Why I Like Zig

I've been writing C code professionally since about 2006, and while I look at a lot of languages (I have varying levels of familiarity with Python, Erlang, Ada, Haskell, Go, Rust, C++, OCaml, D, Javascript, Prolog, Lua, Lisp, Bash, Perl, Forth...you get the idea) I have yet to find one that I would rather use in cases where C is appropriate. Part of this inclination is that I mainly write for embedded systems where a lot of the features of higher level languages don't help.

Zig, though, is showing real promise. There are a few things about Zig that make it suitable as a C replacement, and a few things that make me really want to use it. Of course, there are also things preventing me from using it. This is going to be about those things.

Suitability

Zig's intended as a C replacement, so it's not terribly surprising that it can actually be one. It's got a C compiler, and it has types that are C ABI compatible (at least in principle). It's extremely easy to wrap a C function in a Zig function.

For example, if you have a C function like

int add(int a, int b)
{
    return a + b;
}

Calling it in Zig is as easy as

const c = @cImport({
    @cInclude("adder.h");
});

pub fn add(a: c_int, b: c_int) c_int {
    return c.add(a, b);
}

So easy that wrapping trivial functions like this isn't worth the effort. You'd probably just do

const c = @cImport({
    @cInclude("adder.h");
});

pub const add = c.add;

and let people call it directly.

Calling Zig code from C is pretty easy too. If you want an example, see extending a C library.

So C interoperability is pretty solid, but why does that matter? We're talking about reasons not to write in C, not reasons to keep writing in C. There's a huge amount of code out there that does useful things, and rewriting it all in whatever new language came out last Thursday is wildly impractical (though the Rust community is giving it a go). Effortless C interoperability lets Zig creep in where some other languages would be stuck rewriting the world.

And of course, most languages have at least some of this. extern "C" from C++, for example, is in most pure C libraries' headers. Python's ctypes library makes it very easy to call C functions.

That brings us to the other part of suitability: features Zig doesn't have. It doesn't require things like garbage collection, having a heap, or C++'s exceptions that either prevent it from being used in embedded systems entirely, or have to be disabled in order for it to work. I could use this space to rant about C++, but that's been done. This is why most languages are unsuitable for my purposes. Garbage collection is super nice, but sometimes dynamic allocation just isn't an option.

Of course, there's also a taste aspect to this. I like languages that I can mostly keep in my head (looking at you again, C++) because programming is already hard and I don't need to add complex and esoteric concepts like copy and move constructors to my cognitive load. C is, of course, a bigger language than people think, but it's much smaller than languages like Rust and D.

Killer Features

Now that we've established that Zig can work, it's time to talk about what makes it worth switching. It's not effortless, and something has to justify all the time spent looking up language features and the reduction in the hiring pool.

Error Handling

The first thing I'm going to talk about here is how Zig does error handling. It's ubiquitous in Zig code, and it's the best solution to the problem that I've seen. Zig functions can return something called an error union, which is like the Either monad in Haskell. This is either a useful result like a u32 (or no result) or an error.

Along with error unions as a return type, Zig has a rule that every function's return value has to be saved somewhere (unless it's void). Since the type of the variable you're saving the function's result to isn't an error union, it's a compile-time error to leave a run-time error completely ignored.

Code like this:

struct point_t *point;
point = malloc(sizeof(*point));
point.x = 7;
point.y = 32;

is perfectly valid C. It might even do what you want most of the time. However, this Zig code won't compile:

var point: *Point = allocator.create(Point);
point.x = 7;
point.y = 32;

Instead, you get this helpful error message:

error: expected type '*root.Point', found 'error{OutOfMemory}!*root.Point'
    var point: *Point = allocator.create(Point);
                        ~~~~~~~~~~~~~~~~^~~~~~~
root.zig:9:41: note: cannot convert error union to payload type
root.zig:9:41: note: consider using 'try', 'catch', or 'if'

I won't go into all the ways to handle errors here (the documentation can do that) but the simplest way to get rid of this error is to stick a try before the call to allocator.create:

var point: *Point = try allocator.create(Point);
point.x = 7;
point.y = 32;

Defer

And that brings us to the next great feature Zig has. The thing in D that most makes me want to use it is its scope statements. They let you set up some code to run as part of scope exit, and Zig's defer does the same thing. The next problem with the snippet above is going to be solved by defer, but first I'm going to take a brief detour to talk a about allocators.

Allocators

Zig doesn't have a default heap allocator in the way that most languages do. If you're writing some C code, and you call malloc, or you use the new keyword in C++, you're using the allocator that's built into the language. The standard has things to say about what it does and how it does it. This lets people write incredibly useful tools like valgrind's memcheck to detect the kinds of problems that happen when you manage memory manually. Since Zig doesn't have a default allocator, but does have an allocator interface defined in its standard library, idiomatic Zig code just accepts an allocator as an argument when it's going to do dynamic allocation.

A consequence of that is that someone can write a testing allocator that checks, when a test terminates, if all the memory it allocated was freed. And of course someone did that and it's part of the standard library and testing facilities. So when I run the code from the end of the errors section in a test, the result is this:

run test: error: 'test.allocation' leaked: [gpa] (err): memory address 0x7f84589b1000 leaked:
root.zig:9:45: 0x222d31 in test.allocation (test)
...

If I go look at my source file, line 9 column 45 is the P in try allocator.create(Point). I forgot to free the memory! The test just told me that for free (since I used testing.allocator) and saved me from a memory leak.

</Allocators>

C's solution to this problem is that programmers should remember to free allocated resources when they're done with them. That's clearly the worst. Other languages have other approaches, like automatic garbage collection (disqualified in embedded systems, only works for memory), RAII (now your language is very complicated), try/finally, and context managers.

Python's try/finally is a reasonable approach that mimics the goto-based approach used in a lot of C error handling by adding a layer of indentation. Of course, Python is garbage collected, so we don't have this problem with memory, but here's an example with file handles:

try:
    f = open('filename.txt', 'r')
    text = f.read()
finally:
    f.close()

This puts the cleanup code at the end of the block where the resource is used, instead of right next to where it's allocated. That makes it a bit harder for the reader to see that the resource is cleaned up. Python solves that problem by hiding the cleanup code in the allocated object with context managers.

And, of course, if you have 2 files to work on this gets unwieldy quickly:

try:
    src = open('filename.txt', 'r')
    try:
        dest = open('other.txt', 'w')
        for line in src.readlines():
            if '$' in line:
                dest.write(line)
    finally:
        dest.close()
finally:
    src.close()

Context managers don't help at all with this indentation proliferation. However, if you have a block of code that's guaranteed to execute when we exit the current scope, you can write something like this:

var point: *Point = try allocator.create(Point);
defer allocator.destroy(point);
point.x = 7;
point.y = 32;

And that's how Zig's defer statement works. If you see a line of Zig code like

foo.init(allocator);

without an accompanying

defer foo.deinit();

it's a code smell. More importantly, if you make a change to the code that allocates, the code that deallocates is right there on the next line. And there's no extra level of indentation like with Python's context managers. Of course, I was being a little unfair to Python. contextlib.ExitStack lets you do this:

with ExitStack() as stack:
    src = stack.enter_context(open('filename.txt', 'r'))
    dest = stack.enter_context(open('other.txt', 'w'))
    for line in src.readlines():
        if '$' in line:
            dest.write(line)

Hey, that's a lot like defer but more verbose, less discoverable, and with an extra layer of indentation wrapped around all the relevant code.

Back to Zig, there's also errdefer which only executes when there's an error. It makes writing functions that return allocated resources but have to clean them up if there's an error much easier:

var pt = try allocator.create(Point);
errdefer allocator.destroy(pt);

try axes.init(allocator, pt);
errdefer allocator.deinit(axes);

axes.setPoint(3, 3);
try axes.rotate(18);

return axes;

If the call to axes.rotate causes an error (which we can see is possible because of the try) then first the axes will be deinitialized then the point will be deallocated. It's clean and easy to use.

Optionals

The next that makes me want to ditch C forever and replace it with Zig is optionals. These are the Maybe monad from Haskell, or Option from Rust. Got a pointer? Can it be null sometimes? Wrap it in an optional.

const LinkedList = struct {
    const Node = struct {
        next: ?*Node = null,
        data: u32,
    };
    
    head: ?*Node = null,

    pub fn push(self: *LinkedList, node: *Node) void {
        const next = self.head;
        node.next = next;
        self.head = node;
    }
};

const Iterator = struct {
    current: ?*LinkedList.Node,
    
    pub fn init(lst: *LinkedList) Iterator {
        return .{ .current = lst.head };
    }
    
    pub fn next(self: *Iterator) ?*LinkedList.Node {
        const result = self.current;
        if (self.current) |node| {
            self.current = node.next;
        }
        return result;
    }
};

The language has a lot of sugar around optionals, like while loops that can terminate when their capture is null:

var lst = LinkedList{};
var one = LinkedList.Node{.data = 1};
var two = LinkedList.Node{.data = 2};
lst.push(&one);
lst.push(&two);

var iter = Iterator.init(&lst);
var sum: u32 = 0;
while (iter.next()) |node| {
    sum += node.data;
}
try testing.expectEqual(@as(u32, 3), sum);

And, as seen in the next method, if with a capture that has its optionality stripped off. Of course it has the type checking around it that you'd expect, so if you tried to pass a ?*Node into LinkedList.push it would be an error.

This also means that if you're wrapping a C function, you get some extra safety for free. Instead of void *malloc(size_t size), translate-c gives fn malloc(usize size) ?*anyopaque.

Metaprogramming

C has a notoriously bad system for metaprogramming. The C preprocessor is a whole new language that doesn't know anything about C and works just as well on text files. That makes it easy to implement, but not great to use. Some other languages have tried to supplant it by coming up with a different whole new language, but one more tightly integrated with the language. That's how we ended up with people proving that C++ templates are Turing complete.

Zig's metaprogramming facilities are just more Zig, but executed at compile time. Zig types are first-class data types, and operating on them is surprisingly ergonomic. I thought it was going to be like D's mixins, which felt difficult to use but better than the alternatives, but this is better. For example, the Linked List structure above would be better written as:

pub fn LinkedList(comptime T: type) type {
    return struct {
        const Self = @This();

        const Node = struct {
            next: ?*Node = null,
            data: T,
        };
    
        head: ?*Node = null,
        
        pub fn push(self: *Self, node: *Node) void {
            const next = self.head;
            node.next = next;
            self.head = node;
        }

        pub fn pop(self: *Self) ?*Self.Node {
            if (self.head) |head| {
                self.head = head.next;
                return head;
            }
            return null;
        }
    };
}

Of course, the standard library has a full implementation of both singly and doubly linked lists (code, docs) that do a lot more than what's above, but the core is the same: pass a type in and get a type out, where the output type uses the input type in some way.

Another consequence of how Zig does metaprogramming is that compile-time reflection is super easy. It's got @TypeOf and @typeInfo, which let you get at the kinds of introspective things you'd do at runtime in a dynamic language.

I probably wouldn't ditch C just for this feature, but it sure does make life better. There's a reason why there are more C linked list implementations than C programmers, and Zig will not have that problem.

Other Nice Things

There are other features that are common in other programming languages, but C lacks, like namespaces, a built-in test system, packages, and strong typing. These are all nice to have, and I occasionally miss each of them when writing C.

Suitability Again

So all those are reasons I would dump C and switch to Zig full time. Of course, I haven't done that. I haven't even advocated that we leave C for Zig at places where I've worked. Why not? It's not ready.

That's the only reason. If, when Zig hits 1.0, it has a spec, an implementation that passes all the tests implied by the spec, and a commitment from its maintainers to preserve backward compatibility, I'll start using it for work. I'm already using it for projects at home, even though it adds a considerable maintenance burden.

I don't love that testing.expectEqual forces me to cast the expected value most of the time, and manual memory management isn't for everything. The way Zig does bitfields might be the least useful implementation of the concept I've ever encountered. The standard library gets awkward at times, and is largely undocumented.

However, none of those things have ever made me think I'd be better off writing in C.