Why I like Zig¶
I’ve been writing C code professionally since about 2006, and while I look at a lot of languages (I have varying levels of familiarity with Python, Erlang, Ada, Haskell, Go, Rust, C++, OCaml, D, Javascript, Prolog, Lua, Lisp, Bash, Perl, Forth…you get the idea) I have yet to find one that I would rather use in cases where C is appropriate. Part of this inclination is that I mainly write for embedded systems where a lot of the features of higher level languages either don’t help or cost too much.
Zig, though, is showing real promise. There are a few things about Zig that make it suitable as a C replacement, and a few things that make me really want to use it. Of course, there are also things preventing me from using it. This is going to be about those things.
Suitability¶
Zig’s intended as a C replacement, so it’s not terribly surprising that it can actually be one. It’s got a C compiler, and it has types that are C ABI compatible (at least in principle). It’s extremely easy to wrap a C function in a Zig function.
For example, if you have a C function like
int add(int a, int b)
{
return a + b;
}
Calling it in Zig is as easy as
const c = @cImport({
@cInclude("adder.h");
});
pub fn add(a: c_int, b: c_int) c_int {
return c.add(a, b);
}
So easy that wrapping trivial functions like this isn’t worth the effort. You’d probably just do
const c = @cImport({
@cInclude("adder.h");
});
pub const add = c.add;
and let people call it directly.
Calling Zig code from C is pretty easy too. If you want an example, see extending a C library.
So C interoperability is pretty solid, but why does that matter? We’re talking about reasons not to write in C, not reasons to keep writing in C. Well, there’s a huge amount of code out there that does useful things, and rewriting it all in whatever new language came out last Thursday is wildly impractical (though the Rust community is giving it a go). Effortless C interoperability lets Zig creep in where some other languages would be stuck rewriting the world.
And of course, most languages have at least some of this. extern
"C"
from C++, for example, is in most pure C libraries’
headers. Python’s ctypes
library makes it very easy to call
C functions.
That brings us to the other part of suitability: features Zig doesn’t have. It doesn’t require things like garbage collection, having a heap, or C++’s exceptions that either prevent it from being used in embedded systems entirely, or have to be disabled in order for it to work. I could use this space to rant about C++, but that’s been done. This is why most languages are unsuitable for my purposes. Garbage collection is super nice, but sometimes dynamic allocation just isn’t an option.
Of course, there’s also a taste aspect to this. I like languages that I can mostly keep in my head (looking at you again, C++) because programming is already hard and I don’t need to add complex and esoteric concepts like copy and move constructors to my cognitive load. C is, of course, a bigger language than people think, but it’s much smaller than languages like Rust and D.
Killer Features¶
Now that we’ve established that Zig can work, it’s time to talk about what makes it worth switching. It’s not effortless, and something has to justify all the time spent looking up language features and the reduction in the hiring pool.
Error Handling¶
Error handling is ubiquitous in Zig code, and it’s one of the best
solutions to the problem that I’ve seen. Zig functions can return
something called an error union, which is like the Either
monad in
Haskell. This is either a useful result like a u32
(or no
result) or an error.
Along with error unions as a return type, Zig has a rule that you have to do something with the return value of every function (unless it’s void). Since the type of the variable you’re saving the function’s result to isn’t an error union, it’s a compile-time error to leave a run-time error completely ignored.
Code like this:
struct point_t *point;
point = malloc(sizeof(*point));
point.x = 7;
point.y = 32;
is perfectly valid C. It might even do what you want most of the time. However, this Zig code won’t compile:
var point: *Point = allocator.create(Point);
point.x = 7;
point.y = 32;
Instead, you get this helpful error message:
error: expected type '*root.Point', found 'error{OutOfMemory}!*root.Point'
var point: *Point = allocator.create(Point);
~~~~~~~~~~~~~~~~^~~~~~~
root.zig:9:41: note: cannot convert error union to payload type
root.zig:9:41: note: consider using 'try', 'catch', or 'if'
I won’t go into all the ways to handle errors here (the documentation can do that) but
the simplest way to get rid of this error is to stick a try
before the call to allocator.create
:
var point: *Point = try allocator.create(Point);
point.x = 7;
point.y = 32;
Defer¶
That brings us to the next great feature Zig has. The thing in D that
most makes me want to use it is its scope
statements. They let
you set up some code to run as part of scope exit, and Zig’s
defer
does the same thing. The next problem with the snippet
above is going to be solved by defer
, but first I’m going to
take a brief detour to talk a about allocators.
Allocators¶
Zig doesn’t have a default heap allocator in the way that most
languages do. If you’re writing some C code, and you
call malloc
, or you use
the new
keyword in C++, you’re using the allocator that’s
built into the language. The standard has things to say about what it
does and how it does it. This lets people write incredibly useful
tools
like valgrind’s
memcheck to detect the kinds of problems that happen when you
manage memory manually. Since Zig doesn’t have a default allocator,
but does have an allocator interface defined in its standard library,
idiomatic Zig code just accepts an allocator as an argument when it’s
going to do dynamic allocation.
A consequence of that is that someone can write a testing allocator that checks, when a test terminates, if all the memory it allocated was freed. And of course someone did that and it’s part of the standard library and testing facilities. So when I run the code from the end of the error handling section in a test, the result is this:
run test: error: 'test.allocation' leaked: [gpa] (err): memory address 0x7f84589b1000 leaked:
root.zig:9:45: 0x222d31 in test.allocation (test)
...
If I go look at my source file, line 9 column 45 is the P
in try
allocator.create(Point)
. I forgot to free the memory! The test just
told me that for free (since I used testing.allocator) and saved me
from a memory leak.
</Allocators>¶
C’s solution to this problem is that programmers should remember to free allocated resources when they’re done with them. That’s clearly the worst. Other languages have other approaches, like automatic garbage collection (disqualified in embedded systems, only works for memory), RAII (now your language is very complicated), try/finally, and context managers.
Python’s try/finally is a reasonable approach that mimics the
goto
-based approach used in a lot of C error handling by adding a
layer of indentation. Of course, Python is garbage collected, so we
don’t have this problem with memory, but here’s an example with file
handles:
try:
f = open('filename.txt', 'r')
text = f.read()
finally:
f.close()
This puts the cleanup code at the end of the block where the resource is used, instead of right next to where it’s allocated. That makes it a bit harder for the reader to see that the resource is cleaned up. Python solves that problem by hiding the cleanup code in the allocated object with context managers.
And, of course, if you have 2 files to work on this gets unwieldy quickly:
try:
src = open('filename.txt', 'r')
try:
dest = open('other.txt', 'w')
for line in src.readlines():
if '$' in line:
dest.write(line)
finally:
dest.close()
finally:
src.close()
Context managers don’t help at all with this indentation proliferation. However, if you have a block of code that’s guaranteed to execute when we exit the current scope, you can write something like this:
var point: *Point = try allocator.create(Point);
defer allocator.destroy(point);
point.x = 7;
point.y = 32;
And that’s how Zig’s defer statement works. If you see a line of Zig code like
foo.init(allocator);
without an accompanying
defer foo.deinit();
it’s a code smell. More importantly, if you make a change to the code
that allocates, the code that deallocates is right there on the next
line. And there’s no extra level of indentation like with Python’s
context managers. Of course, I was being a little unfair to
Python. contextlib.ExitStack
lets you do this:
with ExitStack() as stack:
src = stack.enter_context(open('filename.txt', 'r'))
dest = stack.enter_context(open('other.txt', 'w'))
for line in src.readlines():
if '$' in line:
dest.write(line)
Hey, that’s a lot like defer
but more verbose, less discoverable,
and with an extra layer of indentation wrapped around all the relevant
code.
Back to Zig, there’s also errdefer
which only executes when
there’s an error. It makes writing functions that return allocated
resources but have to clean them up if there’s an error much easier:
var pt = try allocator.create(Point);
errdefer allocator.destroy(pt);
try axes.init(allocator, pt);
errdefer allocator.deinit(axes);
axes.setPoint(3, 3);
try axes.rotate(18);
return axes;
If the call to axes.rotate
causes an error (which we
can see is possible because of the try
) then first the
axes will be deinitialized then the point will be deallocated. It’s
clean and easy to use.
Optionals¶
The next that makes me want to ditch C forever and replace it with
Zig is optionals. These are the Maybe
monad from Haskell,
or Option
from Rust. Got a pointer? Can it be null
sometimes? Wrap it in an optional.
const LinkedList = struct {
const Node = struct {
next: ?*Node = null,
data: u32,
};
head: ?*Node = null,
pub fn push(self: *LinkedList, node: *Node) void {
const next = self.head;
node.next = next;
self.head = node;
}
};
const Iterator = struct {
current: ?*LinkedList.Node,
pub fn init(lst: *LinkedList) Iterator {
return .{ .current = lst.head };
}
pub fn next(self: *Iterator) ?*LinkedList.Node {
const result = self.current;
if (self.current) |node| {
self.current = node.next;
}
return result;
}
};
The language has a lot of sugar around optionals,
like while
loops that can terminate when their capture
is null
:
var lst = LinkedList{};
var one = LinkedList.Node{.data = 1};
var two = LinkedList.Node{.data = 2};
lst.push(&one);
lst.push(&two);
var iter = Iterator.init(&lst);
var sum: u32 = 0;
while (iter.next()) |node| {
sum += node.data;
}
try testing.expectEqual(@as(u32, 3), sum);
And, as seen in the next
method, if
with a capture that has
its optionality stripped off. Of course it has the type checking
around it that you’d expect, so if you tried to pass a ?*Node
into
LinkedList.push
it would be an error.
This also means that if you’re wrapping a C function, you get some
extra safety for free. Instead of void *malloc(size_t size)
,
translate-c gives fn malloc(usize size) ?*anyopaque
.
Metaprogramming¶
C has a notoriously bad system for metaprogramming. The C preprocessor is a whole new language that doesn’t know anything about C and works just as well on text files. That makes it easy to implement, but not great to use. Some other languages have tried to supplant it by coming up with a different whole new language, but one more tightly integrated with the main language. That’s how we ended up with people proving that C++ templates are Turing complete.
Zig’s metaprogramming facilities are just more Zig, but executed at compile time. Zig types are first-class data types, and operating on them is surprisingly ergonomic. I thought it was going to be like D’s mixins, which felt difficult to use but better than the alternatives, but this is better. For example, the Linked List structure above would be better written as:
pub fn LinkedList(comptime T: type) type {
return struct {
const Self = @This();
const Node = struct {
next: ?*Node = null,
data: T,
};
head: ?*Node = null,
pub fn push(self: *Self, node: *Node) void {
const next = self.head;
node.next = next;
self.head = node;
}
pub fn pop(self: *Self) ?*Self.Node {
if (self.head) |head| {
self.head = head.next;
return head;
}
return null;
}
};
}
Of course, the standard library has a full implementation of both singly and doubly linked lists (code, docs) that do a lot more than what’s above, but the core is the same: pass a type in and get a type out, where the output type uses the input type in some way.
Another consequence of how Zig does metaprogramming is that
compile-time reflection is super easy. It’s got @TypeOf
and
@typeInfo
, which let you get at the kinds of introspective
things you’d do at runtime in a dynamic language.
I probably wouldn’t ditch C just for this feature, but it sure does make life better. There’s a reason why there are more C linked list implementations than C programmers, and Zig will not have that problem.
Other Nice Things¶
There are other features that are common in other programming languages, but C lacks, like namespaces, a built-in test system, packages, and strong typing. These are all nice to have, and I occasionally miss each of them when writing C.
Suitability Again¶
So all those are reasons I would dump C and switch to Zig full time. Of course, I haven’t done that. I haven’t even advocated that we leave C for Zig at places where I’ve worked. Why not? It’s not ready.
That’s the only reason. If, when Zig hits 1.0, it has a spec, an implementation that passes all the tests implied by the spec, and a commitment from its maintainers to preserve backward compatibility, I’ll start using it for work. I’m already using it for projects at home, even though it adds a considerable maintenance burden.
I don’t love that testing.expectEqual
forces me to cast the
expected
value most of the time, and manual memory management
isn’t for everything. The way Zig does bitfields might be the least
useful implementation of the concept I’ve ever encountered. The
standard library gets awkward at times, and is largely undocumented.
However, none of those things have ever made me think I’d be better off writing in C.