Zig Interfaces for the Uninitiated
Update: May 2023 - Killian Vounckx wrote an update to this (in February 2022) that covers some changes to Zig's idioms around how this stuff works. This page is just staying up for posterity.
Zig's standard library takes a peculiar approach to generic interfaces. Since Zig doesn't have language features like inheritance or Go's interfaces, it uses a mechanism that mostly exists in C. However, like most things in C, it's better in Zig.
It may be that some day in the near future, Issue 130 will be resolved and this whole document will be obsolete. In June of 2020, Zig is a new and fast-moving language. Don't be surprised if it changes. Until that day, however, the thing this tutorial describes is what we have.
A Useful Example
Zig doesn't currently have an equivalent to Python's itertools, but it does have a common pattern for iterators:
while (iter.next()) |val| {
// Do something with val
}
Several data structures in the standard library follow a
similar pattern, so let's pull it into a
formal Iterator
struct that we can operate
on.
/// Iterators can be iterated with while loops. The next
/// function returns null when the iterator is exhausted.
pub fn Iterator(comptime T: type) type {
return struct {
const Self = @This();
nextFn: fn (self: *Self) ?T,
resetFn: fn (self: *Self) void,
pub fn next(self: *Self) ?T {
return self.nextFn(self);
}
pub fn reset(self: *Self) void {
return self.resetFn(self);
}
};
}
That's a generic iterator that yields a series
of T
s as its next
method is
called. I included a reset
too, just in case
someone wants to start over. So the interface is two
functions, next
and reset
, that need
to be implemented with a nextFn
and
a resetFn
.
The next step is to make something that implements Iterator.
Implementing an Interface
Let's pick a super-simple use case for our first
iterator. Something finite and familiar. Range
is
a common iterator that counts up to a value by some
interval. It should do nicely.
Basic Range
The first thing to do is figure out what
our Range
's interface will be. It needs a start
value, an end value, and a step value. It also needs to know
what type all these values are going to be. And, of course, it
needs the functions used
by Iterator
: next
and reset
. Let's put the iterated-over type in an
argument to the Range
type itself, and initialize
the rest at runtime.
pub fn Range(comptime T: type) type {
return struct {
const Self = @This();
start: T,
step: T,
end: T,
next_val: T,
pub fn init(start: T, end: T, step: T) Self {
return Self{
.next_val = start,
.start = start,
.end = end,
.step = step,
};
}
pub fn next(self: *Self) ?T {
const rv = self.next_val;
if (self.step < 0) {
if (rv <= self.end) {
// Done counting down
return null;
}
} else {
if (rv >= self.end) {
// Done counting up
return null;
}
}
self.next_val += self.step;
return rv;
}
pub fn reset(self: *Self) void {
self.next_val = self.start;
}
};
}
This struct has all the parts we need: next
and reset
, as well as a way to initialize
it. However, note that both these methods take an instance
of Range(T)
(that's what @This()
means) but our Iterator
interface
functions nextFn
and resetFn
need
pointers to instances of Iterator
. We have to
tell Zig what a Range
Iterator
looks
like.
Iterating
In order to use Range
as
an Iterator
, we have to instantiate
an Iterator
with the particulars
of Range
. That means adding and initializing a
property.
pub fn init(start: T, end: T, step: T) Self {
return Self{
.iterator = Iterator(T){ .nextFn = next, .resetFn = reset },
.next_val = start,
.start = start,
.end = end,
.step = step,
};
}
That new version of init
will initialize our new
property, but there's a problem. next
and reset
accept Range
pointers,
not Iterator
pointers.
@fieldParentPtr
Of course, for a Range
, we'll need to pass
an Iterator
to next
. But how will
that Iterator
get at the Range
's
state like next_val
and step
? Iterator
can't know about
them, since that would make it a pretty lousy generic
interface. Zig has a built-in function for just this
purpose: @fieldParentPtr
. Given a pointer to a
struct field, along with that field's name and the struct's
type, it will return a pointer to the struct. Let's take a
look at the new next
method that uses it.
pub fn next(iterator: *Iterator(T)) ?T {
const self = @fieldParentPtr(Self, "iterator", iterator);
const rv = self.next_val;
if (self.step < 0) {
if (rv <= self.end) {
// Done counting down
return null;
}
} else {
if (rv >= self.end) {
// Done counting up
return null;
}
}
self.next_val += self.step;
return rv;
}
Not much has changed. We moved self
moved from
the function arguments to the first line of the body and
replaced the argument an iterator. Everything else is exactly
the same.
Completed Range
Let's tie the whole thing together along with some tests and documentation.
/// Generic iterator interface. Call next to get the next element or
/// null if the iterator's exhausted. Call reset to start iteration
/// over from the beginning.
pub fn Iterator(comptime T: type) type {
return struct {
const Self = @This();
nextFn: fn (self: *Self) ?T,
resetFn: fn (self: *Self) void,
pub fn next(self: *Self) ?T {
return self.nextFn(self);
}
pub fn reset(self: *Self) void {
return self.resetFn(self);
}
};
}
/// Half-open range type. Call next() on its iterator to get values
/// out. Example usage:
///
/// var range = try Range(u32).init(0, 10, 1);
/// var iter = &range.iterator;
/// while (iter.next()) |n| {
/// std.debug.warn("{}\n", .{n});
/// }
pub fn Range(comptime T: type) type {
return struct {
iterator: Iterator(T),
next_val: T,
start: T,
step: T,
end: T,
const Self = @This();
/// Return the next element in the range, or null if the range
/// has been exhausted.
pub fn next(iterator: *Iterator(T)) ?T {
const self = @fieldParentPtr(Self, "iterator", iterator);
const rv = self.next_val;
if (self.step < 0) {
if (rv <= self.end) {
return null;
}
} else {
if (rv >= self.end) {
return null;
}
}
self.next_val += self.step;
return rv;
}
/// Reset the range back to its start.
pub fn reset(iterator: *Iterator(T)) void {
const self = @fieldParentPtr(Self, "iterator", iterator);
self.next_val = self.start;
}
/// Initialize. Returns error if step size is invalid.
pub fn init(start: T, end: T, step: T) !Self {
if (step == 0) {
return error.ZeroStepSize;
}
return Self{
.next_val = start,
.start = start,
.end = end,
.step = step,
.iterator = Iterator(T){
.nextFn = next,
.resetFn = reset,
},
};
}
};
}
const std = @import("std");
const testing = std.testing;
test "range ascend" {
var range = try Range(u32).init(0, 10, 1);
var iter = &range.iterator;
var correct: u32 = 0;
while (iter.next()) |n| {
testing.expectEqual(correct, n);
correct += 1;
}
testing.expectEqual(correct, 10);
testing.expectEqual(iter.next(), null);
}
test "range descend" {
var range = try Range(i32).init(10, 0, -1);
var iter = &range.iterator;
var correct: i32 = 10;
while (iter.next()) |n| {
testing.expectEqual(correct, n);
correct -= 1;
}
testing.expectEqual(correct, 0);
testing.expectEqual(iter.next(), null);
}
test "range skip" {
var range = try Range(u32).init(0, 10, 2);
var iter = &range.iterator;
var correct: u32 = 0;
while (iter.next()) |n| {
testing.expectEqual(correct, n);
correct += 2;
}
testing.expectEqual(correct, 10);
testing.expectEqual(iter.next(), null);
}
test "range runtime" {
var start: u32 = 0;
while (start < 10) : (start += 1) {
var range = try Range(u32).init(start, 10, 1);
var iter = &range.iterator;
var correct: u32 = start;
while (iter.next()) |n| {
testing.expectEqual(correct, n);
correct += 1;
}
testing.expectEqual(correct, 10);
testing.expectEqual(iter.next(), null);
}
}
And test it...
$ zig test iterators.zig
All 4 tests passed.
There's an important thing to note about the usage. When it
assigns iter
to range.iterator
it
takes the address. If instead it just assigned var iter
= range.iterator;
the code would compile fine. However,
there would be weird bugs. It would look like it's iterating
over random memory. It'd look like that because it would be
iterating over random memory.
var iter = range.iterator;
makes a copy
of the iterator
field in the range
struct. So when we go to call @fieldParentPtr
on
it, we'll be telling Zig to give us the instance
of Range(u32)
a few bytes before the address of
our copy. But the copy isn't inside a Range(u32)
!
So instead, we use var iter =
&range.iterator;
to get a copy of
the address of the iterator, instead of a copy of the
iterator itself. The address is nicely inside
our Range(32)
so everything works pleasantly.
Fold
Now that we have a generic iterator, let's do something with
it. fold
(sometimes called reduce
)
is a function that takes an iterator, a function, and an
initial value and returns a value. We could
implement fold
for arbitrary slices in Zig, but
what if we want to fold a more complex data structure like
a HashMap
? Well, for that we'd have to write a
custom implementation. Or, instead, we could make a
generic Iterator
out of the HashMap
and pass that to our generic fold
function.
/// Apply fun to successive elements in iter. Initial value of acc is
/// init.
pub fn fold(
comptime T: type,
fun: fn (acc: T, val: T) T,
iter: *Iterator(T),
init: T,
) T {
var accumulator: T = init;
while (iter.next()) |val| {
accumulator = fun(accumulator, val);
}
return accumulator;
}
Since we already have a data structure that
implements Iterator
(i.e. Range
)
we'll use that to test our fold
implementation.
fn add(a: u32, b: u32) u32 {
return a + b;
}
test "fold over range" {
var range = try Range(u32).init(1, 10, 1);
var iter = &range.iterator;
const total = fold(u32, add, iter, 17);
testing.expectEqual(total, 17 + 1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 + 9);
}
It's not a traditional functional implementation, but Zig isn't a traditional functional language. A future version of the Zig compiler will be able to guarantee tail call optimization, but until then this will do.
Finally...
While exploring iterators, what we really learned was how
idiomatic Zig uses runtime polymorphism to have
interchangeable data structures. Notable examples in the
standard library include std.mem.Allocator
and std.rand.Random
. Hopefully this has given you
a deeper understanding of how these interfaces are used and
how to implement data structures that match them.
A somewhat more complete version of the iterators library is available on sourcehut.