Zig Interfaces for the Uninitiated

Update: May 2023 - Killian Vounckx wrote an update to this (in February 2022) that covers some changes to Zig's idioms around how this stuff works. This page is just staying up for posterity.

Zig's standard library takes a peculiar approach to generic interfaces. Since Zig doesn't have language features like inheritance or Go's interfaces, it uses a mechanism that mostly exists in C. However, like most things in C, it's better in Zig.

It may be that some day in the near future, Issue 130 will be resolved and this whole document will be obsolete. In June of 2020, Zig is a new and fast-moving language. Don't be surprised if it changes. Until that day, however, the thing this tutorial describes is what we have.

A Useful Example

Zig doesn't currently have an equivalent to Python's itertools, but it does have a common pattern for iterators:


while (iter.next()) |val| {
    // Do something with val
}

Several data structures in the standard library follow a similar pattern, so let's pull it into a formal Iterator struct that we can operate on.


/// Iterators can be iterated with while loops. The next
/// function returns null when the iterator is exhausted.
pub fn Iterator(comptime T: type) type {
    return struct {
        const Self = @This();
        nextFn: fn (self: *Self) ?T,
        resetFn: fn (self: *Self) void,
        pub fn next(self: *Self) ?T {
            return self.nextFn(self);
        }
        pub fn reset(self: *Self) void {
            return self.resetFn(self);
        }
    };
}

That's a generic iterator that yields a series of Ts as its next method is called. I included a reset too, just in case someone wants to start over. So the interface is two functions, next and reset, that need to be implemented with a nextFn and a resetFn.

The next step is to make something that implements Iterator.

Implementing an Interface

Let's pick a super-simple use case for our first iterator. Something finite and familiar. Range is a common iterator that counts up to a value by some interval. It should do nicely.

Basic Range

The first thing to do is figure out what our Range's interface will be. It needs a start value, an end value, and a step value. It also needs to know what type all these values are going to be. And, of course, it needs the functions used by Iterator: next and reset. Let's put the iterated-over type in an argument to the Range type itself, and initialize the rest at runtime.


pub fn Range(comptime T: type) type {
    return struct {
        const Self = @This();

        start: T,
        step: T,
        end: T,
        next_val: T,

        pub fn init(start: T, end: T, step: T) Self {
            return Self{
                .next_val = start,
                .start = start,
                .end = end,
                .step = step,
            };
        }

        pub fn next(self: *Self) ?T {
            const rv = self.next_val;
            if (self.step < 0) {
                if (rv <= self.end) {
                    // Done counting down
                    return null;
                }
            } else {
                if (rv >= self.end) {
                    // Done counting up
                    return null;
                }
            }
            self.next_val += self.step;
            return rv;
        }

        pub fn reset(self: *Self) void {
            self.next_val = self.start;
        }
    };
}

This struct has all the parts we need: next and reset, as well as a way to initialize it. However, note that both these methods take an instance of Range(T) (that's what @This() means) but our Iterator interface functions nextFn and resetFn need pointers to instances of Iterator. We have to tell Zig what a Range Iterator looks like.

Iterating

In order to use Range as an Iterator, we have to instantiate an Iterator with the particulars of Range. That means adding and initializing a property.


pub fn init(start: T, end: T, step: T) Self {
    return Self{
        .iterator = Iterator(T){ .nextFn = next, .resetFn = reset },
        .next_val = start,
        .start = start,
        .end = end,
        .step = step,
    };
}

That new version of init will initialize our new property, but there's a problem. next and reset accept Range pointers, not Iterator pointers.

@fieldParentPtr

Of course, for a Range, we'll need to pass an Iterator to next. But how will that Iterator get at the Range's state like next_val and step? Iterator can't know about them, since that would make it a pretty lousy generic interface. Zig has a built-in function for just this purpose: @fieldParentPtr. Given a pointer to a struct field, along with that field's name and the struct's type, it will return a pointer to the struct. Let's take a look at the new next method that uses it.


pub fn next(iterator: *Iterator(T)) ?T {
        const self = @fieldParentPtr(Self, "iterator", iterator);
        const rv = self.next_val;
        if (self.step < 0) {
            if (rv <= self.end) {
                // Done counting down
                return null;
            }
        } else {
            if (rv >= self.end) {
                // Done counting up
                return null;
            }
        }
         self.next_val += self.step;
        return rv;
}

Not much has changed. We moved self moved from the function arguments to the first line of the body and replaced the argument an iterator. Everything else is exactly the same.

Completed Range

Let's tie the whole thing together along with some tests and documentation.


/// Generic iterator interface. Call next to get the next element or
/// null if the iterator's exhausted. Call reset to start iteration
/// over from the beginning.
pub fn Iterator(comptime T: type) type {
    return struct {
        const Self = @This();
        nextFn: fn (self: *Self) ?T,
        resetFn: fn (self: *Self) void,
        pub fn next(self: *Self) ?T {
            return self.nextFn(self);
        }
        pub fn reset(self: *Self) void {
            return self.resetFn(self);
        }
    };
}

/// Half-open range type. Call next() on its iterator to get values
/// out. Example usage:
///
/// var range = try Range(u32).init(0, 10, 1);
/// var iter = &range.iterator;
/// while (iter.next()) |n| {
///     std.debug.warn("{}\n", .{n});
/// }
pub fn Range(comptime T: type) type {
    return struct {
        iterator: Iterator(T),
        next_val: T,
        start: T,
        step: T,
        end: T,
        const Self = @This();

        /// Return the next element in the range, or null if the range
        /// has been exhausted.
        pub fn next(iterator: *Iterator(T)) ?T {
            const self = @fieldParentPtr(Self, "iterator", iterator);
            const rv = self.next_val;
            if (self.step < 0) {
                if (rv <= self.end) {
                    return null;
                }
            } else {
                if (rv >= self.end) {
                    return null;
                }
            }

            self.next_val += self.step;
            return rv;
        }

        /// Reset the range back to its start.
        pub fn reset(iterator: *Iterator(T)) void {
            const self = @fieldParentPtr(Self, "iterator", iterator);
            self.next_val = self.start;
        }

        /// Initialize. Returns error if step size is invalid.
        pub fn init(start: T, end: T, step: T) !Self {
            if (step == 0) {
                return error.ZeroStepSize;
            }
            return Self{
                .next_val = start,
                .start = start,
                .end = end,
                .step = step,
                .iterator = Iterator(T){
                    .nextFn = next,
                    .resetFn = reset,
                },
            };
        }
    };
}

const std = @import("std");

const testing = std.testing;
test "range ascend" {
    var range = try Range(u32).init(0, 10, 1);
    var iter = &range.iterator;
    var correct: u32 = 0;
    while (iter.next()) |n| {
        testing.expectEqual(correct, n);
        correct += 1;
    }
    testing.expectEqual(correct, 10);
    testing.expectEqual(iter.next(), null);
}

test "range descend" {
    var range = try Range(i32).init(10, 0, -1);
    var iter = &range.iterator;
    var correct: i32 = 10;
    while (iter.next()) |n| {
        testing.expectEqual(correct, n);
        correct -= 1;
    }
    testing.expectEqual(correct, 0);
    testing.expectEqual(iter.next(), null);
}

test "range skip" {
    var range = try Range(u32).init(0, 10, 2);
    var iter = &range.iterator;
    var correct: u32 = 0;
    while (iter.next()) |n| {
        testing.expectEqual(correct, n);
        correct += 2;
    }
    testing.expectEqual(correct, 10);
    testing.expectEqual(iter.next(), null);
}

test "range runtime" {
    var start: u32 = 0;
    while (start < 10) : (start += 1) {
        var range = try Range(u32).init(start, 10, 1);
        var iter = &range.iterator;
        var correct: u32 = start;
        while (iter.next()) |n| {
            testing.expectEqual(correct, n);
            correct += 1;
        }
        testing.expectEqual(correct, 10);
        testing.expectEqual(iter.next(), null);
    }
}

And test it...


$ zig test iterators.zig 
All 4 tests passed.

There's an important thing to note about the usage. When it assigns iter to range.iterator it takes the address. If instead it just assigned var iter = range.iterator; the code would compile fine. However, there would be weird bugs. It would look like it's iterating over random memory. It'd look like that because it would be iterating over random memory.

var iter = range.iterator; makes a copy of the iterator field in the range struct. So when we go to call @fieldParentPtr on it, we'll be telling Zig to give us the instance of Range(u32) a few bytes before the address of our copy. But the copy isn't inside a Range(u32)! So instead, we use var iter = &range.iterator; to get a copy of the address of the iterator, instead of a copy of the iterator itself. The address is nicely inside our Range(32) so everything works pleasantly.

Fold

Now that we have a generic iterator, let's do something with it. fold (sometimes called reduce) is a function that takes an iterator, a function, and an initial value and returns a value. We could implement fold for arbitrary slices in Zig, but what if we want to fold a more complex data structure like a HashMap? Well, for that we'd have to write a custom implementation. Or, instead, we could make a generic Iterator out of the HashMap and pass that to our generic fold function.


/// Apply fun to successive elements in iter. Initial value of acc is
/// init.
pub fn fold(
    comptime T: type,
    fun: fn (acc: T, val: T) T,
    iter: *Iterator(T),
    init: T,
) T {
    var accumulator: T = init;
    while (iter.next()) |val| {
        accumulator = fun(accumulator, val);
    }
    return accumulator;
}

Since we already have a data structure that implements Iterator (i.e. Range) we'll use that to test our fold implementation.


fn add(a: u32, b: u32) u32 {
    return a + b;
}

test "fold over range" {
    var range = try Range(u32).init(1, 10, 1);
    var iter = &range.iterator;
    const total = fold(u32, add, iter, 17);
    testing.expectEqual(total, 17 + 1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 + 9);
}

It's not a traditional functional implementation, but Zig isn't a traditional functional language. A future version of the Zig compiler will be able to guarantee tail call optimization, but until then this will do.

Finally...

While exploring iterators, what we really learned was how idiomatic Zig uses runtime polymorphism to have interchangeable data structures. Notable examples in the standard library include std.mem.Allocator and std.rand.Random. Hopefully this has given you a deeper understanding of how these interfaces are used and how to implement data structures that match them.

A somewhat more complete version of the iterators library is available on sourcehut.