Basic MetaProgramming in Zig
Aug 14, 2024
While I've written a lot about Zig, I've avoided talking about Zig's meta programming capabilities which, in Zig, generally falls under the "comptime" umbrella. The idea behind "comptime" is to allow Zig code to be run at compile time in order to generate code. It's often said that an advantage of Zig's comptime is that it's just Zig code, as opposed to a separate, often limited, language as seen in other languages. In my experience though, Zig's comptime is still a world onto itself. There are limitations to what you can do at comptime, there are parts of the standard library designed for comptime, and even small parts of the language, like inline for
, used only at comptime.
I'm still not very comfortable with comptime, which is why I've not written much about it. However there are useful parts that are easy to learn. In particular I'm talking about the @hasField
, @hasDecl
and @field
builtins, along with some corresponding functions in the std.meta
namespace.
It might not be the most useful, but @hasField
serves as a good introduction to Zig's meta programming capabilities. You give it a type and a field name, and it tells you whether or not the type has the field:
const std = @import("std");
pub fn main() !void {
std.debug.print("id: {any}\n", .{@hasField(User, "id")});
std.debug.print("name: {any}\n", .{@hasField(User, "name")});
}
const User = struct {
id: u32,
};
The above will tell us that User
has an id
field, but doesn't have a name
field. @hasField
also works for enums and unions.
@hasDecl
is used to indicate if a struct, union or enum has a declaration. A declaration is essentially anything that isn't a field, most notably a function/method or a constant (including nested struct/unions/enums). @hasDecl
respects visibility rules. If a declaration isn't marked pub
, then @hasDecl
will return false
- unless @hasDecl
is called from within the same file.
const std = @import("std");
pub fn main() !void {
std.debug.print("over9000: {any}\n", .{@hasDecl(User, "over9000")});
}
const User = struct {
pub fn over9000(self: *User) bool {
return self.power > 9000;
}
};
@hasDecl
is useful for conditionally enabling behavior. For example, you might have a library with some default behavior which your user can conditionally override by implementing their own version. But care should be taken when using it: it doesn't tell us the type of the declaration. If we change User
to this oddity, we'll get the same output (over9000: true
):
const User = struct {
pub const over9000 = struct {
};
};
Shortly, we'll see how to deal with this.
Unlike @hasField
and @hasDecl
which behave on a type, @field
behaves on an instance. It is used to both get and set the value of the field of an instance.
const std = @import("std");
pub fn main() !void {
var user = User{.id = 0};
@field(user, "id") = 99;
std.debug.print("id: {d}\n", .{@field(user, "id")});
}
const User = struct {
id: u32,
};
This will print id: 99
. This is obviously a silly example. It would have made more sense to use user.id
to access the field. You might be thinking: of course, but @field
would be great to dynamically get or set a field value based on some database result. Remember though, Zig comptime happens at compile-time. The string field, "id"
above, has to be known at compile time. This limits the usefulness of @field
.
Looking at my own libraries, I ever only use it within wider comptime blocks of code. For example, pg.zig has very basic row to struct mapping capabilities. That code iterates through every field of the target struct and uses @field
to populate it. The code looks a bit like:
var value: T = undefined;
inline for (std.meta.fields(T),) |field| {
@field(value, field.name) = row.get(field.type, field.name)
}
This example begins to demonstrate some of the larger comptime world. We see std.meta.fields
which returns a list of fields for a type. We also see inline for
which unrolls the loop. Given a struct with two fields, "id" and "name", we could imagine the above comptime code resulting in:
var value: T = undefined;
@field(value, "id") = row.get(u32, "id");
@field(value, "name") = row.get([]const, "name");
We should take this a step further and also expand the code generated by @field
:
var value: T = undefined;
value.id = row.get(u32, "id");
value.name = row.get([]const, "name");
Unfortunately, as far as I know, there's no way to see the Zig-equivalent of comptime generated code, such as the expansion shown above. But it's how I try to visualize comptime code, and, in this case, it highlights why the field names given to @field
have to be comptime-known.
Interestingly (and maybe inconsistently) @field
works on both fields and declaration.
The std.meta
namespace is worth getting familiar with; it's small but useful. One of the functions that stands out is std.meta.hasFn
. It builds on top of @hasDecl
determining not only if the type has the specific declaration but whether or not that declaration is a function:
const std = @import("std");
pub fn main() !void {
std.debug.print("over9000: {any}\n", .{std.meta.hasFn(User, "over9000")});
std.debug.print("SAYIAN_LEVEL: {any}\n", .{std.meta.hasFn(User, "SAIYAN_LEVEL")});
}
const User = struct {
pub fn over9000(self: *User) bool {
return self.power > 9000;
}
pub const SAIYAN_LEVEL = 9000;
};
This tells us that User
does have a over9000
function but does not have a SAIYAN_LEVEL
function.
So far, when calling @hasDecl
or std.meta.hasFn
, we've always used a struct, User
. Both of these also work with enums and unions. If we try to call @hasDecl
with something else, we'll get a compile-time error:
const std = @import("std");
pub fn main() !void {
std.debug.print("{any}\n", .{@hasDecl(void, "over9000")});
}
error: expected struct, enum, union, or opaque; found 'void'.
If we try the same with std.meta.hasFn
the code compiles and returns false
. This is all pretty reasonable, but in real world code, there's one common issue we'll run into. Often times, you'll use these functions with a generic type. For example, we might create a cache which optionally calls removedFromCache
on items which are purged from the cache:
pub fn Cache(comptime T: type) type {
return struct {
lookup: std.StringHashMap(T),
list: std.DoublyLinkedList(T),
...
fn freeSpace(self: *Cache(T)) void {
const last = self.list.pop() orelse return;
if (comptime std.meta.hasFn(T, "removedFromCache")) {
T.removedFromCache(last.data);
}
}
};
}
std.meta.hasFn
is obviously a better choice than @hasDecl
. For one, we need to make sure removedFromCache
is a function and not another type of declaration. For another, our code should compile even if T
isn't a struct, i.e. maybe we want to cache u64
values.
With hasFn
our cache works for a struct, like User
, or a primitive type, like u32
. But it doesn't work for one important case: a pointer to a struct, e.g. *User
. We need to fix two things to support this reasonable use case.
The first is that std.meta.hasFn
will always return false for a pointer to struct. It might seem like this should print true
:
const std = @import("std");
pub fn main() !void {
std.debug.print("{any}\n", .{std.meta.hasFn(*User, "removedFromCache")});
}
const User = struct {
pub fn removedFromCache(self: *User) void {
_ = self;
}
};
After all, the first parameter to removedFromCache
is a *User
. But that just isn't how it works. removedFromCache
is a function (a declaration), with the User
struct. A pointer to a struct doesn't contain declarations, so hasFn
will always return false when using a pointer to a struct. To solve this, we can use std.meta.hasMethod
instead. If we take the above code and replace hasFn
with hasMethod
we'll get true for either a User
or a *User
.
Our second issue is the next line:
fn freeSpace(self: *Cache(T)) void {
const last = self.list.pop() orelse return;
if (comptime std.meta.hasMethod(T, "removedFromCache")) {
T.removedFromCache(last.data);
}
}
The T.removeFromCache(last.data)
works when T
is User
, because that translate to User.removedFromCache
. But when T
is a *User
, it translate to *User.removedFromCache
, which isn't valid - again, pointers to structs don't contain declarations.
So while std.meta.hasMethod
is useful, it doesn't completely solve our problem.
@typeInfo
You can't talk about metaprogramming in Zig without talking about @typeInfo
. It takes a type and returns a tagged union describing that type. Currently, std.builtin.Type
returned by @typeInfo
can represent one of 24 different types, some of those having sub-types and complex fields. It can be a lot to try to learn all at once. Instead, we can start to get a feel for @typeInfo
by looking at how hasFn
uses it. Here's the full implementation of std.meta.hasFn
:
pub inline fn hasFn(comptime T: type, comptime name: []const u8) bool {
switch (@typeInfo(T)) {
.Struct, .Union, .Enum, .Opaque => {},
else => return false,
}
if (!@hasDecl(T, name))
return false;
return @typeInfo(@TypeOf(@field(T, name))) == .Fn;
}
The code is hopefully simple enough that not only can we get an initial sense for @typeInfo
, but we can also see how hasFn
is able to use it, along with @hasDecl
, to identify if a struct has a specific function. The first part, the switch
, ensures that T
is a struct, union, enum of an opaque, else it returns false. We saw how hasFn
returns false for other types, whereas @hasDecl
gives a comptime error. Here we see how @typeInfo
can be used to turn that compile time error into value.
After making sure that T
is a valid type, @hasDecl
can safely be called. If we do have the declaration, we still need to assert that it's a function. Here again @typeInfo
is used, but this time to check if the declaration is a function, (.Fn
). The @typeInfo
+ @TypeOf
combination is common. @TypeOf
always returns a type
. It's often used when a function accepts an anytype
, but here we see it used on the return value of @field
.
With this understanding of hasFn
, you might not be surprised to learn that hasMethod
is just a wrapper around hasFn
:
pub inline fn hasMethod(comptime T: type, comptime name: []const u8) bool {
return switch (@typeInfo(T)) {
.Pointer => |P| switch (P.size) {
.One => hasFn(P.child, name),
.Many, .Slice, .C => false,
},
else => hasFn(T, name),
};
}
This is a bit more complicated. We're not just using @typeInfo
to check the type; in the case of a Pointer, we're going a bit deeper and checking / using some of the compile-time information we have about the type. Specifically we're checking if it's a single-item pointer and, if it is, we're calling hasFn
on the "child" of the pointer. This essentially unwraps our *User
, turning a call to hasMethod(*User, "x")
into a call to hasFn(User, "x")
.
We took a little detour to start learning about @typeInfo
. The hope if that we can use what we've learned to fix our cache implementation. Remember, we had this code:
fn freeSpace(self: *Cache(T)) void {
const last = self.list.pop() orelse return;
if (comptime std.meta.hasMethod(T, "removedFromCache")) {
T.removedFromCache(last.data);
}
}
Which doesn't compile when T
is a pointer to a struct, like *User
. If you take a second look at the implementation of hasMethod
, can you come up with a possible solution? This is what I'd do:
fn freeSpace(self: *Cache(T)) void {
const last = self.list.pop() orelse return;
if (comptime std.meta.hasMethod(T, "removedFromCache")) {
switch (@typeInfo(T)) {
.Pointer => |ptr| ptr.child.removedFromCache(last.data),
else => T.removedFromCache(last.data),
}
}
}
We can do the same thing as hasMethod
: when T
is a pointer, use its child
. The use of an else
fallthrough might see a little reckless. Like I said, @typeInfo
represents 24 different types, certainly trying to call removedFromCache
wouldn't be valid on a Void
type. But this code only executes within a successful hasMethod
check. Still, some might prefer being more explicit:
switch (@typeInfo(T)) {
.Pointer => |ptr| ptr.child.removedFromCache(last.data),
.Struct, .Union, .Enum => T.removedFromCache(last.data),
else => unreachable,
}
Newcomers to Zig often ask about behavior similar to std.meta.hasFn
or @field
, and probably come out of the exchange disappointed to learn about the comptime requirement. It's certainly different and in many cases limiting compared to the runtime reflection offered by many higher level languages. But I think it's logical and useful in a different way.
Zig's comptime is approachable, but, for me, still complicated. I wish there was a way to get Zig code out of comptime (a bit like how you can see the full Erlang code generated by Elixir code, including its macros). Still, taking small steps and using the functions we explored above, plus seeing their implementation, has helped me get more comfortable with comptime in general, and the std.builtin.Type
in particular. So if you're intimidated by comptime and struggling to learn it, know that you aren't the only one. Take it one step at a time.