This part continues where the previous left off: familiarizing ourselves with the language. We'll explore Zig's control flow and types beyond structures. Together with the first part, we'll have covered most of the language's syntax allowing us to tackle more of the language and the standard library.
Zig's control flow is likely familiar, but with additional synergies with aspects of the language we've yet to explore. We'll start with a quick overview of control flow and come back when discussing features that elicit special control flow behavior.
You will notice that instead of the logical operators &&
and ||
, we use and
and or
. Like in most languages, and
and or
control the flow of execution: they short-circuit. The right side of an and
isn't evaluated if the left side is false
, and the right side of an or
isn't evaluated if the left side is true
. In Zig, control flow is done with keywords, and thus and
and or
are used.
Also, the comparison operator, ==
, does not work between slices, such as []const u8
, i.e. strings. In most cases, you'll use std.mem.eql(u8, str1, str2)
which will compare the length and then bytes of the two slices.
Zig's if
, else if
and else
are commonplace:
if (std.mem.eql(u8, method, "GET") or std.mem.eql(u8, method, "HEAD")) {
} else if (std.mem.eql(u8, method, "POST")) {
} else {
}
The above example is comparing ASCII strings and should likely be case insensitive. std.ascii.eqlIgnoreCase(str1, str2)
is probably a better option.
There is no ternary operator, but you can use an if/else
like so:
const super = if (power > 9000) true else false;
switch
is similar to an if/else if/else, but has the advantage of being exhaustive. That is, it's a compile-time error if not all cases are covered. This code will not compile:
fn anniversaryName(years_married: u16) []const u8 {
switch (years_married) {
1 => return "paper",
2 => return "cotton",
3 => return "leather",
4 => return "flower",
5 => return "wood",
6 => return "sugar",
}
}
We're told: switch must handle all possibilities. Since our years_married
is a 16-bit integer, does that mean we need to handle all 64K cases? Yes, but thankfully there's an else
:
6 => return "sugar",
else => return "no more gifts for you",
We can combine multiple cases or use ranges, and use blocks for complex cases:
fn arrivalTimeDesc(minutes: u16, is_late: bool) []const u8 {
switch (minutes) {
0 => return "arrived",
1, 2 => return "soon",
3...5 => return "no more than 5 minutes",
else => {
if (!is_late) {
return "sorry, it'll be a while";
}
return "never";
},
}
}
While a switch
is useful in a number of cases, its exhaustive nature really shines when dealing with enums, which we'll talk about shortly.
Zig's for
loop is used to iterate over arrays, slices and ranges. For example, to check if an array contains a value, we might write:
fn contains(haystack: []const u32, needle: u32) bool {
for (haystack) |value| {
if (needle == value) {
return true;
}
}
return false;
}
for
loops can work on multiple sequences at once, as long as those sequences are the same length. Above we used the std.mem.eql
function. Here's what it (almost) looks like:
pub fn eql(comptime T: type, a: []const T, b: []const T) bool {
if (a.len != b.len) return false;
for (a, b) |a_elem, b_elem| {
if (a_elem != b_elem) return false;
}
return true;
}
The initial if
check isn't just a nice performance optimization, it's a necessary guard. If we take it out and pass arguments of different lengths, we'll get a runtime panic: for loop over objects with non-equal lengths.
for
loops can also iterate over ranges, such as:
for (0..10) |i| {
std.debug.print("{d}\n", .{i});
}
This really shines in combination with one (or more!) sequence:
fn indexOf(haystack: []const u32, needle: u32) ?usize {
for (haystack, 0..) |value, i| {
if (needle == value) {
return i;
}
}
return null;
}
The end of the range is inferred from the length of haystack
, though we could punish ourselves and write: 0..hastack.len
. for
loops don't support the more generic init; compare; step
idiom. For this, we rely on while
.
Because while
is simpler, taking the form of while (condition) { }
, we have greater control over the iteration. For example, when counting the number of escape sequences in a string, we need to increment our iterator by 2 to avoid double counting a \
:
var escape_count: usize = 0;
{
var i: usize = 0;
while (i < src.len) {
if (src[i] == '\\') {
i += 2;
escape_count += 1;
} else {
i += 1;
}
}
}
We added an explicit block around our temporary i
variable and while
loop. This narrows the scope of i
. Blocks like this can be useful, though in this case it's probably overkill. Still, the above example is as close to a traditional for(init; compare; step)
loop that Zig has.
A while
can have an else
clause, which is executed when the condition is false. It also accepts a statement to execute after each iteration. There can be multiple statements speparated with ;
. This feature was commonly used prior to for
supporting multiple sequences. The above can be written as:
var i: usize = 0;
var escape_count: usize = 0;
while (i < src.len) : (i += 1) {
if (src[i] == '\\') {
i += 1;
escape_count += 1;
}
}
break
and continue
are supported for either breaking out of the inner-most loop or jumping to the next iteration.
Blocks can be labeled and break
and continue
can target a specific label. A contrived example:
outer: for (1..10) |i| {
for (i..10) |j| {
if (i * j > (i+i + j+j)) continue :outer;
std.debug.print("{d} + {d} >= {d} * {d}\n", .{i+i, j+j, i, j});
}
}
break
has another interesting behavior, returning a value from a block:
const personality_analysis = blk: {
if (tea_vote > coffee_vote) break :blk "sane";
if (tea_vote == coffee_vote) break :blk "whatever";
if (tea_vote < coffee_vote) break :blk "dangerous";
};
Blocks like this must be semi-colon terminated.
Later, when we explore tagged unions, error unions and optional types, we'll see what else these control flow structures have to offer.
Enums are integer constants that are given a label. They are defined much like a struct:
const Status = enum {
ok,
bad,
unknown,
};
And, like a struct, can contain other definitions, including functions which may or may not take the enum as a parameter:
const Stage = enum {
validate,
awaiting_confirmation,
confirmed,
err,
fn isComplete(self: Stage) bool {
return self == .confirmed or self == .err;
}
};
Recall struct types can be inferred based on their assigned or return type using the .{...}
notation. Above, we see the enum type being inferred based on its comparison to self
, which is of type Stage
. We could have been explicit and written: return self == Stage.confirmed or self == Stage.err;
. But, when dealing with enums you'll often see the enum type omitted via the .$value
notation. This is called an enum literal.
The exhaustive nature of switch
makes it pair nicely with enums as it ensures you've handled all possible cases. Be careful when using the else
clause of a switch
though, as it'll match any newly added enum values, which may or may not be the behavior that you want.
An union defines a set of types that a value can have. For example, this Number
union can either be an integer
, a float
or a nan
(not a number):
const std = @import("std");
pub fn main() void {
const n = Number{.int = 32};
std.debug.print("{d}\n", .{n.int});
}
const Number = union {
int: i64,
float: f64,
nan: void,
};
A union can only have one field set at a time; it's an error to try to access an unset field. Since we've set the int
field, if we then tried to access n.float
, we'd get an error. One of our fields, nan
, has a void
type. How would we set its value? Use {}
:
const n = Number{.nan = {}};
A challenge with unions is knowing which field is set. This is where tagged unions come into play. A tagged union merges an enum with an union, which can be used in a switch statement. Consider this example:
pub fn main() void {
const ts = Timestamp{.unix = 1693278411};
std.debug.print("{d}\n", .{ts.seconds()});
}
const TimestampType = enum {
unix,
datetime,
};
const Timestamp = union(TimestampType) {
unix: i32,
datetime: DateTime,
const DateTime = struct {
year: u16,
month: u8,
day: u8,
hour: u8,
minute: u8,
second: u8,
};
fn seconds(self: Timestamp) u16 {
switch (self) {
.datetime => |dt| return dt.second,
.unix => |ts| {
const seconds_since_midnight: i32 = @rem(ts, 86400);
return @intCast(@rem(seconds_since_midnight, 60));
},
}
}
};
Notice that each case in our switch
captures the typed value of the field. That is dt
is a Timestamp.DateTime
and ts
is an i32
. This is also the first time we've seen a structure nested within another type. DateTime
could have been defined outside of the union. We're also seeing two new builtin functions: @rem
to get the remainder and @intCast
to convert the result to an u16
(@intCast
infers that we want an u16
from our return type since the value is being returned).
As we can see from the above example, tagged unions can be used somewhat like interfaces, as long as all possible implementations are known ahead of time and can be baked into the tagged union.
Finally, the enum type of a tagged union can be inferred. Instead of defining a TimestampType
, we could have done:
const Timestamp = union(enum) {
unix: i32,
datetime: DateTime,
...
and Zig would have created an implicit enum based on our union's fields.
Any value can be declared as optional by prepending a question mark, ?
, to the type. Optional types can either be null
or a value of the defined type:
var home: ?[]const u8 = null;
var name: ?[]const u8 = "Leto";
The need to have an explicit type should be clear: if we had just done const name = "Leto";
, then the inferred type would be the non-optional []const u8
.
.?
is used to access the value behind the optional type:
std.debug.print("{s}\n", .{name.?});
But we'll get a runtime panic if we use .?
on a null. An if
statement can safely unwrap an optional:
if (home) |h| {
} else {
}
orelse
can be used to unwrap the optional or execute code. This is commonly used to specify a default, or return from the function:
const h = home orelse "unknown"
const h = home orelse return;
However, orelse
can also be given a block and execute more complex logic. Optional types also integrate with while
, and are frequently used for creating iterators. We won't implement an iterator, but hopefully this dummy code makes sense:
while (rows.next()) |row| {
}
So far, every single variable that we've seen has been initialized to a sensible value. But sometimes we don't know the value of a variable when it's declared. Optionals are one option, but don't always make sense. In such cases we can set variables to undefined
to leave them uninitialized.
One place where this is commonly done is when creating an array to be filled by some function:
var pseudo_uuid: [16]u8 = undefined;
std.crypto.random.bytes(&pseudo_uuid);
The above still creates an array of 16 bytes, but leaves the memory uninitialized.
Zig has simple and pragmatic error handling capabilities. It all starts with error sets, which look and behave like enums:
const OpenError = error {
AccessDenied,
NotFound,
};
A function, including main
, can now return this error:
pub fn main() void {
return OpenError.AccessDenied;
}
const OpenError = error {
AccessDenied,
NotFound,
};
If you try to run this, you'll get an error: expected type 'void', found 'error{AccessDenied,NotFound}'. This makes sense: we defined main
with a void
return type, yet we return something (an error, sure, but that's still not void
). To solve this, we need to change our function's return type.
pub fn main() OpenError!void {
return OpenError.AccessDenied;
}
This is called an error union type and it indicates that our function can return either an OpenError
error or a void
(aka, nothing). So far we've been quite explicit: we created a error set for the possible errors our function can return, and used that error set in the error union return type of our function. But, when it comes to errors, Zig has few neat tricks up its sleeve. First, rather than specifying an error union as error set!return type
we can let Zig infer the error set by using: !return type
. So we could, and probably would, define our main
as:
pub fn main() !void
Second, Zig is capable of implicitly creating error sets for us. Instead of creating our error set, we could have done:
pub fn main() !void {
return error.AccessDenied;
}
Our completely explicit and implicit approaches aren't exactly equivalents. For example, references to functions with implicit error sets require using the special anyerror
type. Library developers might see advantages to being more explicit, such as self-documenting code. Still, I think both the implicit error sets and the inferred error union are pragmatic; I make heavy use of both.
The real value of error unions is the built-in language support in the shape of catch
and try
. A function call that returns an error union can include a catch
clause. For example, an http server library might have code that looks like:
action(req, res) catch |err| {
if (err == error.BrokenPipe or err == error.ConnectionResetByPeer) {
return;
} else if (err == error.BodyTooBig) {
res.status = 431;
res.body = "Request body is too big";
} else {
res.status = 500;
res.body = "Internal Server Error";
}
};
The switch
version is more idiomatic:
action(req, res) catch |err| switch (err) {
error.BrokenPipe, error.ConnectionResetByPeer) => return,
error.BodyTooBig => {
res.status = 431;
res.body = "Request body is too big";
},
else => {
res.status = 500;
res.body = "Internal Server Error";
}
};
That's all quite fancy, but let's be honest, the most likely thing you're going to do in catch
is bubble the error to the caller:
action(req, res) catch |err| return err;
This is so common that it's what try
does. Rather than the above, we do:
try action(req, res);
This is particularly useful given that error must be handled. Most likely you'll do so with a try
or catch
.
Most of the time you'll be using try
and catch
, but error unions are also supported by if
and while
, much like optional types. In the case of while
, if the condition returns an error, the else
clause is executed.
There is a special anyerror
type which can hold any error. While we could define a function as returning anyerror!TYPE
rather than !TYPE
, the two are not equivalent. The inferred error set is created based on what the function can return. anyerror
is the global error set, a superset of all error sets in the program. Therefore, using anyerror
in a function signature is likely to signal that your function can return errors that, in reality, it cannot. anyerror
is used for function parameters or struct fields that can work with any error (imagine a logging library).
It's not uncommon for a function to return an error union optional type. With an inferred error set, this looks like:
pub fn loadLast() !?Save {
return null;
}
There are different ways to consume such functions, but the most compact is by using try
to unwrap our error and then orelse
to unwrap the optional. Here's a working skeleton:
const std = @import("std");
pub fn main() void {
const save = (try Save.loadLast()) orelse Save.blank();
std.debug.print("{any}\n", .{save});
}
pub const Save = struct {
lives: u8,
level: u16,
pub fn loadLast() !?Save {
return null;
}
pub fn blank() Save {
return .{
.lives = 3,
.level = 1,
};
}
};
While Zig has more depth, and some of the language features have greater capabilities, what we've seen in these first two parts is a significant part of the language. It will serve as a foundation, allowing us to explore more complex topics without getting too distracted by syntax.