Reading a JSON config in Zig
This post was updated on October 30 2023, to reflect changes to std.json currently on the master branch.
A while ago, I wrote a Websocket Server implementation in Zig. I did it just for fun, which means I didn't have to worry about the boilerplate things that go into creating an actual production application.
More recently, I've jumped back into Zig for something a little more serious (but not really). As part of my learning journey, I wanted to add more polish. One of the things I wanted to do was add configuration options, which meant reading a config.json
file.
This seemingly simple task presented me with enough challenges that I thought it might be worth a blog post. The difficulties that I ran into are threefold. First, Zig's documentation is, generously, experimental. Second, Zig changes enough from version to version that whatever external documentation I did find, didn't work as-is (I'm writing this using 0.11-dev). Finally, decades of being coddled by garbage collectors have dulled by education.
Here's the skeleton that we'll start with:
const std = @import("std");
pub fn main() !void {
const config = try readConfig("config.json");
std.debug.print("config.root: {s}\n", .{config.root});
}
fn readConfig(path: []const u8) !Config {
//TODO: all our logic here!
// since we're not using path yet, we need this to satisfy the compiler
_ = path;
return error.NotImplemented;
}
const Config = struct {
root: []const u8,
};
It would be a nice addition to be able to specify the path/filename of the configuration via a command line option, such as --config config.json
, but Zig has no built-in flag parser, so we'd have fiddle around with std.os.argv
or use a third party library.
Also, as a very quick summary of Zig's error handling, the try function(...)
syntax is like Go's if err != nil { return err }
. Note that our main
function returns !void
and our readConfig
returns !Config
. This is called an error union type and it means our functions return either an error or the specified type. We can list an explicit error set (like an enum), in the form of ErrorType!Type
. If we want to support any error, we can use the special anyerror
type. Finally, we can have zig infer the error set via !Type
which, practically speaking, is similar to using anyerror
, but is actually just a shorthand form of having an explicit error set (where the error set is automatically inferred at compile-time).
Reading a File
The first thing we need to do is read the contents of our file (after which we can parse it). We could explore the available API, but it should be somewhat obvious that this is going to require allocating memory. One of Zig's key feature is explicit memory allocations. This means that any part of the standard library (and hopefully 3rd party libraries) that might allocate memory take a parameter of type std.mem.Allocator
. In other words, instead of having a readFile(path) ![]u8
function which would allocate memory internally using, say, malloc
, we'd expect to find a readFile(allocator, path) ![]u8
which would use the supplied allocator
.
Hopefully this will make more sense once we look at a concrete example. For now, we'll create an allocator in main
and pass it to readConfig
:
const std = @import("std");
const Allocator = std.mem.Allocator;
pub fn main() !void {
// we have a few choices, but this is a safe default
var gpa = std.heap.GeneralPurposeAllocator(.{}){};
const allocator = gpa.allocator();
const config = try readConfig(allocator, "config.json");
std.debug.print("config.root: {s}\n", .{config.root});
}
fn readConfig(allocator: Allocator, path: []const u8) !Config {
//TODO: all our logic here!
// since we're not using path yet, we need this to satisfy the compiler
_ = path;
_ = allocator;
return error.NotImplemented;
}
Now to actually read the file, there's a Dir
type in the standard library that exposes a readFile
and a readFileAlloc
method. I find this API a little confusing (though I'm sure there's a reason for it), as these aren't static functions but members of the Dir
type. So we need a Dir
. Thankfully, we can easily get the Dir
of the current working directory using std.fs.cwd()
.
If we use readFileAlloc
, our code looks like:
fn readConfig(allocator: Allocator, path: []const u8) !Config {
const data = try std.fs.cwd().readFileAlloc(allocator, path, 512);
defer allocator.free(data);
...
}
The last parameter, 512, is the maximum size to allocate/read. Our config is very small, so we're limiting this to 512 bytes. Importantly, this function allocates and returns memory using our provided allocator. We're responsible for freeing this memory, which we'll do when the function returns, using defer
.
Alternatively, we could use readFile
, which takes and fills in a []u8
instead of an allocator. Using this function, our code looks like:
fn readConfig(allocator: Allocator, path: []const u8) !Config {
var buffer = try allocator.alloc(u8, 512);
const data = try std.fs.cwd().readFile(path, buffer);
defer allocator.free(buffer);
...
In the above code, data
is a slice of buffer
which is fitted to the size of file. In our specific case, it doesn't matter if you use readFileAlloc
or readFile
. I find the first one simpler. The benefit of the readFile
though is the ability to re-use buffers.
It turns out that reading a file is straightforward. However, if you're coming from a garbage collected mindset, it's hopefully apparent that [manual] memory management is a significant responsibility. If we did forget to free
data
, and we did write tests for readConfig
, Zig's test runner would detect this and automatically fail. The output would look something like:
zig build test
Test [3/10] test.readConfig... [gpa] (err): memory address 0x1044f8000 leaked:
/opt/zig/lib/std/array_list.zig:391:67: 0x10436fa5f in ensureTotalCapacityPrecise (test)
const new_memory = try self.allocator.alignedAlloc(T, alignment, new_capacity);
^
/opt/zig/lib/std/array_list.zig:367:51: 0x104366c2f in ensureTotalCapacity (test)
return self.ensureTotalCapacityPrecise(better_capacity);
^
/opt/zig/lib/std/io/reader.zig:74:56: 0x104366607 in readAllArrayListAligned__anon_3814 (test)
try array_list.ensureTotalCapacity(math.min(max_append_size, 4096));
^
/opt/zig/lib/std/fs/file.zig:959:46: 0x10436637b in readToEndAllocOptions__anon_3649 (test)
self.reader().readAllArrayListAligned(alignment, &array_list, max_bytes) catch |err| switch (err) {
^
/opt/zig/lib/std/fs.zig:2058:42: 0x104365d0f in readFileAllocOptions__anon_3466 (test)
return file.readToEndAllocOptions(allocator, max_bytes, stat_size, alignment, optional_sentinel);
^
/opt/zig/lib/std/fs.zig:2033:41: 0x104365a0f in readFileAlloc (test)
return self.readFileAllocOptions(allocator, file_path, max_bytes, null, @alignOf(u8), null);
^
/code/demo.zig:13:41: 0x10436745f in readConfig (test)
const data = try std.fs.cwd().readFileAlloc(allocator, path, 512);
^
/code/demo.zig:30:13: 0x104369547 in test.readConfig (test)
const config = try readConfig(allocator, "config.json");
Parsing JSON
We've read our file and have a string, which in Zig is a []u8
. Populating a
fn readConfig(allocator: Allocator, path: []const u8) !std.json.Parsed(Config) {
const data = try std.fs.cwd().readFileAlloc(allocator, path, 512);
defer allocator.free(data);
return std.json.parseFromSlice(Config, allocator, data, .{.allocate = .alloc_always});
}
Notice that we're now returning a std.json.Parsed(Config)
. If you think about turning a JSON string into an object, you know that there'll have to be some allocations. At a minimum, we need to allocate an instance of our Config
. That's why we pass our allocator. The `Parsed` structure we get back contains a `value` which is our `Config` as well as an arena allocator. This is convenient as it allows us to easily manage the allocated memory by tying the lifetime of the allocation to our `Config` within the `Parsed` structure:
pub fn main() !void {
// we have a few choices, but this is a safe default
var gpa = std.heap.GeneralPurposeAllocator(.{}){};
const allocator = gpa.allocator();
const parsed = try readConfig(allocator, "config.json");
defer parsed.deinit();
const config = parse.value;
...
}
Also take note of the last parameter we passed to parseFromSlice
. This is an optional argument that configures various aspects of the JSON parsing. We're specifying .allocate = .alloc_always
, which tells the function to copy any strings from our input and thus have them owned by the Parsed object. The default is alloc_if_needed
. Using this option would be more efficient since our parsed Config
would reference our input data
, rather than having to make a copy, but then we'd have to extend data's
lifetime to match our return value. If you're creating many objects with large text values, it's likely something you'd want to consider.
Default Values
With the above code, json.parse
will fail with an error.MissingField
if our json file doesn't have a root
field. But what if we wanted to make it optional with a default value? There are two options. The first is to specify the default value in our structure:
const Config = struct {
root: []const u8 = "/tmp/demo",
};
Another option is to use an Optional type and default to null
:
const Config = struct {
root: ?[]const u8 = null,
};
When we want to use the value, we can use orelse
to set the default:
lmdb.open(config.root orelse "/tmp/db")
Conclusion
The complete code to read a json config file into a config structure is:
const std = @import("std");
const Allocator = std.mem.Allocator;
pub fn main() !void {
var gpa = std.heap.GeneralPurposeAllocator(.{}){};
const allocator = gpa.allocator();
const parsed = try readConfig(allocator, "config.json");
defer parsed.deinit();
const config = parse.value;
std.debug.print("config.root: {s}\n", .{config.root});
}
fn readConfig(allocator: Allocator, path: []const u8) !std.json.Parsed(Config) {
// 512 is the maximum size to read, if your config is larger
// you should make this bigger.
const data = try std.fs.cwd().readFileAlloc(allocator, path, 512);
defer allocator.free(data);
return std.json.parseFromSlice(Config, allocator, data, .{.allocate = .alloc_always});
}
const Config = struct {
root: []const u8,
};
It would be nice to parse our JSON in a single line, and I don't get why readFileAlloc
is a member of Dir
(but I bet there's a reason). Overall though, it's pretty painless. Hopefully this helps someone :)