homedark

Comparing Strings as Integers with @bitCast

Feb 20, 2025

In the last blog posts, we looked at different ways to compare strings in Zig. A few posts back, we introduced Zig's @bitCast. As a quick recap, @bitCast lets us force a specific type onto a value. For example, the following prints 1067282596:

const std = @import("std");
pub fn main() !void {
    const f: f32 = 1.23;
    const n: u32 = @bitCast(f);
    std.debug.print("{d}\n", .{n});
}

What's happening here is that Zig represents the 32-bit float value of 1.23 as: [4]u8{164, 112, 157, 63}. This is also how Zig represents the 32-bit unsigned integer value of 1067282596. Data is just bytes; it's the type system - the compiler's knowledge of what data is what type - that controls what and how that data is manipulated.

It might seem like there's something special about bitcasting from a float to an integer; they're both numbers after all. But you can @bitCast from any two equivalently sized types. Can you guess what this prints?:

const std = @import("std");
pub fn main() !void {
    const data = [_]u8{3, 0, 0, 0};
    const x: i32 = @bitCast(data);
    std.debug.print("{d}\n", .{x});
}

The answer is 3. Think about the above snippet a bit more. We're taking an array of bytes and telling the compiler to treat it like an integer. If we made data equal to [_]u8{'b', 'l', 'u', 'e'}, it would still work (and print 1702194274). We're slowly heading towards being able to compare strings as-if they were integers.

From the last post, we could use multiple std.mem.eql or, more simply, std.meta.stringToEnum to complete the following method:

fn parseMethod(value: []const u8) ?Method {
    // ...
}

const Method = enum {
    get,
    put,
    post,
    head,
};

We can also use @bitCast. Let's take it step-by-step.

The first thing we'll need to do is switch on value.len. This is necessary because the three-byte "GET" will need to be @bitCast to a u24, whereas the four-byte "POST" needs to be @bitCast to a u32:

fn parseMethod(value: []const u8) ?Method {
    switch (value.len) {
        3 => switch (@as(u24, @bitCast(value[0..3]))) {
            // TODO
            else => {},
        },
        4 => switch (@as(u32, @bitCast(value[0..4]))) {
            // TODO
            else => {},
        },
        else => {},
    }

    return null;
}

If you try to run this code, you'll get a compilation error: cannot @bitCast from '*const [3]u8'. @bitCast works on actual bits, but when we slice our []const u8 with a compile-time known range ([0..3]), we get a pointer to an array. We can't @bitCast a pointer, we can only @bitCast actual bits of data. For this to work, we need to derefence the pointer, i.e. use: value[0..3].*. This will turn our *const [3]u8 into a const [3]u8.

fn parseMethod(value: []const u8) ?Method {
    switch (value.len) {
        // changed: we now derefernce the value (.*)
        3 => switch (@as(u24, @bitCast(value[0..3].*))) {
            // TODO
            else => {},
        },
        // changed: we now dereference the value (.*)
        4 => switch (@as(u32, @bitCast(value[0..4].*))) {
            // TODO
            else => {},
        },
        else => {},
    }

    return null;
}

Also, you might have noticed the @as(u24, ...) and @as(u32, ...). @bitCast, like most of Zig's builtin functions, infers its return type. When we're assiging the result of a @bitCast to a variable of a known type, i.e: const x: i32 = @bitCast(data);, the return type of i32 is inferred. In the above switch, we aren't assigning the result to a varible. We have to use @as(u24, ...) in order for @bitCast to kknow what it should be casting to (i.e. what its return type should be).

The last thing we need to do is fill our switch blocks. Hopefully it's obvious that we can't just do:

3 => switch (@as(u24, @bitCast(value[0..3].*))) {
    "GET" => return .get,
    "PUT" => return .put,
    else => {},
},
...

But you might be thinking that, while ugly, something like this might work:

3 => switch (@as(u24, @bitCast(value[0..3].*))) {
    @as(u24, @bitCast("GET".*)) => return .get,
    @as(u24, @bitCast("PUT".*)) => return .put,
    else => {},
},
...

Because "GET" and "PUT" are string literals, they're null terminated and of type *const [3:0]u8. When we dereference them, we get a const [3:0]u8. It's close, but it means that the value is 4 bytes ([4]u8{'G', 'E', 'T', 0}) and thus cannot be @bitCast into a u24. This is ugly, but it works:

fn parseMethod(value: []const u8) ?Method {
    switch (value.len) {
        3 => switch (@as(u24, @bitCast(value[0..3].*))) {
            @as(u24, @bitCast(@as([]const u8, "GET")[0..3].*)) => return .get,
            @as(u24, @bitCast(@as([]const u8, "PUT")[0..3].*)) => return .put,
            else => {},
        },
        4 => switch (@as(u32, @bitCast(value[0..4].*))) {
            @as(u32, @bitCast(@as([]const u8, "HEAD")[0..4].*)) => return .head,
            @as(u32, @bitCast(@as([]const u8, "POST")[0..4].*)) => return .post,
            else => {},
        },
        else => {},
    }
    return null;
}

That's a mouthful, so we can add small function to help:

fn parseMethod(value: []const u8) ?Method {
    switch (value.len) {
        3 => switch (@as(u24, @bitCast(value[0..3].*))) {
            asUint(u24, "GET") => return .get,
            asUint(u24, "PUT") => return .put,
            else => {},
        },
        4 => switch (@as(u32, @bitCast(value[0..4].*))) {
            asUint(u32, "HEAD") => return .head,
            asUint(u32, "POST") => return .post,
            else => {},
        },
        else => {},
    }
    return null;
}

pub fn asUint(comptime T: type, comptime string: []const u8) T {
    return @bitCast(string[0..string.len].*);
}

Like the verbose version, the trick is to cast our null-terminated string literal into a string slice, []const u8. By passing it through the asUint function, we get this without needing to add the explicit @as([]const u8).

There is a more advanced version of asUint which doesn't take the uint type parameter (T). If you think about it, the uint type can be inferred from the string's length:

pub fn asUint(comptime string: []const u8) @Type(.{
    .int = .{
        // bits, not bytes, hence * 8
        .bits = string.len * 8,
        .signedness = .unsigned,
    },
}) {
    return @bitCast(string[0..string.len].*);
}

Which allows us to call it with a single parameter: asUint("GET"). This might be your first time seeing such a return type. The @Type builtin is the opposite of @typeInfo. The latter takes a type and returns information on it in the shape of a std.builtin.Type union. Whereas @Type takes the std.builtin.Type and returns an actual usable type. One of these days I'll find the courage to blog about std.builtin.Type!

As a final note, some people dislike the look of this sort of return type and rather encapsulate the logic in its own function. This is the same:

pub fn asUint(comptime string: []const u8) AsUintReturn(string) {
    return @bitCast(string[0..string.len].*);
}

// Remember that, in Zig, by convention, a function should be
// PascalCase if it returns a type (because types are PascalCase).
fn AsUintReturn(comptime string: []const u8) type {
    return @Type(.{
        .int = .{
            // bits, not bytes, hence * 8
            .bits = string.len * 8,
            .signedness = .unsigned,
        },
    });
}

Conclusion

Of the three approaches, this is the least readable and less approachable. Is it worth it? It depends on your input and the values you're comparing against. In my benchmarks, using @bitCast performs roughly the same as std.meta.stringToEnum. But there are some cases where @bitCast can outperform std.meta.stringToEnum by as much as 50%. Perhaps that's the real value of this approach: the performance is less dependent on the input or the values being matched against.