Zig's @memcpy, copyForwards and copyBackwards
Sep 03, 2024
If you've used Zig for a bit, you've probably come across the @memcpy
builtin. It copies bytes from one region of memory to another. For example, if we wanted to concat two arrays, we could write a little helper:
fn concat(comptime T: type, allocator: std.mem.Allocator, arr1: []const T, arr2: []const T) ![]T {
var combined = try allocator.alloc(T, arr1.len + arr2.len);
@memcpy(combined[0..arr1.len], arr1);
@memcpy(combined[arr1.len..], arr2);
return combined;
}
You might have also come across std.mem.copyForwards
and std.mem.copyBackwards
. All of these do the same thing, but have slightly different constraint on the parameters. Specifically, each restricts if or how the two parameters can overlap. At first glance, having 3 ways to do the same thing might seem like overkill. If we look at the concat
example above, combined
is a newly allocated chunk of memory: it can never overlap with the memory of arr1
and arr2
.
While you're probably using @memcpy
in a way similar as concat
, there are equally legitimate cases where the two parameters could overlap. For example, in pg.zig we read messages from the PostgreSQL server into a static buffer. For efficiency, we read as much data as possible - ideally, we fill our buffer. As our result, our buffer could be filled with partial payloads. As a made up example, consider:
var buffer: [16]u8 = undefined;
const n = try socket.read(&buffer);
Assuming this fills up our buffer (i.e. n == 16
), buffer
might look like:
'D', 0, 4, 'G', 'O', 'K', 'U', // our first row
'D', 0, 9, 'O', 'V', 'E', 'R', '9', '0', // the start of our next row
To complete our 2nd row, we need to read more data from the socket, but our buffer is full. We could dynamically allocated a larger buffer. However, since data for the 1st row is no longer needed, our buffer is large enough as long as we move our partially-read 2nd row to the beginning. Unlike our concat
which was copying into a totally new memory location, we now need to copy from buffer
into buffer
. Is that ok? Let's extract this specific example and try:
const std = @import("std");
pub fn main() !void {
var buf = [16]u8{
'D', 0, 4, 'G', 'O', 'K', 'U',
'D', 0, 9, 'O', 'V', 'E', 'R', '9', '0',
};
@memcpy(buf[0..9], buf[7..16]);
std.debug.print("{any}\n", .{&buf});
}
If you try to run the above, you'll get a runtime panic: panic: @memcpy arguments alias. To solve this, we need to use std.mem.copyForwards
. If I was to implement my own memcpy
functionality, I'd likely end up with something similar to copyFowards:
:
fn cp(comptime T: type, dest: []T, src: []const u8) void {
std.debug.assert(dest.len == src.len);
for (0..src.len) |i| {
dest[i] = src[i];
}
}
But, as a general solution, this code has bugs. Consider this usage:
pub fn main() !void {
var buf = [4]u8{1, 2, 3, 4};
cp(u8, buf[1..3], buf[0..2]);
std.debug.print("{any}", .{&buf});
}
We're saying copy the values 1, 2
over 2, 3
, so presumably we'd expect the result to be 1, 1, 2, 4
. But, with our simple cp
function, we'd get 1, 1, 1, 4
. The issue is that our cp
function overwrites part of src
before it's copied into dest
. In this case, we can solve the issue by copying backwards. In other words, instead of:
buf[1] = buf[0];
buf[2] = buf[1];
We want to do:
buf[2] = buf[1];
buf[1] = buf[0];
You can see why we need both a copyForwards
and copyBackwards
. In some cases we need to copy src
front to back and in other cases we need to copy it back to front. It depends on which part (the beginning or the end) of the buffers overlap. In many other cases, like concat
above, where there is no overlap, an implementation is free to optimize the copy operation - perhaps copying multiple bytes using a single operation.
Aliasing
In many languages, including Zig, you're allowed to have multiple variable reference the same memory. As we saw above, aliasing can impact the correctness of code. Because of aliasing, @memcpy
crashes on overlapping memory and copyForwards
and copyBackwards
can give unexpected results.
While Zig's documents the behavior of all three functions, the issue is pervasive. This code also crashes:
const std = @import("std");
pub fn main() !void {
var buf: [10]u8 = undefined;
const prefix = try std.fmt.bufPrint(&buf, "{s}", .{"over"});
const warning = try std.fmt.bufPrint(&buf, "{s}{d}!", .{ prefix, "9000" });
std.debug.print("{s}\n", .{warning});
}
Because bufPrint
internally calls @memcpy
, the second time we call it we get a panic. This is because we're trying to copy prefix
(which references buf
) into buf
.
In addition to correctness, when aliasing is possible, compilers need to be careful about the assumptions they make. This often means avoiding certain optimizations. The output for this convoluted code is 30:
const std = @import("std");
pub fn main() !void {
var incr = [_]i32{10};
var result: i32 = 10;
add(&result, &incr);
std.debug.print("{d}\n", .{result});
}
fn add(a: *i32, b: []i32) void {
a.* += b[0];
a.* += b[0];
}
The code is essentially doing 10 + 10 + 10
. If we keep add
exactly as-is, but change how we call it, we'll get a different result:
pub fn main() !void {
var incr = [_]i32{10};
const result: *i32 = &incr[0];
add(result, &incr);
std.debug.print("{d}\n", .{result.*});
}
Even though both cases start with a.* == 10
and b[0] == 10
the result is now 40
. Why? Because in this second version, a
points to b[0]
. Thus the first call to a.* += b[0]
is incrementing b[0]
as well.
When I look at add
, I don't think about a
referencing b[0]
, but the compiler does. And as unlikely as that might be, the compiler has no choice but to play it safe. Instead of loading a[0]
once, it has to load it twice: once for each read. More generally, aliasing means that compilers need to handle the possibility that writing to a pointer affects other variables.
Like I said above, this is a convoluted example. But in the name of completeness, we can see all of this in action, albeit indirectly. If we take the complete code, save it as "test.zig" and run it in ReleaseFast
, we'll get the aforementioned output of 40
:
const std = @import("std");
pub fn main() !void {
var incr = [_]i32{10};
const result: *i32 = &incr[0];
add(result, &incr);
std.debug.print("{d}\n", .{result.*});
}
pub fn add(a: *i32, b: []i32) void {
a.* += b[0];
a.* += b[0];
}
Now if we mark one of both parameters with noalias
and run it again, we'll get 30
:
const std = @import("std");
pub fn main() !void {
var incr = [_]i32{10};
const result: *i32 = &incr[0];
add(result, &incr);
std.debug.print("{d}\n", .{result.*});
}
pub fn add(noalias a: *i32, b: []i32) void {
a.* += b[0];
a.* += b[0];
}
Why is this happening? Because the noalias
hint lets the compiler reorganize the code so that b[0]
is read only once. It's important to note that noalias
doesn't forbid aliasing. The fact that this code runs, despite a
and b[0]
referencing the same memory, proves that. It's merely a promise that we're making to the compiler. To be clear, I've never used noalias
in the past and I'm 99% sure I'll never use it in the future. The only reason I bring it up is to hopefully explain/show how aliasing impacts the compiler.
Conclusion
In most cases, you'll end up using @memcpy
. Thankfully, if it ever gets called with overlapping memory, you'll get a runtime panic (I say thankfully, because a runtime panic is better than an undefined behavior). Still, unless you're copying into newly allocated memory, it's probably worth spending a few seconds to consider whether the source and destination could overlap and, if so, whether std.mem.copyForwards
(or, less likely in my experience std.mem.copyBackwars
) is the correct choice.
Beyond that, for the sake of readability and simplicity aliasing is something worth minimizing. Some guidelines use the word "avoid", but I've settled on merely being more mindful of it (at least for now); maybe now and again I manage to limit the scope where two variables reference the same memory.