Zig's MemoryPool Allocator
You're probably familiar with Zig's GeneralPurposeAllocator
, ArenaAllocator
and FixedBufferAllocator
, but Zig's standard library has another allocator you should be keeping at the ready: std.heap.MemoryPool
.
What makes the MemoryPool
allocator special is that it can only create one type. As compensation for this limitation, it's very fast when allocating and freeing that type. Also, it's easy to use:
const std = @import("std");
pub fn main() !void {
var gpa = std.heap.GeneralPurposeAllocator(.{}){};
const allocator = gpa.allocator();
// This pool can only create and destroy a User
var user_pool = std.heap.MemoryPool(User).init(allocator);
defer user_pool.deinit();
const user1 = try user_pool.create();
defer user_pool.destroy(user1);
// ...
}
const User = struct {
id: i32
};
The real value of the MemoryPool
is when you're frequently allocating and destroying (not just once, as above). It works by maintaining a free list of previously destroyed objects. When the list is empty, the item is created using the underlying allocator (a GeneralPuposeAllocator
above). But when you destroy an item, it's added the list, where subsequent calls to create
can re-use the memory.
The implementation helps explain why MemoryPool
works with a single type: the use of a free list and re-use of memory requires all objects to be of the same size. It also explains the performance advantage over other allocators when there's high churn.
In the last blog post, we briefly looked at Zig's @ptrCast and how we can use it to force the compiler to treat memory as a specific type. The MemoryPool
is a good example of @ptrCast
's potential.
If I was going to write something like MemoryPool
, I'd probably start with this skeleton:
fn MemoryPool(comptime Item: type) type {
return struct {
free_list: ?*Node,
fallback: std.mem.Allocator,
const Node = struct {
data: *Item,
next: ?*Node,
};
};
}
The problem is that we'll need to allocate a Node
on every call to destroy
to build up our list. Look at the clever trick MemoryPool
uses to avoid this:
// slightly modified version of what MemoryPool does
pub fn destroy(pool: *Self, ptr: *Item) void {
ptr.* = undefined;
const node = @as(*Node, @ptrCast(ptr));
node.* = Node{
.next = pool.free_list,
};
pool.free_list = node;
}
Do you see what's going on? The memory allocated for our User
is being re-purposed to be the node in our free list. From the time that you you call create
to the time that you call destroy
the memory is yours and its of type *T
. But once you call destroy
the memory becomes a *Node
and is used to create a chain of free memory. This works for two reasons. The first is that there's no time where the two identities overlap (unless you use the memory after calling destroy
, but that's a problem with any allocator). The second is that T
needs to be at least as large as MemoryPool(T).Node
. The pool guaratees that via this line of code:
pub const item_size = @max(@sizeOf(Node), @sizeOf(Item));
When we need to allocate using the underlying allocator, we'll allocate item_size
. In most cases, @sizeOf(Item)
will be larger than @sizeOf(Node)
, so this approach not only results in no additional allocations, but also no overhead.