homedark

Zig's MemoryPool Allocator

Dec 22, 2023

You're probably familiar with Zig's GeneralPurposeAllocator, ArenaAllocator and FixedBufferAllocator, but Zig's standard library has another allocator you should be keeping at the ready: std.heap.MemoryPool.

What makes the MemoryPool allocator special is that it can only create one type. As compensation for this limitation, it's very fast when allocating and freeing that type. Also, it's easy to use:

const std = @import("std");

pub fn main() !void {
  var gpa = std.heap.GeneralPurposeAllocator(.{}){};
  const allocator = gpa.allocator();

  // This pool can only create and destroy a User
  var user_pool = std.heap.MemoryPool(User).init(allocator);
  defer user_pool.deinit();

  const user1 = try user_pool.create();
  defer user_pool.destroy(user1);
  // ...
}

const User = struct {
  id: i32
};

The real value of the MemoryPool is when you're frequently allocating and destroying (not just once, as above). It works by maintaining a free list of previously destroyed objects. When the list is empty, the item is created using the underlying allocator (a GeneralPuposeAllocator above). But when you destroy an item, it's added the list, where subsequent calls to create can re-use the memory.

The implementation helps explain why MemoryPool works with a single type: the use of a free list and re-use of memory requires all objects to be of the same size. It also explains the performance advantage over other allocators when there's high churn.

In the last blog post, we briefly looked at Zig's @ptrCast and how we can use it to force the compiler to treat memory as a specific type. The MemoryPool is a good example of @ptrCast's potential.

If I was going to write something like MemoryPool, I'd probably start with this skeleton:

fn MemoryPool(comptime Item: type) type {
  return struct {
    free_list: ?*Node,
    fallback: std.mem.Allocator,

    const Node = struct {
      data: *Item,
      next: ?*Node,
    };
  };
}

The problem is that we'll need to allocate a Node on every call to destroy to build up our list. Look at the clever trick MemoryPool uses to avoid this:

// slightly modified version of what MemoryPool does
pub fn destroy(pool: *Self, ptr: *Item) void {
  ptr.* = undefined;

  const node = @as(*Node, @ptrCast(ptr));
  node.* = Node{
      .next = pool.free_list,
  };
  pool.free_list = node;
}

Do you see what's going on? The memory allocated for our User is being re-purposed to be the node in our free list. From the time that you you call create to the time that you call destroy the memory is yours and its of type *T. But once you call destroy the memory becomes a *Node and is used to create a chain of free memory. This works for two reasons. The first is that there's no time where the two identities overlap (unless you use the memory after calling destroy, but that's a problem with any allocator). The second is that T needs to be at least as large as MemoryPool(T).Node. The pool guaratees that via this line of code:

pub const item_size = @max(@sizeOf(Node), @sizeOf(Item));

When we need to allocate using the underlying allocator, we'll allocate item_size. In most cases, @sizeOf(Item) will be larger than @sizeOf(Node), so this approach not only results in no additional allocations, but also no overhead.