TCP Server in Zig - Part 1 - Single Threaded

Oct 02, 2024

In this series we're going to look at building a TCP server in Zig. We're going to start with a simple single-threaded server so that we can focus on basics. In following parts, we'll make our server multi-threaded and then introduce polling (poll, epoll and kqueue).

We begin with a little program that compiles and runs but doesn't do much:

const std = @import("std");
const net = std.net;
const posix = std.posix;

pub fn main() !void {
    const address = try std.net.Address.parseIp("127.0.0.1", 5882);

    const tpe: u32 = posix.SOCK.STREAM;
    const protocol = posix.IPPROTO.TCP;
    const listener = try posix.socket(address.any.family, tpe, protocol);
    defer posix.close(listener);
}

The first thing we do is define the address we'll eventually want to listen on. In our example we're going to be listening on 127.0.0.1 (localhost) port 5882. If you wanted to make your server publicly accessible, with all the risk that might entail, you would need to use your computer's public interface, or use "0.0.0.0" to bind to all interfaces. With parseIp we could also specify an IPv6 address, e.g. we could pass "::1" as the IPv6 equivalent to 127.0.0.1.

This first parameter is called the domain and it'll usually be INET or INET6. We make our life a bit easier by using our address's family for that parameter. The next parameter servers two purposes. First it sets the type of socket we're creating. For TCP this will be SOCK.STREAM. For UDP we'd use SOCK.DGRAM. The second purpose is as a bitwise OR flag for different behavior. For now we don't set any flag, but we will use posix.SOCK.STREAM | posix.SOCK.NONBLOCK in a later part. Finally, we specify TCP as the protocol that we'll be using.

At this point, we could make use our newly created socket in one of two ways. We could use posix.connect to create a client to connect to a server over TCP. Alternatively, and what we want, is to use posix.listen to flag our socket as a listening socket, a server, that can accept connections:

pub fn main() !void {
    const address = try std.net.Address.parseIp("127.0.0.1", 5882);

    const tpe: u32 = posix.SOCK.STREAM;
    const protocol = posix.IPPROTO.TCP;
    const listener = try posix.socket(address.any.family, tpe, protocol);
    defer posix.close(listener);

    // we added these three lines
    try posix.setsockopt(listener, posix.SOL.SOCKET, posix.SO.REUSEADDR, &std.mem.toBytes(@as(c_int, 1)));
    try posix.bind(listener, &address.any, address.getOsSockLen());
    try posix.listen(listener, 128);
}

The first new line allows the address to be reused. We'll go over this in more detail a bit later, but include it here so that you can start and stop the program repeatedly without getting an error. The next line binds our socket to our address. It's bind that links our process to a specific address, allowing the operating system to route incoming traffic to our process using the specified port (5882). We pass the socket, address and address length (we'll see why the length is needed when we look at accept, next). Finally we call listen which is what turns our code into a "server", able to handle incoming requests. The parameters we pass to listen is the socket and a backlog. The backlog is a hint for the number of connections we want the OS to queue while it waits for us to accept connections. Because it's just a hint, it's hard to play with to see what impact it can have. In following parts, as we improve our server, we'll get a chance to discuss this in a bit more depth.

Finally, we're able to accept and service requests. This is our first complete working example. It's a lot of code, but we'll go over it in detail:

const std = @import("std");
const net = std.net;
const posix = std.posix;

pub fn main() !void {
    const address = try std.net.Address.parseIp("127.0.0.1", 5882);

    const tpe: u32 = posix.SOCK.STREAM;
    const protocol = posix.IPPROTO.TCP;
    const listener = try posix.socket(address.any.family, tpe, protocol);
    defer posix.close(listener);

    try posix.setsockopt(listener, posix.SOL.SOCKET, posix.SO.REUSEADDR, &std.mem.toBytes(@as(c_int, 1)));
    try posix.bind(listener, &address.any, address.getOsSockLen());
    try posix.listen(listener, 128);

    while (true) {
        var client_address: net.Address = undefined;
        var client_address_len: posix.socklen_t = @sizeOf(net.Address);

        const socket = posix.accept(listener, &client_address.any, &client_address_len, 0) catch |err| {
            // Rare that this happens, but in later parts we'll
            // see examples where it does.
            std.debug.print("error accept: {}\n", .{err});
            continue;
        };
        defer posix.close(socket);

        std.debug.print("{} connected\n", .{client_address});

        write(socket, "Hello (and goodbye)") catch |err| {
            // This can easily happen, say if the client disconnects.
            std.debug.print("error writing: {}\n", .{err});
        };
    }
}

fn write(socket: posix.socket_t, msg: []const u8) !void {
    var pos: usize = 0;
    while (pos < msg.len) {
        const written = try posix.write(socket, msg[pos..]);
        if (written == 0) {
            return error.Closed;
        }
        pos += written;
    }
}

If you run this code, you should be able to connect and get the greeting:

$ socat - TCP:127.0.0.1:5882
Hello (and goodbye)

We've added the while block as well as the write function. This new code accepts a connection, writes a message to the new connection, closes the connection and repeats.

accept will block until there's an incoming connection. It takes 4 parameters: the listening socket that we're accepting on, a pointer to the address where accept will write the remote address to, the length of that address, and bitwise OR flags. For now, we don't pass any flags. You can pass null for the 2nd and 3rd parameters (the address and address length). If you do this, you'll need to call posix.getpeername to get the remote address.

Hopefully it makes sense why the address has to be passed by reference: we need the system call to populate the value for us. But why do we need to pass the length and why by reference? Keep in mind that there are different kinds of addresses, the two you're probably most familiar with are IPv4 and IPv6. These have different lengths. C doesn't have tagged unions, so it needs a bit more information to properly handle the address parameter. By taking a mutable length, accept is able to tell us the actual length of the address it wrote into the 2nd parameter. Because Zig supports tagged union, this isn't really something we need to worry about, but we still need to pass in the data.

Once we have a connection, we send it a message. Why is our call to write done in a loop? Because when we write to a socket, it's possible that only part of the bytes are written. We need to loop until all the bytes are written. write returns the number of bytes written. In this case, since we're writing a short 19-byte message, you could try this a million times and only ever need a single write. But the reality is that a 19-byte message could take anywhere from 1 to 19 calls to write to fully write. Throughout this series, we're going to talk a lot more about writing and reading messages. So this is something we'll revisit in greater detail.

Finally, we close the connection and start a new iteration of our loop, blocking on accept until a new client connects.

In the above code, we briefly jumped over our call to setsockopt prior to calling bind. We needed to add this code because after you close a socket, the operating system puts the socket in a TIME_WAIT state to deal with any additional packets that might still be on their way. The length of time is configurable, but as a general rule, you want to leave it as-is. The consequence of that is that, despite our program exiting, the address-port pair 127.0.0.1:5882 remains in-use and thus cannot be re-used for a short time. If you take that line out and start and stop the program, you should get error.AddressInUse

Setting the REUSEADDR option on the listening socket, as we did, using setsockopt is the simplest option. With this option, as long as there isn't an active socket bound and listening to the address, your bind should succeed.

Another option is to let the operating system pick the port. This has the obvious downside that clients won't be able to use a hard-coded port. Still, it's a useful trick to know:

const std = @import("std");
const net = std.net;
const posix = std.posix;

pub fn main() !void {
    // The port has been changed to 0
    const address = try std.net.Address.parseIp("127.0.0.1", 0);

    const tpe: u32 = posix.SOCK.STREAM;
    const protocol = posix.IPPROTO.TCP;
    const listener = try posix.socket(address.any.family, tpe, protocol);
    defer posix.close(listener);

    try posix.setsockopt(listener, posix.SOL.SOCKET, posix.SO.REUSEADDR, &std.mem.toBytes(@as(c_int, 1)));

    try posix.bind(listener, &address.any, address.getOsSockLen());

    // After binding, we can get the address that the OS used
    try printAddress(listener);
}

fn printAddress(socket: posix.socket_t) !void {
    var address: std.net.Address = undefined;
    var len: posix.socklen_t = @sizeOf(net.Address);

    try posix.getsockname(socket, &address.any, &len);
    std.debug.print("{}\n", .{address});
}

Notice that we set our port to 0. This is what tells the OS to pick a port for us. Once we've called bind, we can use getsockname to get the address, including the port, that the OS picked.

The next logical step is to read data from the client. Maybe we can get a little fancy and have our server echo back any message it receives. Here's our improved while block:

var buf: [128]u8 = undefined;
while (true) {
    var client_address: net.Address = undefined;
    var client_address_len: posix.socklen_t = @sizeOf(net.Address);
    const socket = posix.accept(listener, &client_address.any, &client_address_len, 0) catch |err| {
        // Rare that this happens, but in later parts we'll
        // see examples where it does.
        std.debug.print("error accept: {}\n", .{err});
        continue;
    };
    defer posix.close(socket);

    std.debug.print("{} connected\n", .{client_address});

    const read = posix.read(socket, &buf) catch |err| {
        std.debug.print("error reading: {}\n", .{err});
        continue;
    };

    if (read == 0) {
        continue;
    }

    write(socket, buf[0..read]) catch |err| {
        // This can easily happen, say if the client disconnects.
        std.debug.print("error writing: {}\n", .{err});
    };
}

Which we can test via:

$ echo 'hello?' | socat - TCP:127.0.0.1:5882
hello?

We declare a buffer, buf which we're going to read into. Since this initial version is single-threaded, we can create and re-use a single buffer, but you can probably guess that, as our implementation gets more complex, buffer management is going to be something we have to think more about.

Like write, read returns the number of bytes read into our buffer. This can be anywhere from 0 to buf.len, with 0 meaning the connection is closed. Like accept, our read is blocking. The code will just wait on that call to posix.read until the client sends something. Even if the client disconnects, read will probably not return/error immediately like you might expect. This is a fairly complicated topic. Next we'll look at implementing a timeout, which is a good start. In subsequent parts we'll look at polling and putting our socket in a nonblocking mode. And beyond this, there are numerous OS settings (like TCP keepalive) that can impact exactly how this behaves. Still, as a general rule, the best way to know if the other side is still there is to try to write to it.

TCP can be used for a wide range of applications, and for some of those, having a client and server connected with very little or no traffic for an extended period of time is normal. But in our little sample, it's a bit silly that a client can connect and block our server indefinitely. What we need is a way to timeout the read, which brings us back to setsockopt. We saw this function briefly already, when setting the REUSEADDR option on our listening socket. This function takes 4 arguments: the socket that we're setting an option on, the level where the option is defined, the actual option that we're setting, and the value that we're setting the option to. Socket options can be defined at multiple levels, but you'll almost always be using the SOL.SOCKET level. With that in mind, let's look at how we can set a read timeout on the client socket:

// ...

const socket = posix.accept(listener, &client_address.any, &client_address_len, 0) catch |err| {
    std.debug.print("error accept: {}\n", .{err});
    continue;
};
defer posix.close(socket);

std.debug.print("{} connected\n", .{client_address});

// Added these two lines (.tv_sec and .tv_usec before zig 0.14.0)
const timeout = posix.timeval{.sec = 2, .usec = 500_000};
try posix.setsockopt(socket, posix.SOL.SOCKET, posix.SO.RCVTIMEO, &std.mem.toBytes(timeout));

const read = posix.read(socket, &buf) catch |err| {
    std.debug.print("error reading: {}\n", .{err});
    continue;
};

// ...

We've added two lines of code to give a 2.5 second timeout to all subsequent reads on the socket. If you were to run this modified version and connect without sending any data, the call to posix.read should return a error.WouldBlock error after 2.5 seconds of waiting.

That &std.mem.toBytes(timeout) is a little strange. Different options take different types of values, from booleans (identified as 0 or non-zero), to integers to more complex structures as above. C doesn't have tagged unions so the underlying setsockopt essentially takes an arbitrary byte array along with a length. Rather than taking both the data and the length, Zig's posix.setsockopt takes a []const u8 slice and passes the ptr and len values to the underlying implementation. std.mem.toBytes doesn't return a slice though, it returns an array sized to the parameter type. For a posix.timeval, toBytes returns a [16]u8, so we need the address-of operator (&) to coerce it a slice.

It's worth mentioning that, once set, the timeout applies to any subsequent reads. If we want to change the timeout, we have to call setsockopt again. We can remove the timeout by setting the sec and usec fields to zero.

We're going to revisit read timeouts throughout this series. The next part will focus on message boundaries, which has a direct impact on trying to enforce an accurate timeout. Later, when we move to nonblocking sockets, we'll have to use a different way to implement timeouts. But, for now, setting RCVTIMEO and getting familiar with setsockopt, which is used for a number of other options, is a good start.

We didn't mention it before, but posix.write is also blocking (at least until we move to nonblocking sockets in later parts). While we can, and probably should, set write timeout, it's important to understand the differences between reading from a socket and writing to a socket.

Specifically, when write returns, it does not mean that the other end, the client, has received the data. In fact, it doesn't even mean that the bytes passed to write have even begun their journey beyond our computer. It's a lot like writing to a file. When you write to a file, you're really writing to layers of buffers (owned by the operating system, and then by the device being written to). Writing to a socket is the same; a successful call to write should be interpreted as: the OS has made a copy of the data and is aware that it needs to send it to the socket. Also, in case you're wondering, there is no mechanism to flush a socket. The OS and network device are all fairly free to manage bytes as they see fit.

This has a couple of implications. First, with respect to timeouts, it means that write timeouts are generally less common than read timeouts. Our write is only really writing to a buffer owned by our operating system, so in normal cases, it isn't likely to stall. That said, there's obviously a limit to how much data the OS can buffer. If you're writing data faster than the other end can process, write will block. In fact, we can use setsockopt with the SNDBUF option to alter the size of that buffer. If you're sending infrequent and small messages, as we are, a write is unlikely to ever block. But even in these simple cases, it's a good idea to set a write timeout:

// our existing read timeout
const timeout = posix.timeval{.sec = 2, .usec = 500_000};
try posix.setsockopt(socket, posix.SOL.SOCKET, posix.SO.RCVTIMEO, &std.mem.toBytes(timeout));

// add the write timeout
try posix.setsockopt(socket, posix.SOL.SOCKET, posix.SO.SNDTIMEO, &std.mem.toBytes(timeout));

Now we're using setsockopt to set the SO.SNDTIMEO option and, for simplicity, we've used the same 2.5 second timeout as our read.

The other important implication of write's behavior is that the operating system makes a copy of the bytes to write. This obviously incurs a cost, possibly a significant cost if you're streaming gigabytes of data. And it isn't an easy problem to solve. In non-trivial cases, we would expect the message to be dynamically allocated or to be part of a re-usable buffer. If the OS didn't make a copy, how would we know when it was safe to free the message or re-use the buffer? So we can think of this as a cost of network programming. However, recent patterns have aimed to solve this. We'll only briefly explore this at the end of this series (hint: iouring). For the most part, this is just a reality of network programming, but something worth being aware of.

Zig's standard library has an std.net.listen function which returned an std.net.Server. The Server has an accept method which returns a std.net.Server.Connection which exposes a std.net.Stream which is a thin wrapper around a socket.

We could have used all of these and saved ourselves some coding. For example the Stream has a writeAll function that behaves like the write function above. Using the underlying functions directly provides greater flexibility and helps us better understand the fundamentals. Furthermore, many of the std.net APIs feel incomplete. For example, there's no way to set a timeout on a std.net.Stream, so we still need to use posix.setsockopt directly. As our examples leverage more advanced features, these thin wrappers are going to be less and less useful.

That said, if you're building a library, you can have your cake and eat it too. Your code can leverage the underlying functions and structures directly, but you can always expose an std.net.Stream to your users. As an example of what I mean, consider this example and pay attention to the last 5 lines of code:

const std = @import("std");
const net = std.net;
const posix = std.posix;

pub fn main() !void {
    const address = try std.net.Address.parseIp("127.0.0.1", 5882);

    const tpe: u32 = posix.SOCK.STREAM;
    const protocol = posix.IPPROTO.TCP;
    const listener = try posix.socket(address.any.family, tpe, protocol);
    defer posix.close(listener);

    try posix.setsockopt(listener, posix.SOL.SOCKET, posix.SO.REUSEADDR, &std.mem.toBytes(@as(c_int, 1)));
    try posix.bind(listener, &address.any, address.getOsSockLen());
    try posix.listen(listener, 128);

    var buf: [128]u8 = undefined;
    while (true) {
        var client_address: net.Address = undefined;
        var client_address_len: posix.socklen_t = @sizeOf(net.Address);
        const socket = posix.accept(listener, &client_address.any, &client_address_len, 0) catch |err| {
            // Rare that this happens, but in later parts we'll
            // see examples where it does.
            std.debug.print("error accept: {}\n", .{err});
            continue;
        };
        defer posix.close(socket);

        std.debug.print("{} connected\n", .{client_address});

        const timeout = posix.timeval{.sec = 2, .usec = 500_000};
        try posix.setsockopt(socket, posix.SOL.SOCKET, posix.SO.RCVTIMEO, &std.mem.toBytes(timeout));
        try posix.setsockopt(socket, posix.SOL.SOCKET, posix.SO.SNDTIMEO, &std.mem.toBytes(timeout));

        // we've changed everything from this point on
        const stream = std.net.Stream{.handle = socket};

        const read = try stream.read(&buf);
        if (read == 0) {
            continue;
        }
        try stream.writeAll(buf[0..read]);
    }
}

Instead of using posix.read and posix.write, we've created a std.net.Stream (which just wraps our socket) and used its read and writeAll functions. Again, I think it's best to use the socket directly in your own code, but if it is a library, you can see from the above how easy it would be to expose the higher level Stream.

We've taken the first step in building a server and, hopefully, you have enough code to play and experiment with. Except for the write function which iterated through the unsent bytes, we've completely ignored the fact that TCP deals exclusively in a stream of bytes. While you and I think of sending and receiving distinct messages, that isn't how TCP works and it isn't how we can structure our reads and writes. The next part will focus on adding message boundaries to the TCP stream of bytes. Once we have that out of the way, we'll be able to focus on taking our above example and scaling it.

TCP Server in Zig - Part 1 - Single Threaded

REUSEADDR / Address in Use

Reading

Read Timeouts

Write Timeouts

std.net

Conclusion