Elixir, A Little Beyond The Basics - Part 6: processes
Oct 20, 2021
One of the things I like the most about Elixir is that complexity in the language, standard library and runtime are well layered. By this I mean that it's possible to get up to speed and be productive quickly while slowly and naturally uncovering more advanced and nuanced concepts.
I can think of no better example of this than processes. Processes are a fundamental part of programming in Elixir, yet it's common for developers to build full Phoenix applications without interacting with them directly. Then one day, a need arises, which, with some online searching, leads to examples of Agents and GenServers and a whole new world opens up.
One of the reasons this is such a smooth experience is the synergy between the languages, standard library and runtime: GenServers, as a whole, have nice ergonomics despite their relative complexity. But this same ease-of-use can result in a superficial understanding of what's happening behind the scenes.
Let's start from zero. Elixir's terminology (inherited from Erlang) is initially confusing, but, as you learn more, accurate. Elixir doesn't have "threads", it has "processes". When you're just starting out though, it's safe to think of them as any other "green thread" (or "virtual thread"), such as goroutines. This means that they're managed by the Erlang runtime, are cheap to create and have low overhead.
We start a process with the spawn
function:
spawn fn -> IO.puts("over 9000!") end
It's worth pointing out that everything is running in a process. If you run the above code in an iex terminal, that's a process (which, like any other code, can spawn more processes).
Where Elixir processes differ from most thread and green thread implementations is that they're isolated. They don't share memory with other processes, including the process that spawned them. This has implications on how processes interact with each other, as typical concurrency primitives (e.g. mutexes) cannot work. Specifically, all interactions between processes are done via message passing. Let's look at an example:
pid = spawn fn ->
receive do
msg -> IO.puts("received: #{msg}")
end
end
send(pid, "hello")
Here we see three things. First, spawn returns the process identifier (PID or pid). We use send
to send a message to a pid. Above we're sending a string, but we can send any term (i.e. anything). Finally, receive
is used to read messages.
By default, receive
will block until a message is received. We can change this behavior, as well as add pattern matching to the received message. We can also receive multiple messages by calling receive
multiple times:
defmodule MyApp.Receiver do
def run(counter) do
IO.puts(counter)
receive do
:incr -> run(counter + 1)
{:incr, by} -> run(counter + by)
:stop -> :ok
after
250 -> run(counter)
end
end
end
pid = spawn fn -> MyApp.Receiver.run(0) end
:timer.sleep(1000)
send(pid, :incr)
:timer.sleep(2000)
send(pid, {:incr, 10})
:timer.sleep(1000)
send(pid, :stop)
You might get different results each time you run this, but it should be similar to 0
printed four times, followed by 1
printed eight times and finally 11
printed four times.
Thread Safety
What you must understand is that each Elixir processes has its own mailbox. Messages sent to a process are placed at the end of the mailbox. When a process reads from its mailbox, using receive
, the message is removed from the front it its mailbox (it's a queue). Multiple processes can send messages to the same target process, but that target process can only process one message at a time.
When you combine this mailbox pattern with process isolation, you get a strong guarantee: a process can always manipulate its data without needing any concurrency control. (To be clear, there is synchronization within the runtime to allow concurrent access to the mailbox, but this is completely hidden from the application).
You might be thinking to yourself that our above example is too simple. Of course, counter
can't be shared between processes, it's just an integer that exists on run's
stack. But the reality is that processes always have exclusive access to their data (which we typically call their "state"), regardless of the type. This is because message which cross process boundaries are deep copied (strings larger than 64 bytes have special optimizations where only a reference is passed).
This is less efficient than passing references. But it has two significant advantages. First, as we already mentioned, processes are thread-safe without any additional concurrency control. Second, it allows the garbage collector to be more efficient: since data is isolated per process, garbage collection is also isolated per process.
Send & Receive
While receive
will block, send
never blocks. In fact, we can send to a non-existent process:
pid = spawn fn -> end
:timer.sleep(100)
Process.alive?(pid)
send(pid, :hello)
If you've used GenServers, you know that you can interact with them asynchronously (via cast/2
) and synchronously (via call/2
). How is that possible? Our sender can receive
to wait for a reply:
pid = spawn fn ->
receive do
{:add, a, b, reply_pid} -> send(reply_pid, {:sum, a + b})
end
end
send(pid, {:add, 9000, 1, self()})
receive do
{:sum, value} -> IO.puts(value)
after
5000 -> raise :timeout
end
self/0
returns the current PID, which we need to send to our calculator in order for it to know where to direct the reply.
In addition to send/2
, there's also Process.send_after/3
which can be used to send a message, to a pid, after a certain amount of time:
Process.send_after(pid, :refresh_token, :timer.minutes(1))
receive
also has one neat trick: it'll search the mailbox for messages that match the given pattern(s):
pid = spawn fn ->
receive do
{:add, a, b} when is_number(a) and is_number(b) -> IO.puts("#{a} + #{b} == #{a + b}")
{:sub, a, b} when is_number(a) and is_number(b) -> IO.puts("#{a} - #{b} == #{a - b}")
end
end
send(pid, {:multiply, 9000, 2})
send(pid, {:add, "hello", "world"})
send(pid, {:sub, 1000, 5})
The above will only output 1000 - 5 == 995
, as the first two message will be ignored. In the above example, our spawned process exits shortly after we send the 3rd message, as this unblocks the receiver and causes the function to exit. However, if our process was long-lived, it's important to know that the first two messages we sent continue to exist in the process' mailbox (the runtime doesn't know if some later call to receive
will want those messages.). We can see this in action:
pid = spawn fn ->
receive do
{:add, a, b} when is_number(a) and is_number(b) -> IO.puts("#{a} + #{b} == #{a + b}")
{:sub, a, b} when is_number(a) and is_number(b) -> IO.puts("#{a} - #{b} == #{a - b}")
end
receive do
msg -> IO.inspect("unknown1: #{inspect(msg)}")
end
receive do
msg -> IO.inspect("unknown2: #{inspect(msg)}")
end
receive do
_ -> raise "should not be called"
end
end
send(pid, {:multiply, 9000, 2})
send(pid, {:add, "hello", "world"})
send(pid, {:sub, 1000, 5})
Which outputs
1000 - 5 == 995
"unknown1: {:multiply, 9000, 2}"
"unknown2: {:add, \"hello\", \"world\"}"
Normally, when we call receive
, we'll get messages in the order that they were written into the process' mailbox. But we can see from the above, where the first message received is actually the last sent, that a selective receive, via pattern matching, adds another layer.
Process Names, Process Info and Process Dictionaries
We can give a process a name, and use the name rather than the pid when sending a message. This makes it possible to send messages to process without knowing their current pid.
pid = spawn fn ->
receive do
{:add, a, b, reply_pid} -> send(reply_pid, {:sum, a + b})
end
end
Process.register(pid, :myapp_calculator)
send(:myapp_calculator, {:add, 9000, 2, self()})
receive do
msg -> IO.inspect(msg)
end
Process.whereis/1
can be used to get the pid for a given name. It returns nil
if no process is registered with that name. Process.registered/0
can be used to get a list of all registered process names.
Process.info/1
can be used to get information about a process. It takes a pid, not a registered name. Similarly, Process.info/2
takes a list of the specific fields we're interested in. Twenty or so fields are exposed by Process.info/0
:
> Process.info(self())
[
current_function: {Process, :info, 1},
initial_call: {:proc_lib, :init_p, 5},
status: :running,
message_queue_len: 0,
links: [],
dictionary: [],
trap_exit: false,
error_handler: :error_handler,
priority: :normal,
group_leader:
total_heap_size: 13544,
heap_size: 2586,
stack_size: 51,
reductions: 87142,
garbage_collection: [
max_heap_size: %{error_logger: true, kill: true, size: 0},
min_bin_vheap_size: 46422,
min_heap_size: 233,
fullsweep_after: 65535,
minor_gcs: 6
],
suspending: []
]
> Process.info(self(), [:status, :message_queue_len, :reductions])
[status: :running, message_queue_len: 0, reductions: 140543]
message_queue_len
and reductions
are particularly useful if you're interested in monitoring the performance and efficiency of your processes. The first indicates the size of the mailbox, or how many message are waiting to be received by the process. In most cases, you'll want message_queue_len
to spend as much time at or near 0 as possible.reductions
can be viewed as an opaque unit of work. A processes reductions
will continue to increment, so what we care about is the rate. In reality, a reductions is a counter which is incremented on a function call. In current versions of the Erlang runtime, a context switch happens at every 4000 reductions). While the absolute count of reductions is less valuable than the rate (else long-running processes will always appear as "heavier"), we can quickly get a list of the processes with the highest reductions:
Process.list()
|> Enum.map(fn pid ->
info = Process.info(pid, [:reductions, :registered_name])
name = case info[:registered_name] do
[] -> pid
name -> name
end
%{name: name, reductions: info[:reductions]}
end)
|> Enum.sort_by(fn p -> p[:reductions] end, :desc)
|> Enum.take(10)
There's another field in the process information that's interesting: dictionary
. In the above sample output, it was just an empty list. However, if you run Process.info(self())
in an iex terminal, you'll get a non-empty list (try running it more than once!).
Every process has an internal dictionary which can be used to store arbitrary values. The process dictionary is only accessible within the process. You interact with it via Process.get/1
, Process.put/2
and Process.delete/1
:
defmodule MyApp.Process do
def run() do
Process.put(:myapp_history, [])
read_loop(0)
end
defp read_loop(3) do
IO.inspect(history())
read_loop(0)
end
defp read_loop(count) do
receive do
msg -> Process.put(:myapp_history, [msg | history()])
end
read_loop(count + 1)
end
defp history(), do: Process.get(:myapp_history)
end
pid = spawn &MyApp.Process.run/0
send(pid, 1)
send(pid, 2)
send(pid, 3)
IO.inspect(Process.get(:myapp_history))
Similar to iex, we use the process dictionary to keep a history. We print out the history on every 3rd message. Importantly, because every process has its own dictionary, the last line of the above snippet will print nil
. Since it's being executed form a different process, it doesn't matter that we're using the same key, namely :myapp_history
.
While process dictionaries can be useful in some cases, do note that they can make code hard to understand and maintain. Values stored in the process dictionary as essentially globals of the process - with the advantage that, being fully owned and isolated to the process, they are, of course, thread safe.
Processes and Modules
This might be obvious, but it's important to be clear: processes and modules aren't tied. Given the following:
defmodule MyApp.Calculator do
def start() do
pid = spawn &loop/0
Process.register(pid, :calculator)
end
defp loop() do
receive do
{:add, a, b, reply_pid} -> send(reply_pid, a + b)
end
loop()
end
def add(a, b) do
send(:calculator, {:add, a, b, self()})
receive do
sum -> sum
after
1000 -> raise :timeout
end
end
end
MyApp.Calculator.start()
IO.puts MyApp.Calculator.add(1, 3)
It's important to know what's being run from what process. Everything commented with # 1
is running in our initial process. This includes the add/2
which is located in the MyApp.Calculator
module. While GenServers (to be discussed in greater detail in a later part) often give the impression that modules and process map 1 to 1, this isn't the case.
tl;dr
Every Elixir process has a mailbox and all communication between processes happens via messages sent to and received from mailboxes. Because a process can only operate on one message at a time, and because processes are isolated (do not share any of their data/state), processes do not require any application-level concurrency control.