Popol: Minimal non-blocking I/O with Rust
Asynchronous I/O in Rust is still very much in its infancy. Async/await,
Rust’s solution to the problem, was recently stabilized and so when came
the time to implement some peer-to-peer networking code, I reached for the shiny new
feature. To my dismay, it created more problems than it solved. Indeed, I quickly
regretted going down that path and searched for alternatives. All I was
looking for was an easy way to handle between fifty and a hundred TCP connections
(net::TcpStream
) efficiently to implement the reactor for nakamoto, a
Bitcoin client I’ve been working on.
The crux of the problem with async/await is that it is incompatible with the
standard library traits, such as Read
and Write
. Partly, this is due to the
design decisions behind Rust’s async implementation: the choice of cooperative
multitasking instead of preemptive multitasking, for instance.
Languages built on the latter don’t suffer from this scission. Good examples
include Haskell and Erlang, which don’t have a language-level concept of “async”
– the implementation is transparent to the user.
As it stands, the async/await system in Rust comes with an incompatible set of
traits: AsyncRead
and AsyncWrite
, which don’t play nicely with the standard
library. Channels, for example the amazing crossbeam-channel don’t work in
asynchronous code – you need an async variant. Alternatively, most of the
runtimes that effectively provide replacement types for channels, files and
sockets, such as async-std and tokio, have a large dependency footprint,
inherent in the complexity of the problem. For example, smol, one of the
“small” runtimes still consists of about 24 crates, including its transitive
dependencies, as of today.
$ cargo tree -p smol -e no-dev --prefix none --no-dedupe | sort | uniq | wc -l
24
Async/await may be a “zero-cost abstraction”, but it certainly has a cost in usability, maintenance and complexity. I think it still has a long way to go before the benefits outweigh the costs.
So what are the alternatives? Well, for parallelism, we have threads. Threads
are great in Rust. However, for peer-to-peer networking, threads can get a little
unwieldy. For my use case, the ability to handle between a dozen and a few
hundred open connections is all I need. And I believe this is true for most
people, especially in the peer-to-peer space. This requirement places us squarely
within the territory of the venerable poll(2)
system call, which is available on
almost all platforms nowadays.1
Popol is designed as a minimal ergonomic wrapper around poll
,
built for use cases such as peer-to-peer networking, where you typically have
no more than a few hundred concurrent connections. It’s meant to be familiar
enough for those with experience using mio, but a little easier to use, and
a lot smaller.2
// Create a registry to hold I/O sources.
let mut sources = popol::Sources::with_capacity(1);
// Create an events buffer to hold readiness events.
let mut events = popol::Events::with_capacity(1);
// Register the program's standard input as a source of "read" readiness events.
// The first parameter is the key we want to associate with the source. Since
// we only have one source in this example, we just pass in the unit type.
sources.register((), &io::stdin(), popol::interest::READ);
// Wait on our event sources for at most 6 seconds. If an event source is
// ready before then, process its events. Otherwise, timeout.
match sources.wait_timeout(&mut events, Duration::from_secs(6)) {
Ok(()) => {}
Err(err) if err.kind() == io::ErrorKind::TimedOut => process::exit(1),
Err(err) => return Err(err),
}
// Iterate over source events. Since we only have one source
// registered, this will only iterate once.
for ((), event) in events.iter() {
// The standard input has data ready to be read.
if event.readable || event.hangup {
let mut buf = [0; 1024];
// Read what we can from standard input and echo it.
match io::stdin().read(&mut buf[..]) {
Ok(n) => io::stdout().write_all(&buf[..n])?,
Err(err) => panic!(err),
}
}
}
By building on poll
, we have the following benefits:
- Much smaller implementation than even the smallest async/await runtimes.3
- All the standard library traits just work (
io::Read
,io::Write
, etc.) - All the standard library sockets work, (
net:::TcpStream
,net::UdpStream
, etc.) - No “runtime” keeps the code much easier to reason about.
Some people ask why I didn’t build on epoll
4. There’s a couple of reasons:
epoll
is more complex thanpoll
and thus requires us to write more code.epoll
isn’t portable; it only works on Linux.epoll
requires a system call to add, remove or modify the list of sources to poll. This introduces new failure modes.- Even though
epoll
should be able to handle more connections,poll
is plenty enough for all the use-cases I’m interested in.
How about mio? It solves the first two problems, one might say. But there are a couple of things that irked me about mio. In particular:
- mio comes bundled with its own socket types, which you are supposed to use instead of the ones provided by the standard library.
- mio identifies event sources by a
Token
type, which is basically a wrapper around au64
. This is leaked from the underlyingepoll
implementation, and is less than ideal.
Compared to mio, popol:
- Is a lot smaller (about 10% of the size).
- Allows you to use any type implementing
Eq
andClone
to identify event sources. It’s not limited tou64
. - Supports standard library sockets (and other file types) transparently.
- Supports multiple wakers per wait call.
- Has only a single API call that can fail, since it’s built on
poll
.
Some of the above is exemplified by popol’s Sources::register
function below:
impl<K: Eq + Clone> Sources<K> {
/// Register a new source with the given key, and wait for the specified
/// events.
pub fn register(&mut self, key: K, source: &impl AsRawFd, events: Interest);
}
It goes without saying that popol is also a lot less mature than mio, so use it at your own risk! Popol also doesn’t currently support Windows, though this is planned.
Finally, we can look at an example of a TCP server which accepts incoming
connections and registers them with popol. This example showcases the
use of more complex types for K
:
use std::{io, net};
use popol;
/// The identifier we'll use with `popol` to figure out the source of an event.
/// The `K` in `Sources<K>`.
#[derive(Eq, PartialEq, Clone)]
enum Source {
/// An event from a connected peer.
Peer(net::SocketAddr),
/// An event on the listening socket. Most probably a new peer connection.
Listener,
}
let listener = net::TcpListener::bind("0.0.0.0:8888")?;
let mut sources = popol::Sources::new();
let mut events = popol::Events::new();
// Register the listener socket, using the corresponding identifier.
sources.register(Source::Listener, &listener, popol::interest::READ);
// It's important to set the socket in non-blocking mode. This allows
// us to know when to stop accepting connections.
listener.set_nonblocking(true)?;
loop {
// Wait for something to happen on our sources.
sources.wait(&mut events)?;
for (key, event) in events.iter() {
match key {
Source::Listener => loop {
// Accept as many connections as we can.
let (conn, addr) = match listener.accept() {
Ok((conn, addr)) => (conn, addr),
Err(e) if e.kind() == io::ErrorKind::WouldBlock => break,
Err(e) => return Err(e),
};
// Register the new peer using the `Peer` variant of `Source`.
sources.register(
Source::Peer(addr),
&conn,
popol::interest::ALL
);
}
Source::Peer(addr) if event.readable => {
println!("{} has data to be read", addr);
}
Source::Peer(addr) if event.writable => {
println!("{} is ready to be written", addr);
}
}
}
}
Popol is available on crates.io. The source code is hosted on GitHub. Discuss on reddit.