Telex logo Telex

Rust Patterns That Matter #18: Typestate

Post 18 of 22 in Rust Patterns That Matter. Companion series: Building a Chat Server in Rust.

Previous: #17: Builder Pattern | Next: #19: Arc<Mutex<T>>

You have a connection that goes through states: disconnected, connecting, connected. You can only send data when connected. The obvious implementation checks the state at runtime and panics or returns an error on invalid transitions. The typestate pattern makes invalid transitions a compile error instead.

The motivation

enum State { Disconnected, Connecting, Connected }

struct Connection {
    state: State,
}

impl Connection {
    fn send(&self, data: &[u8]) -> Result<(), Error> {
        match self.state {
            State::Connected => { /* actually send */ Ok(()) }
            _ => Err(Error::new("not connected")),
        }
    }
}

The problem: every method that depends on state needs a runtime check. Forget a check and you have a bug. The type system isn't helping - from the compiler's perspective, a Connection is always the same type regardless of its state.

The pattern: states as types

Encode each state as a separate zero-sized type. Make Connection generic over the state. Transition methods consume the current state and return a new one:

// State markers — zero-sized, exist only in the type system
struct Disconnected;
struct Connecting;
struct Connected;

struct Connection<S> {
    addr: String,
    _state: std::marker::PhantomData<S>,
}

impl Connection<Disconnected> {
    fn new(addr: String) -> Self {
        Connection { addr, _state: std::marker::PhantomData }
    }

    fn connect(self) -> Connection<Connecting> {
        println!("connecting to {}...", self.addr);
        Connection { addr: self.addr, _state: std::marker::PhantomData }
    }
}

impl Connection<Connecting> {
    fn on_connected(self) -> Connection<Connected> {
        println!("connected!");
        Connection { addr: self.addr, _state: std::marker::PhantomData }
    }
}

impl Connection<Connected> {
    fn send(&self, data: &[u8]) {
        println!("sending {} bytes", data.len());
    }

    fn disconnect(self) -> Connection<Disconnected> {
        println!("disconnected");
        Connection { addr: self.addr, _state: std::marker::PhantomData }
    }
}

The send method only exists on Connection<Connected>. Calling it on a disconnected connection doesn't produce a runtime error - it doesn't compile:

let conn = Connection::new("127.0.0.1".into());

// conn.send(b"hello"); // ERROR: no method named `send` found
//                      //        for `Connection<Disconnected>`

let conn = conn.connect();     // Connection<Connecting>
let conn = conn.on_connected(); // Connection<Connected>

conn.send(b"hello");            // OK — we're connected

let conn = conn.disconnect();   // Connection<Disconnected>
// conn.send(b"hello");         // ERROR again — disconnected

Why transition methods consume self

Each transition takes self by value, not by reference. This destroys the old state - you can't use a Connection<Disconnected> after calling .connect() on it. The old variable is gone. This prevents you from accidentally using a stale state.

Zero-sized types

struct Disconnected; has no fields and occupies zero bytes. The PhantomData<S> in the connection also occupies zero bytes. The state marker exists purely in the type system - it's erased completely at compile time. Connection<Disconnected> and Connection<Connected> have the same runtime representation (just the addr field).

Tradeoffs

When to use it

When not to: dynamic state machines, state determined by runtime data, or when the number of states is large and the combinatorics of state-specific methods become unwieldy. Use an enum with runtime checks in those cases.

See it in practice: Building a Chat Server #4: Commands and Plugins uses this pattern for connection lifecycle safety.