Building a Chat Server in Rust #3: Parsing and Performance
Post 3 of 6 in Building a Chat Server in Rust. Companion series: Rust Patterns That Matter.
Previous: #2: Rooms and Users | Next: #4: Commands and Plugins
Our chat server has rooms and users, but the wire format is ad hoc - just raw text. Now we define a proper protocol, parse it without allocating where possible, and add a custom iterator for streaming frame extraction. Four patterns make this work.
The code is on the
03-parsing
branch.
The protocol
Every line follows the format TYPE:PAYLOAD\n:
MSG:alice:hello world # chat message
JOIN:general # join a room
NICK:bob # change username
QUIT: # disconnect
Simple, line-delimited, human-readable. We parse each line into a Frame
enum:
pub enum Frame<'a> {
Msg {
username: Cow<'a, str>,
body: Cow<'a, str>,
},
Join {
room: Cow<'a, str>,
},
Nick {
name: Cow<'a, str>,
},
Quit,
}
Two things to notice: the lifetime 'a on the enum, and Cow
on every string field. Both exist for the same reason: zero-copy parsing.
Pattern #10: Lifetime annotations - borrowing from the input
When we parse "MSG:alice:hello world", the substrings
"alice" and "hello world" already exist in the input
buffer. Why allocate new Strings? We can just point into the
original buffer.
But Rust needs proof that the parsed Frame won't outlive the buffer
it borrows from. That's what the lifetime annotation 'a does -
it ties the Frame to its input:
pub fn parse_frame<'a>(line: &'a str) -> Result<Frame<'a>, ChatError> {
let (cmd, payload) = line
.split_once(':')
.ok_or_else(|| ChatError::Parse("missing ':' delimiter".into()))?;
match cmd {
"MSG" => {
let (username, body) = payload
.split_once(':')
.ok_or_else(|| ChatError::Parse("MSG requires username:body".into()))?;
Ok(Frame::Msg {
username: Cow::Borrowed(username.trim()),
body: Cow::Borrowed(body),
})
}
// ... other variants
}
}
Read the signature: the input line lives for 'a, and the
returned Frame also lives for 'a. The compiler enforces
that you can't use the Frame after the input buffer is freed. No
dangling pointers, no use-after-free, no runtime cost.
Deep dive: Rust Patterns #10: Lifetime Annotations covers the mental model: lifetimes describe, they don't control.
Pattern #11: Cow - borrow or own
Cow<'a, str> stands for "clone on write." It holds either a
borrowed &'a str or an owned String. The parser
creates Cow::Borrowed when it can point into the input:
// No allocation — just a pointer into the input buffer.
username: Cow::Borrowed(username.trim()),
But when the server strips control characters or trims trailing whitespace from
user input, the data changes and needs allocation. When it passes through unchanged,
the borrowed slice is free. That's Cow::Owned:
// Allocation — only when the data needs transformation.
body: Cow::Owned(raw_body.trim().to_string()),
The caller doesn't care which variant it got - Cow derefs to
&str either way. Borrow if you can, own if you must.
Deep dive: Rust Patterns #11: Cow
covers the full pattern including Cow<[u8]> and
Cow<Path>.
Pattern #12: Custom iterators - streaming frame extraction
Our server reads one line at a time with BufReader. But in production
protocol parsers - especially async ones - bytes accumulate in a buffer
from raw reads, and you extract frames as they complete. FrameIter
demonstrates this: implement Iterator's one required method, and you
get map, filter, collect for free:
pub struct FrameIter<'a> {
buf: &'a str,
pos: usize,
}
impl<'a> Iterator for FrameIter<'a> {
type Item = Result<Frame<'a>, ChatError>;
fn next(&mut self) -> Option<Self::Item> {
let remaining = &self.buf[self.pos..];
let newline = remaining.find('\n')?;
let line = &remaining[..newline];
self.pos += newline + 1;
if line.trim().is_empty() {
return self.next();
}
Some(parse_frame(line))
}
}
Implement Iterator for your type and you get map,
filter, collect, and lazy evaluation for free.
The iterator yields one Frame per complete line and stops at
incomplete data. The consumed() method tells the caller how many
bytes to drain from the buffer.
Deep dive: Rust Patterns #12: Custom Iterators
covers the full Iterator trait and how to write your own.
Pattern #13: 'static + Clone - the escape hatch
Zero-copy parsing is great - until you need the parsed data to outlive the
input buffer. In our server, we parse a Frame from a line, but then
we need to broadcast the message to other users. The line is about to be freed.
What do we do?
Clone it. Convert the borrowed data to owned data:
impl<'a> Frame<'a> {
pub fn into_owned(self) -> Frame<'static> {
match self {
Frame::Msg { username, body } => Frame::Msg {
username: Cow::Owned(username.into_owned()),
body: Cow::Owned(body.into_owned()),
},
// ... other variants
}
}
}
into_owned() converts Cow::Borrowed(&str) to
Cow::Owned(String). The result has lifetime 'static
- it owns all its data and can live forever. We use it at the boundary
where zero-copy meets "I need to keep this":
// Parse — zero-copy, borrows from `line`.
let frame = parse_frame(&line)?;
// Clone at the boundary — only when we need to keep it.
let owned_msg = msg.into_owned();
// Broadcast — owned_msg can outlive `line`.
self.broadcast(room_id, user_id, &owned_msg)?;
This is the pragmatic approach: borrow by default, clone at boundaries. Don't optimise until you profile.
Deep dive: Rust Patterns #13: 'static + Clone covers the escape hatch and when it's the right call.
Try it
# Terminal 1
git checkout 03-parsing
cargo run
# Terminal 2
nc 127.0.0.1 8080
alice # → Welcome, alice!
MSG:alice:hello world # → <alice> hello world
JOIN:general # → * You joined #general
NICK:alicia # → * You are now alicia (was alice)
QUIT: # → * Goodbye!
What we have, what's missing
We now have a structured protocol with four patterns:
- Lifetime annotations -
Frame<'a>ties the parsed data to the input buffer's lifetime. - Cow - borrow from the input buffer when clean, own when transformed. Zero allocations in the common path.
- Custom iterators -
FrameIteryields frames from a byte buffer, with all of Iterator's combinators for free. - 'static + Clone -
into_owned()converts borrowed data to owned data at scope boundaries.
What's missing: the protocol works, but the only "commands" are baked into the frame parser. Next time we build a proper command system with enum dispatch, closures, a builder, and typestate connections.
Telex