Rust Patterns That Matter #12: Custom Iterators
Post 12 of 22 in Rust Patterns That Matter. Companion series: Building a Chat Server in Rust.
Previous: #11: Cow | Next: #13: 'static + Clone
Everyone uses .map().filter().collect(). But when you need to iterate
over your own FrameParser or RoomList, you implement
Iterator - one method, and you get map,
filter, collect, enumerate,
zip, take, skip, and fifty more adaptors
for free.
The motivation
You're parsing a chat protocol. Messages arrive as bytes in a buffer. You need to extract frames one at a time: find the delimiter, slice out the payload, advance the cursor. Something like this:
struct FrameParser<'a> {
buf: &'a [u8],
pos: usize,
}
impl<'a> FrameParser<'a> {
fn next_frame(&mut self) -> Option<&'a [u8]> {
if self.pos >= self.buf.len() {
return None;
}
let start = self.pos;
match self.buf[start..].iter().position(|&b| b == b'\n') {
Some(offset) => {
self.pos = start + offset + 1;
Some(&self.buf[start..start + offset])
}
None => {
self.pos = self.buf.len();
Some(&self.buf[start..])
}
}
}
}
This works, but now you can't use any iterator adaptors. You write manual loops everywhere:
let mut parser = FrameParser { buf: &data, pos: 0 };
let mut messages = Vec::new();
while let Some(frame) = parser.next_frame() {
if let Ok(msg) = std::str::from_utf8(frame) {
if !msg.starts_with("PING") {
messages.push(msg);
}
}
}
You'd rather write parser.map(...).filter(...).collect(). But
FrameParser isn't an iterator - it's just a struct with a
method.
The Iterator trait
One required method:
trait Iterator {
type Item;
fn next(&mut self) -> Option<Self::Item>;
}
Return Some(value) for the next item. Return None when
done. That's it. Implement this and you get .map(),
.filter(), .take(), .enumerate(),
.collect(), and 70+ other adaptors for free.
Implementing it
The FrameParser already has the right shape -
next_frame takes &mut self and returns
Option. Rename it:
impl<'a> Iterator for FrameParser<'a> {
type Item = &'a [u8];
fn next(&mut self) -> Option<Self::Item> {
if self.pos >= self.buf.len() {
return None;
}
let start = self.pos;
match self.buf[start..].iter().position(|&b| b == b'\n') {
Some(offset) => {
self.pos = start + offset + 1;
Some(&self.buf[start..start + offset])
}
None => {
self.pos = self.buf.len();
Some(&self.buf[start..])
}
}
}
}
Now the manual loop becomes:
let parser = FrameParser { buf: &data, pos: 0 };
let messages: Vec<&str> = parser
.filter_map(|frame| std::str::from_utf8(frame).ok())
.filter(|msg| !msg.starts_with("PING"))
.collect();
Same logic. No manual loop. Composable.
IntoIterator
The for loop in Rust doesn't call .next() directly -
it calls .into_iter(). That means for x in collection
works for any type that implements IntoIterator:
impl<'a> IntoIterator for &'a RoomList {
type Item = &'a Room;
type IntoIter = std::slice::Iter<'a, Room>;
fn into_iter(self) -> Self::IntoIter {
self.rooms.iter()
}
}
Now users can write for room in &room_list. The convention is
three impls: for &T (borrows), for &mut T
(mutable borrows), and for T (consuming). Not every type needs all
three - implement what makes sense.
Lazy evaluation
Iterator adaptors don't do anything until you consume them. This chain:
let result = parser
.filter_map(|frame| std::str::from_utf8(frame).ok())
.filter(|msg| !msg.starts_with("PING"))
.take(10)
.collect::<Vec<_>>();
processes one element at a time through the entire chain. No intermediate
Vec is allocated between filter_map and
filter. And .take(10) stops after 10 items - it
doesn't process the remaining buffer. This is zero-overhead by design.
Useful adaptors for custom iterators
.peekable()- look at the next item without consuming it. Essential for parsers that need one-token lookahead..chain(other)- concatenate two iterators. Useful for prepending a header frame or appending a trailer..zip(other)- pair items from two iterators. Good for correlating parallel data..enumerate()- attach an index to each item. Cheap way to get position information..by_ref()- borrow the iterator so you can use adaptors and then continue using the original iterator.
When to use it
- Good uses: parsers, file readers, database cursors, tree traversals, any sequential data source where you want to compose operations
- When not to: if you just need to expose a slice, return
&[T]- don't wrap it in a custom iterator. Iterators add value when the data is computed lazily or the traversal is non-trivial.
What comes next
Iterators let you process data lazily without allocating. But sometimes you do need
to own the data - especially when crossing scope boundaries into threads or
async tasks. That's where 'static + Clone comes in -
the next post.
See it in practice: Building a Chat Server #3: Parsing and Performance uses this pattern for streaming frame extraction from a byte buffer.
Telex