Rust Patterns That Matter #8: Index-Based Design
Post 8 of 22 in Rust Patterns That Matter. Companion series: Building a Chat Server in Rust.
Previous: #7: Split Borrows | Next: #9: Drop and RAII
Some data structures don't fit Rust's ownership model naturally. Graphs, trees with parent pointers, entity-component systems - anything where things reference each other in cycles or complex webs. The solution is surprisingly simple: stop using references.
The motivation
You're building a graph. Nodes have edges to other nodes. In Python, you'd store a list of references to neighbour objects. In Rust, you try to do the same with references.
struct Node<'a> {
value: String,
neighbors: Vec<&'a Node<'a>>,
}
fn main() {
let mut a = Node { value: "A".into(), neighbors: vec![] };
let mut b = Node { value: "B".into(), neighbors: vec![] };
a.neighbors.push(&b);
b.neighbors.push(&a); // ERROR: cannot borrow `a` as immutable
// because it is also borrowed as mutable
}
This is the tip of the iceberg. Even if you fix this particular error, you
can't add nodes to a Vec<Node> while other nodes reference
into it - the Vec might reallocate, invalidating all references.
Self-referential structures and Rust's borrow checker are fundamentally at odds.
Why references don't work for graphs
Rust references are compile-time-checked pointers. They guarantee that the data they point to is valid and not being mutated by someone else. This works beautifully for tree-shaped ownership. It falls apart when the relationships form cycles or when the container holding the data might move it in memory.
A Vec can reallocate when it grows. If node A stores &node_b
and node_b lives inside a Vec, pushing a new node to that
Vec might move node_b to a new address. The reference in
node A is now dangling. Rust prevents this at compile time, which is why you can't
get a long-lived reference into a Vec that might change.
The pattern: indices into a Vec
Instead of storing references to nodes, store their position in a shared array. Relationships become integers, not pointers.
struct Graph {
nodes: Vec<Node>,
}
struct Node {
value: String,
neighbors: Vec<usize>, // indices into Graph::nodes
}
impl Graph {
fn new() -> Self {
Graph { nodes: vec![] }
}
fn add_node(&mut self, value: String) -> usize {
let idx = self.nodes.len();
self.nodes.push(Node { value, neighbors: vec![] });
idx
}
fn add_edge(&mut self, from: usize, to: usize) {
self.nodes[from].neighbors.push(to);
self.nodes[to].neighbors.push(from);
}
}
fn main() {
let mut g = Graph::new();
let a = g.add_node("A".into());
let b = g.add_node("B".into());
g.add_edge(a, b); // no borrow issues — just integers
}
No lifetimes, no borrow checker conflicts, no self-referential structures.
Relationships are just numbers. The Vec can grow freely because
nobody holds a reference into it - they hold indices, which remain valid
regardless of reallocation.
Arenas
A Vec used as a node store is called an arena. All nodes share the
arena's lifetime. Allocation is a push. Lookup is an index.
Deallocation happens when the arena is dropped.
Dedicated arena crates like typed-arena or bumpalo give
you stable references (nodes don't move once allocated) if you need them. But for
most graph-like structures, plain Vec + indices is sufficient and
requires no dependencies.
Making indices type-safe
A bare usize is error-prone - you could accidentally use a node
index as an edge index, or index into the wrong vector. The newtype pattern
(#1) fixes this:
struct NodeId(usize);
struct EdgeId(usize);
struct Node {
value: String,
neighbors: Vec<NodeId>,
}
Now the compiler catches it if you pass an EdgeId where a
NodeId is expected. Zero runtime cost.
ECS: this pattern at scale
Entity Component Systems take this idea to its logical conclusion. Entities are
indices. Components are stored in parallel arrays keyed by entity index. To check
if entity 42 has a position: positions[42]. To check its health:
healths[42]. No references between components, no ownership tangles,
excellent cache locality.
Game engines (Bevy, Legion) and simulation systems are built entirely on this principle. It scales to millions of entities precisely because it avoids the pointer-chasing and lifetime complexity that references would introduce.
Tradeoffs
- No compile-time validity. An index is just a number. If you remove a node without cleaning up references to it, the index becomes stale. This is a runtime concern, not a compile-time one.
- Generational indices mitigate stale references: each slot has a generation counter that increments on removal. An index stores both the position and the generation; lookup checks that they match.
- Cache-friendly. Nodes live in contiguous memory. Iterating
over all nodes is a linear scan. This is often faster than chasing heap-allocated
Box<Node>pointers.
When to use it
- Graphs with cycles (the motivating example)
- Trees where children need parent pointers
- ECS and simulation systems
- Any structure where Rust's ownership model and reference relationships conflict
When not to: if your data is a simple tree with clear parent-to-child ownership and
no cycles, Box<Node> works fine and the compiler checks everything
statically.
See it in practice: Building a Chat Server #2: Rooms and Users uses this pattern for the server's user and room storage.
Telex