Rust Deep Dives #6: Serde Beyond #[derive]: The Attributes You'll Actually Use
Post 6 of 8 in Rust Deep Dives. Companion series: Rust Patterns That Matter.
Previous: #5: Trait Patterns | Next: #7: Declarative Macros
If you write Rust that talks to the outside world, you use serde. Config files,
HTTP APIs, databases, message queues — serde is behind nearly all of it. The
derive macros get you started in seconds, but real-world data is messy.
Keys are camelCase when your structs are snake_case. Fields show up sometimes and
don't other times. APIs wrap everything in a "data" envelope. Numbers
arrive as strings. This post covers the attributes and patterns you actually reach
for when that happens.
The basics (quick recap)
Add serde with the derive feature and a format crate
like serde_json. Slap #[derive(Serialize, Deserialize)]
on your struct, and you're done. Serde generates the serialization code at compile
time — no reflection, no runtime overhead.
use serde::{Serialize, Deserialize};
// That's it. This struct can now be converted to/from JSON, TOML,
// YAML, MessagePack, and dozens of other formats.
#[derive(Serialize, Deserialize, Debug)]
struct User {
name: String,
email: String,
age: u32,
}
fn main() {
let json = r#"{"name": "Alice", "email": "alice@example.com", "age": 30}"#;
let user: User = serde_json::from_str(json).unwrap();
println!("{:?}", user);
let back_to_json = serde_json::to_string_pretty(&user).unwrap();
println!("{}", back_to_json);
}
That covers perhaps 40% of real usage. The other 60% is where the attributes come in. Let's go through them.
Field-level attributes
This is where you spend most of your time with serde. The data you're working with almost never matches your Rust structs perfectly out of the box. Field-level attributes bridge the gap between what the external world sends you and what your code wants to work with.
#[serde(rename = "...")]
The JSON key is "userName" but your Rust field is user_name.
This is the most common attribute you'll use. It maps a Rust field name to a
different serialized name without forcing you to break Rust naming conventions.
#[derive(Serialize, Deserialize)]
struct ApiResponse {
#[serde(rename = "userId")]
user_id: u64,
#[serde(rename = "firstName")]
first_name: String,
#[serde(rename = "lastName")]
last_name: String,
}
This works in both directions: serialization uses the renamed key, and
deserialization expects it. If you need different names for each direction,
there's #[serde(rename(serialize = "...", deserialize = "..."))],
but you rarely need it.
#[serde(rename_all = "...")]
If every field follows the same naming convention, renaming them one by one is
tedious. Put rename_all on the struct (or enum) instead and serde
converts every field automatically.
#[derive(Serialize, Deserialize)]
#[serde(rename_all = "camelCase")]
struct UserProfile {
user_id: u64, // serializes as "userId"
first_name: String, // serializes as "firstName"
last_name: String, // serializes as "lastName"
email_verified: bool, // serializes as "emailVerified"
}
The supported conventions are "camelCase", "PascalCase",
"snake_case", "SCREAMING_SNAKE_CASE",
"kebab-case", and "SCREAMING-KEBAB-CASE". For JSON APIs,
camelCase is by far the most common. For TOML and YAML config files,
kebab-case shows up frequently.
#[serde(skip)]
Some fields are internal state that shouldn't appear in the serialized output and
shouldn't be expected on input. skip excludes the field entirely from
both serialization and deserialization. The field must implement Default
so serde can fill it in during deserialization.
#[derive(Serialize, Deserialize)]
struct Document {
title: String,
body: String,
// Internal bookkeeping, not part of the serialized format
#[serde(skip)]
dirty: bool,
#[serde(skip)]
cache: Option<String>,
}
There are also #[serde(skip_serializing)] and
#[serde(skip_deserializing)] if you only want to skip in one direction.
#[serde(skip_serializing_if = "...")]
This is the one you use to omit None values from JSON output instead
of writing them as null. It takes a path to a function that returns
bool — if the function returns true, the field is
omitted.
#[derive(Serialize, Deserialize)]
struct UpdateRequest {
#[serde(skip_serializing_if = "Option::is_none")]
name: Option<String>,
#[serde(skip_serializing_if = "Option::is_none")]
email: Option<String>,
#[serde(skip_serializing_if = "Vec::is_empty")]
tags: Vec<String>,
}
// With name = Some("Alice"), email = None, tags = vec![]
// Serializes to: {"name": "Alice"}
// Instead of: {"name": "Alice", "email": null, "tags": []}
This is especially important for PATCH-style API requests where you only want to
send the fields that actually changed. The predicate function can be anything —
Option::is_none, Vec::is_empty, or your own function that
checks whatever condition you want.
#[serde(default)]
When a field is missing from the input, instead of failing, use
Default::default() to fill it in. This is essential for config files
where you want sensible defaults for anything the user doesn't specify.
#[derive(Serialize, Deserialize)]
struct ServerConfig {
host: String,
#[serde(default = "default_port")]
port: u16,
#[serde(default)]
verbose: bool, // defaults to false
#[serde(default)]
max_connections: u32, // defaults to 0
#[serde(default)]
allowed_origins: Vec<String>, // defaults to empty vec
}
fn default_port() -> u16 {
8080
}
Plain #[serde(default)] calls Default::default() for that
field's type. #[serde(default = "path")] calls a specific function. You
can also put #[serde(default)] on the struct itself to apply it to every
field at once.
#[serde(flatten)]
Flatten inlines a nested struct's fields into the parent. This is useful when you want to compose structs in Rust but the serialized format is a flat object.
#[derive(Serialize, Deserialize)]
struct Pagination {
page: u32,
per_page: u32,
}
#[derive(Serialize, Deserialize)]
struct SearchRequest {
query: String,
#[serde(flatten)]
pagination: Pagination,
}
// Serializes to: {"query": "rust serde", "page": 1, "per_page": 20}
// Not: {"query": "rust serde", "pagination": {"page": 1, "per_page": 20}}
This pattern is great for reusing common field groups. You define pagination, sorting, and filtering as separate structs, then flatten them into whatever request type needs them. In your Rust code you get nice composition. In the serialized output you get a flat structure that matches what the API expects.
Enum representations
Serde supports four different ways to represent enums in serialized data. This matters a lot because the choice affects what your JSON (or TOML, or whatever) looks like, and you usually need to match an existing format rather than pick your favorite. Let's look at all four using the same enum.
// The enum we'll serialize four different ways
#[derive(Serialize, Deserialize)]
enum Message {
Text { body: String },
Image { url: String, width: u32 },
Ping,
}
Externally tagged (the default)
With no attribute, serde uses external tagging. The variant name is a key, and the content is the value.
// Text { body: "hello" } becomes:
// {"Text": {"body": "hello"}}
//
// Ping becomes:
// "Ping"
It works, but it's awkward for most JSON APIs. You can't easily add a common field like a timestamp alongside the tag.
Internally tagged
#[derive(Serialize, Deserialize)]
#[serde(tag = "type")]
enum Message {
Text { body: String },
Image { url: String, width: u32 },
Ping,
}
// Text { body: "hello" } becomes:
// {"type": "Text", "body": "hello"}
//
// Ping becomes:
// {"type": "Ping"}
The tag is a field inside the object. This is what most REST APIs look like. You
can combine it with rename_all to get
{"type": "text", "body": "hello"} instead.
Adjacently tagged
#[derive(Serialize, Deserialize)]
#[serde(tag = "type", content = "data")]
enum Message {
Text { body: String },
Image { url: String, width: u32 },
Ping,
}
// Text { body: "hello" } becomes:
// {"type": "Text", "data": {"body": "hello"}}
//
// Ping becomes:
// {"type": "Ping"}
The tag and the content live side by side in the same object. This shows up in some APIs and event systems where the payload is explicitly separated from the metadata.
Untagged
#[derive(Serialize, Deserialize)]
#[serde(untagged)]
enum Message {
Text { body: String },
Image { url: String, width: u32 },
Ping,
}
// Text { body: "hello" } becomes:
// {"body": "hello"}
//
// No variant name appears anywhere.
Untagged enums have no discriminator. Serde tries each variant in order and uses the first one that successfully deserializes. This is powerful but comes with trade-offs: error messages are terrible (you just get "data did not match any variant"), and variant order matters because ambiguous data will match whichever variant comes first.
The decision: for JSON APIs, internally tagged is the most common and usually the right choice. It produces clean, readable JSON and makes it obvious what type of object you're looking at. Use untagged when you're consuming an API you don't control that doesn't include a type discriminator. Use adjacently tagged when the API wraps payloads in a separate field. The external default is fine for Rust-to-Rust communication where humans won't read the data.
Handling messy real-world data
The patterns above cover the well-designed cases. But real APIs are not well-designed. Here's how to deal with the common headaches.
String-or-number fields
Some APIs send the same field as a number sometimes and a string other times.
A user ID might be 12345 in one response and "12345" in
another. An untagged enum handles this cleanly.
use std::fmt;
#[derive(Serialize, Deserialize, Debug, Clone)]
#[serde(untagged)]
enum StringOrNumber {
Number(u64),
Text(String),
}
impl StringOrNumber {
fn as_u64(&self) -> Option<u64> {
match self {
StringOrNumber::Number(n) => Some(*n),
StringOrNumber::Text(s) => s.parse().ok(),
}
}
}
#[derive(Serialize, Deserialize, Debug)]
struct ApiUser {
id: StringOrNumber,
name: String,
}
// Both of these work:
// {"id": 12345, "name": "Alice"}
// {"id": "12345", "name": "Alice"}
Note the variant order: Number comes first because serde tries variants
in order for untagged enums. If Text came first, every number would
deserialize as a string (since JSON numbers are valid strings in many parsers).
Unknown fields
By default, serde silently ignores fields it doesn't recognize. That's usually fine, but sometimes you want to be strict or capture everything.
use std::collections::HashMap;
// Strict mode: reject any unexpected fields
#[derive(Deserialize)]
#[serde(deny_unknown_fields)]
struct StrictConfig {
host: String,
port: u16,
}
// Permissive mode: capture unknown fields in a map
#[derive(Serialize, Deserialize)]
struct FlexibleConfig {
host: String,
port: u16,
#[serde(flatten)]
extra: HashMap<String, serde_json::Value>,
}
The deny_unknown_fields approach is good for config files where a typo
in a key name should be an error, not silently ignored. The flatten
with HashMap approach is useful when you want to forward unknown fields
to another system or log them for debugging.
Custom date/time deserialization
Dates are a perennial pain point. Some APIs send Unix timestamps, some send ISO 8601
strings, some send epoch milliseconds. Serde's deserialize_with
attribute lets you write a function that handles whatever format you're dealing with.
use serde::{Deserialize, Deserializer};
#[derive(Deserialize, Debug)]
struct Event {
name: String,
#[serde(deserialize_with = "deserialize_timestamp")]
created_at: u64, // always stored as seconds
}
// Accept either seconds (u64) or milliseconds (u64 > 1e12)
fn deserialize_timestamp<'de, D>(deserializer: D) -> Result<u64, D::Error>
where
D: Deserializer<'de>,
{
let value = u64::deserialize(deserializer)?;
if value > 1_000_000_000_000 {
// Looks like milliseconds, convert to seconds
Ok(value / 1000)
} else {
Ok(value)
}
}
// Both work:
// {"name": "deploy", "created_at": 1700000000}
// {"name": "deploy", "created_at": 1700000000000}
Nested JSON strings
Sometimes an API returns a field that contains a JSON string inside a JSON string. Yes, this happens. The outer JSON has a string field, and that string is itself valid JSON that you need to parse into a struct.
use serde::{Deserialize, Deserializer};
#[derive(Deserialize, Debug)]
struct Metadata {
version: u32,
region: String,
}
#[derive(Deserialize, Debug)]
struct Webhook {
event: String,
// The API sends this as a JSON string, not a JSON object
#[serde(deserialize_with = "deserialize_nested_json")]
metadata: Metadata,
}
fn deserialize_nested_json<'de, D, T>(deserializer: D) -> Result<T, D::Error>
where
D: Deserializer<'de>,
T: serde::de::DeserializeOwned,
{
let s = String::deserialize(deserializer)?;
serde_json::from_str(&s).map_err(serde::de::Error::custom)
}
// Handles: {"event": "push", "metadata": "{\"version\":2,\"region\":\"us-east\"}"}
The deserialize_nested_json function is generic over T, so
you can reuse it for any field that has this "JSON string inside JSON" problem.
Custom serialization and deserialization
When attributes aren't enough, serde gives you three levels of customization:
the with module pattern, standalone functions, and full manual
implementations.
#[serde(with = "module")]
The with attribute points to a module that provides both
serialize and deserialize functions. This is the
cleanest approach when you need to customize both directions for a field.
mod hex_bytes {
use serde::{Serializer, Deserializer, Deserialize};
pub fn serialize<S>(bytes: &Vec<u8>, serializer: S) -> Result<S::Ok, S::Error>
where
S: Serializer,
{
let hex_string: String = bytes.iter().map(|b| format!("{:02x}", b)).collect();
serializer.serialize_str(&hex_string)
}
pub fn deserialize<'de, D>(deserializer: D) -> Result<Vec<u8>, D::Error>
where
D: Deserializer<'de>,
{
let s = String::deserialize(deserializer)?;
(0..s.len())
.step_by(2)
.map(|i| {
u8::from_str_radix(&s[i..i + 2], 16)
.map_err(|e| serde::de::Error::custom(e))
})
.collect()
}
}
#[derive(Serialize, Deserialize)]
struct FileHash {
name: String,
#[serde(with = "hex_bytes")]
checksum: Vec<u8>,
}
// Serializes as: {"name": "data.bin", "checksum": "a1b2c3d4"}
// Instead of: {"name": "data.bin", "checksum": [161, 178, 195, 212]}
#[serde(deserialize_with = "function")]
If you only need to customize deserialization, point to a single function instead
of a whole module. We already saw this with the timestamp example above. Here's
another common case: deserializing a comma-separated string into a Vec.
use serde::{Deserialize, Deserializer};
fn comma_separated<'de, D>(deserializer: D) -> Result<Vec<String>, D::Error>
where
D: Deserializer<'de>,
{
let s = String::deserialize(deserializer)?;
Ok(s.split(',').map(|s| s.trim().to_string()).collect())
}
#[derive(Deserialize, Debug)]
struct CsvRow {
name: String,
#[serde(deserialize_with = "comma_separated")]
tags: Vec<String>,
}
// {"name": "post", "tags": "rust, serde, json"}
// Parses tags as: vec!["rust", "serde", "json"]
Manual Serialize/Deserialize implementation
Sometimes no combination of attributes will do what you need. You can implement the traits by hand. This is verbose but gives you total control. You'll rarely need this, but it's good to know it's an option.
use serde::{Serialize, Serializer, Deserialize, Deserializer};
use serde::ser::SerializeStruct;
struct Color {
r: u8,
g: u8,
b: u8,
}
// Serialize as "#rrggbb" string instead of {"r": 0, "g": 0, "b": 0}
impl Serialize for Color {
fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
where
S: Serializer,
{
let hex = format!("#{:02x}{:02x}{:02x}", self.r, self.g, self.b);
serializer.serialize_str(&hex)
}
}
// Deserialize from "#rrggbb" string
impl<'de> Deserialize<'de> for Color {
fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
where
D: Deserializer<'de>,
{
let s = String::deserialize(deserializer)?;
let s = s.trim_start_matches('#');
if s.len() != 6 {
return Err(serde::de::Error::custom("expected 6-character hex color"));
}
let r = u8::from_str_radix(&s[0..2], 16).map_err(serde::de::Error::custom)?;
let g = u8::from_str_radix(&s[2..4], 16).map_err(serde::de::Error::custom)?;
let b = u8::from_str_radix(&s[4..6], 16).map_err(serde::de::Error::custom)?;
Ok(Color { r, g, b })
}
}
Manual implementations are most useful when your serialized form is radically
different from your Rust representation. A color as a hex string, a duration as
a human-readable string like "30s" or "5m", coordinates
as a two-element array instead of a struct — those are the cases where manual
impls earn their verbosity.
Patterns for API clients
When you're building a client for someone else's HTTP API, several of the patterns above come together. Here are the combinations that show up constantly.
Response wrappers with flatten
Most APIs wrap their response data in an envelope. You don't want to manually unwrap
it every time. A generic wrapper struct with flatten keeps things clean.
#[derive(Deserialize, Debug)]
struct ApiEnvelope<T> {
success: bool,
data: T,
#[serde(default)]
request_id: Option<String>,
}
#[derive(Deserialize, Debug)]
struct User {
id: u64,
name: String,
}
#[derive(Deserialize, Debug)]
struct PaginatedList<T> {
items: Vec<T>,
total: u64,
page: u32,
per_page: u32,
}
// Usage:
fn get_user(id: u64) -> Result<ApiEnvelope<User>, Box<dyn std::error::Error>> {
let response = reqwest::blocking::get(
&format!("https://api.example.com/users/{}", id)
)?;
let envelope: ApiEnvelope<User> = response.json()?;
Ok(envelope)
}
// Handles: {"success": true, "data": {"id": 1, "name": "Alice"}, "request_id": "abc123"}
Error responses with untagged enums
APIs return different shapes for success and error responses. An untagged enum lets you handle both with a single deserialize call.
#[derive(Deserialize, Debug)]
struct SuccessResponse<T> {
data: T,
}
#[derive(Deserialize, Debug)]
struct ErrorResponse {
error: ErrorDetail,
}
#[derive(Deserialize, Debug)]
struct ErrorDetail {
code: String,
message: String,
}
#[derive(Deserialize, Debug)]
#[serde(untagged)]
enum ApiResult<T> {
Success(SuccessResponse<T>),
Error(ErrorResponse),
}
impl<T> ApiResult<T> {
fn into_result(self) -> Result<T, ErrorDetail> {
match self {
ApiResult::Success(s) => Ok(s.data),
ApiResult::Error(e) => Err(e.error),
}
}
}
// Handles both:
// {"data": {"id": 1, "name": "Alice"}}
// {"error": {"code": "not_found", "message": "User 99 does not exist"}}
Note that Success comes before Error in the enum. Since
serde tries untagged variants in order, put the more specific (or more common)
variant first to avoid false matches.
Putting it all together
Here's what a real API client struct tends to look like when you combine multiple serde patterns.
use std::collections::HashMap;
#[derive(Serialize, Deserialize, Debug)]
#[serde(rename_all = "camelCase")]
struct CreateTaskRequest {
title: String,
#[serde(skip_serializing_if = "Option::is_none")]
description: Option<String>,
#[serde(default = "default_priority")]
priority: String,
#[serde(skip_serializing_if = "Vec::is_empty", default)]
assigned_to: Vec<String>,
#[serde(skip_serializing_if = "Option::is_none")]
due_date: Option<String>,
#[serde(flatten)]
extra: HashMap<String, serde_json::Value>,
}
fn default_priority() -> String {
"medium".to_string()
}
#[derive(Deserialize, Debug)]
#[serde(rename_all = "camelCase")]
struct Task {
id: u64,
title: String,
description: Option<String>,
priority: String,
assigned_to: Vec<String>,
created_at: String,
#[serde(default)]
completed: bool,
}
This combines rename_all for the API's camelCase convention,
skip_serializing_if to keep requests minimal,
default to handle missing fields gracefully,
and flatten to forward any extra fields the API might accept.
Every pattern we covered earlier finds its way into real client code.
The key insight with serde is that you almost never need to write parsing code by hand. Between the derive macros, field attributes, enum representations, and custom deserializers, there's a declarative solution for nearly every data format mismatch you'll encounter. Start with the simplest attribute that works, and only reach for custom implementations when the attributes genuinely can't express what you need.
Telex