PathForge: Writing a BGP-4 Daemon from Scratch in Rust
BGP is the protocol that holds the internet together — the thing that decides, autonomous system by autonomous system, how a packet leaving my laptop ever finds its way to a server on the other side of the planet. I’ve read about it for years, but reading about a protocol and implementing one are very different kinds of understanding. So I sat down with RFC 4271 and wrote a BGP-4 speaker from scratch in Rust. It’s called PathForge, and this is what I learned building it.
It ended up being about 5,400 lines of Rust with 109 unit tests and a handful of fuzz harnesses. But the line count isn’t the point — the point is that almost every paragraph of the RFC turned into a type, a state, or a test, and the parts I would have skimmed while reading became decisions I had to actually make.
It starts with bytes on the wire
Every BGP message begins with the same 19-byte header: a 16-byte marker that’s all ones, a 2-byte length, and a 1-byte type. The marker is a holdover from older authentication schemes, but it’s still mandatory, so the very first thing the parser does is insist on it:
pub const MARKER: [u8; 16] = [0xFF; 16];
pub const HEADER_LEN: usize = 19;
pub const MAX_MESSAGE_LEN: usize = 4096;
let mut marker = [0u8; 16];
buf.copy_to_slice(&mut marker);
if marker != MARKER { return Err(HeaderError::InvalidMarker); }
let length = buf.get_u16();
if (length as usize) < HEADER_LEN || (length as usize) > MAX_MESSAGE_LEN {
return Err(HeaderError::InvalidLength(length));
}
Here’s the thing that shaped the whole codebase: every one of those bytes arrives from a peer I don’t control. A malformed OPEN or a deliberately broken path attribute should never crash the daemon — it should produce a NOTIFICATION and a clean teardown. That’s exactly the kind of property that’s hard to prove by hand, so the message parsers were the first thing I pointed cargo-fuzz at. Four fuzz targets — the framing, OPEN, UPDATE, and the path-attribute parser — chew on random bytes looking for a panic. Rust already rules out whole categories of buffer bugs, but fuzzing the parser is what lets me actually trust it with hostile input.
A session is a state machine
A BGP session is defined in the RFC as a finite state machine, and the closest thing to a faithful translation is to write it as one. There are six states — Idle, Connect, Active, OpenSent, OpenConfirm, and Established — and a fixed set of events that move you between them. Modeling both as plain Rust enums meant the transition logic became an exhaustive match, and the compiler started catching cases I’d forgotten:
pub enum BgpState {
Idle, Connect, Active, OpenSent, OpenConfirm, Established,
}
pub enum BgpEvent {
ManualStart, TcpConnectionConfirmed, TcpConnectionFail,
BgpOpen, KeepAliveMsg, UpdateMsg, NotifMsg, HoldTimerExpired, /* ... */
}
The happy path reads the way the spec describes it: from Idle you connect, send your OPEN with its capabilities, wait for the peer’s OPEN in OpenSent, trade KEEPALIVEs through OpenConfirm, and land in Established — the only state where UPDATE messages actually mean anything. Hold-timer expiry can yank you out of any of it, which is its own small lesson in why timers belong in the state machine and not bolted on beside it.
One task per peer
On the concurrency side, the shape is about as simple as Tokio lets it be: the listener accepts a TCP connection and spawns one async task per peer, and that task owns the session loop for its neighbor — driving the FSM, exchanging KEEPALIVEs, and resetting the hold timer every time it hears from the other side. A routing daemon is mostly a crowd of long-lived, mostly-idle sessions, each babysitting a couple of timers, and that’s precisely the workload async runtimes are good at. I never had to think about a thread pool.
The decision process, as a sort
My favorite part is the BGP decision process. When several neighbors all advertise a route to the same prefix, BGP doesn’t get to be vague about which one wins — RFC 4271 spells out an exact tie-breaking ladder: highest LOCAL_PREF, then shortest AS_PATH, then lowest ORIGIN, then lowest MED, and so on. The lovely thing is that in Rust this whole ladder is just a sort comparator, with each rule chained onto the previous one by Ordering::then:
candidates.sort_by(|a, b| {
let a_lp = a.attrs.local_pref.unwrap_or(100);
let b_lp = b.attrs.local_pref.unwrap_or(100);
// 1. Highest LOCAL_PREF wins (note: b vs a, because higher is better)
b_lp.cmp(&a_lp)
// 2. Shortest AS_PATH
.then(a.attrs.as_path_len().cmp(&b.attrs.as_path_len()))
// 3. Lowest ORIGIN (IGP < EGP < Incomplete)
.then_with(|| origin(a).cmp(&origin(b)))
// 4. Lowest MED
.then(med(a).cmp(&med(b)))
// 5. Oldest route as the final tiebreaker
.then(a.received_at.cmp(&b.received_at))
});
let best = candidates[0].clone();
Reading the rules straight down the comparator, in order, is about as close as code gets to reading the RFC itself. The defaults matter more than they look, too — an absent LOCAL_PREF becomes 100, a missing ORIGIN sorts as “Incomplete,” an absent MED counts as zero — and getting any of those wrong quietly elects the wrong route. Underneath, each peer keeps its own Adj-RIB-In, the winners land in the Loc-RIB, and a longest-prefix-match lookup answers “where does this address actually go.”
Everything around the core
RFC 4271 is really just the trunk; the interesting routing world hangs off the branches. PathForge grew to negotiate capabilities (multiprotocol IPv4/IPv6, 4-byte AS numbers, route refresh, graceful restart), reflect routes for iBGP scaling, dampen flapping prefixes, validate route origins against RPKI, and filter with prefix and community lists. I kept a strict discipline of one module per concern, each pinned to the RFC it implements — rr.rs for route reflection, dampening.rs for RFC 2439, mp.rs for multiprotocol NLRI, and so on. That mapping is the only reason five thousand lines stayed navigable instead of turning into a swamp.
Being able to see inside it
A routing daemon you can’t observe is a frightening thing to run, so a real chunk of the work went into operability rather than protocol. There are Prometheus counters for sessions, messages, routes, and errors; tracing spans via #[instrument] so a single peer’s story can be followed through the logs; and a Unix-socket management CLI that dumps the live RIB without touching the data path. To check I hadn’t merely implemented my own private dialect of BGP, there’s a Docker Compose harness that peers PathForge against FRRouting and watches a real handshake and route exchange happen end to end.
What I took away
Implementing a protocol from its specification is the most thorough way I know to actually read one. The RFC’s careful English becomes types and state transitions, its “SHOULD”s become judgment calls you can’t defer, and the corners you’d gloss over reading become the tests that fail at 11pm. Rust turned out to be a wonderful language for it — a spec written in terms of states, messages, and attributes maps almost directly onto enums and exhaustive matches, and the borrow checker quietly absorbs a class of bugs that would otherwise haunt a network daemon.
The code is on GitHub at github.com/arazmj/pathforge if you’d like to read the rest.