Gerdu: A Multi-Protocol, Raft-Backed Key-Value Store in Go
It’s easy to use Redis every day and never think about what’s underneath it — the eviction policy, the wire protocol, the way a cluster keeps its nodes in agreement. The best cure I know for that kind of fuzziness is to build a small version yourself. So I wrote Gerdu, an in-memory key-value store in Go that speaks several protocols at once and can run as a single node or a Raft-backed cluster. It’s a few thousand lines, and almost all of it hangs off one very small idea.
One interface, several caches
That idea is an interface with three methods. A cache, as far as the rest of Gerdu is concerned, is anything that can do this:
type UnImplementedCache interface {
Put(key string, value string) (created bool)
Get(key string) (value string, ok bool)
Delete(key string) (ok bool)
}
Nothing else in the system knows what kind of cache it’s holding. That let me write three different ones — an LRU, an LFU, and a weak-reference cache that lets the garbage collector reclaim entries under memory pressure — and swap between them with a command-line flag. Every interesting design decision in the project comes back to keeping that seam clean.
The LRU underneath
The default cache is the textbook LRU: a hash map for O(1) lookups, paired with a doubly-linked list that tracks recency. A Get finds the node in the map and moves it to the front of the list; the least-recently-used entries sink to the tail, which is exactly where eviction takes from. The one twist I like is that capacity is measured in bytes rather than entries — a 64MB cap means real memory — so eviction keeps popping the tail until the store fits again:
// on write, after inserting/updating:
for c.size > c.capacity {
tail := c.linklist.PopTail()
metrics.Deletes.Inc()
c.size -= entrySize(tail.Key, tail.Value)
delete(c.node, tail.Key)
}
A background goroutine wakes up on a ticker and sweeps out anything that’s expired, so dead entries don’t linger just because nobody asked for them. None of this is novel — it’s the canonical LRU — but writing it out is what makes “O(1) with a hash map and a linked list” stop being a phrase you repeat in interviews and start being something you’ve actually felt.
Four front doors to the same room
Here’s where the interface starts paying off. Gerdu serves HTTP, Redis, memcached, and gRPC — all at the same time, each on its own port. The reason I bothered is simple: whatever client you already have should just work. Point redis-cli at it, point a memcached library at it, curl it, or call it over gRPC.
Each protocol is its own little package, and each one is essentially a translator. It parses its own wire format — RESP for Redis, the line-based text protocol for memcached, protobuf for gRPC — and then bottoms out in the same three calls. A Redis SET, a memcached set, an HTTP PUT, and a gRPC Put all end up at cache.Put(key, value). The cache has no idea which doorway a request came through, and it doesn’t need to.
Distribution by composition
The part I’m proudest of is how clustering slots in. Gerdu uses HashiCorp’s Raft for replication, and the trick is that the cache is the replicated state machine. Rather than bolt consensus onto the side, I wrote a RaftProxy that implements the very same UnImplementedCache interface — but instead of touching memory directly, a write turns into a little command and gets proposed to the Raft log:
func (c *RaftProxy) applyCommand(cmd *command) (raft.ApplyFuture, error) {
if c.raft.State() != raft.Leader {
return nil, fmt.Errorf("not a leader but a %v", c.raft.State())
}
b, _ := json.Marshal(cmd) // {op, key, value, ...}
return c.raft.Apply(b, raftTimeout), nil
}
The mutation doesn’t happen here. It happens later, when that log entry is committed and Raft hands it back to the state machine’s Apply on every node, which decodes the command and finally calls into the underlying cache. Snapshots are just the map marshalled to JSON; restore reads it back. Because the cache satisfies Raft’s FSM contract, the store’s state and the replicated state are the same thing.
And here’s the payoff for keeping that seam clean: the protocol servers don’t change at all. On a single node they talk to an LRU cache. In a cluster they talk to a RaftProxy that wraps an LRU cache. Same interface, two behaviors. Writes are routed to the leader, followers reject them with a polite “not a leader,” and the usual Raft arithmetic applies — three nodes survive one failure, five survive two. The log lives in memory by default, or in BoltDB on disk if you point it at storage.
Knowing what it’s doing
Because a cache is exactly the kind of thing whose behavior you want to watch, Gerdu increments Prometheus counters — hits, misses, adds, deletes — right inside the cache operations and exposes them at /metrics. TLS is optional for the HTTP, gRPC, and Redis listeners, and on Ctrl+C a node tries to leave its Raft cluster cleanly rather than just vanishing and making its peers wait out a timeout.
What I took away
The lesson Gerdu drove home is how much leverage a single, small interface can give you. Three methods were enough to let the same abstraction be a local LRU, a Raft state machine, and the shared backend for four different wire protocols — without any of those pieces needing to know about the others. Most of the “distributed” difficulty I’d braced for turned out to be a question of where to put that seam, not how much machinery to pile on top of it.
The code is on GitHub at github.com/arazmj/gerdu if you’d like to look around.