LRU Cache
The definitive guide to the LRU cache — how to combine a hash map and a doubly linked list to get O(1) get and put, why neither structure alone is enough, and how to walk the full implementation in an interview.
The problem that bites you first
Here is the concrete scenario. You're building a thumbnail service. Generating a thumbnail is expensive — 200ms of CPU per image. You want to cache the last 1000 results so repeated requests are instant. The rules:
get(key)— return the cached thumbnail if it exists, else return -1 (miss).put(key, value)— store a new thumbnail. If the cache is at capacity, evict the item that was accessed least recently to make room.
Both operations must be O(1). You have 10,000 concurrent requests per second. O(log n) is too slow and O(n) is laughable.
This is LeetCode 146 — one of the most common system-design interview openers, and for good reason. It tests whether you understand that the right data structure isn't just one thing.
Why one structure isn't enough
Work through why each solo approach fails:
Hash map only. Lookup is O(1), great. But when you need to evict, you don't know which key was used least recently. You'd have to scan all keys and compare timestamps — O(n).
Array with timestamps. Store (key, value, last_access_time) tuples. On eviction, scan for the minimum timestamp — O(n). Even with a min-heap tracking recency, promoting an item on access is O(log n) because you have to update its position in the heap. Not O(1).
Linked list only. The list maintains recency order perfectly — head is most recent, tail is LRU. Eviction is O(1): chop the tail. Access is O(n): scan for the key. Not O(1).
The breakthrough: the linked list handles ordering and eviction in O(1), but only once you have a pointer to the node. The hash map handles finding that node in O(1). Map stores key → node, not key → value. Then every operation is O(1):
- get: hash map lookup → get node pointer → move node to head → return node's value.
- put: create node → insert at head → store in hash map → if over capacity, remove tail node → delete tail's key from hash map.
flowchart LR
subgraph "hash map"
K1["key: 'thumb_42'"] --> N1
K2["key: 'thumb_07'"] --> N2
K3["key: 'thumb_99'"] --> N3
end
subgraph "doubly linked list (recency order)"
direction LR
H["HEAD\n(dummy)"] <--> N1["thumb_99\nval=..."]
N1 <--> N2["thumb_42\nval=..."]
N2 <--> N3["thumb_07\nval=..."]
N3 <--> T["TAIL\n(dummy)"]
end
style H fill:#22d3ee,color:#0a0a0f
style T fill:#ff4d8d,color:#0a0a0f
style N1 fill:#7c5cff,color:#fff
The hash map arrows all point into the list. The list maintains order. They share the same node objects — nothing is duplicated.
The sentinel trick that eliminates all edge cases
Before writing code, take one design decision seriously: use dummy head and tail sentinel nodes.
Without sentinels, every insert and delete has to handle "what if the list is empty?" and "what if I'm removing the only node?" Those are four extra if-checks scattered through your code, each one a potential bug under pressure. With sentinels:
dummy_head <-> [real nodes...] <-> dummy_tail
Real content always lives between the sentinels. Removing the true head means removing the node right after dummy_head. Removing the true tail means removing the node right before dummy_tail. The sentinels are never themselves removed. Your add-to-head and remove-from-tail functions are now unconditional — no null checks, no special cases.
This is a technique worth burning into muscle memory. Not just for LRU caches — any doubly linked list that needs to handle arbitrary insertions and deletions benefits from it.
The full implementation
class Node:
"""A doubly linked list node that holds both key and value.
We store the key so that when we evict the tail, we know which
key to delete from the hash map."""
def __init__(self, key=0, val=0):
self.key = key
self.val = val
self.prev = None
self.next = None
class LRUCache:
def __init__(self, capacity: int):
self.capacity = capacity
self.map = {} # key → Node
# Sentinel nodes: real content lives between these two.
self.head = Node() # dummy head (most recent side)
self.tail = Node() # dummy tail (eviction side)
self.head.next = self.tail
self.tail.prev = self.head
# ── private helpers ──────────────────────────────────────────────
def _remove(self, node: Node) -> None:
"""Splice a node out of the list — O(1)."""
node.prev.next = node.next
node.next.prev = node.prev
def _insert_at_head(self, node: Node) -> None:
"""Insert a node right after the dummy head — O(1)."""
node.next = self.head.next
node.prev = self.head
self.head.next.prev = node
self.head.next = node
# ── public API ───────────────────────────────────────────────────
def get(self, key: int) -> int:
if key not in self.map:
return -1
node = self.map[key]
self._remove(node) # pull it out of its current position
self._insert_at_head(node) # put it at the front (most recent)
return node.val
def put(self, key: int, value: int) -> None:
if key in self.map:
# Update existing — just change value and promote.
node = self.map[key]
node.val = value
self._remove(node)
self._insert_at_head(node)
else:
if len(self.map) == self.capacity:
# Evict the LRU node — it's right before the dummy tail.
lru = self.tail.prev
self._remove(lru)
del self.map[lru.key] # lru.key is why we store key in Node
node = Node(key, value)
self.map[key] = node
self._insert_at_head(node)
Walk through a concrete sequence to cement this. Capacity = 3.
put(1, "a") → list: [1]
put(2, "b") → list: [2, 1]
put(3, "c") → list: [3, 2, 1]
get(1) → list: [1, 3, 2] (1 promoted to head)
put(4, "d") → list: [4, 1, 3] (2 evicted — LRU was at tail)
get(2) → -1 (2 is gone)
get(3) → list: [3, 4, 1] (3 promoted)
The key moment is put(4, "d"). The cache is full. The tail's predecessor is node 2, which hasn't been touched since it was inserted first. It gets removed from the list, deleted from the map, done. Node 4 goes in at the head.
What the visualizer is showing you
{ "type": "linked-list", "title": "recency queue — head=most recent, tail=evict next", "data": [1, 3, 2] }
Imagine this list after get(1) in the sequence above. Node 1 is at the head because it was just accessed. Node 2 is at the tail and will be the eviction victim on the next put that exceeds capacity. Every get or put rewires two pairs of pointers — that's all. The rest of the list is completely untouched.
This is the O(1) moment. You splice out the accessed node (two pointer writes to close the gap) and splice it in at the head (two pointer writes to open a slot). The hash map is what told you where that node lives without scanning the list. Four pointer writes total, one hash map lookup. Constant time regardless of cache size.
The one-liner Python cheat (and when to use it)
Python's collections.OrderedDict is a dict plus a doubly linked list internally — the exact same design. So you can implement LRU in a handful of lines:
from collections import OrderedDict
class LRUCache:
def __init__(self, capacity: int):
self.capacity = capacity
self.cache = OrderedDict()
def get(self, key: int) -> int:
if key not in self.cache:
return -1
self.cache.move_to_end(key) # promote to most-recent position
return self.cache[key]
def put(self, key: int, value: int) -> None:
if key in self.cache:
self.cache.move_to_end(key)
self.cache[key] = value
if len(self.cache) > self.capacity:
self.cache.popitem(last=False) # remove LRU (the "first" item)
move_to_end(key) is O(1). popitem(last=False) removes the oldest entry in O(1). This is correct and production-worthy.
Know when to use which version. In an interview where the question is explicitly "design an LRU cache" or "implement from scratch," write the explicit Node + doubly linked list version. It proves you understand the mechanics. If you're using an LRU cache as a component inside a larger solution, OrderedDict is the right call — it's faster to write, less likely to have bugs, and clearly communicates intent.
Beyond LRU: LFU, and how real caches think about this
LRU is the right policy when recency predicts future access — which is true for most workloads. But it has a known failure mode: cache pollution from one-time scans. If you iterate through a 10GB dataset once, that scan evicts your hot working set entirely. Your cache goes cold on data that was being genuinely reused. This is called the "scan resistance" problem.
LFU (Least Frequently Used) evicts the item accessed the fewest times total. It resists pollution from one-time scans because newly loaded scan data has a count of one and gets evicted first. The downside: an item that was hot six hours ago but hasn't been touched since will still have a high count and won't get evicted. It takes longer to adapt to changing workloads.
The O(1) LFU implementation is more complex than LRU — it requires a hash map of frequency buckets, each bucket being a linked list of items at that frequency. Worth knowing exists; rarely asked to implement from scratch in interviews.
Redis eviction policies illustrate the spectrum real systems navigate:
| Policy | What it evicts |
|---|---|
noeviction | Returns errors when full — you decide what to do |
allkeys-lru | LRU across all keys (the most common choice) |
volatile-lru | LRU only among keys with a TTL set |
allkeys-lfu | LFU across all keys (Redis 4.0+) |
allkeys-random | Random eviction — fast but usually wrong |
Redis's LRU is approximate — it samples a small number of keys and evicts the least-recently-used among the sample. This trades accuracy for speed and keeps the implementation simple. For most production use cases, the approximation is close enough and the performance is better.
Complexity summary
| Operation | Time | Space |
|---|---|---|
get(key) | O(1) | — |
put(key, value) | O(1) | — |
| Eviction | O(1) | — |
| Total space | — | O(capacity) |
All four pointer-rewire operations (_remove and _insert_at_head each touch 4 pointers) plus one dictionary operation — that's O(1) regardless of how large the cache is.
Common mistakes people make
Storing value in the map instead of node. If your map is key → value instead of key → node, you can't move the node to the front without first searching the list for it. You need the node pointer. Store key → node, and let node.val hold the value.
Forgetting to store key in the node. When evicting the tail, you need to remove that key from the hash map too. If your node only stores val, you have to walk the map to find which key to delete — O(n). Store both key and val in the node. This specific detail is easy to miss in the heat of an interview.
Not handling the update case in put. If you call put(key, new_value) on a key that already exists, two things must happen: update the value, AND promote the node to the head (it was just accessed). Forgetting the promotion is a bug — the old recency order is now wrong.
Using a singly linked list. The O(1) node removal only works with a doubly linked list. A singly linked list can delete the tail in O(1), but deleting an arbitrary interior node requires finding its predecessor — O(n). The moment you need to promote an arbitrary node on get, singly linked falls apart. If you need a refresher on the difference, the linked list article covers singly vs. doubly in detail.
Leaving stale keys in the map after eviction. Evict the tail from the list, fine. But if you don't call del self.map[lru.key], the map keeps growing indefinitely — the evicted values stay in memory and subsequent lookups on evicted keys return wrong results. The sentinel node design makes it easy to get at tail.prev.key before removal.
When NOT to use an LRU cache
Not every caching problem calls for LRU, and it's worth being explicit.
-
Access patterns are uniform random. LRU is only better than random eviction when recency predicts future access. If every key is equally likely to be requested next, LRU's overhead (maintaining order) is pure cost. A simple bounded hash map with random eviction is faster.
-
Your dataset fits in memory entirely. If there's no capacity pressure, there's nothing to evict. Use a plain hash table with no eviction machinery.
-
Write-heavy workloads with few reads. LRU caches are most valuable when reads dominate — you want to serve cached values repeatedly. If you're mostly writing and each key is touched once, you're paying the maintenance cost with no benefit.
-
You need TTL-based expiry, not recency-based eviction. LRU says "evict what was accessed longest ago." If you want "evict what has been here longest regardless of access," that's FIFO, not LRU — and a deque with a hash map is simpler and cheaper.
-
You need both recency AND frequency signals. Some access patterns are better served by LFU or a hybrid policy like TinyLFU (used by Caffeine, the Java caching library, and by Twitter's cache systems). If your hot-set changes slowly but has genuine long-term regulars, LFU resists churn better.
The LRU cache is one of those designs that becomes a lens. Once you understand how combining a hash map with a doubly linked list achieves O(1) on all three axes — lookup, promotion, eviction — you start seeing the pattern elsewhere. Browser back/forward navigation. OS page replacement. CPU L1 caches. Database buffer pools. They are all, at their core, a recency queue backed by fast lookup. Different constraints, same shape.
Frequently asked questions
▸What is an LRU cache and why is it useful?
LRU stands for Least Recently Used. An LRU cache is a fixed-capacity key-value store that evicts the item accessed least recently when it runs out of space. It is useful because most real access patterns have "temporal locality" — things you touched recently are more likely to be needed again than things you touched a long time ago. CPUs use this exact policy for hardware caches, and it is the default eviction policy in Redis (allkeys-lru).
▸How do you implement an LRU cache in O(1) for both get and put?
Combine a hash map with a doubly linked list. The hash map maps each key to the corresponding list node, giving O(1) lookup. The doubly linked list keeps nodes in recency order — most recently used at the head, least recently used at the tail. On get: look up the node in O(1), move it to the head in O(1). On put: insert a new node at the head in O(1), and if over capacity, remove the tail node in O(1). Neither structure alone can do this — the hash map can not maintain order, and the linked list alone requires O(n) to find a node by key.
▸Why not just use an array or a sorted structure for an LRU cache?
An array can maintain order but finding and removing an element by key is O(n). A sorted tree gives O(log n) for all operations but there is no O(1) path. The doubly linked list is special because once you have a pointer to a node, removing or moving it is O(1) — no searching. The hash map is what gets you that pointer instantly. The combination is the only way to achieve O(1) for all three operations: lookup, move-to-front, and evict-from-tail.
▸What is the difference between LRU and LFU caching?
LRU (Least Recently Used) evicts the item that was accessed longest ago, regardless of how many times it has been accessed total. LFU (Least Frequently Used) evicts the item accessed the fewest times overall. LRU is simpler to implement and works well when recent access predicts future access. LFU can outperform LRU for access patterns where some items are consistently popular over long periods, even if they have not been touched recently. Redis supports both policies.
▸How does Python's functools.lru_cache work internally?
Python's functools.lru_cache uses an ordered dictionary (collections.OrderedDict) which itself is a dict plus a doubly linked list — exactly the same design. Each call moves the result to the most-recently-used position in the internal list. When the cache exceeds maxsize, the least-recently-used entry (the tail) is evicted. The implementation in CPython is written in C for speed, but the algorithm is identical to the interview version.
You may also like
Union-Find (Disjoint Set Union)
The near-magic data structure for connectivity queries — are these two nodes in the same component? Nearly O(1) per operation with union by rank and path compression. Kruskal's MST, redundant connections, account merging, and more.
Tries (Prefix Trees)
The data structure powering autocomplete, spell-check, and IP routing — a tree where each edge is a character and every path from the root spells a prefix. Insert, search, and prefix queries all run in O(L) on word length alone, completely independent of how many words you store.
Strings
Strings look like simple text, but they hide a brutal trap: naive concatenation in a loop is O(n²). Learn why, how encoding actually works, and the handful of patterns — sliding window, two pointers, hashing — that solve 80% of string interview problems.