Segment Trees
Segment trees answer range queries AND handle point or range updates in O(log n) — the structure to reach for when prefix sums aren't enough and you need both reads and writes at scale.
Why prefix sums stop working
Take five minutes with prefix sums before reading further — they are the right tool for static arrays, and understanding why they break is the whole motivation for this article.
The prefix sum of arr is pre[i] = arr[0] + arr[1] + ... + arr[i]. A range sum query [l, r] becomes pre[r] - pre[l-1] — one subtraction, O(1). Perfect.
Now someone calls arr[42] = 99. Every pre[i] for i ≥ 42 is wrong. You have to recompute the whole suffix: O(n). Do this after every update and your "fast" solution is O(n) per query. On a 10⁵-element array with 10⁵ mixed updates and queries, you're looking at 10¹⁰ operations. That's a problem.
The root issue: a prefix sum bakes all the array's current values into a single aggregated structure. One change invalidates a cascade. A segment tree avoids this by storing answers at many granularities — when one leaf changes, only its ancestors need updating, and there are only O(log n) of those.
The mental model: divide and precompute
Split the array in half. Store the answer for the whole array, the answer for the left half, the answer for the right half. Recurse. Keep splitting until every segment is a single element.
That's a segment tree. Each node stores the precomputed answer (sum, min, whatever) for its contiguous subarray. The root covers the whole array. The leaves are individual elements. Any range you can possibly query decomposes into a union of segments already in the tree.
The depth of the tree is ⌈log₂ n⌉. A query touches at most 4 nodes per level — so 4 log n total, which is O(log n). An update changes one leaf and walks straight up the tree recomputing, touching O(log n) ancestors.
For n=8 you get a perfect binary tree of depth 3. For arbitrary n the tree is still complete — we pad conceptually to the next power of 2. The leaves land in the lower half of the index space.
The array representation
Just like a heap, a segment tree lives in a flat array — no node objects, no pointers. The root sits at index 1. The children of node i are at 2i and 2i+1. The parent of node i is i // 2.
Index 0 is unused (makes the math cleaner). For an array of n elements, allocate 4n integers. The leaves — the actual array elements — land in the lower half of the tree.
For arr = [1, 3, 5, 7]:
tree index: 1 2 3 4 5 6 7
segment: [0,3] [0,1] [2,3] [0,0][1,1][2,2][3,3]
value: 16 4 12 1 3 5 7
Children of node 2 are at indices 4 and 5. Children of node 3 are at 6 and 7. Leaves are at the bottom. Building the tree is a post-order traversal: fill in the leaves, then combine pairs bottom-up.
Building the tree
class SegmentTree:
def __init__(self, arr):
self.n = len(arr)
self.tree = [0] * (4 * self.n)
if self.n > 0:
self._build(arr, 1, 0, self.n - 1)
def _build(self, arr, node, start, end):
if start == end:
# leaf: store the array value
self.tree[node] = arr[start]
else:
mid = (start + end) // 2
self._build(arr, 2 * node, start, mid)
self._build(arr, 2 * node + 1, mid + 1, end)
# internal node: combine children
self.tree[node] = self.tree[2 * node] + self.tree[2 * node + 1]
This is O(n) — every element is visited exactly once. The 4n allocation is not waste; the implicit tree structure needs that room. For n = 5, the next power of 2 is 8, and the full tree needs 2 × 8 = 16 slots. 4 × 5 = 20 safely covers this.
The combine step here is addition for a range-sum tree. Swap it for min() and you get a range-min tree. Swap for max(), GCD, XOR — same skeleton, different function. This composability is the main reason to prefer a segment tree over a Fenwick tree for non-invertible operations.
Range query
To query [l, r], you recurse from the root. At each node, three cases:
- Current segment is completely outside
[l, r]— return the identity (0 for sum, ∞ for min). - Current segment is completely inside
[l, r]— returntree[node]directly. - Partial overlap — recurse into both children and combine.
def query(self, l, r):
return self._query(1, 0, self.n - 1, l, r)
def _query(self, node, start, end, l, r):
if r < start or end < l:
return 0 # completely outside: identity for sum
if l <= start and end <= r:
return self.tree[node] # completely inside: use precomputed answer
mid = (start + end) // 2
left_sum = self._query(2 * node, start, mid, l, r)
right_sum = self._query(2 * node + 1, mid + 1, end, l, r)
return left_sum + right_sum
The key insight is case 2. When the current node's segment fits entirely inside your query range, you take the precomputed answer and stop descending. The recursion terminates quickly — the number of partial-overlap nodes is bounded by O(log n) per level of the tree, giving O(log n) total.
Let's trace query(2, 5) on [1, 3, 5, 7, 9, 11]:
flowchart TD
R["[0,5] PARTIAL → recurse"]
L["[0,2] PARTIAL → recurse"]
Ri["[3,5] INSIDE ✓ → 27"]
LL["[0,1] OUTSIDE → 0"]
LR["[2,2] INSIDE ✓ → 5"]
R --> L
R --> Ri
L --> LL
L --> LR
style Ri fill:#00ff9d,color:#0a0a0f
style LR fill:#00ff9d,color:#0a0a0f
style LL fill:#ff4d8d,color:#0a0a0f
style R fill:#7c5cff,color:#fff
style L fill:#22d3ee,color:#0a0a0f
Result: 5 + 27 = 32. Only 5 nodes visited instead of iterating through all 4 elements in the range. For large arrays the difference is dramatic.
Point update
Update arr[i] = val: walk from root to the leaf at index i, update the leaf, recompute every ancestor on the way back up.
def update(self, i, val):
self._update(1, 0, self.n - 1, i, val)
def _update(self, node, start, end, i, val):
if start == end:
# reached the leaf
self.tree[node] = val
else:
mid = (start + end) // 2
if i <= mid:
self._update(2 * node, start, mid, i, val)
else:
self._update(2 * node + 1, mid + 1, end, i, val)
# recompute this node from updated children
self.tree[node] = self.tree[2 * node] + self.tree[2 * node + 1]
The recursion descends exactly one path — left or right at each level — and recomputes on the way back. That's O(log n) nodes, same as the height of the tree. Nothing else in the tree changes.
Lazy propagation: range updates in O(log n)
Point updates are clean. But what if you need to add 5 to every element in [3, 97]? Without lazy propagation you'd call update 95 times — O(n log n) total. Terrible.
Lazy propagation fixes this with a simple idea: don't do the work until you have to.
You maintain a parallel lazy array the same size as tree. When you "update" a range, instead of descending all the way to the leaves, you:
- Find the O(log n) nodes that exactly cover your range.
- Update their stored values and mark them with a pending tag in
lazy. - Stop. Don't touch their children yet.
When you later query (or update) and need to descend through a tagged node, you push the tag down to the children first, then continue. This ensures that by the time you actually read a node's value, all pending updates from ancestors have been applied.
class LazySegmentTree:
def __init__(self, arr):
self.n = len(arr)
self.tree = [0] * (4 * self.n)
self.lazy = [0] * (4 * self.n) # pending additive updates
self._build(arr, 1, 0, self.n - 1) # same _build as before
def _push_down(self, node, start, end):
"""Flush pending tag to children before descending."""
if self.lazy[node] != 0:
mid = (start + end) // 2
left, right = 2 * node, 2 * node + 1
# each child's sum grows by (pending_add × its segment length)
self.tree[left] += self.lazy[node] * (mid - start + 1)
self.tree[right] += self.lazy[node] * (end - mid)
self.lazy[left] += self.lazy[node]
self.lazy[right] += self.lazy[node]
self.lazy[node] = 0
def range_add(self, l, r, val):
self._range_add(1, 0, self.n - 1, l, r, val)
def _range_add(self, node, start, end, l, r, val):
if r < start or end < l:
return
if l <= start and end <= r:
# completely inside: update sum and tag, stop here
self.tree[node] += val * (end - start + 1)
self.lazy[node] += val
return
self._push_down(node, start, end)
mid = (start + end) // 2
self._range_add(2 * node, start, mid, l, r, val)
self._range_add(2 * node + 1, mid + 1, end, l, r, val)
self.tree[node] = self.tree[2 * node] + self.tree[2 * node + 1]
def query(self, l, r):
return self._query(1, 0, self.n - 1, l, r)
def _query(self, node, start, end, l, r):
if r < start or end < l:
return 0
if l <= start and end <= r:
return self.tree[node]
self._push_down(node, start, end) # flush before descending
mid = (start + end) // 2
return (self._query(2 * node, start, mid, l, r) +
self._query(2 * node + 1, mid + 1, end, l, r))
The _push_down call appears at the top of every recursive step before the descent. That's the invariant: a node's value is correct for the values already pushed into it; children might have stale values, but we fix them before we look at them.
Lazy propagation keeps both range updates and range queries at O(log n). It's more code, but the pattern is mechanical once you internalize "tag now, push when needed."
Segment tree vs. your other options
When you see a range-query problem, there are three tools on the table. Here is how to pick:
| Situation | Best tool |
|---|---|
| Static array, only queries | Prefix sums — O(1) query, O(n) build, done |
| Point updates + range sum queries | Fenwick tree — simpler code, same O(log n) |
| Point updates + range min/max | Segment tree — Fenwick cannot do non-invertible ops |
| Range updates + range queries | Segment tree with lazy propagation |
| Offline queries, sorted order matters | Segment tree or merge sort tree |
The Fenwick tree (binary indexed tree) deserves explicit comparison. It is roughly 10–15 lines of code for the same O(log n) point-update + range-sum functionality. If that's all you need, use it — the constant factor is better and the code is harder to mess up under pressure. The segment tree earns its complexity when:
- You need range min or max (Fenwick can't do this natively)
- You need range updates (requires lazy propagation, no Fenwick equivalent)
- You need to store arbitrary information per node (merging sorted lists, counting inversions, etc.)
The relationship with binary trees is also worth naming: a segment tree IS a binary tree, just one where each node's value is derived from its children and the key concept is "what does this subtree cover?" The structure you already know from BSTs and heaps maps directly — same index arithmetic, same recursive decomposition.
Three worked problems
Problem 1: Range sum query (static warmup)
Given arr = [2, 4, 5, 7, 8, 9], answer: sum of [1, 4]?
st = SegmentTree([2, 4, 5, 7, 8, 9])
print(st.query(1, 4)) # 4 + 5 + 7 + 8 = 24
Query decomposes like this: root [0,5] is partial → recurse; left [0,2] is partial → leaf [1,1]=4 inside, [2,2]=5 inside; right [3,5] is partial → [3,4]=15 inside, [5,5]=9 outside. Total: 4 + 5 + 15 = 24. Seven nodes visited. At n=10⁶ it's log₂(10⁶) ≈ 20 nodes versus a million — that's the whole point.
Problem 2: Range minimum query with updates
This is where Fenwick trees stop. Min is not invertible, so there's no Fenwick trick — you need a segment tree.
The code is the sum tree from above with exactly two changes:
# 1. combine: swap + for min
self.tree[node] = min(self.tree[2 * node], self.tree[2 * node + 1])
# 2. out-of-range identity: swap 0 for float('inf')
if r < start or end < l:
return float('inf')
That's it. The recursion, the index arithmetic, the push/pop structure — identical. Understanding the template means "range min with updates" is literally a two-line edit away from "range sum with updates." Same for max (identity float('-inf')), GCD (identity 0), XOR (identity 0). The segment tree is a framework, not a one-off structure.
Problem 3: Count of elements in a range (hard)
A more advanced pattern: "how many elements in arr[l..r] are greater than k?" Each node now stores the sorted subarray for its segment — a merge sort tree. A query decomposes the range into O(log n) nodes and binary-searches each. Query time is O(log² n). This is a competitive-programming staple; you'll see it extend further into persistent segment trees and wavelet trees.
The TELL for this pattern: when the question is richer than a scalar aggregate, store richer data per node. The recursive decomposition is the same segment tree skeleton — the payload can be anything associative.
Pitfalls people actually hit
Allocating only 2n instead of 4n. If n is not a power of 2, 2n is not enough. For n=5, the implicit tree needs 16 slots — 2 × 5 = 10 writes out of bounds silently. Use 4n. Always.
Wrong identity element. The "outside range" return must be the identity for your operation: 0 for sum, float('inf') for min, float('-inf') for max. Wrong identity contaminates results in exactly the inputs you won't catch in quick testing.
Forgetting _push_down in the query function. Lazy propagation needs _push_down before every descent — in both range_add AND query. It's easy to add it only in the update path and then wonder why queries return stale values.
Reaching for a segment tree when the array never changes. No updates means prefix sums — O(1) query, O(n) build, a tenth of the code. Using a segment tree for a static array is a flag that you didn't read the constraints.
When NOT to use a segment tree
| You need | Better tool |
|---|---|
| Range queries, no updates | Prefix sums — O(1) query |
| Range sum + point update, no min/max | Fenwick tree — half the code |
| Top-K, scheduling, stream minimum | Heap — O(log n) push/pop |
| k-th smallest in a range (order stats) | Order statistics tree, wavelet tree |
| 2D range queries | 2D segment tree or merge sort tree |
The segment tree is a heavy tool. In interviews, reaching for it immediately on a "range query" problem is often wrong — the interviewer usually expects you to try prefix sums first and only escalate when updates arrive. In competitive programming it's a staple; in system design it almost never shows up because range queries in production get answered by databases with B-tree indexes, not in-memory segment trees.
When you see a problem with "given n operations, each either a range query or a point/range update" — that's the TELL. Segment tree. Build once, handle everything in O(log n).
Frequently asked questions
▸What is a segment tree used for?
A segment tree answers range queries — "what is the sum/min/max of elements from index l to r?" — and handles updates in O(log n) time. The killer use case is when you have both queries AND updates on the same array. Pure prefix sums answer range-sum queries in O(1) but break completely the moment you update an element. A segment tree trades that O(1) query for O(log n) but handles updates just as fast, making it the right structure whenever the array is not static.
▸What is the difference between a segment tree and a Fenwick tree?
Both support range queries and point updates in O(log n). A Fenwick tree (binary indexed tree) is simpler to code — about 10 lines — and has a smaller constant factor, but it only natively handles operations that are invertible (like sum, not min/max). A segment tree is more code but is more general: it handles min, max, GCD, any associative operation, and with lazy propagation it also handles range updates efficiently. If you only need range sums with point updates, reach for a Fenwick tree. If you need range min/max, range updates, or a non-invertible operation, you need a segment tree.
▸What is lazy propagation in a segment tree?
Lazy propagation is the technique that makes range updates efficient. Without it, updating every element in a range [l, r] means O(n) point updates — useless. With lazy propagation, you tag internal nodes with a 'pending update' that you push down only when you actually need to descend into a subtree. This lets you update an entire range in O(log n) by touching at most O(log n) nodes, deferring the work on the rest until they're actually queried.
▸How large should the segment tree array be?
The safe rule is to allocate 4n nodes for an array of n elements. If n is a power of 2, 2n is enough. In general, 4n covers all cases because the tree has at most 2 × (next power of 2 ≥ n) nodes, which is always less than 4n. In competitive programming the constant 4 is burned into muscle memory — just use it.
▸When should I use a segment tree over sorting or binary search?
Segment trees shine when the problem has a mix of queries and mutations interleaved in time — you cannot afford to re-sort or recompute prefix sums after each update. If the array is static and you only need range sums, prefix sums are faster and simpler (O(1) query, O(n) build). If updates are rare, prefix sums plus a full rebuild might be acceptable. The segment tree is the right choice when updates and queries are comparably frequent and both need to be fast.
You may also like
Union-Find (Disjoint Set Union)
The near-magic data structure for connectivity queries — are these two nodes in the same component? Nearly O(1) per operation with union by rank and path compression. Kruskal's MST, redundant connections, account merging, and more.
Tries (Prefix Trees)
The data structure powering autocomplete, spell-check, and IP routing — a tree where each edge is a character and every path from the root spells a prefix. Insert, search, and prefix queries all run in O(L) on word length alone, completely independent of how many words you store.
Strings
Strings look like simple text, but they hide a brutal trap: naive concatenation in a loop is O(n²). Learn why, how encoding actually works, and the handful of patterns — sliding window, two pointers, hashing — that solve 80% of string interview problems.