Binary Search Trees
The left<node<right invariant gives you O(log n) search, insert, and delete — until the tree degenerates into a linked list. Here's the full picture, including why inorder traversal is a free sort.
The invariant: why it gives you binary search for free
You've used binary search on a sorted array — every comparison eliminates half the remaining candidates, so you find any value in O(log n) comparisons. A BST does the exact same thing, except the "sorted array" is replaced by a pointer-linked tree.
At every node, you ask one question: is my target less than this value, greater, or equal? Less → go left. Greater → go right. Equal → found it. Because the BST property guarantees that the left subtree contains only smaller values and the right subtree only larger ones, you never have to second-guess yourself. One comparison, one branch, half the tree eliminated.
For a balanced BST with n nodes, the height is roughly log₂(n). A tree with 1 million nodes is only about 20 levels deep. That's 20 comparisons maximum to find any value — versus up to 1 million for a linear scan.
flowchart TD
A["50"] --> B["30"]
A --> C["70"]
B --> D["20"]
B --> E["40"]
C --> F["60"]
C --> G["80"]
style A fill:#7c5cff,color:#0a0a0f
style B fill:#22d3ee,color:#0a0a0f
style C fill:#22d3ee,color:#0a0a0f
style D fill:#00ff9d,color:#0a0a0f
style E fill:#00ff9d,color:#0a0a0f
style F fill:#00ff9d,color:#0a0a0f
style G fill:#00ff9d,color:#0a0a0f
Searching for 40: start at 50 (go left), reach 30 (go right), reach 40 (found). Three comparisons for a 7-node tree. The height is 3, and log₂(7) ≈ 2.8. The math checks out.
How the tree is actually built: insert
Insert is just search with a twist. Walk the tree exactly as you would for a search. When you fall off a leaf — when the child pointer you'd follow is null — that's where the new node goes.
class Node:
def __init__(self, val):
self.val = val
self.left = None
self.right = None
def insert(root, val):
if root is None:
return Node(val) # base case: empty slot, plant it here
if val < root.val:
root.left = insert(root.left, val)
elif val > root.val:
root.right = insert(root.right, val)
# val == root.val: duplicate, ignore (or handle per your needs)
return root
Each recursive call descends one level. The new node always lands as a leaf. Because you followed the BST property on the way down, the tree property is preserved everywhere — no extra bookkeeping needed.
Search is nearly identical: same left/right branching, but instead of planting a node at null, you return False.
def search(root, target):
if root is None:
return False # fell off the tree, not here
if target == root.val:
return True
if target < root.val:
return search(root.left, target)
return search(root.right, target)
Both operations are O(h) where h is the height of the tree. When the tree is balanced, h = O(log n). When it isn't — well, we'll get there.
The visualizer: watch the invariant work
{ "type": "bst", "title": "insert 35 — watch where it lands", "data": [50, 30, 70, 20, 40, 60, 80] }
Step through inserting 35 and trace the path: 35 < 50 (go left), 35 > 30 (go right), 35 < 40 (go left), null — it lands as 40's left child. Every comparison cut the search space in half. And critically, 35 ended up in exactly the right place: its parent 40 is greater, its grandparent 30 is smaller, so the BST property holds for every ancestor. That's what the recursive structure buys you — correctness is maintained automatically as you go.
Delete: the one operation that requires actual thought
Delete has three cases, and the first two are easy:
Case 1 — leaf node. Just remove it. Nothing else is affected.
Case 2 — one child. Splice the node out and connect its parent directly to its one child. The subtree structure is preserved; only the deleted node disappears.
Case 3 — two children. This is the interesting one. You can't just remove the node without breaking the BST property. The trick: find the node's inorder successor — the smallest value in its right subtree, which is always the leftmost node of that right subtree. Copy that successor's value into the node you're deleting, then delete the successor instead. The successor has at most one child (a right child — it can't have a left child, since it's the minimum of its subtree), so it falls into Case 1 or 2.
def delete(root, val):
if root is None:
return None
if val < root.val:
root.left = delete(root.left, val)
elif val > root.val:
root.right = delete(root.right, val)
else:
# Found it — handle the three cases
if root.left is None: # Case 1 or 2: no left child
return root.right
if root.right is None: # Case 2: no right child
return root.left
# Case 3: two children — find inorder successor
successor = find_min(root.right)
root.val = successor.val # copy successor value down
root.right = delete(root.right, successor.val) # delete it up there
return root
def find_min(node):
while node.left:
node = node.left
return node
The successor swap is elegant: you never actually move pointers around in complex ways — you just swap a value and do a simpler deletion. O(h) total.
Inorder traversal is a free sort
Here's a property that's worth burning into memory: inorder traversal of a BST visits nodes in ascending sorted order. Always.
Inorder means: visit left subtree, then current node, then right subtree. Recursively, every node does this, and by the BST property, everything in the left subtree is smaller and everything in the right is larger. So the order you visit nodes is exactly ascending order.
def inorder(root, result=None):
if result is None:
result = []
if root is None:
return result
inorder(root.left, result) # everything smaller first
result.append(root.val) # then this node
inorder(root.right, result) # then everything larger
return result
# Insert [50, 30, 70, 20, 40, 60, 80], call inorder → [20, 30, 40, 50, 60, 70, 80]
This is how you validate a BST in an interview (inorder should produce a strictly increasing sequence), and it's also how you pull sorted output from one cheaply.
Successor and predecessor
The sorted-order property leads to two O(log n) operations you can't get from a hash table at any price:
- Inorder successor of N — the next-larger value. If N has a right subtree, it's the minimum of that subtree (
find_min(N.right)). If not, it's the nearest ancestor for which N is in the left subtree. - Inorder predecessor of N — the next-smaller value. Mirror image: maximum of the left subtree, or nearest ancestor for which N is in the right subtree.
These show up in range queries, finding the "closest value to X," and database index scans — problems where a hash table would force a full scan.
The fatal flaw: degenerate trees
Here is the thing nobody warns you about until you ship a bug into production: a BST built from sorted input is a linked list.
Insert 10, 20, 30, 40, 50 in that order. Each value is larger than all previous ones, so each lands as the right child of the previous node. The tree looks like this:
flowchart TD
N10["10"] --> N20["20"]
N20 --> N30["30"]
N30 --> N40["40"]
N40 --> N50["50"]
style N10 fill:#ff4d8d,color:#0a0a0f
style N50 fill:#ff4d8d,color:#0a0a0f
Height is n, not log n. Search for 50 takes 5 comparisons in this 5-node tree — O(n). Insert and delete are the same. You've built an O(n) linked list with extra steps.
This is not a theoretical concern. It's the classic gotcha if you're building a BST from pre-sorted data — timestamps from a log file, IDs from an autoincrementing database column, file names in lexicographic order. Real systems hit this.
The deeper issue: the plain BST has no self-correcting mechanism. Once the tree degenerates, it stays degenerate. The only fix is to never let it happen.
Self-balancing trees: the actual fix
The solution is to maintain a height invariant at insert and delete time, using rotations to restructure the tree when it gets lopsided. Two major variants:
- AVL trees — strictly balance-enforced: the heights of the left and right subtrees of any node differ by at most 1. Guaranteed O(log n) worst case; slightly more rotation overhead on insert/delete.
- Red-black trees — a relaxed coloring invariant that permits up to 2× height difference between subtrees, but guarantees O(log n) worst case with fewer rotations. Java's
TreeMap, C++'sstd::map, and Python'ssortedcontainers.SortedList(via skip lists, a cousin) all use this class of structure.
The full rotation mechanics live in the balanced BST deep dive. The point to internalize here: a plain BST is a teaching tool and occasionally the right choice for small, random-ish data. For production, you almost always want the balanced variant.
BST vs. hash table: pick the right one
This is one of the most common interview design questions. The short answer:
| You need… | Winner | Why |
|---|---|---|
| Fast membership test / exact lookup | Hash table | O(1) average vs O(log n) |
| Sorted iteration over keys | BST | Hash tables have no ordering |
| Range query: "all keys 10–50" | BST | Walk a contiguous inorder range |
| Min / max in O(log n) | BST | Leftmost / rightmost node |
| Successor / predecessor | BST | Hash table requires full scan |
| Unknown key distribution, worst-case guarantees | Balanced BST | No hash collision pathology |
A hash table will beat a BST on pure lookup speed — O(1) amortized vs O(log n) is a real gap, and the constant factor favors hash tables too (no pointer chasing through tree levels, better cache behavior for small tables). But the moment you need ordering, the hash table is helpless. That's the dividing line.
If your keys are strings and you need prefix queries ("all keys starting with 'app'"), look at a trie instead. If you need a sorted sequence with fast rank queries ("what's the kth smallest?"), an order-statistics tree (an augmented BST) is the tool.
A fuller worked example: validate a BST
This is one of the most common BST interview problems (Leetcode 98). The naive approach — check at each node that left.val < root.val < right.val — fails on cases like this:
5
/ \
1 4
/ \
3 6
Node 4's children (3 and 6) satisfy the local check, but 3 is in the right subtree of 5, so it should be > 5. The local check misses it.
The correct approach threads a valid range [lo, hi] down the tree:
def is_valid_bst(root, lo=float('-inf'), hi=float('inf')):
if root is None:
return True # empty tree is valid
if not (lo < root.val < hi):
return False # violates the inherited range
return (is_valid_bst(root.left, lo, root.val) and # left must be < root
is_valid_bst(root.right, root.val, hi)) # right must be > root
Every recursive call narrows the valid range. When you go left, the new upper bound is the current node's value. When you go right, the new lower bound is. O(n) time — every node visited once — O(h) space on the call stack.
The inorder approach works too: collect nodes in inorder and check if the sequence is strictly increasing. Equivalent correctness, slightly more code.
Anti-patterns and real gotchas
Inserting already-sorted data — covered above. Profile or shuffle your data before bulk-loading a BST if you can't use a balanced variant.
Forgetting duplicates — the classic BST definition says left < node < right, so duplicates have no home. Decide up front: ignore duplicates, allow them in the left subtree (left ≤ node), or keep a count at each node. Inconsistency here causes subtle bugs in delete.
Treating a BST like a balanced BST — just because the average case is O(log n) doesn't mean yours will be. If you're not using a self-balancing implementation, you're trusting your input to be random-enough. That's an assumption that tends to fail at exactly the moment you can least afford it.
Recursive implementations on deep trees — a degenerate BST with 100,000 nodes is 100,000 levels deep. That blows the call stack. Iterative implementations using an explicit stack or a while loop avoid this. In interviews it rarely matters; in production it absolutely can.
Implementing your own instead of using the stdlib — Python's sortedcontainers.SortedList and SortedDict, Java's TreeMap/TreeSet, C++'s std::map/std::set are all balanced and battle-tested. Reach for them before rolling your own BST unless you're in an interview or need a specific augmentation.
What's next
A plain BST is the concept; balanced BSTs (AVL, red-black) are what you actually deploy. Once you're comfortable with the BST invariant and the core operations here, the balancing machinery in the next article will make sense as a natural extension — rotations are just local pointer swaps that restore height balance without breaking the search property.
If you want to understand why BSTs are designed the way they are, revisit binary trees for the general tree vocabulary, and binary search for the array-based version of the same divide-and-conquer idea. The BST is precisely what you get when you embed binary search into a dynamically modifiable structure.
Frequently asked questions
▸What is the BST property, and why does it matter?
Every node N in a binary search tree satisfies: all values in its left subtree are strictly less than N, and all values in its right subtree are strictly greater. This invariant lets you eliminate half the remaining candidates at every step during search, giving O(log n) average-case performance — the same logic as binary search on a sorted array, but on a dynamic structure that supports cheap insert and delete.
▸Why can a binary search tree degrade to O(n)?
If you insert values in sorted or nearly-sorted order, each new node becomes the right child of the previous one. The tree is a straight line — a linked list wearing a tree costume — and search has to walk every node. This is the fatal flaw of a plain BST. Self-balancing trees (AVL, red-black) prevent it by rotating nodes after insert and delete to keep the height close to log n.
▸How does deleting a node with two children work in a BST?
You can't just remove a two-child node without breaking the BST property. The standard solution: find the node's inorder successor (the smallest value in its right subtree), copy its value into the node you wanted to delete, then delete the successor instead. The successor has at most one child (a right child), so that second deletion is straightforward.
▸What does inorder traversal have to do with sorting?
Inorder traversal — left subtree, then root, then right subtree — visits nodes in ascending order by the BST property. That means if you insert n values into a BST and run inorder traversal, you get them back sorted. This is actually the skeleton of tree sort, though it costs O(n log n) on average for the inserts and O(n) for the traversal.
▸When should I use a BST instead of a hash table?
Use a BST when you need ordering — range queries ("give me everything between 10 and 50"), finding the minimum or maximum, predecessor/successor lookups, or iterating keys in sorted order. A hash table gives O(1) average for exact lookups but knows nothing about order. If all you need is fast membership testing and no ordering, the hash table wins. If order matters, BST (or better, a balanced variant) is the right call.
You may also like
Union-Find (Disjoint Set Union)
The near-magic data structure for connectivity queries — are these two nodes in the same component? Nearly O(1) per operation with union by rank and path compression. Kruskal's MST, redundant connections, account merging, and more.
Tries (Prefix Trees)
The data structure powering autocomplete, spell-check, and IP routing — a tree where each edge is a character and every path from the root spells a prefix. Insert, search, and prefix queries all run in O(L) on word length alone, completely independent of how many words you store.
Strings
Strings look like simple text, but they hide a brutal trap: naive concatenation in a loop is O(n²). Learn why, how encoding actually works, and the handful of patterns — sliding window, two pointers, hashing — that solve 80% of string interview problems.