~/data structures/binary-search-tree
◆◆Intermediateasked at Amazonasked at Googleasked at Metaasked at Microsoft

Binary Search Trees

The left<node<right invariant gives you O(log n) search, insert, and delete — until the tree degenerates into a linked list. Here's the full picture, including why inorder traversal is a free sort.

11 min read2026-01-15Ironclad Academy
Complexity cheat sheet
Operation
Average
Worst
Searchworst case on a degenerate (sorted-input) tree
O(log n)
O(n)
Insertsame caveat — must find correct leaf first
O(log n)
O(n)
Deletesuccessor swap adds a constant, not a factor
O(log n)
O(n)
Min / Maxwalk leftmost / rightmost spine
O(log n)
O(n)
Inorder traversalvisits every node exactly once
O(n)
O(n)
Successor / Predecessorright subtree min or ancestor walk
O(log n)
O(n)
// DEPTH
the full breakdown — requirements, capacity, evolution, trade-offs

The invariant: why it gives you binary search for free

You've used binary search on a sorted array — every comparison eliminates half the remaining candidates, so you find any value in O(log n) comparisons. A BST does the exact same thing, except the "sorted array" is replaced by a pointer-linked tree.

At every node, you ask one question: is my target less than this value, greater, or equal? Less → go left. Greater → go right. Equal → found it. Because the BST property guarantees that the left subtree contains only smaller values and the right subtree only larger ones, you never have to second-guess yourself. One comparison, one branch, half the tree eliminated.

For a balanced BST with n nodes, the height is roughly log₂(n). A tree with 1 million nodes is only about 20 levels deep. That's 20 comparisons maximum to find any value — versus up to 1 million for a linear scan.

flowchart TD
    A["50"] --> B["30"]
    A --> C["70"]
    B --> D["20"]
    B --> E["40"]
    C --> F["60"]
    C --> G["80"]
    style A fill:#7c5cff,color:#0a0a0f
    style B fill:#22d3ee,color:#0a0a0f
    style C fill:#22d3ee,color:#0a0a0f
    style D fill:#00ff9d,color:#0a0a0f
    style E fill:#00ff9d,color:#0a0a0f
    style F fill:#00ff9d,color:#0a0a0f
    style G fill:#00ff9d,color:#0a0a0f

Searching for 40: start at 50 (go left), reach 30 (go right), reach 40 (found). Three comparisons for a 7-node tree. The height is 3, and log₂(7) ≈ 2.8. The math checks out.

How the tree is actually built: insert

Insert is just search with a twist. Walk the tree exactly as you would for a search. When you fall off a leaf — when the child pointer you'd follow is null — that's where the new node goes.

class Node:
    def __init__(self, val):
        self.val = val
        self.left = None
        self.right = None

def insert(root, val):
    if root is None:
        return Node(val)          # base case: empty slot, plant it here
    if val < root.val:
        root.left = insert(root.left, val)
    elif val > root.val:
        root.right = insert(root.right, val)
    # val == root.val: duplicate, ignore (or handle per your needs)
    return root

Each recursive call descends one level. The new node always lands as a leaf. Because you followed the BST property on the way down, the tree property is preserved everywhere — no extra bookkeeping needed.

Search is nearly identical: same left/right branching, but instead of planting a node at null, you return False.

def search(root, target):
    if root is None:
        return False              # fell off the tree, not here
    if target == root.val:
        return True
    if target < root.val:
        return search(root.left, target)
    return search(root.right, target)

Both operations are O(h) where h is the height of the tree. When the tree is balanced, h = O(log n). When it isn't — well, we'll get there.

The visualizer: watch the invariant work

{ "type": "bst", "title": "insert 35 — watch where it lands", "data": [50, 30, 70, 20, 40, 60, 80] }

Step through inserting 35 and trace the path: 35 < 50 (go left), 35 > 30 (go right), 35 < 40 (go left), null — it lands as 40's left child. Every comparison cut the search space in half. And critically, 35 ended up in exactly the right place: its parent 40 is greater, its grandparent 30 is smaller, so the BST property holds for every ancestor. That's what the recursive structure buys you — correctness is maintained automatically as you go.

Delete: the one operation that requires actual thought

Delete has three cases, and the first two are easy:

Case 1 — leaf node. Just remove it. Nothing else is affected.

Case 2 — one child. Splice the node out and connect its parent directly to its one child. The subtree structure is preserved; only the deleted node disappears.

Case 3 — two children. This is the interesting one. You can't just remove the node without breaking the BST property. The trick: find the node's inorder successor — the smallest value in its right subtree, which is always the leftmost node of that right subtree. Copy that successor's value into the node you're deleting, then delete the successor instead. The successor has at most one child (a right child — it can't have a left child, since it's the minimum of its subtree), so it falls into Case 1 or 2.

def delete(root, val):
    if root is None:
        return None
    if val < root.val:
        root.left = delete(root.left, val)
    elif val > root.val:
        root.right = delete(root.right, val)
    else:
        # Found it — handle the three cases
        if root.left is None:           # Case 1 or 2: no left child
            return root.right
        if root.right is None:          # Case 2: no right child
            return root.left
        # Case 3: two children — find inorder successor
        successor = find_min(root.right)
        root.val = successor.val        # copy successor value down
        root.right = delete(root.right, successor.val)  # delete it up there
    return root

def find_min(node):
    while node.left:
        node = node.left
    return node

The successor swap is elegant: you never actually move pointers around in complex ways — you just swap a value and do a simpler deletion. O(h) total.

Inorder traversal is a free sort

Here's a property that's worth burning into memory: inorder traversal of a BST visits nodes in ascending sorted order. Always.

Inorder means: visit left subtree, then current node, then right subtree. Recursively, every node does this, and by the BST property, everything in the left subtree is smaller and everything in the right is larger. So the order you visit nodes is exactly ascending order.

def inorder(root, result=None):
    if result is None:
        result = []
    if root is None:
        return result
    inorder(root.left, result)   # everything smaller first
    result.append(root.val)      # then this node
    inorder(root.right, result)  # then everything larger
    return result

# Insert [50, 30, 70, 20, 40, 60, 80], call inorder → [20, 30, 40, 50, 60, 70, 80]

This is how you validate a BST in an interview (inorder should produce a strictly increasing sequence), and it's also how you pull sorted output from one cheaply.

Successor and predecessor

The sorted-order property leads to two O(log n) operations you can't get from a hash table at any price:

  • Inorder successor of N — the next-larger value. If N has a right subtree, it's the minimum of that subtree (find_min(N.right)). If not, it's the nearest ancestor for which N is in the left subtree.
  • Inorder predecessor of N — the next-smaller value. Mirror image: maximum of the left subtree, or nearest ancestor for which N is in the right subtree.

These show up in range queries, finding the "closest value to X," and database index scans — problems where a hash table would force a full scan.

The fatal flaw: degenerate trees

Here is the thing nobody warns you about until you ship a bug into production: a BST built from sorted input is a linked list.

Insert 10, 20, 30, 40, 50 in that order. Each value is larger than all previous ones, so each lands as the right child of the previous node. The tree looks like this:

flowchart TD
    N10["10"] --> N20["20"]
    N20 --> N30["30"]
    N30 --> N40["40"]
    N40 --> N50["50"]
    style N10 fill:#ff4d8d,color:#0a0a0f
    style N50 fill:#ff4d8d,color:#0a0a0f

Height is n, not log n. Search for 50 takes 5 comparisons in this 5-node tree — O(n). Insert and delete are the same. You've built an O(n) linked list with extra steps.

This is not a theoretical concern. It's the classic gotcha if you're building a BST from pre-sorted data — timestamps from a log file, IDs from an autoincrementing database column, file names in lexicographic order. Real systems hit this.

The deeper issue: the plain BST has no self-correcting mechanism. Once the tree degenerates, it stays degenerate. The only fix is to never let it happen.

Self-balancing trees: the actual fix

The solution is to maintain a height invariant at insert and delete time, using rotations to restructure the tree when it gets lopsided. Two major variants:

  • AVL trees — strictly balance-enforced: the heights of the left and right subtrees of any node differ by at most 1. Guaranteed O(log n) worst case; slightly more rotation overhead on insert/delete.
  • Red-black trees — a relaxed coloring invariant that permits up to 2× height difference between subtrees, but guarantees O(log n) worst case with fewer rotations. Java's TreeMap, C++'s std::map, and Python's sortedcontainers.SortedList (via skip lists, a cousin) all use this class of structure.

The full rotation mechanics live in the balanced BST deep dive. The point to internalize here: a plain BST is a teaching tool and occasionally the right choice for small, random-ish data. For production, you almost always want the balanced variant.

BST vs. hash table: pick the right one

This is one of the most common interview design questions. The short answer:

You need…WinnerWhy
Fast membership test / exact lookupHash tableO(1) average vs O(log n)
Sorted iteration over keysBSTHash tables have no ordering
Range query: "all keys 10–50"BSTWalk a contiguous inorder range
Min / max in O(log n)BSTLeftmost / rightmost node
Successor / predecessorBSTHash table requires full scan
Unknown key distribution, worst-case guaranteesBalanced BSTNo hash collision pathology

A hash table will beat a BST on pure lookup speed — O(1) amortized vs O(log n) is a real gap, and the constant factor favors hash tables too (no pointer chasing through tree levels, better cache behavior for small tables). But the moment you need ordering, the hash table is helpless. That's the dividing line.

If your keys are strings and you need prefix queries ("all keys starting with 'app'"), look at a trie instead. If you need a sorted sequence with fast rank queries ("what's the kth smallest?"), an order-statistics tree (an augmented BST) is the tool.

A fuller worked example: validate a BST

This is one of the most common BST interview problems (Leetcode 98). The naive approach — check at each node that left.val < root.val < right.val — fails on cases like this:

    5
   / \
  1   4
     / \
    3   6

Node 4's children (3 and 6) satisfy the local check, but 3 is in the right subtree of 5, so it should be > 5. The local check misses it.

The correct approach threads a valid range [lo, hi] down the tree:

def is_valid_bst(root, lo=float('-inf'), hi=float('inf')):
    if root is None:
        return True                        # empty tree is valid
    if not (lo < root.val < hi):
        return False                       # violates the inherited range
    return (is_valid_bst(root.left,  lo, root.val) and   # left must be < root
            is_valid_bst(root.right, root.val, hi))      # right must be > root

Every recursive call narrows the valid range. When you go left, the new upper bound is the current node's value. When you go right, the new lower bound is. O(n) time — every node visited once — O(h) space on the call stack.

The inorder approach works too: collect nodes in inorder and check if the sequence is strictly increasing. Equivalent correctness, slightly more code.

Anti-patterns and real gotchas

Inserting already-sorted data — covered above. Profile or shuffle your data before bulk-loading a BST if you can't use a balanced variant.

Forgetting duplicates — the classic BST definition says left < node < right, so duplicates have no home. Decide up front: ignore duplicates, allow them in the left subtree (left ≤ node), or keep a count at each node. Inconsistency here causes subtle bugs in delete.

Treating a BST like a balanced BST — just because the average case is O(log n) doesn't mean yours will be. If you're not using a self-balancing implementation, you're trusting your input to be random-enough. That's an assumption that tends to fail at exactly the moment you can least afford it.

Recursive implementations on deep trees — a degenerate BST with 100,000 nodes is 100,000 levels deep. That blows the call stack. Iterative implementations using an explicit stack or a while loop avoid this. In interviews it rarely matters; in production it absolutely can.

Implementing your own instead of using the stdlib — Python's sortedcontainers.SortedList and SortedDict, Java's TreeMap/TreeSet, C++'s std::map/std::set are all balanced and battle-tested. Reach for them before rolling your own BST unless you're in an interview or need a specific augmentation.

What's next

A plain BST is the concept; balanced BSTs (AVL, red-black) are what you actually deploy. Once you're comfortable with the BST invariant and the core operations here, the balancing machinery in the next article will make sense as a natural extension — rotations are just local pointer swaps that restore height balance without breaking the search property.

If you want to understand why BSTs are designed the way they are, revisit binary trees for the general tree vocabulary, and binary search for the array-based version of the same divide-and-conquer idea. The BST is precisely what you get when you embed binary search into a dynamically modifiable structure.

// FAQ

Frequently asked questions

What is the BST property, and why does it matter?

Every node N in a binary search tree satisfies: all values in its left subtree are strictly less than N, and all values in its right subtree are strictly greater. This invariant lets you eliminate half the remaining candidates at every step during search, giving O(log n) average-case performance — the same logic as binary search on a sorted array, but on a dynamic structure that supports cheap insert and delete.

Why can a binary search tree degrade to O(n)?

If you insert values in sorted or nearly-sorted order, each new node becomes the right child of the previous one. The tree is a straight line — a linked list wearing a tree costume — and search has to walk every node. This is the fatal flaw of a plain BST. Self-balancing trees (AVL, red-black) prevent it by rotating nodes after insert and delete to keep the height close to log n.

How does deleting a node with two children work in a BST?

You can't just remove a two-child node without breaking the BST property. The standard solution: find the node's inorder successor (the smallest value in its right subtree), copy its value into the node you wanted to delete, then delete the successor instead. The successor has at most one child (a right child), so that second deletion is straightforward.

What does inorder traversal have to do with sorting?

Inorder traversal — left subtree, then root, then right subtree — visits nodes in ascending order by the BST property. That means if you insert n values into a BST and run inorder traversal, you get them back sorted. This is actually the skeleton of tree sort, though it costs O(n log n) on average for the inserts and O(n) for the traversal.

When should I use a BST instead of a hash table?

Use a BST when you need ordering — range queries ("give me everything between 10 and 50"), finding the minimum or maximum, predecessor/successor lookups, or iterating keys in sorted order. A hash table gives O(1) average for exact lookups but knows nothing about order. If all you need is fast membership testing and no ordering, the hash table wins. If order matters, BST (or better, a balanced variant) is the right call.

// RELATED

You may also like