Trees: When Data Branches
Move beyond flat lists and discover why trees power everything from file systems to databases. Learn nodes, roots, leaves, height, and how a balanced tree buys you O(log n) on a silver platter.
The moment you outgrow flat
Here''s a bug you''ve probably written. You have a sorted array of user IDs and you need to check membership constantly as new IDs arrive. Binary search gets you O(log n) lookups — great. But insertions? You''re shifting elements to keep it sorted, O(n) every time. By the time your list hits 100,000 entries, each new insert is scanning and moving thousands of elements.
Or maybe you went the other direction and used a hash set: O(1) insert and lookup, but now a product manager asks for "all users with IDs between 1000 and 2000." Hash tables have no concept of order. You''re scanning the whole thing.
This is the gap trees fill. A balanced BST gives you O(log n) insert, O(log n) search, and the ability to answer range queries efficiently — all at once. Before we get there, let''s build the vocabulary.
Anatomy of a tree
flowchart TD
A["50 (root)"]:::root --> B["30"]:::inner
A --> C["70"]:::inner
B --> D["20"]:::leaf
B --> E["40"]:::leaf
C --> F["60"]:::leaf
C --> G["80"]:::leaf
classDef root fill:#7c5cff,color:#ffffff
classDef inner fill:#22d3ee,color:#0a0a0f
classDef leaf fill:#00ff9d,color:#0a0a0f
A few terms you''ll see everywhere:
- Node — the basic unit. It holds a value and references to its children.
- Root — the single node with no parent. Every tree has exactly one.
- Leaf — a node with no children. The endpoints of every branch.
- Edge — the connection between a parent and a child.
- Height — the longest path from the root down to any leaf, measured in edges. This is the number that controls everything.
- Depth — the distance from the root to a specific node. The root is at depth 0.
- Subtree — any node plus all its descendants. Every node is the root of its own subtree.
In the diagram above, 50 is the root, 20, 40, 60, and 80 are leaves, and the height is 2 (root → 30 → 20 is two edges).
Why height is everything
The cost of searching, inserting, or deleting in a BST is proportional to height — you walk from the root to the target node, and the path can be at most height steps long. So the whole game is: keep the height small.
How small can you get? A perfectly balanced binary tree with n nodes has height ⌊log₂ n⌋. That is as good as it gets.
| Nodes | Balanced height | Worst-case (unbalanced) height |
|---|---|---|
| 7 | 2 | 6 |
| 1,000 | ~10 | 999 |
| 1,000,000 | ~20 | 999,999 |
| 1,000,000,000 | ~30 | 999,999,999 |
That column on the right — worst-case height — is what happens when you insert already-sorted data into a naive BST. You get a straight chain. No branching. It becomes a linked list with extra steps, and O(log n) collapses to O(n). This is not hypothetical; it bites people in production.
The fix is a self-balancing BST: AVL trees and Red-Black trees keep the height within a constant factor of log n through rotations after each insert and delete. You almost never implement these yourself — language standard libraries do it for you (TreeMap in Java, std::map in C++, SortedList in C#) — but you should know why they exist.
The BST ordering rule
Let''s be precise. A binary search tree satisfies one invariant at every node:
All values in the left subtree are strictly less than this node''s value.
All values in the right subtree are strictly greater than this node''s value.
This holds recursively — not just for immediate children, but for the entire subtree. That is a common misconception. The node 40 in the tree above isn''t just greater than 30 (its parent); it''s also less than 50 (its grandparent). Every ancestor constrains it.
{ "type": "bst", "title": "the ordering rule — play with insert and search", "data": [50, 30, 70, 20, 40, 60, 80] }
Use the visualizer to insert a value — say, 45. Watch it walk right from 50 (45 < 50, go left), then right from 30 (45 > 30, go right), then right from 40 (45 > 40, go right) and land as 40''s right child. Every comparison is a binary decision that cuts the remaining candidates in half.
Core operations: how they actually work
Search starts at the root. At each node, compare your target to the current value: if equal, found; if less, go left; if greater, go right. Repeat until you find it or fall off the tree (not present). Cost: O(h) where h is the height.
Insert is the same walk, except when you fall off the tree you''ve found the exact spot where the new node belongs — attach it there. O(h).
Delete has three cases worth knowing:
- The node is a leaf: just remove it.
- The node has one child: splice it out and connect the parent directly to the child.
- The node has two children: find the in-order successor (the smallest value in the right subtree) or in-order predecessor (largest in the left subtree), copy its value to the current node, then delete that node instead. This sounds fiddly but is a standard pattern.
class Node:
def __init__(self, val):
self.val = val
self.left = None
self.right = None
class BST:
def __init__(self):
self.root = None
def insert(self, val):
if not self.root:
self.root = Node(val)
return
cur = self.root
while True:
if val < cur.val:
if cur.left is None:
cur.left = Node(val)
return
cur = cur.left
else:
if cur.right is None:
cur.right = Node(val)
return
cur = cur.right
def search(self, val):
cur = self.root
while cur:
if val == cur.val:
return True
cur = cur.left if val < cur.val else cur.right
return False
Notice how both insert and search are the same loop — walk left or right based on the comparison, stop when you find it or run out of tree.
Traversals: the four ways to visit every node
Once you have a tree, you often need to visit every node. The order you do it in changes what you get out.
flowchart LR
direction LR
T1["In-order\nleft → node → right"]:::t --> R1["Sorted output\n20, 30, 40, 50, 60, 70, 80"]:::r
T2["Pre-order\nnode → left → right"]:::t --> R2["Parent before children\nGood for copying/serializing"]:::r
T3["Post-order\nleft → right → node"]:::t --> R3["Children before parent\nGood for deletion, sizing"]:::r
T4["Level-order (BFS)\nrow by row"]:::t --> R4["Uses a queue\nGood for shortest path in trees"]:::r
classDef t fill:#7c5cff,color:#ffffff
classDef r fill:#1e1e2e,color:#cdd6f4
The one you''ll reach for most in interviews is in-order — for a BST it gives you a perfectly sorted sequence in O(n), which is genuinely magical. Write it once and internalize the shape:
def inorder(node, result=[]):
if node is None:
return
inorder(node.left, result)
result.append(node.val) # process AFTER left, BEFORE right
inorder(node.right, result)
return result
Swap the result.append line before the recursive calls and you have pre-order. Move it after both recursive calls and you have post-order. The structure is identical — only when you process the node changes.
Level-order (breadth-first) is the odd one out — it uses a queue rather than recursion:
from collections import deque
def level_order(root):
if not root:
return []
result, queue = [], deque([root])
while queue:
node = queue.popleft()
result.append(node.val)
if node.left: queue.append(node.left)
if node.right: queue.append(node.right)
return result
Level-order shows up constantly in problems that ask about "the nearest" something — shortest path in a graph, minimum depth of a tree, right-side view. The queue ensures you finish every node at depth d before touching any node at depth d+1.
The two specializations: BST and Heap
Trees branch into two structures you''ll use heavily and both deserve their own deep dives.
The binary search tree is what this article has been building toward — the ordering invariant that makes search logarithmic. The binary tree page covers the general structure without the ordering constraint, which matters for problems involving expression trees, Huffman coding, and tree DP.
The heap is a tree in disguise. It stores data in an array but the conceptual structure is a complete binary tree with a different invariant: every parent is smaller than its children (min-heap) or larger (max-heap). It doesn''t support search by arbitrary value, but it gives you the minimum (or maximum) in O(1) and insert/delete in O(log n). The moment a problem says "give me the k smallest elements" or "always pop the highest-priority task next," you want a heap, not a BST.
| BST | Heap | |
|---|---|---|
| Find min/max | O(log n) | O(1) |
| Search by value | O(log n) | O(n) |
| Insert | O(log n) | O(log n) |
| Delete arbitrary | O(log n) | O(n) |
| Sorted traversal | O(n) in-order | O(n log n) |
| Memory layout | Pointer-based | Array (cache-friendly) |
The pitfalls that get people
Inserting in sorted order into a naive BST. If you insert 1, 2, 3, 4, 5 sequentially, every node goes right and you get a straight chain. Height becomes n. Always think about whether your insert sequence might be monotone — and if it is, use a self-balancing variant.
Confusing height and depth. Height is measured from a node downward to the farthest leaf. Depth is measured from the root down to that node. The root has depth 0 and height equal to the tree''s height. Leaf nodes have depth equal to their distance from the root and height 0. Problems mix these up constantly — read carefully.
Forgetting the recursive invariant. The BST property is not just "left child < node < right child." It is "every node in the left subtree < node < every node in the right subtree." A common buggy approach is to check only adjacent parent-child pairs when validating a BST. The proper check passes down a range [min_allowed, max_allowed] that tightens as you recurse.
def is_valid_bst(node, lo=float('-inf'), hi=float('inf')):
if node is None:
return True
if not (lo < node.val < hi):
return False
return (is_valid_bst(node.left, lo, node.val) and
is_valid_bst(node.right, node.val, hi))
Stack overflow on deep trees. Recursive traversals on a heavily unbalanced tree can blow the call stack (Python''s default recursion limit is 1000). For production code on trees of unknown depth, use an iterative approach with an explicit stack.
When a tree is the wrong tool
Trees are not your default structure — arrays and hash tables are. Reach for a tree when you need at least two of these at once: ordered data, fast inserts, fast range queries. If you only need fast lookups and don''t care about order, a hash table is simpler and faster. If your data never changes after it''s built, a sorted array with binary search is cache-friendlier and needs no pointers.
Also: if you''re storing a small collection (under a few hundred elements), the overhead of pointer-based tree nodes often makes a simple sorted array faster in practice due to cache effects — even when the Big-O says the tree should win.
For the deeper treatment of the BST itself — rotations, self-balancing, the deletion algorithm in full detail, and the interview problems trees are famous for — head to Binary Search Tree. For the heap and its canonical use cases (priority queues, heapsort, top-K problems), Heap is the next read.
Frequently asked questions
▸What is a tree data structure?
A tree is a hierarchical data structure made of nodes, where each node holds a value and zero or more references (called edges) to child nodes. One special node — the root — sits at the top with no parent. Every other node has exactly one parent, and there are no cycles. Trees are everywhere: file systems, HTML DOMs, company org charts, and the decision trees inside your compiler.
▸What is the difference between a binary tree and a binary search tree?
A binary tree is just a tree where every node has at most two children — left and right. That is purely a structural constraint, nothing about the values. A binary search tree (BST) adds an ordering rule on top: every value in a node's left subtree is smaller than the node, and every value in the right subtree is larger. That ordering is what makes search, insert, and delete take O(log n) on a balanced BST instead of O(n).
▸Why does a balanced tree give O(log n) operations?
At each node in a balanced BST you eliminate roughly half the remaining candidates. With 1 million nodes, you make at most ~20 comparisons (because 2^20 ≈ 1 million). More formally, a perfectly balanced tree of n nodes has height ⌊log₂ n⌋, and height is exactly how many steps a search takes. When the tree is unbalanced — a straight chain of nodes — height becomes n and you lose the logarithm entirely.
▸What are the three main ways to traverse a binary tree?
In-order (left → node → right) visits nodes in sorted order for a BST — incredibly useful. Pre-order (node → left → right) processes the parent before its children, which mirrors how you'd serialize a tree or build a copy. Post-order (left → right → node) processes children first, which is natural for deletion or computing subtree sizes. There is also level-order (breadth-first), which visits nodes row by row using a queue.
▸When should I use a tree instead of an array or hash table?
Use a tree when you need sorted order AND fast insert/delete — something arrays can't do simultaneously, and hash tables don't support at all. A BST gives you O(log n) search, insert, delete, and in-order traversal in one package. Hash tables beat trees on pure lookup speed (O(1) vs O(log n)), but they can't give you "find all values between 100 and 200" efficiently. Trees also shine whenever the data is inherently hierarchical (file paths, expression evaluation, decision logic).