DFS Patterns
Depth-first search is the backbone of cycle detection, flood fill, path enumeration, and clone graph — master the visited-set template, the 3-color trick, and when to reach for DFS over BFS.
The mental model: commit and backtrack
Imagine exploring a cave system with one rule: always take the first unexplored tunnel you see, and only backtrack when you hit a dead end or a tunnel you've already mapped. That is DFS. You trace one path all the way to its conclusion, then retrace your steps just far enough to find the next unexplored fork.
The thing that makes this powerful — and different from BFS — is that the path from the entrance to your current position is always fully in memory. The call stack is literally a recording of the route you took. That is what makes DFS natural for problems about paths, cycles, and reachability along specific routes.
flowchart TD
A["0"] --> B["1"]
A --> C["4"]
B --> D["2"]
B --> E["3"]
D --> F["5"]
style A fill:#7c5cff,color:#fff
style B fill:#22d3ee,color:#0a0a0f
style D fill:#22d3ee,color:#0a0a0f
style F fill:#00ff9d,color:#0a0a0f
style E fill:#00ff9d,color:#0a0a0f
style C fill:#00ff9d,color:#0a0a0f
DFS on this graph from node 0: visit 0, go to 1, go to 2, go to 5 (dead end, backtrack), back to 2 (done), back to 1, go to 3 (dead end), back to 1 (done), back to 0, go to 4 (dead end). The call stack at peak depth holds [0, 1, 2, 5] — the exact path from root to current position.
The canonical template
This is the four-line core. Every DFS problem you'll ever write is a variation of this:
def dfs(graph, node, visited):
if node in visited:
return
visited.add(node)
for neighbor in graph[node]:
dfs(graph, neighbor, visited)
That's it. Check if visited. Mark visited. Recurse on each neighbor. The work you actually want to do — processing a node, accumulating a path, checking a condition — gets inserted inside this skeleton.
The visited set is not optional. On a graph with cycles, skipping it sends you into infinite recursion. Even on a DAG, revisiting nodes wastes work and can produce incorrect results. Always bring the set.
Iterative form (when recursion depth is a problem)
Python's default recursion limit is 1000. Grids can be 300×300 = 90,000 cells. You'll hit the limit. The fix is an explicit stack:
def dfs_iterative(graph, start):
visited = set()
stack = [start]
while stack:
node = stack.pop()
if node in visited:
continue
visited.add(node)
for neighbor in graph[node]:
if neighbor not in visited:
stack.append(neighbor)
One subtle difference: because stack.pop() takes from the end and neighbors get appended in list order, the iterative version visits neighbors in reverse order compared to recursive DFS. For most problems this doesn't matter. When exact traversal order does matter — like enumerating paths — stick with recursion or be careful about your append order.
Complexity
| What | Cost | Note |
|---|---|---|
| Time | O(V + E) | Every vertex visited once, every edge examined once |
| Space (call stack) | O(V) | Worst case: a long path graph, recursion V levels deep |
| Space (visited set) | O(V) | One entry per vertex |
Compare this to BFS, which has identical O(V + E) time and O(V) space. The algorithms are equal on raw complexity — the choice between them is about what the problem needs, not about efficiency.
Pattern 1: flood fill / number of islands
This is the "entry-level" DFS problem shape, and it's the one that teaches you the grid-as-graph reframe cleanest.
The tell: You have a 2D grid. You need to count connected regions, or paint a connected region, or find if you can reach one cell from another.
The insight: Each cell is a node. Each adjacent cell (up/down/left/right, sometimes diagonals) is an edge. You never need to build an explicit graph — just run DFS directly on the grid.
def num_islands(grid):
if not grid:
return 0
rows, cols = len(grid), len(grid[0])
count = 0
def dfs(r, c):
# out of bounds or water or already visited
if r < 0 or r >= rows or c < 0 or c >= cols or grid[r][c] != '1':
return
grid[r][c] = '#' # mark visited by mutating in place
dfs(r + 1, c)
dfs(r - 1, c)
dfs(r, c + 1)
dfs(r, c - 1)
for r in range(rows):
for c in range(cols):
if grid[r][c] == '1':
count += 1
dfs(r, c) # sinks the entire island
return count
Two tricks here. First: using '#' to mark visited cells directly in the grid avoids a separate O(rows×cols) visited matrix. Second: when the outer loop finds an unvisited '1', it's guaranteed to be a new island — the DFS call that follows sinks the entire connected landmass before the outer loop continues.
Flood fill (LeetCode 733) is the same algorithm with paint colors instead of '1'/'0'. If you can write one, you can write the other.
{ "type": "graph", "title": "Grid DFS: each cell is a node, neighbors are edges" }
Play with the visualizer above — mentally map the graph nodes to grid cells. DFS starting from any land cell fans out to all connected land before returning. That's your island count right there.
Pattern 2: path existence and full-path enumeration
Sometimes you need to know if a path exists. Sometimes you need all the paths. DFS handles both; the difference is what you do when you reach the target.
Path existence
def has_path(graph, src, dst, visited=None):
if visited is None:
visited = set()
if src == dst:
return True
if src in visited:
return False
visited.add(src)
return any(has_path(graph, neighbor, dst, visited)
for neighbor in graph[src])
Short-circuits the moment it finds the target. O(V + E) worst case (path doesn't exist, you've explored everything).
All paths source to target (no cycles in this version)
LeetCode 797 asks for all paths from node 0 to node n-1 in a DAG. The trick is that you need to include the current path in your state, which means backtracking — explicitly undoing the "add to path" step when you return:
def all_paths(graph):
n = len(graph)
result = []
def dfs(node, path):
if node == n - 1:
result.append(list(path)) # snapshot — don't append the reference
return
for neighbor in graph[node]:
path.append(neighbor)
dfs(neighbor, path)
path.pop() # backtrack: undo before trying next neighbor
dfs(0, [0])
return result
That path.pop() is the backtrack. Without it, you'd be accumulating garbage from previous branches. This is the exact same structure as backtracking — DFS with explicit undo steps.
Pattern 3: cycle detection in directed graphs (3-color)
This is the one that trips people up. If you use a simple two-color visited set on a directed graph, you get false positives.
Why two colors fail: Consider nodes A → B → C and A → C. If you visit A, then B, then C (mark C as visited), then backtrack and try the direct edge A → C — you see C is visited and incorrectly call it a cycle. But it's not: the two paths from A to C don't form a loop.
Three colors fix it:
- WHITE (0): unvisited
- GRAY (1): currently on the DFS stack (in-progress)
- BLACK (2): fully explored — all descendants processed
A cycle exists if and only if you ever reach a GRAY node while doing DFS. A GRAY node means "I'm an ancestor of the current position." Reaching it again means there's a path from a descendant back up to an ancestor — that's a back edge, which is a cycle.
def has_cycle_directed(graph, n):
WHITE, GRAY, BLACK = 0, 1, 2
color = [WHITE] * n
def dfs(node):
color[node] = GRAY
for neighbor in graph[node]:
if color[neighbor] == GRAY:
return True # back edge → cycle
if color[neighbor] == WHITE:
if dfs(neighbor):
return True
color[node] = BLACK
return False
return any(dfs(v) for v in range(n) if color[v] == WHITE)
The color[node] = BLACK line happens after ALL neighbors are fully explored. That's the key timing. GRAY means "I'm somewhere above you in the current path." BLACK means "I've been fully resolved, no back edges reachable from me."
This same algorithm underlies topological sort (specifically Kahn's DFS variant) and deadlock detection in operating systems. A directed graph can be topologically ordered if and only if it has no cycles — i.e., DFS runs without ever hitting a GRAY node.
flowchart LR
A["node: WHITE\n(unvisited)"]
B["node: GRAY\n(on stack)"]
C["node: BLACK\n(done)"]
A -->|"enter DFS"| B
B -->|"all neighbors done"| C
B -->|"reach another GRAY"| E["CYCLE FOUND"]
style A fill:#444,color:#fff
style B fill:#7c5cff,color:#fff
style C fill:#00ff9d,color:#0a0a0f
style E fill:#ff4d8d,color:#fff
Pattern 4: clone graph
Clone graph (LeetCode 133) is the DFS problem that's specifically about building a structure as you traverse. The visited map doubles as your copy-tracking registry:
def clone_graph(node):
if not node:
return None
clones = {} # original node → its clone
def dfs(n):
if n in clones:
return clones[n]
clone = Node(n.val)
clones[n] = clone # register BEFORE recursing (handles cycles)
for neighbor in n.neighbors:
clone.neighbors.append(dfs(neighbor))
return clone
return dfs(node)
The subtle correctness requirement: you must register the clone in clones before recursing into neighbors. Why? The graph can have cycles. If A and B point to each other, when you DFS from A to B, B tries to recurse back to A. If A's clone isn't registered yet, you'll enter infinite recursion. The if n in clones: return clones[n] early-return only works because you register before you recurse.
This pattern generalizes to any "deep copy of a cyclic structure" problem.
DFS vs. BFS: make the call fast
You don't need to think hard about this. One question settles it 90% of the time:
Does the problem care about the shortest path (minimum number of hops)?
- Yes → BFS. BFS's FIFO queue guarantees the first time you reach the destination, you took the minimum number of edges.
- No, or the problem is about structure/reachability/all-paths → DFS.
| Problem type | Reach for |
|---|---|
| Shortest path in unweighted graph | BFS |
| Level-by-level processing | BFS |
| Nearest neighbor / minimum hops | BFS |
| Cycle detection | DFS |
| Connected components | Either (DFS is usually cleaner) |
| All paths from source to target | DFS |
| Flood fill / island count | DFS (BFS works too, DFS is shorter) |
| Topological sort | DFS (Kahn's / 3-color) |
| Clone graph / copy structure | DFS |
| Detect if path exists (don't care which) | Either |
For tree problems — see binary tree — DFS is almost always the right default. Trees have no cycles, so no visited set needed, and recursion on a tree is so natural it barely needs naming.
For general graphs, the choice depends on the question. When in doubt, ask yourself: "Would I need to know how many hops this took?" If no, start with DFS.
Pitfalls and anti-patterns
Forgetting the visited set on a cyclic graph. This is the #1 bug. Recursive DFS on a graph with a cycle will infinite-loop without it. Trees don't need it (no cycles), but graphs do. Habituate yourself: graph DFS → visited set, period.
Two-color visited set on a directed graph for cycle detection. As shown above, it gives false positives. You need three colors: WHITE, GRAY, BLACK. On undirected graphs, two colors is fine — any back edge is a cycle. On directed graphs, it's not.
Mutating input as the visited marker without thinking. Modifying the grid in-place (turning '1' to '#') is fine when the problem says you can, but some problems explicitly state "do not modify input." Check the constraints. When in doubt, carry a separate visited set.
Appending a list reference instead of a copy. Classic bug in all-paths problems:
result.append(path) # WRONG: appends a reference; future pops will corrupt it
result.append(list(path)) # CORRECT: snapshot at this moment
Missing the backtrack. In path-enumeration problems, if you add something to the path before recursing, you must remove it after:
path.append(node)
dfs(node, path)
path.pop() # this must happen — or every branch carries garbage from others
Miss the pop and your paths will accumulate every node you've ever visited across all branches. This is the same mistake beginners make with backtracking — the problem is always a missing undo step.
Stack overflow on large grids. Python's recursion limit bites you on grids larger than ~30×30 in the worst case (a single connected component that's a snake). Either use the iterative DFS template from earlier, or add sys.setrecursionlimit(10**6) at the top of your solution. In competitive programming, the latter is idiomatic. In production code, use the iterative version.
When NOT to use DFS
DFS is powerful but not universal.
- You need shortest path by hop count → use BFS. DFS finds a path, not the shortest one.
- You need shortest weighted path → use Dijkstra's or Bellman-Ford. Neither is DFS.
- The graph is very wide but shallow and you need to find something near the top → BFS finds it in the first few levels; DFS might dive to the bottom of a completely wrong branch first.
- You need connected components on a very large, dense graph → Union-Find (union-find) is often faster in practice due to its nearly O(1) per operation with path compression. DFS is O(V + E) which is fine asymptotically, but Union-Find has a smaller constant and handles dynamic connectivity too.
- The problem requires processing in level order (e.g., a binary tree's level-order output, zigzag traversal) → BFS is the natural tool. You can do it with DFS by tracking depth as a parameter, but it's more work for no gain.
And a meta-pitfall: reaching for DFS on every graph problem because it's familiar. When you read a graph problem, spend five seconds on the "does shortest path matter?" question before committing. Getting it wrong means solving a harder problem than necessary.
Putting it together: the interview game plan
When you see a graph or grid problem:
-
Identify the structure. Is it a grid? A general graph? A tree? Grid → cells are nodes, neighbors are edges, visited array or in-place mutation tracks state.
-
Ask: does distance / shortest path matter? No → DFS is likely fine. Yes → consider BFS.
-
Ask: do you need cycle detection? In a directed graph → 3-color DFS. In an undirected graph → 2-color (or just check
neighbor != parentin the recursive call). -
Reach for the template. Visited set, mark on entry, recurse on unvisited neighbors. Add your problem-specific logic inside the recursion (accumulate a path, flip a cell, build a clone).
-
Check recursion depth. If the graph can have millions of nodes or the grid is large, use the iterative form or bump the recursion limit.
DFS is one of those tools where knowing the template is necessary but not sufficient. What makes the difference is recognizing the shape — flood fill, all-paths, cycle, clone — so you know which variation to reach for. Practice those four patterns until the skeleton is automatic, and the rest is just problem-specific fill.
Frequently asked questions
▸What is depth-first search and how does it work?
Depth-first search (DFS) commits fully to one branch before backtracking and trying another. Starting from a source node, it picks a neighbor, goes as deep as possible along that path until it hits a dead end or an already-visited node, then unwinds and tries the next unvisited neighbor. The call stack — either the language runtime's recursion stack or an explicit stack you manage — is what enforces this "go deep first" discipline.
▸How do you detect a cycle in a directed graph using DFS?
Use three node colors: WHITE (unvisited), GRAY (currently on the DFS stack / in progress), BLACK (fully explored). When you visit a node, mark it GRAY. When you finish exploring all its descendants, mark it BLACK. If during the DFS you ever reach a GRAY node, you have found a back edge — a path that leads back to an ancestor still on the stack — which means there is a cycle. A fully BLACK node is safe; encountering it just means you've merged into an already-explored branch.
▸When should I use DFS instead of BFS?
Reach for DFS when you need to explore full paths (all paths from source to target, permutations, subsets), detect cycles, compute connected components, check reachability without caring about distance, or when you are working on trees where recursion is natural. Use BFS when you need the shortest path in terms of edge count, or when level-by-level processing matters. The structural rule: BFS = shortest hop count; DFS = full exploration, structure, and backtracking.
▸Is the recursive DFS call stack the same as an explicit stack?
Yes, they implement the same algorithm. In the recursive version, the language runtime's call stack stores the DFS state — each recursive call represents "I am currently on this node, mid-way through exploring its neighbors." In the iterative version, you maintain an explicit stack and push/pop nodes manually. Recursive is easier to write and read; iterative avoids stack-overflow on very deep graphs (Python's default recursion limit is 1000). For graphs with millions of nodes, use the iterative version or increase sys.setrecursionlimit.
▸What is the time and space complexity of DFS?
O(V + E) time — every vertex is visited once and every edge is examined once. O(V) space in the worst case for the recursion stack (or explicit stack), which happens on a path graph where the recursion goes V levels deep. The visited set also costs O(V). These bounds hold whether you are doing DFS on a general graph, a grid, or a tree.
You may also like
Topological Sort
Order the nodes of a DAG so every edge points forward — the algorithm that drives build systems, package managers, course schedulers, and anything else that lives and dies by dependency ordering.
Subsets, Permutations & Combinations
Generate every subset, permutation, and combination with the choose/explore/un-choose backtracking template. Understand the 2^n and n! complexity ceilings, why sorting kills duplicate branches, and how to recognize which variant a problem is actually asking for.
Shortest Paths (Dijkstra, Bellman-Ford, BFS)
Pick the right shortest-path algorithm the first time: BFS for unweighted graphs, Dijkstra with a heap for non-negative weights, Bellman-Ford when negative edges appear. Concrete worked examples, a decision table, and every gotcha that costs people the interview.