◆◆Intermediateasked at Googleasked at Metaasked at Amazon

Backtracking

Learn the backtracking algorithm pattern — the systematic brute force that prunes itself. Master the universal choose-explore-unchoose template and apply it to subsets, permutations, combination sum, and N-queens.

10 min read2026-01-15Ironclad Academy

#backtracking #recursion #combinatorics #dfs

// DEPTH

the full breakdown — requirements, capacity, evolution, trade-offs

The moment you realize you need it

You have a problem that looks like this: generate all subsets of [1, 2, 3]. Or find every valid arrangement of 4 queens on a 4×4 chessboard. Or fill in this sudoku grid.

The first instinct is to loop, but the loops are nested and you don't know how many levels deep you need. You could write three nested loops for three elements, but what about n elements? The depth of the nesting is variable, which is the clearest sign that recursion is the right tool — and specifically backtracking, because you need to explore every branch but undo your state between them.

Here is another tell: the problem has constraints that can eliminate whole swaths of possibilities early. For N-queens, placing a queen in column 0 of row 0 instantly invalidates every position in that column, that row, and those diagonals for all future rows. If you catch that early, you skip an enormous subtree. Backtracking builds in that early exit naturally.

The universal template

Every backtracking solution you will ever write is a version of this:

def backtrack(state, choices):
    if is_complete(state):
        results.append(state.copy())   # found a valid solution
        return

    for choice in choices:
        if not is_valid(state, choice):
            continue                    # prune this branch entirely

        # choose
        state.add(choice)

        # explore
        backtrack(state, remaining_choices(choices, choice))

        # un-choose (undo)
        state.remove(choice)

The state is whatever you've built so far — a partial list, a board, a path. The choices are what you could add next. The is_valid check is your pruning logic. The un-choose — state.remove(choice) — is what makes this reusable: you hand the exact same object to every branch, mutate it, recurse, then restore it.

The single most common mistake is forgetting the un-choose. If you skip it, every branch inherits the garbage left by its siblings and your results will be completely wrong.

Worked example 1: all subsets

Given nums = [1, 2, 3], generate every possible subset (including the empty set).

The decision tree is binary: for each element, you either include it or you don't.

flowchart TD
    R["[]"] --> A1["[1]"]
    R --> A2["[]"]
    A1 --> B1["[1,2]"]
    A1 --> B2["[1]"]
    A2 --> B3["[2]"]
    A2 --> B4["[]"]
    B1 --> C1["[1,2,3]"]:::valid
    B1 --> C2["[1,2]"]:::valid
    B2 --> C3["[1,3]"]:::valid
    B2 --> C4["[1]"]:::valid
    B3 --> C5["[2,3]"]:::valid
    B3 --> C6["[2]"]:::valid
    B4 --> C7["[3]"]:::valid
    B4 --> C8["[]"]:::valid
    classDef valid fill:#00ff9d,color:#0a0a0f

Every leaf is valid — there are no constraints here, so no pruning. This makes subsets the cleanest possible example to see the template in action.

def subsets(nums):
    results = []
    current = []

    def backtrack(start):
        results.append(current.copy())  # every partial state is a valid subset

        for i in range(start, len(nums)):
            current.append(nums[i])     # choose
            backtrack(i + 1)            # explore (only elements after i, to avoid duplicates)
            current.pop()               # un-choose

    backtrack(0)
    return results

The start index is the subtle piece. By only looking at elements from i + 1 onward, you guarantee that each subset is generated exactly once and always in the same order. Complexity: O(2^n × n) — there are 2^n subsets and each takes up to O(n) to copy. See subsets and permutations for more depth on the enumeration patterns.

Worked example 2: permutations

Now generate every ordering of [1, 2, 3]. Every element must appear exactly once, and order matters. There are 3! = 6 results.

This time the pruning condition is "don't use an element already in the current arrangement":

def permutations(nums):
    results = []
    current = []
    used = [False] * len(nums)

    def backtrack():
        if len(current) == len(nums):
            results.append(current.copy())
            return

        for i in range(len(nums)):
            if used[i]:
                continue            # prune: already in the current arrangement

            used[i] = True
            current.append(nums[i])     # choose

            backtrack()                 # explore

            current.pop()               # un-choose
            used[i] = False

    backtrack()
    return results

Notice the un-choose is now two lines: current.pop() and used[i] = False. Every mutation you made on the way in gets reversed on the way out. The used array is the state shared across all recursive calls — you never copy it, you mutate and restore.

Complexity: O(n! × n). There are n! permutations; copying each takes O(n).

Worked example 3: combination sum (with pruning that actually matters)

This is where backtracking earns its keep. Given candidates [2, 3, 6, 7] and a target 7, find all combinations of candidates (with repetition allowed) that sum to exactly 7.

def combination_sum(candidates, target):
    candidates.sort()               # enables early termination
    results = []
    current = []

    def backtrack(start, remaining):
        if remaining == 0:
            results.append(current.copy())
            return

        for i in range(start, len(candidates)):
            c = candidates[i]
            if c > remaining:
                break               # sorted: everything after is also too big → prune the rest
            current.append(c)
            backtrack(i, remaining - c)     # i, not i+1, because repetition is allowed
            current.pop()

    backtrack(0, target)
    return results

The break on c > remaining is pruning. Because the candidates are sorted, once one candidate exceeds the remaining sum, all subsequent candidates will too — so you cut the entire rest of that loop. Without sorting and this break, you explore dead branches all the way down. With it, the practical performance is dramatically better even though the worst case is still exponential.

This pattern — sort first, break when a constraint turns monotone — applies in dozens of problems.

Worked example 4: N-queens sketch

N-queens is the canonical hard backtracking problem. Place N queens on an N×N chessboard so no two attack each other (same row, column, or diagonal).

The state is a partial board: you place one queen per row, top to bottom. For each row, you try each column. Before placing, you check whether that column or either diagonal is already threatened.

def solve_n_queens(n):
    results = []
    cols_used = set()
    diag1_used = set()   # row - col is constant on a top-left diagonal
    diag2_used = set()   # row + col is constant on a top-right diagonal
    board = [-1] * n     # board[row] = column where the queen is placed

    def backtrack(row):
        if row == n:
            results.append(board[:])
            return

        for col in range(n):
            if col in cols_used or (row - col) in diag1_used or (row + col) in diag2_used:
                continue            # this position is attacked — prune the entire subtree

            # choose
            board[row] = col
            cols_used.add(col)
            diag1_used.add(row - col)
            diag2_used.add(row + col)

            backtrack(row + 1)      # explore

            # un-choose
            cols_used.discard(col)
            diag1_used.discard(row - col)
            diag2_used.discard(row + col)

    backtrack(0)
    return results

For N=8, the full search space is 8^8 = ~16 million positions. Backtracking with constraint checks visits only 15,720. That is a three-order-of-magnitude reduction from pruning alone. This is why the technique exists.

The same structure applies to sudoku: instead of rows, you iterate over empty cells. Instead of columns and diagonals, you check the row, column, and 3×3 box constraints.

The recursion visualizer and the decision tree

When you're debugging a backtracking solution, the hardest part is seeing which branches got pruned and which didn't. Step back through the call stack and watch the return path:

{ "type": "recursion", "fn": "fibonacci", "n": 5, "title": "overlapping subproblems in a recursive tree" }

Step through this slowly. See how certain subtrees appear more than once? That overlap is why, for some problems, backtracking gets replaced by dynamic programming — you memoize the repeated subproblems instead of recomputing them. But notice the difference: in permutations and N-queens, the path to a node matters (a queen in column 2 of row 1 means something different than a queen in column 2 of row 3), so you cannot memoize on just the current state. Backtracking is correct when the path, not just the current node, defines the problem.

For more on how recursion builds up call stacks, see Recursion crash course.

How pruning interacts with complexity

Worst-case complexity is exponential and nothing changes that. But the actual running time depends on how aggressively you prune:

Problem	Full search space	After pruning (typical)
Subsets of n=20	2^20 ≈ 1M	1M (no constraints to prune)
Permutations of n=8	8^8 ≈ 16M	40,320 (= 8! — used[] eliminates most)
N-queens n=8	8^8 ≈ 16M	~15,720 positions visited
Combination sum (sorted + break)	varies	often 10–100× smaller

The N-queens number is striking. Pruning at each row eliminates invalid columns and diagonals immediately, so the tree that gets explored is tiny compared to the brute-force space.

The lesson: write the constraint check before you recurse, not after. The goal is to kill a branch as early as possible — before any child calls happen. A check at the base case is still correct, but it means you spent the entire depth of the tree before discovering the branch was bad.

Common mistakes

Forgetting the un-choose. This is by far the most common bug. You add something to your state, recurse, and then just… move on to the next iteration. The next branch then starts with corrupted state. Every mutation must be reversed before the loop continues.

Copying state on the way in instead of restoring on the way out. You can avoid the un-choose by passing copies of your state to each recursive call (backtrack(current + [choice], ...) instead of mutating current). This is correct and sometimes cleaner for small problems, but it costs O(n) per call for the copy — n levels deep, that adds an O(n) multiplier to everything. For big inputs, in-place mutation with explicit restoration is faster.

Pruning too late. Checking if remaining < 0: return at the top of the function is better than nothing, but sorting first and breaking when the current candidate already exceeds the remaining budget is much better — it skips entire siblings in one shot instead of one at a time.

Not sorting when you have duplicates. If candidates = [1, 1, 2] and you're generating subsets without duplicates, you need to both sort and skip repeated choices at the same recursion level (if i > start and candidates[i] == candidates[i-1]: continue). Without this, you get the same subset generated multiple times via different paths.

Conflating backtracking with DP. If the problem asks "how many ways can you make change for $N?" — that's DP. You don't need the actual combinations, just the count. Backtracking that count is exponentially slower than a bottom-up DP table.

When NOT to use backtracking

Backtracking is the right tool only when n is small enough. As a rough guide:

Subset-style (2^n): n ≤ 20 is usually fine; n ≤ 25 with heavy pruning; n > 30 is a red flag.
Permutation-style (n!): n ≤ 10 is fine; n = 12 is ~500 million, probably too slow.

If n is large and the problem asks only for a count or an optimal value, you almost certainly want dynamic programming instead. If the problem has an explicit graph structure — find all paths from A to B in a graph — the same choose/explore/un-choose pattern applies but lives in a graph DFS with a visited set playing the role of the constraint checker.

If you need all valid subsets or permutations as actual outputs and n is genuinely large, there is no efficient algorithm — the output itself is exponentially large. That usually means the problem is either constrained more than it appears, or you're supposed to return just one answer (use backtracking and return on the first success), not all of them.

The one-paragraph summary you can take into an interview

Backtracking is depth-first search over a decision tree. At each node you pick a choice from what's available, add it to your current partial solution, recurse, then undo the addition and try the next choice. Before recursing, you check whether the partial solution already violates a constraint — if it does, you skip the entire subtree. The template is always: choose → recurse → un-choose. The input has to be small (n ≤ 20ish) because complexity is exponential, but pruning can make the practical cost orders of magnitude smaller than the theoretical worst case. When you see "generate all," "find all valid," or "enumerate every possible," and the constraints suggest small input, that is your signal to reach for this pattern.

// FAQ

Frequently asked questions

▸What is backtracking in algorithms?

Backtracking is a recursive technique that builds a solution piece by piece, abandoning a partial solution the moment it determines that path cannot lead to a valid answer. It explores a decision tree — at each step you make a choice, recurse into it, and then undo that choice so you can try the next one. It is essentially depth-first search over a space of partial solutions.

▸How is backtracking different from brute force?

Pure brute force generates all possibilities first, then filters them. Backtracking prunes as it goes — as soon as it knows a partial path cannot possibly succeed, it abandons that branch and backtracks immediately. This can cut the actual work down dramatically even though the worst-case complexity is the same. The size of the pruned tree versus the full tree is the measure of how much pruning helps.

▸What is the time complexity of backtracking algorithms?

Exponential in general — O(2^n) for subset-style problems, O(n!) for permutation-style problems. This is unavoidable when you need to enumerate all valid combinations or arrangements. Backtracking is the right tool specifically when n is small (typically n ≤ 20 for subset-like problems, n ≤ 10 for permutations) and you need to enumerate or find all valid solutions, not just count them.

▸When should I use backtracking versus dynamic programming?

Use backtracking when you need to enumerate all solutions or find one valid arrangement (N-queens, sudoku, permutations). Use dynamic programming when you only need a count or an optimal value (number of subsets summing to k, minimum coins). Many problems that feel like backtracking actually only need DP because you discard the actual paths and just keep a number. If your problem says "find ALL" or "generate ALL" and n is small, that is a backtracking signal.

▸What does "un-choose" mean in backtracking?

After you recurse into a choice and return, you must undo whatever that choice did to your state before trying the next option. If you added an element to a list before recursing, remove it after. If you marked a cell as used, unmark it. This restoration is what makes the same mutable data structure safely reusable across all branches of the recursion, avoiding the cost of copying it for each branch.

← previous

BFS Patterns

// RELATED