~/techniques/dp-on-grids
◆◆Intermediateasked at Amazonasked at Googleasked at Microsoft

Dynamic Programming on Grids

Grid DP is the pattern behind unique paths, minimum cost routes, edit distance, and longest common subsequence. Once you see the dependency arrows, every 2D DP problem becomes the same fill-order exercise.

11 min read2026-01-20Ironclad Academy
Complexity cheat sheet
Operation
Average
Worst
Unique paths (m×n grid)each cell filled once from top and left
O(m×n)
O(m×n)
Minimum path sumsame fill order as unique paths
O(m×n)
O(m×n)
Longest common subsequencem and n are the two string lengths
O(m×n)
O(m×n)
Edit distanceLevenshtein; one cell per character pair
O(m×n)
O(m×n)
Space-optimized (rolling rows)discard all but the previous row
O(n)
O(n)
// DEPTH
the full breakdown — requirements, capacity, evolution, trade-offs

The mental model: dependencies and fill order

Here is the rule that generates every correct grid DP solution, and you should burn it into your head:

Fill order follows dependency arrows.

If dp[i][j] depends on dp[i-1][j] and dp[i][j-1], those arrows point up and left. Fill the table top-to-bottom, left-to-right, and both dependencies are always ready when you need them.

This is obvious once stated, but people get it wrong constantly — they write the recurrence correctly and then loop in an order that reads an uncomputed cell. The bugs are silent and painful because the code looks right.

flowchart LR
    subgraph "Dependency arrows point here"
        direction LR
        TL["dp[i-1][j-1]"] -->|diagonal| C["dp[i][j]"]
        T["dp[i-1][j]"] -->|from above| C
        L["dp[i][j-1]"] -->|from left| C
    end

    style C fill:#7c5cff,color:#fff
    style T fill:#22d3ee,color:#0a0a0f
    style L fill:#22d3ee,color:#0a0a0f
    style TL fill:#22d3ee,color:#0a0a0f

Three flavors of neighbors appear in nearly every grid DP problem:

What dp[i][j] needsFill direction
dp[i-1][j] only (row above)top → bottom, any column order
dp[i][j-1] only (left neighbor)left → right, any row order
dp[i-1][j] and dp[i][j-1]top-to-bottom, left-to-right
dp[i-1][j-1] and dp[i-1][j] and dp[i][j-1]top-to-bottom, left-to-right

The diagonal dependency (dp[i-1][j-1]) appears in LCS and edit distance. It does not change the fill order — same sweep, you already have it by the time you need it.

Unique paths — the hello world

A robot starts at (0,0) in an m×n grid and wants to reach (m-1,n-1). It can only move right or down. Count the unique paths.

Recurrence:

dp[i][j] = dp[i-1][j] + dp[i][j-1]

"The number of ways to reach here is the number of ways to reach the cell above plus the number of ways to reach the cell to my left."

Base cases: The entire first row is all 1s (only one way to get there — go right forever). Same for the first column (go down forever).

block-beta
    columns 4
    A["1"] B["1"] C["1"] D["1"]
    E["1"] F["2"] G["3"] H["4"]
    I["1"] J["3"] K["6"] L["10"]
    M["1"] N["4"] O["10"] P["20"]

    style A fill:#7c5cff,color:#fff
    style P fill:#00ff9d,color:#0a0a0f
    style F fill:#22d3ee,color:#0a0a0f
    style K fill:#22d3ee,color:#0a0a0f

Each cell is the sum of its top and left neighbors. The answer for a 4×4 grid is 20 — read right out of dp[3][3].

def unique_paths(m: int, n: int) -> int:
    dp = [[1] * n for _ in range(m)]   # base cases: first row and col = 1

    for i in range(1, m):
        for j in range(1, n):
            dp[i][j] = dp[i-1][j] + dp[i][j-1]

    return dp[m-1][n-1]

O(m×n) time, O(m×n) space. We can do better on space — come back to that.

Variant: obstacles. If some cells are blocked (value 1 in a grid), the fix is trivial: if grid[i][j] == 1, set dp[i][j] = 0. Blocked cells contribute nothing. The rest of the code is unchanged.

Minimum path sum — same skeleton, different recurrence

Now each cell has a cost, and you want the minimum-cost path from top-left to bottom-right. Still right-or-down moves only.

dp[i][j] = grid[i][j] + min(dp[i-1][j], dp[i][j-1])

The cost to reach (i,j) is the cell's own cost plus the cheaper of the two ways to arrive. Everything else — fill order, base cases, space optimization — is identical to unique paths.

def min_path_sum(grid: list[list[int]]) -> int:
    m, n = len(grid), len(grid[0])
    dp = [[0] * n for _ in range(m)]
    dp[0][0] = grid[0][0]

    # First row: can only come from the left
    for j in range(1, n):
        dp[0][j] = dp[0][j-1] + grid[0][j]

    # First col: can only come from above
    for i in range(1, m):
        dp[i][0] = dp[i-1][0] + grid[i][0]

    for i in range(1, m):
        for j in range(1, n):
            dp[i][j] = grid[i][j] + min(dp[i-1][j], dp[i][j-1])

    return dp[m-1][n-1]

The moment you internalize that unique paths and min path sum are the same shape with different operators (+ vs min), you stop thinking of them as two separate problems. They are one problem. You just need to know which value to propagate.

LCS — grid DP on two strings

Longest Common Subsequence: given "ABCDE" and "ACE", the LCS is "ACE" — length 3. There is no grid in the input, but you build one conceptually.

Let rows represent characters of string A (0-indexed, with an empty-string prefix at row 0) and columns represent characters of string B. Cell dp[i][j] is the LCS length of A[:i] and B[:j].

Recurrence:

if A[i-1] == B[j-1]:
    dp[i][j] = dp[i-1][j-1] + 1      # this character matches — extend diagonal
else:
    dp[i][j] = max(dp[i-1][j], dp[i][j-1])   # skip one character from A or B
flowchart LR
    subgraph "LCS of 'ABCD' vs 'ACD'"
        direction TB
        H0[""] --> H1["A"] --> H2["C"] --> H3["D"]
        R0["∅"] --> R1["A"] --> R2["B"] --> R3["C"] --> R4["D"]
    end

The grid for "ABCD" vs "ACD" (rows=ABCD, cols=ACD):

     ""  A  C  D
""  [ 0, 0, 0, 0 ]
A   [ 0, 1, 1, 1 ]
B   [ 0, 1, 1, 1 ]
C   [ 0, 1, 2, 2 ]
D   [ 0, 1, 2, 3 ]

The answer is dp[4][3] = 3. The matching characters A, C, D each lit up the diagonal.

def longest_common_subsequence(a: str, b: str) -> int:
    m, n = len(a), len(b)
    dp = [[0] * (n + 1) for _ in range(m + 1)]

    for i in range(1, m + 1):
        for j in range(1, n + 1):
            if a[i-1] == b[j-1]:
                dp[i][j] = dp[i-1][j-1] + 1          # diagonal + 1
            else:
                dp[i][j] = max(dp[i-1][j], dp[i][j-1])  # best skip

    return dp[m][n]

A subtlety that trips people up: the indices are off by one because the DP table has a zero-row and zero-column for the empty-string prefix. a[i-1] is the i-th character (1-indexed in the table). Get this boundary right and the whole table clicks into place.

Edit distance — the hard one

Edit distance (Levenshtein) is the minimum number of insertions, deletions, and substitutions to turn string A into string B. Same conceptual grid as LCS, but the recurrence has three choices:

if A[i-1] == B[j-1]:
    dp[i][j] = dp[i-1][j-1]                 # characters match — free
else:
    dp[i][j] = 1 + min(
        dp[i-1][j],      # delete from A (look at row above)
        dp[i][j-1],      # insert into A (look at cell left)
        dp[i-1][j-1]     # substitute (look at diagonal)
    )

The dependency picture is richer — all three neighbors feed into dp[i][j]:

flowchart TD
    diag["dp[i-1][j-1]<br/>(substitute)"] --> cur["dp[i][j]"]
    above["dp[i-1][j]<br/>(delete)"] --> cur
    left["dp[i][j-1]<br/>(insert)"] --> cur

    style cur fill:#7c5cff,color:#fff
    style diag fill:#22d3ee,color:#0a0a0f
    style above fill:#22d3ee,color:#0a0a0f
    style left fill:#22d3ee,color:#0a0a0f

Fill order is still top-to-bottom, left-to-right. All three dependencies are already computed.

def edit_distance(a: str, b: str) -> int:
    m, n = len(a), len(b)
    dp = [[0] * (n + 1) for _ in range(m + 1)]

    # Base cases: transforming from/to empty string
    for i in range(m + 1):
        dp[i][0] = i          # delete i characters
    for j in range(n + 1):
        dp[0][j] = j          # insert j characters

    for i in range(1, m + 1):
        for j in range(1, n + 1):
            if a[i-1] == b[j-1]:
                dp[i][j] = dp[i-1][j-1]
            else:
                dp[i][j] = 1 + min(
                    dp[i-1][j],      # delete
                    dp[i][j-1],      # insert
                    dp[i-1][j-1]     # substitute
                )

    return dp[m][n]

For a = "horse", b = "ros", the answer is 3. Trace the table and you'll see the path: horse → rorse → rose → ros.

The base cases deserve a pause: dp[i][0] = i because converting A[:i] to an empty string requires i deletions. dp[0][j] = j because building B[:j] from nothing requires j insertions. Miss these and the table is garbage from the first real cell.

Space optimization: collapsing the table

All four problems above allocate an O(m×n) table. In practice you usually don't need all of it.

The key observation: dp[i][j] only depends on row i-1 (and the current row so far). You can keep two 1D arrays — prev and curr — and discard everything else.

def unique_paths_optimized(m: int, n: int) -> int:
    prev = [1] * n   # represents row 0

    for i in range(1, m):
        curr = [1] * n   # first column is always 1
        for j in range(1, n):
            curr[j] = prev[j] + curr[j-1]
        prev = curr

    return prev[n-1]

Space drops from O(m×n) to O(n). For a 1000×1000 grid that's the difference between a million integers and a thousand.

Edit distance and LCS have a diagonal dependency (dp[i-1][j-1]). When you overwrite a row in-place, you need to save prev[j-1] before overwriting it — store it in a scalar diag:

def lcs_optimized(a: str, b: str) -> int:
    n = len(b)
    dp = [0] * (n + 1)      # represents the previous row

    for ch_a in a:
        diag = 0             # tracks dp[i-1][j-1] before overwrite
        for j in range(1, n + 1):
            old = dp[j]      # save before overwrite (becomes diag for j+1)
            if ch_a == b[j-1]:
                dp[j] = diag + 1
            else:
                dp[j] = max(dp[j], dp[j-1])
            diag = old

    return dp[n]

Read that carefully — diag is the value dp[j] held before the current row's iteration touched it, which is exactly dp[i-1][j-1]. The single extra scalar replaces an entire row.

The patterns and their tells

You have seen four problems. Here is what they generalize to:

Counting paths on a grid (unique paths, number of ways with obstacles)

  • Tell: "how many ways to get from (0,0) to (m-1,n-1)" or "count paths subject to constraints"
  • Shape: dp[i][j] = sum of cells you came from
  • Default fill: top-to-bottom, left-to-right

Optimizing cost/value on a grid (min path sum, max gold, dungeon game)

  • Tell: cells have values, find min/max total along a path
  • Shape: dp[i][j] = cell_value + min/max of predecessors
  • Watch out: if you need future cells (dungeon game), fill bottom-right to top-left

Subsequence comparison (LCS, shortest common supersequence, distinct subsequences)

  • Tell: two sequences, find something about how they relate
  • Shape: m×n table where rows and columns are the two sequences
  • Match → diagonal; skip → left or above

Edit operations (edit distance, one-insert-away)

  • Tell: convert one string to another with minimum operations
  • Shape: same as LCS grid, three-way min
  • Base cases encode editing from/to empty string

When you see a problem that fits any of these shapes, draw the dependency arrows before you write a single line of code. The arrows tell you the recurrence, the fill order, and whether space optimization is possible. That thirty-second diagram is worth an hour of debugging.

Pitfalls people actually hit

Wrong fill order. You read dp[i-1][j] before you've filled row i-1. This happens when you loop columns in the outer loop and rows in the inner loop for a problem that needs left-above dependencies. Draw the arrows.

Off-by-one in the base cases. LCS and edit distance both need that zero-row and zero-column for the empty-string prefix. Forget them and cell (1,1) reads garbage. Add them first, always.

Initializing the diagonal wrong. In the space-optimized LCS, diag starts at 0 before each row (the empty-string prefix has zero LCS length with anything). It must reset at the start of each outer iteration.

Not handling blocked cells in path problems. If cell (i,j) is an obstacle, set dp[i][j] = 0 and don't let subsequent cells add from it. Usually this is automatic because you only sum non-blocked predecessors, but if you initialize the table cleverly you can accidentally bake in bad values.

Confusing "paths" and "costs." Min path sum wants the cell's cost in the recurrence. Unique paths does not — there are no weights, just counts. Keep these straight and the recurrences won't bleed into each other.

When NOT to use 2D DP

Grid DP is powerful but it is the wrong tool in a few situations:

When the grid has arbitrary edge weights and you need shortest paths. That's graph territory — run Dijkstra or Bellman-Ford. A grid with arbitrary weights is just a special-case graph, and BFS/Dijkstra exploits the structure better than a fixed fill-order DP. See BFS patterns for grid-specific BFS tricks like 0-1 BFS.

When you need to count or optimize across arbitrary subsets. If your state isn't reducible to two indices, you might be in knapsack patterns territory — bitmask DP or subset DP where the state space is exponential in something.

When you need the actual path, not just the cost. Grid DP gives you the optimal value. Recovering the actual path requires either backtracking through the table (keeping parent pointers) or re-running a trace from the filled table. Not wrong — just know you'll need an extra pass.

When the grid is huge and sparse. A 10⁶ × 10⁶ grid with 10⁴ non-empty cells is not a DP problem — it's a graph problem with implicit edges. Allocating a 10¹² cell table is not going to work out.

The cleaner signal for when 2D DP is right: your problem has two changing integer indices, the subproblem at (i,j) is fully determined by those two values alone, and every cell depends on a fixed small neighborhood. Check all three boxes and fill the table. The rest is arithmetic.


If the general DP machinery is still fuzzy, back up to dynamic programming for the four-step framework and 1D problems first — grid DP is 1D DP with a second dimension bolted on, not a new idea. Once you're comfortable here, the natural next step is knapsack patterns, where the second dimension is a capacity or a budget instead of a grid coordinate.

// FAQ

Frequently asked questions

What makes a problem a "grid DP" problem?

A problem is grid DP when you can express the answer in terms of a 2D table where each cell dp[i][j] depends only on a fixed set of neighboring cells — typically dp[i-1][j], dp[i][j-1], and sometimes dp[i-1][j-1]. The grid does not have to be an actual matrix in the input; LCS and edit distance both use a conceptual grid whose rows and columns are the characters of two strings. The tell is that your state needs exactly two integer indices to identify it.

What order should I fill the DP table in a grid problem?

Fill in the order that guarantees every cell you depend on is already computed when you need it. If dp[i][j] depends on dp[i-1][j] (cell above) and dp[i][j-1] (cell to the left), fill row by row, left to right — both dependencies are satisfied. If you need dp[i+1][j] (cell below), fill bottom-up. The rule is simple: draw the dependency arrows, then fill in the direction those arrows point away from.

How do I space-optimize a 2D DP table?

If dp[i][j] only depends on row i-1 (and optionally the previous value in the same row), you only need to store two rows at once — the previous row and the row you are building. Overwrite in-place from left to right and prev[j] still holds the old value before you overwrite it. Some problems can even collapse to a single row with a scalar to track the diagonal.

Is edit distance really a grid DP problem?

Yes. Imagine a grid where rows represent characters of string A (plus an empty-string row at index 0) and columns represent characters of string B (plus an empty-string column at index 0). dp[i][j] is the edit distance between A[:i] and B[:j]. It depends on dp[i-1][j] (delete from A), dp[i][j-1] (insert into A), and dp[i-1][j-1] (replace or match). The fill order is the standard top-left to bottom-right sweep.

What is the difference between LCS and edit distance, and when do I use each?

LCS (Longest Common Subsequence) finds the longest sequence of characters that appear in both strings in the same order, without caring about gaps or substitutions. Edit distance counts the minimum number of insertions, deletions, and substitutions to turn one string into another. LCS is used for diff tools, plagiarism detection, and DNA alignment. Edit distance is used for spell checking, fuzzy search, and autocorrect. Both use nearly identical 2D DP grids — the recurrences just differ in the match/mismatch case.

// RELATED

You may also like