72. Edit DistanceMedium

Intuition 1. Recursion 2. Top-Down 3. Bottom-Up 4. Space Opt Visualizer Related

Edit Distance

MediumStringDynamic Programming·GoogleAmazonMicrosoftMeta·⚡ Very Frequent

Given two strings word1 and word2, return the minimum number of operations to convert word1 into word2. You have three operations available, each costing 1:

Insert a character at any position
Delete a character at any position
Replace a character with another

This is also called the Levenshtein distance — it's everywhere: spell checkers, diff tools, DNA alignment, autocomplete.

0 ≤ word1.length, word2.length ≤ 100
Both strings consist of lowercase English letters only
Either string can be empty (base case matters!)

Examples

Example 1

Inputword1 = "horse", word2 = "ros"

Output3

horse → rorse (replace h→r) → rose (delete r) → ros (delete e)

Example 2

Inputword1 = "monkeys", word2 = "money"

Output2

monkeys → monkey (delete s) → money (delete k)

Example 3 — empty strings

Inputword1 = "", word2 = "abc"

Output3

Must insert all 3 characters. Empty → any string = length of that string.

How to think about it

Stand at the beginning of both strings, looking at word1[i] and word2[j]. You have two situations:

Characters match: great, nothing to do here. Move both pointers forward and solve the rest: dfs(i+1, j+1).
Characters differ: you must spend one operation. Which one? Try all three and take the cheapest:
- Delete word1[i] → now solving dfs(i+1, j)
- Insert word2[j] into word1 → now solving dfs(i, j+1)
- Replace word1[i] with word2[j] → now solving dfs(i+1, j+1)

Why this is DP: the subproblem "cost to convert word1[i:] → word2[j:]" is called thousands of times with the same (i, j). Every unique pair is an independent subproblem with a fixed answer. That's exactly what memoization is for.

At each mismatch, try all three operations and pick the cheapest. A match costs nothing.

1. Recursion — O(3^(m+n))

The naive recursive solution explores every possible sequence of operations. No caching, so the same subproblems get recomputed exponentially. Fine for tiny strings, useless in practice.

If i == m: word1 is gone — insert all remaining word2 chars. Return n − j.
If j == n: word2 is done — delete all remaining word1 chars. Return m − i.
If word1[i] == word2[j]: free move, return dfs(i+1, j+1).
Otherwise: 1 + min(dfs(i+1,j), dfs(i,j+1), dfs(i+1,j+1)).

recursion.py

def minDistance(word1: str, word2: str) -> int:
    m, n = len(word1), len(word2)

    def dfs(i, j):
        if i == m: return n - j   # insert rest of word2
        if j == n: return m - i   # delete rest of word1
        if word1[i] == word2[j]:
            return dfs(i + 1, j + 1)
        return 1 + min(
            dfs(i + 1, j),        # delete word1[i]
            dfs(i, j + 1),        # insert word2[j]
            dfs(i + 1, j + 1),    # replace word1[i]
        )

    return dfs(0, 0)

2. Top-Down DP (Memoization) — O(m·n)

Same recursion, but cache every (i, j) result the first time you compute it. Since there are only m × n unique pairs, every cell is computed once. Runtime drops from exponential to polynomial.

Cache key is (i, j) — not (i, j, some_operation). The subproblem "min cost to convert word1[i:] into word2[j:]" has one fixed answer regardless of how you got to position (i, j).

top_down.py

def minDistance(word1: str, word2: str) -> int:
    m, n = len(word1), len(word2)
    dp = {}

    def dfs(i, j):
        if i == m: return n - j
        if j == n: return m - i
        if (i, j) in dp: return dp[(i, j)]

        if word1[i] == word2[j]:
            dp[(i, j)] = dfs(i + 1, j + 1)
        else:
            dp[(i, j)] = 1 + min(
                dfs(i + 1, j),
                dfs(i, j + 1),
                dfs(i + 1, j + 1),
            )
        return dp[(i, j)]

    return dfs(0, 0)

3. Bottom-Up DP — O(m·n)

Flip the recursion upside down. Fill a (m+1) × (n+1) table bottom-right to top-left — every cell you need is already filled by the time you reach it.

Think of it as filling a crossword: the bottom row and rightmost column are the base cases (pure insertions or deletions). Everything else follows from those.

Create dp[m+1][n+1]. Set dp[m][j] = n − j and dp[i][n] = m − i.
Fill from i = m−1 down to 0, and j = n−1 down to 0.
Match: dp[i][j] = dp[i+1][j+1]. Mismatch: dp[i][j] = 1 + min(dp[i+1][j], dp[i][j+1], dp[i+1][j+1]).
Return dp[0][0].

bottom_up.py

def minDistance(word1: str, word2: str) -> int:
    dp = [[float("inf")] * (len(word2) + 1)
          for _ in range(len(word1) + 1)]

    for j in range(len(word2) + 1):
        dp[len(word1)][j] = len(word2) - j   # insert remaining
    for i in range(len(word1) + 1):
        dp[i][len(word2)] = len(word1) - i   # delete remaining

    for i in range(len(word1) - 1, -1, -1):
        for j in range(len(word2) - 1, -1, -1):
            if word1[i] == word2[j]:
                dp[i][j] = dp[i + 1][j + 1]
            else:
                dp[i][j] = 1 + min(
                    dp[i + 1][j],      # delete
                    dp[i][j + 1],      # insert
                    dp[i + 1][j + 1],  # replace
                )
    return dp[0][0]

Reading the DP table

For word1 = "horse", word2 = "ros":

	j=0 "r"	j=1 "o"	j=2 "s"	j=3 end
i=0 "h"	3	3	3	5
i=1 "o"	3	2	2	4
i=2 "r"	2	2	2	3
i=3 "s"	2	2	1	2
i=4 "e"	3	3	2	1
i=5 end	3	2	1	0

dp[0][0] = 3 is the answer. The base row and column are the scaffolding everything else builds on.

4. Space-Optimized — O(min(m, n))

Each row of the DP table only depends on the row below it. So you don't need the full 2D table — just two 1D arrays: the "current" row and the "next" row you're reading from. Go further: do it in-place with one array and one extra variable tracking the diagonal.

Swap the strings first so the array size is based on the shorter word. Then sweep right-to-left updating in place, carrying the diagonal dp[i+1][j+1]in a temp variable before overwriting it.

space_optimized.py

def minDistance(word1: str, word2: str) -> int:
    m, n = len(word1), len(word2)
    if m < n:
        m, n = n, m
        word1, word2 = word2, word1   # n is shorter

    dp = [n - i for i in range(n + 1)]   # base: row m

    for i in range(m - 1, -1, -1):
        nextDp = dp[n]                    # save dp[i+1][n]
        dp[n] = m - i                     # base: col n
        for j in range(n - 1, -1, -1):
            temp = dp[j]                  # save dp[i+1][j] (diagonal for next j)
            if word1[i] == word2[j]:
                dp[j] = nextDp
            else:
                dp[j] = 1 + min(dp[j], dp[j + 1], nextDp)
            nextDp = temp                 # shift diagonal left
    return dp[0]

Complexity Comparison

Approach	Time	Space	Notes
Recursion	O(3^(m+n))	O(m+n)	Stack depth only. Runs fine for strings under ~10 chars
Top-Down DP	O(m·n)	O(m·n)	Cache fills the same (m+1)×(n+1) table as bottom-up
Bottom-Up DP	O(m·n)	O(m·n)	No recursion overhead. Easier to visualize
Space Optimized	O(m·n)	O(min(m,n))	Single 1D array + one temp variable. Best for large inputs

See it run

Approach:

DP Table (dp[i][j] = min edits for word1[i:] → word2[j:])

READY—

Press ▶ or step forward to begin.

i—

j—

dp[0][0]—

—

word1

word2

Space Play/Pause·←→ Step·R Reset·F Fullscreen

Executing Code

def minDistance(word1, word2):

    dp = [[float("inf")] * (len(word2)+1) for _ in range(len(word1)+1)]

    for j in range(len(word2)+1): dp[len(word1)][j] = len(word2) - j  # insert rest

    for i in range(len(word1)+1): dp[i][len(word2)] = len(word1) - i  # delete rest

    for i in range(len(word1)-1, -1, -1):

        for j in range(len(word2)-1, -1, -1):

            if word1[i] == word2[j]:

                dp[i][j] = dp[i+1][j+1]

            else:

                dp[i][j] = 1 + min(dp[i+1][j], dp[i][j+1], dp[i+1][j+1])

    return dp[0][0]

Where people go wrong

Confusing insert and delete

When you "insert" into word1, you're consuming word2[j] — so advance j. When you "delete" from word1, you're consuming word1[i] — so advance i. Getting these backwards is the most common source of wrong answers.

Forgetting to add 1 for the operation

When characters don't match, the result is 1 + min(...) — that 1 is the current operation you're performing right now. It's easy to write min(...) and forget the +1, which makes every answer off by some amount.

Wrong base case

When word1 is exhausted (i == m), you still need to insert all remainingword2 characters: return n - j, not 0. Same reasoning for whenword2 is exhausted: return m - i. Return 0 by mistake and every empty-string test case fails.

Quick tests before submitting

"", "abc" → 3 (insert 3)
"abc", "" → 3 (delete 3)
"abc", "abc" → 0 (already equal)
"a", "b" → 1 (one replace)
"ab", "ba" → 2 (not 1 — you can't swap, only insert/delete/replace)

What to do next

Edit Distance is one of the most reused DP patterns. These all share the same 2-pointer / 2D-DP skeleton.

Discussion

…

Loading comments…

LeetMotion