Longest Common Subsequence (LCS) Algorithm

Longest Common Subsequence (LCS) Algorithm

Overview

The Longest Common Subsequence (LCS) problem is a classic problem in computer science. Given two sequences, the goal is to find the length of their longest subsequence that appears in both of them in the same order (but not necessarily contiguously).

Algorithm

The LCS problem is typically solved using Dynamic Programming. Let strings X and Y have lengths m and n, respectively. We define a 2D array dp where dp[i][j] stores the length of LCS of X[0..i-1] and Y[0..j-1].

if X[i-1] == Y[j-1]:
    dp[i][j] = dp[i-1][j-1] + 1
else:
    dp[i][j] = max(dp[i-1][j], dp[i][j-1])

The value at dp[m][n] gives the length of the LCS.

Time Complexity

O(m * n), where m and n are lengths of the input strings.

Space Complexity

O(m * n), or O(n) with space optimization using rolling arrays.

Easy Problems

1. Length of LCS

Given two strings, return the length of their LCS.


#include <iostream>
#include <vector>
#include <string>

using namespace std;

int lcsLength(string X, string Y) {
    int m = X.size(), n = Y.size();
    vector<vector<int>> dp(m + 1, vector<int>(n + 1, 0));
    for (int i = 1; i <= m; ++i) {
        for (int j = 1; j <= n; ++j) {
            if (X[i - 1] == Y[j - 1])
                dp[i][j] = 1 + dp[i - 1][j - 1];
            else
                dp[i][j] = max(dp[i - 1][j], dp[i][j - 1]);
        }
    }
    return dp[m][n];
}

2. Check if one string is a subsequence of another

Using LCS to determine if a string is a subsequence of another.


bool isSubsequence(string s, string t) {
    return lcsLength(s, t) == s.length();
}

Intermediate Problems

1. Print the LCS

Instead of just the length, return the actual LCS string.


string printLCS(string X, string Y) {
    int m = X.length(), n = Y.length();
    vector<vector<int>> dp(m+1, vector<int>(n+1, 0));

    for (int i = 1; i <= m; i++) {
        for (int j = 1; j <= n; j++) {
            if (X[i-1] == Y[j-1])
                dp[i][j] = dp[i-1][j-1] + 1;
            else
                dp[i][j] = max(dp[i-1][j], dp[i][j-1]);
        }
    }

    string lcs;
    int i = m, j = n;
    while (i > 0 && j > 0) {
        if (X[i-1] == Y[j-1]) {
            lcs = X[i-1] + lcs;
            i--; j--;
        } else if (dp[i-1][j] > dp[i][j-1])
            i--;
        else
            j--;
    }
    return lcs;
}

2. Minimum number of deletions and insertions to transform one string into another


pair<int, int> minDelInsert(string X, string Y) {
    int lcs = lcsLength(X, Y);
    int deletions = X.size() - lcs;
    int insertions = Y.size() - lcs;
    return {deletions, insertions};
}

Hard Problem

Shortest Common Supersequence

Find the shortest string that has both strings as subsequences.


string shortestCommonSupersequence(string X, string Y) {
    int m = X.length(), n = Y.length();
    vector<vector<int>> dp(m + 1, vector<int>(n + 1, 0));

    for (int i = 1; i <= m; i++) {
        for (int j = 1; j <= n; j++) {
            if (X[i-1] == Y[j-1])
                dp[i][j] = dp[i-1][j-1] + 1;
            else
                dp[i][j] = max(dp[i-1][j], dp[i][j-1]);
        }
    }

    string result;
    int i = m, j = n;
    while (i > 0 && j > 0) {
        if (X[i-1] == Y[j-1]) {
            result = X[i-1] + result;
            i--; j--;
        } else if (dp[i-1][j] > dp[i][j-1]) {
            result = X[i-1] + result;
            i--;
        } else {
            result = Y[j-1] + result;
            j--;
        }
    }

    while (i > 0) {
        result = X[i-1] + result;
        i--;
    }
    while (j > 0) {
        result = Y[j-1] + result;
        j--;
    }

    return result;
}

Conclusion

The LCS algorithm is a powerful tool for comparing sequences, applicable in diff tools, DNA analysis, and file comparison utilities. It scales well with optimizations and is foundational in dynamic programming.