Longest Common Subsequence (LCS) Algorithm
Overview
The Longest Common Subsequence (LCS) problem is a classic problem in computer science. Given two sequences, the goal is to find the length of their longest subsequence that appears in both of them in the same order (but not necessarily contiguously).
Algorithm
The LCS problem is typically solved using Dynamic Programming. Let strings X
and Y
have lengths m
and n
, respectively. We define a 2D array dp
where dp[i][j]
stores the length of LCS of X[0..i-1]
and Y[0..j-1]
.
if X[i-1] == Y[j-1]: dp[i][j] = dp[i-1][j-1] + 1 else: dp[i][j] = max(dp[i-1][j], dp[i][j-1])
The value at dp[m][n]
gives the length of the LCS.
Time Complexity
O(m * n), where m and n are lengths of the input strings.
Space Complexity
O(m * n), or O(n) with space optimization using rolling arrays.
Easy Problems
1. Length of LCS
Given two strings, return the length of their LCS.
#include <iostream>
#include <vector>
#include <string>
using namespace std;
int lcsLength(string X, string Y) {
int m = X.size(), n = Y.size();
vector<vector<int>> dp(m + 1, vector<int>(n + 1, 0));
for (int i = 1; i <= m; ++i) {
for (int j = 1; j <= n; ++j) {
if (X[i - 1] == Y[j - 1])
dp[i][j] = 1 + dp[i - 1][j - 1];
else
dp[i][j] = max(dp[i - 1][j], dp[i][j - 1]);
}
}
return dp[m][n];
}
2. Check if one string is a subsequence of another
Using LCS to determine if a string is a subsequence of another.
bool isSubsequence(string s, string t) {
return lcsLength(s, t) == s.length();
}
Intermediate Problems
1. Print the LCS
Instead of just the length, return the actual LCS string.
string printLCS(string X, string Y) {
int m = X.length(), n = Y.length();
vector<vector<int>> dp(m+1, vector<int>(n+1, 0));
for (int i = 1; i <= m; i++) {
for (int j = 1; j <= n; j++) {
if (X[i-1] == Y[j-1])
dp[i][j] = dp[i-1][j-1] + 1;
else
dp[i][j] = max(dp[i-1][j], dp[i][j-1]);
}
}
string lcs;
int i = m, j = n;
while (i > 0 && j > 0) {
if (X[i-1] == Y[j-1]) {
lcs = X[i-1] + lcs;
i--; j--;
} else if (dp[i-1][j] > dp[i][j-1])
i--;
else
j--;
}
return lcs;
}
2. Minimum number of deletions and insertions to transform one string into another
pair<int, int> minDelInsert(string X, string Y) {
int lcs = lcsLength(X, Y);
int deletions = X.size() - lcs;
int insertions = Y.size() - lcs;
return {deletions, insertions};
}
Hard Problem
Shortest Common Supersequence
Find the shortest string that has both strings as subsequences.
string shortestCommonSupersequence(string X, string Y) {
int m = X.length(), n = Y.length();
vector<vector<int>> dp(m + 1, vector<int>(n + 1, 0));
for (int i = 1; i <= m; i++) {
for (int j = 1; j <= n; j++) {
if (X[i-1] == Y[j-1])
dp[i][j] = dp[i-1][j-1] + 1;
else
dp[i][j] = max(dp[i-1][j], dp[i][j-1]);
}
}
string result;
int i = m, j = n;
while (i > 0 && j > 0) {
if (X[i-1] == Y[j-1]) {
result = X[i-1] + result;
i--; j--;
} else if (dp[i-1][j] > dp[i][j-1]) {
result = X[i-1] + result;
i--;
} else {
result = Y[j-1] + result;
j--;
}
}
while (i > 0) {
result = X[i-1] + result;
i--;
}
while (j > 0) {
result = Y[j-1] + result;
j--;
}
return result;
}
Conclusion
The LCS algorithm is a powerful tool for comparing sequences, applicable in diff tools, DNA analysis, and file comparison utilities. It scales well with optimizations and is foundational in dynamic programming.