Longest Common Substring

Geeks


Description:

Given two strings ‘X’ and ‘Y’, find the length of the longest common substring.

Example:

Input : X = "GeeksforGeeks", y = "GeeksQuiz"
Output : 5
The longest common substring is "Geeks" and is of
length 5.

Input : X = "abcdxyz", y = "xyzabcd"
Output : 4
The longest common substring is "abcd" and is of
length 4.

Input : X = "zxabcdezy", y = "yzabcdezx"
Output : 6
The longest common substring is "abcdez" and is of
length 6.

Idea:

A simple solution is to one by one consider all substrings of first string and for every substring check if it is a substring in second string. Keep track of the maximum length substring. There will be O(m^2) substrings and we can find whether a string is subsring on another string in O(n) time (See this). So overall time complexity of this method would be O(n * m2)

Dynamic Programming can be used to find the longest common substring in O(m*n) time. The idea is to find length of the longest common suffix for all substrings of both strings and store these lengths in a table.

The longest common suffix has following optimal substructure property
   LCSuff(X, Y, m, n) = LCSuff(X, Y, m-1, n-1) + 1 if X[m-1] = Y[n-1]
                        0  Otherwise (if X[m-1] != Y[n-1])

The maximum length Longest Common Suffix is the longest common substring.
   LCSubStr(X, Y, m, n)  = Max(LCSuff(X, Y, i, j)) where 1 <= i <= m
                                                     and 1 <= j <= n

如果要输出LCS本身而不是长度,则在获得maxLen的时候记录index。

time O(m n), space O(m n)

Code:

#include <vector>
#include <iostream>
#include <string>

using namespace std;

int LCS(string s1, string s2){
    if(s1.empty() || s2.empty()){
        return 0;
    }
    int sizeS1 = s1.size();
    int sizeS2 = s2.size();

    vector<vector<int>> lenLCS(sizeS1+1, vector<int>(sizeS2+1, 0));
    int maxLen = 0;

    for(int i=0; i<sizeS1; i++){
        for(int j=0; j<sizeS2; j++){
            if(s1[i]==s2[j]){
                lenLCS[i+1][j+1] = lenLCS[i][j] + 1;
                maxLen = max(maxLen, lenLCS[i+1][j+1]);
            }
        }
    }
    return maxLen;
}

string LCSReturnString(string s1, string s2){
    if(s1.empty() || s2.empty()){
        return "";
    }
    int sizeS1 = s1.size();
    int sizeS2 = s2.size();

    vector<vector<int>> lenLCS(sizeS1+1, vector<int>(sizeS2+1, 0));
    int maxLen = 0;
    string result;
    int leftIndex=0;
    int rightIndex=0;

    for(int i=0; i<sizeS1; i++){
        for(int j=0; j<sizeS2; j++){
            if(s1[i]==s2[j]){
                lenLCS[i+1][j+1] = lenLCS[i][j] + 1;
                if(lenLCS[i+1][j+1]>maxLen){
                    maxLen = lenLCS[i+1][j+1];
                    rightIndex=i;
                    leftIndex=i-maxLen+1;
                }
            }
        }
    }
    return s1.substr(leftIndex, rightIndex-leftIndex+1);
}

int main(){

    string s1("GeeksforGeeks");
    string s2("Geeks2");

    s1 = "OldSite:GeeksforGeeks.org";
    s2 = "NewSite:GeeksQuiz.com";

    cout<<LCS(s1, s2)<<endl;

    cout<<LCSReturnString(s1, s2)<<endl;
}

results matching ""

    No results matching ""