How do I use rolling hash and binary search to find the longest common sub-string?
-
I understand how rolling hash has been used in Rabin Karp and how it can be used to find the common sub-string of given length. How do I use binary search to determine this length? I tried reading: http://www.infoarena.ro/blog/rolling-hash http://community.topcoder.com/tc?module=Static&d1=tutorials&d2=stringSearching EDIT: I am not sure, but this is what I have understood: I take low as 1 and high as the length of the smaller string. I calculate mid, and find the hash value for this length. If I am able to find a common substring, I recur for mid+1 and high. Otherwise, I recur for low and mid-1. Is this right?
-
Answer:
The main idea for using binary search in this problem is, if you have a common substring of some length 'n', then we can definitely find a common substring of length less than 'n'. So finding the longest common substring involves the following steps: hash1[] = hash of string 1 hash2[] = hash of string 2 lo = 0 hi = length of shorter string + 1 mid = (lo + hi) / 2 if (common substring of length == mid): lo = mid else hi = mid Naive method for finding common substring will run in O(n^2) but you can do it in O(nlogn). Store the hashes of each substring of length 'n' of a string in a STL Set then calculate hash for each substring of length 'n' for the other string and search it in the set. So the overall complexity of this will be O(n*logn^2). You can try this problem http://www.codechef.com/problems/SSTORY and check some ACed solutions if you have trouble implementing.
Gautam Singh at Quora Visit the source
Other answers
First observe that the length of the longest common sub-string is at most the minimum of the lengths of the two input strings and the length of the shortest common sub-string is at least zero (the strings have no characters in common). left = 0 right = min(len(s1), len(s2)) Since you have an algorithm to check if there is a common sub-string of length k (call the algorithm kcs). You can call it O(log n) times to find the maximum k. while(left <= right) { k = (left + right) / 2 if(kcs(s1, s2, k) == true) { left = k } else { right = k } }
Saad Taame
The longest common substring (LCS) of two input strings is a common substring (in both of them) of maximum length. We can relax the constraints to generalize the problem: find a common substring of length . We can then use binary search to find the maximum . This takes time provided that solving the relaxed problem takes linear time. Finding a -common substring can be solved using a rolling hash: 1. Compute hash values of all - length substrings of and . 2. If a hash of coincides with a hash of , then we've found a -length common substring. Step 1 uses a rolling hash to achieve linear time and In order to implement step 2, use a hash table. Add all the hashes of the -length substrings of to the table. For each -length substring of , look it up on the table. This takes expected linear time for a large enough hash table.
Dharmendra Singh
There are some good answers already to this thread. I am also trying to give one.Let us assume that you know about rolling hash. If you donĂ¢t, then do check the answer of Pawan Bhadauria in this thread Now you are aware of rolling hash, we can start now.To implement binary search to any problem , we need to check for all x in S , p(x) => p(y) for all y>x .The important point that allows us to use BS in the problem is that if the given strings have a common substring of length n, they also have at least one common substring of any length m < n. And if the two strings do not have a common substring of length n they do not have a common substring of any length m > n.So , here x must be in range (0,min(|s1|,|s2|)) (both inclusive). Now, for each x we can check is any common substring of length x exist or not . If substring of x length exist then we donĂ¢t have to check for lengths less than x or vice versa.Pseudo code for the BS l = 0 , r= min(s1.length(),s2.length()) while (l<=r){ mid = l + (r-l)/2 if(p(s1,s2,mid)) // p(s1,s2,len) checks for common substring l = mid + 1 else r = mid - 1 } return l-1 Now to find is any common substring of length x exist or not . To ease our way out , we have rolling hash.For s1, we have to find all the possible hashes of all substrings of length x and stores them in hash table ( i have used stl map for this purpose , you can use stl set too or any ds of your choice ). Next for s2 , we need to calculate the hashes of all substring of length x and we compare each hash with existing values in the hash table if match found we returns true. If no match found we returns false.Pseudo code for Rolling Hash p(s1,s2,x){ // s1:string 1 s2:string 2 x:length of substring // To find hash I'm using B as prime. //calculate hash(0) = hash(s1[0...x-1]) s1[0] + s1[1]*B + .... // + s1[x-1]*pow(B,x-1) //To calculate all the other hashes use this equation // hash(i) = hash(s1[i...i+x-1]) = (hash(i-1) - s[i-1])/B // + s1[i+x-1]*pow(B,x-1) // now stores all the hashes in hash table // similarly calculates all the hashes for string 2 and compare them // with existing hashes if (match found) return true else return false } Overall Complexity will be log(n)*(nlog(n)) .Thanks Rushal for A2A.
Tirth Bal
Related Q & A:
- How do I create a digital signature and how do I use it?Best solution by support.office.com
- Where do I find my Yahoo briefcase and how do I use it?Best solution by Yahoo! Answers
- How do I submit my website into several search engines?Best solution by Yahoo! Answers
- How do i put my forum in yahoo search for people to see and connect too?Best solution by Yahoo! Answers
- How do I add my blog to yahoo search engine?Best solution by Yahoo! Answers
Just Added Q & A:
- How many active mobile subscribers are there in China?Best solution by Quora
- How to find the right vacation?Best solution by bookit.com
- How To Make Your Own Primer?Best solution by thekrazycouponlady.com
- How do you get the domain & range?Best solution by ChaCha
- How do you open pop up blockers?Best solution by Yahoo! Answers
For every problem there is a solution! Proved by Solucija.
-
Got an issue and looking for advice?
-
Ask Solucija to search every corner of the Web for help.
-
Get workable solutions and helpful tips in a moment.
Just ask Solucija about an issue you face and immediately get a list of ready solutions, answers and tips from other Internet users. We always provide the most suitable and complete answer to your question at the top, along with a few good alternatives below.