International Journal of Computer Applications |
Foundation of Computer Science (FCS), NY, USA |
Volume 81 - Number 2 |
Year of Publication: 2013 |
Authors: Vidya Saikrishna, Sid Ray |
10.5120/13986-1996 |
Vidya Saikrishna, Sid Ray . Improved Approximate Multiple-Pattern String Matching using Consecutive N-Grams. International Journal of Computer Applications. 81, 2 ( November 2013), 26-31. DOI=10.5120/13986-1996
String matching is to find all the occurrences of a given pattern in a large text, the strings being sequence of characters drawn from finite alphabet set. Multiple-Pattern string matching problem involves detection of all the patterns of the Multiple-Pattern set in the text. Shift OR algorithm which we call as the Standard Shift OR algorithm uses the concept of Bit Parallelism to perform approximate string matching. The algorithm as the name suggests performs approximate string matching which means that it finds out some false matches besides detecting correct matches. In other words the algorithm behaves as a filter. In this paper a modification of the standard Shift OR is proposed to improve the filtering efficiency of the standard Shift OR algorithm using the consecutive N-Grams of the patterns of the multiple-pattern set. The proposed method reads N characters of the text at once as compared to a single character in the standard Shift OR algorithm. The number of false matches reduces besides increasing the speed of matching. Extensive experiments have been performed with the algorithm on text and pattern of variable size and the results are compared with the standard Shift OR algorithm.