CFP last date
20 December 2024
Reseach Article

Improved Approximate Multiple Pattern String Matching using Consecutive Q Grams of Pattern

by Syed Danish Ali, Zuber Farooqui
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 76 - Number 15
Year of Publication: 2013
Authors: Syed Danish Ali, Zuber Farooqui
10.5120/13320-0451

Syed Danish Ali, Zuber Farooqui . Improved Approximate Multiple Pattern String Matching using Consecutive Q Grams of Pattern. International Journal of Computer Applications. 76, 15 ( August 2013), 1-6. DOI=10.5120/13320-0451

@article{ 10.5120/13320-0451,
author = { Syed Danish Ali, Zuber Farooqui },
title = { Improved Approximate Multiple Pattern String Matching using Consecutive Q Grams of Pattern },
journal = { International Journal of Computer Applications },
issue_date = { August 2013 },
volume = { 76 },
number = { 15 },
month = { August },
year = { 2013 },
issn = { 0975-8887 },
pages = { 1-6 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume76/number15/13320-0451/ },
doi = { 10.5120/13320-0451 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T21:45:57.574915+05:30
%A Syed Danish Ali
%A Zuber Farooqui
%T Improved Approximate Multiple Pattern String Matching using Consecutive Q Grams of Pattern
%J International Journal of Computer Applications
%@ 0975-8887
%V 76
%N 15
%P 1-6
%D 2013
%I Foundation of Computer Science (FCS), NY, USA
Abstract

String matching is to find all the occurrences of a given pattern in a large text both being sequence of characters drawn from finite alphabet set. This problem is fundamental in computer Science and is the basic need of many applications such as text retrieval, symbol manipulation, computational biology, data mining, and network security. Bit parallelism method is used for increasing the processing speed of String matching algorithm. Standard Shift OR algorithm is used to perform approximate string matching. The algorithm is a filter which finds out false matches besides detecting correct matches. To improve the efficiency of basic Shift OR algorithm by reducing the number of false matches that is detected along with the correct matches by the algorithm, proposed Shift OR with consecutive q grams has been implemented. In the algorithm instead of reading a single character at a time, it read q characters at once. Extensive experiments have been done with the algorithm and the results are compared with basic version of shift OR algorithms. The number of false matches also reduced considerably. The gain is due to the improved ?ltering efficiency caused by q-grams.

References
  1. Leena Salmela, J. Tarhio and J. Kytojoki, "Multiple Pattern String Matching with Q grams", ACM Journal Name, Vol. V, No. N, Month 20YY
  2. Rajesh Prasad, Suneeta Agarwal, Ishadutta Yadav, Bharat Singh "Efficient Bit-Parallel Multi-Patterns String Matching Algorithms for Limited Expression", Compute 2010 ACM
  3. Heikki Hyyr¨o, Kimmo Fredriksson Gonzalo Navarro , "Increased Bit-Parallelism for Approximate and Multiple String Matching", ACM Journal of Experimental Algorithmics Vol 10 2006.
  4. Gonzalo Navarro and Mathieu Raffinot. A Bit Parallel approach to Suffix Automata : Fast Extended String Matching. In M. Farach (editor), Proc. CPM'98, LNCS 1448. Pages 14-33, 1998.
  5. G. Navarro,M. Raffinot, Fast and flexible string matching by combining bit-parallelism and suffix automata,ACM J. Experimental Algorithmics (JEA) 5 (4) (2000).
  6. M. Crochemore et al. , A bit-parallel suffix automaton approach for (?, ? )-matching in music retrieval, in: Proc. 10th Internat. Symp. on String Processing and Information Retrieval (SPIRE'03), in: Lecture Notes in Computer. Sci. , vol. 2857, 2003, pp. 211–223
  7. R. Baeza-Yates, G. Gonnet, A new approach to text searching, Comm. ACM 35 (10) (1992) 74–82.
  8. Hannu Peltola and Jorma Tarhio , Alternative Algorithms for Bit-Parallel String Matching, String Processing and Information Retrieval, 2003 - Springer
  9. Leena Salmela, J. Tarhio and J. Kytojoki "MultiPattern String Matching with Very Large Pattern Sets", ACM Journal of Experimental Algorithmics, Volume 11, 2006.
  10. AHO, A. AND CORASICK, M. 1975. Efficient string matching: n aid to bibliographic search. Communications of the ACM 18, 6, 333–340.
  11. HYYR¨O, H. AND NAVARRO, G. 2002. Faster bit-parallel approximate string matching. In Proc. 13th Combinatorial Pattern Matching (CPM '02). LNCS 2373. Berlin, Germany, Springer, New York. 203–224.
  12. NAVARRO, G. AND RAFFINOT, M. 2000. Fast and flexible string matching by combining bit parallelism and suffix automata. ACM Journal of Experimental Algorithmics (JEA). 5, 4.
  13. Vidya Saikrishna, Akhtar Rasool and Nilay Khare. Article: Spam Filtering through Multiple Pattern Bit Parallel String Matching Combining Shift AND & OR. International Journal of Computer Applications 61(5):40-45, January 2013. Published by Foundation of Computer Science, New York, USA.
  14. . Vidya Saikrishna, Akhtar Rasool and Nilay Khare Time Efficient String Matching Solution for Single and Multiple Pattern using Bit Parallelism. International Journal of Computer Applications 46(6):15-20, May 2012. Published by Foundation of Computer Science, New York, USA
Index Terms

Computer Science
Information Sciences

Keywords

Approximate String matching Bit parallelism Shift OR String Matching q Grams