CFP last date
20 January 2025
Reseach Article

Efficient Algorithm for Mining Frequent Subgraphs (Static and Dynamic) based on gSpan

by K. Lakshmi, T. Meyyappan
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 63 - Number 19
Year of Publication: 2013
Authors: K. Lakshmi, T. Meyyappan
10.5120/10572-3117

K. Lakshmi, T. Meyyappan . Efficient Algorithm for Mining Frequent Subgraphs (Static and Dynamic) based on gSpan. International Journal of Computer Applications. 63, 19 ( February 2013), 9-12. DOI=10.5120/10572-3117

@article{ 10.5120/10572-3117,
author = { K. Lakshmi, T. Meyyappan },
title = { Efficient Algorithm for Mining Frequent Subgraphs (Static and Dynamic) based on gSpan },
journal = { International Journal of Computer Applications },
issue_date = { February 2013 },
volume = { 63 },
number = { 19 },
month = { February },
year = { 2013 },
issn = { 0975-8887 },
pages = { 9-12 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume63/number19/10572-3117/ },
doi = { 10.5120/10572-3117 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T21:14:44.860824+05:30
%A K. Lakshmi
%A T. Meyyappan
%T Efficient Algorithm for Mining Frequent Subgraphs (Static and Dynamic) based on gSpan
%J International Journal of Computer Applications
%@ 0975-8887
%V 63
%N 19
%P 9-12
%D 2013
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Frequent sub graph mining is another active research topic in data mining. A graph is a general model to represent data and has been used in many domains like chemo informatics and bioinformatics. Mining patterns from graph databases is challenging since graph related operations, such as sub graph testing, generally have higher time complexity than the corresponding operations on item sets, sequences, and trees. We investigated new approaches for frequent graph-based pattern mining in graph datasets and found that a novel algorithm called span (graph-based Substructure pattern mining), has been used as the standard for comparing performance of new algorithms. It is based on the pattern growth approach of frequent sub graph mining and hence discovers frequent substructures without candidate generation. span is based on a lexicographic order, and maps each graph to a unique minimum DFS code as its canonical label. Based on this lexicographic order, gSpan adopts the depth-first search strategy to mine frequent connected sub graphs efficiently. Based on the literature survey done, algorithms based on pattern growth approach and DFS strategy are found to be better in performance than algorithms based on Apriori approach and BFS strategy. In this paper we propose a new algorithm based on gSpan, for a special class of graphs characterised by the existence of unique node lablels.

References
  1. P. Dickinson, H. Bunked, A. Dadejiichand M. Kraetzl. On graphs with unique node labels. In Proceedings of the IAPR Workshop on GBR, pages 13, 2003.
  2. X. Yan and J. Han. gSpan: Graph-based substructure pattern mining. Proc. 2nd IEEE Int'l Conf.
  3. Yuhua Li Quan Lin Gang Zhong Dongsheng Duan Yanan Jin Wei Bi, A Directed Labeled Graph Frequent Pattern Mining Algorithm based on Minimum Code. In the proceedings of Third International Conference on Multimedia and Ubiquitous Engineering 2009.
  4. Chang Hun You, Lawrence B. Holder and Diane J. Cook :Graph-based Data Mining in Dynamic Networks: Empirical Comparison of Compression-based and Frequency-based Subgraph Mining in International Conference on Data Mining Workshops IEEE, 2004.
  5. Hsun-Ping Hsieh, Cheng-Te Li, Mining Temporal Subgraph Patterns in Heterogeneous Information Networks: In the proceedings of IEEE International Conference on Social Computing / IEEEInternational Conference on Privacy, Security, Risk and Trust.
  6. Nijssen, S. and Kok, J. , A quickstart in frequent structure mining can make a difference. In Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery andData Mining, ACM, 2004, pp. 647–652.
  7. Bianca Wackersreuther, Peter Wackersreuther, Annahita Oswald : Frequent Subgraph Discovery in Dynamic Networks, ACM 978-1-4503-0214-2 2010.
  8. L. T. Thomas, S. R. Valluri, and K. Karlapalem. Margin:Maximal frequent subgraph mining. Proc. 6th IEEE Int'l Conf. Data mining (ICDM '06), pp. 1097-1101, 2006.
  9. M. Kuramochi and G. Karypis. Grew-a scalable frequent subgraph discovery algorithm. In ICDM, pages 439–442,2004.
  10. J. Huan, W. Wang, and J. Prins. Efficient mining of frequent subgraph in the presence of isomorphism. UNC computer science technique report TR03-021, 2003.
  11. J. Huan, W. Wang, J. Prins, and J. Yang. Spin: Mining maximal frequent subgraphs from graph databases. UNC Technical Report TR04-018, 2004.
  12. T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein. Introduction to Algorithms. MIT Press, 2001, Second Edition.
  13. J. Huan, W. Wang, and J. Prins. Efficient mining of frequent subgraph in the presence ofisomorphism. UNC computer science technique report TR03-021, 2003.
  14. A. Inokuchi, T. Washio, and H. Motoda. An apriori-based algorithm for mining frequent substructures from graph data. In PKDD'00.
  15. M. Kuramochi and G. Karypis . Frequent Subgraph Discovery. In ICDM'01.
  16. M. Deshpande, M. Kuramochi, and G. Karypis. Frequent sub-structure based approaches forclassifying chemical compounds. In Proc. of 2003 IEEE International Conference on Data Mining(ICDM), 2003.
  17. Yong Liu, Jianzhong Li, Hong Gao, JPMiner: Mining Frequent Jump Patterns From Graph Databases. In the proceedings of Sixth International Conference on Fuzzy Systems and Knowledge Discovery 2009.
  18. Nijssen, S. and Kok, J. , Faster association rules for multiple relations. In IJCAI'01: SeventeenthInternational Joint Conference on Artificial Intelligence, 2001, vol. 2, pp. 891–896.
Index Terms

Computer Science
Information Sciences

Keywords

Parallel programming frequent subgraph mining DFS code Isomorphism