CFP last date
20 December 2024
Reseach Article

Article:Efficient Tree Based Distributed Data Mining Algorithms for mining Frequent Patterns

by T.SathishKumar, V.Kavitha, Dr.T.Ravichandran
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 10 - Number 1
Year of Publication: 2010
Authors: T.SathishKumar, V.Kavitha, Dr.T.Ravichandran
10.5120/1447-1957

T.SathishKumar, V.Kavitha, Dr.T.Ravichandran . Article:Efficient Tree Based Distributed Data Mining Algorithms for mining Frequent Patterns. International Journal of Computer Applications. 10, 1 ( November 2010), 11-16. DOI=10.5120/1447-1957

@article{ 10.5120/1447-1957,
author = { T.SathishKumar, V.Kavitha, Dr.T.Ravichandran },
title = { Article:Efficient Tree Based Distributed Data Mining Algorithms for mining Frequent Patterns },
journal = { International Journal of Computer Applications },
issue_date = { November 2010 },
volume = { 10 },
number = { 1 },
month = { November },
year = { 2010 },
issn = { 0975-8887 },
pages = { 11-16 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume10/number1/1447-1957/ },
doi = { 10.5120/1447-1957 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T19:58:38.554008+05:30
%A T.SathishKumar
%A V.Kavitha
%A Dr.T.Ravichandran
%T Article:Efficient Tree Based Distributed Data Mining Algorithms for mining Frequent Patterns
%J International Journal of Computer Applications
%@ 0975-8887
%V 10
%N 1
%P 11-16
%D 2010
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Advancements in the field of wired and wireless network environments have paved route to the advent of many dynamic distributed computing environments. These environments have diverged computing resources and multiple heterogeneous sources of data. Most mining algorithms are designed to mine rules from monolithic non-distributed databases. Even algorithms exclusively designed to operate on distributed databases normally download the relevant data to a centralized location and then perform the data mining operations. This centralized approach does not work well in many of the distributed, ubiquitous, privacy sensitive data mining applications, which opened a new area of research Distributed Data Mining (DDM) under the data mining domain. Out of various methods employed to mine frequent Itemsets, tree based methodology proves some efficiency in distributed environment. So in this paper we study a set of tree based algorithms [DTFIM, PP, LFP and PP] to mine frequent pattern in distributed environment.

References
  1. R. Agrawal, T. Imielinski and A. N. Swami, 1993. "Mining association rules between sets of items in large databases", in ACM SIGMOD Int. Conf. on Management of Data pp. 207-16.
  2. R. Agrawal and R. Srikant, 1994. "Fast algorithms for mining association rules,", in VLDB pp. 487-99.
  3. R. Agrawal, H. Mannila, R. Srikant, H. Toivonen, and A.I. Verkamo 1996.”Fast discovery of association rules”, in Advances in KnowledgeDiscovery and Data Mining pages 307–328.
  4. E. Ansari, G.H. Dastghaibifard, M. Keshtkaran,” 2008. 19-21 March, 2008. DTFIM: Distributed Trie-based Frequent Itemset Mining”, Proceedings of the International MultiConference of Engineers and Computer Scientists 2008 Vol I IMECS.
  5. M.Z Ashrafi, D. Taniar and K. Smith, 2004. "ODAM: An optimized distributed association rule mining algorithm", IEEE Distributed Systems Online 1541-4922, 5 (3).
  6. G. Buehrer, S. Parthasarathy, S. Tatikonda, T. Kurc and J. Saltz,2007. "Toward terabyte pattern mining an architecture-conscious solution," in PPoPP, p. 2-12.
  7. D. W. Cheung, J. Han, V. T. Ng, and C. Y.Wong. 1996. “Maintenance of Discovered Association Rules in Large Databases: An Incremental Updating Technique”. In proceedings of 12th ICDE.
  8. S. Cong, J. Han, J. Hoeflinger and D. Padua, 2005. "A sampling-based framework for parallel data mining," in PPoPP , pp. 255-65
  9. D. Chen, C. Lai, W. Hu, W.G. Chen, Y. Zhang and W. Zheng, 2006. "Tree partition based parallel frequent pattern mining on shared memory systems," in IEEE Parallel and Distributed Processing Symposium.
  10. David Wai-Lok Cheung , Jiawei Han , Vincent Ng , C. Y. Wong, February 26-March 01. Maintenance of Discovered Association Rules in Large Databases: An Incremental Updating Technique, Proceedings of the Twelfth International Conference on Data Engineering, p.106-114.
  11. G. Grahne and J. Zhu , May 2003. ”High performance mining of maximal frequent itemsets”, In SIAM’03 Workshop on High Performance Data Mining: Pervasive and Data Stream Mining.
  12. Han, J., Pei, J., and Yin, Y. 2000. Mining frequent patterns without candidate generation. In Proc. 2000 ACMSIGMOD Int. Conf. Management of Data (SIGMOD’00), Dallas, TX, pp. 1–12
  13. J. Hu and X. Yang-Li, 2008. "A fast parallel association rules mining algorithm based on FP-Forest," in 5th Int. Symposium on Neural Networks , pp. 40-9.
  14. Haoyuan Li,Yi Wang,Dong Zhang, Ming Zhang,Edward Chang 2008.”Pfp: parallel fp-growth for query recommendation Proceedings of the 2008 ACM conference on Recommender systems Pages: 107-114.
  15. A. Javed and A. Khokhar, 2004. "Frequent pattern mining on message passing multiprocessor systems," Distributed and Parallel Databases , vol. 16, pp. 321-34.
  16. Jian Pei,Jiawei Han ,Hongjun Lu ,Shojiro Nishio ,Shiwei Tang ,Dongquing Yang,” H-Mine: Hyper-Structure Mining of Frequent Patterns in Large Databases”, First IEEE International Conference on Data Mining (ICDM'01)
  17. Kun-Ming Yu, Jiayi Zhou, and Wei Chen Hsiao , 2007. ”Load Balancing Approach Parallel Algorithm for Frequent Pattern Mining” V. Malyshkin (Ed.): PaCT 2007, LNCS 4671, pp. 623–631, 2007 Springer-Verlag Berlin Heidelberg .
  18. Laila A. Abd-Elmegid Mohamed E. El-Sharkawi Laila M. El-Fangary & Yehia K. Helmy May 2010. “Vertical Mining of Frequent Patterns from Uncertain Data” journal on Computer and Information Science Vol. 3, No. 2.
  19. Laszlo Szathmary, Petko Valtchev, Amedeo Napoli, and Robert Godin 2008. “An Efficient Hybrid Algorithm for Mining Frequent Closures and Generators “ CLA 2008, pp. 47–58, ISBN 978–80–274–2111–7, Palack´y University, Olomouc.
  20. J. Liu, Y. Pan, K. Wang, and J. Han, 2002. ” Mining frequent item sets by opportunistic projection”. In SIGKDD.
  21. Mannila, H.; Toivonen, H.; and Verkamo, 1994. A. I. Efficient algorithms for discovering associationrules. In AAAI Workshop on Knowledge Discovery in Databases (KDD 94) , 181 - 192.
  22. Minho Kim, Gye Hyung Kim and R.S. Ramakrishna 2003 .“A Virtual Join Algorithm for Fast Association Rule Mining “Intelligent Data Engineering and Automated Learning, Volume 2690/2003, 796-800, DOI: 10.1007/978-3-540-45080-1_108
  23. Mohammed J. Zaki. May/June 2000. Scalable algorithms for association mining. IEEE Transactions on Knowledge and Data Engineering, 12(3):372-390.
  24. S. Orlando, P. Palmerini, R. Perego and F. Silvestri, 2003."An efficient parallel and distributed algorithm for counting frequent sets," in VECPAR , pp. 421-35.
  25. J.S. Park, M.-S. Chen, and P.S. Yu 1995. ”An effective hash based algorithm for mining association rules”, In Proceedings of the 1995 ACM SIGMOD International Conference on Management of Data, volume 24(2) of SIGMOD Record, pages 175–186. ACM Press.
  26. Savasere, E. Omiecinski and S. B. Navathe, "An efficient algorithm for mining association rules in large databases," in VLDB , 1995, pp. 432-44.
  27. Tanbeer SK, Ahmed CF, Jeong B. 2009.” Parallel and Distributed Algorithms for Frequent Pattern Mining in Large Databases” IETE Tech Rev;26:55-65
  28. H. Toivonen, 1996. ”Sampling large databases for association rules”,in T.M. Vijayaraman, A.P. Buchmann, C. Mohan, and N.L. Sarda, editors, Proceedings 22nd International Conference on Very Large Data Bases, pages 134–145. Morgan Kaufmann.
  29. K.-M. Yu, J. Zhou and W. C. Hsiao, 2007. "Load balancing approach parallel algorithm for frequent pattern mining," in PaCT , pp. 623-31.
  30. O.R.Zaοane, M. El-Hajj and P. Lu, 2001. "Fast parallel association rule mining without candidacy generation," in IEEE Int. Conf. on Data Mining , pp. 665-8.
Index Terms

Computer Science
Information Sciences

Keywords

Tree based algorithms Distributed data mining Mining Frequent patterns