CFP last date
20 February 2025
Reseach Article

Data Mining - Techniques, Methods and Algorithms: A Review on Tools and their Validity

by Mansi Gera, Shivani Goel
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 113 - Number 18
Year of Publication: 2015
Authors: Mansi Gera, Shivani Goel
10.5120/19926-2042

Mansi Gera, Shivani Goel . Data Mining - Techniques, Methods and Algorithms: A Review on Tools and their Validity. International Journal of Computer Applications. 113, 18 ( March 2015), 22-29. DOI=10.5120/19926-2042

@article{ 10.5120/19926-2042,
author = { Mansi Gera, Shivani Goel },
title = { Data Mining - Techniques, Methods and Algorithms: A Review on Tools and their Validity },
journal = { International Journal of Computer Applications },
issue_date = { March 2015 },
volume = { 113 },
number = { 18 },
month = { March },
year = { 2015 },
issn = { 0975-8887 },
pages = { 22-29 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume113/number18/19926-2042/ },
doi = { 10.5120/19926-2042 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T22:51:15.633134+05:30
%A Mansi Gera
%A Shivani Goel
%T Data Mining - Techniques, Methods and Algorithms: A Review on Tools and their Validity
%J International Journal of Computer Applications
%@ 0975-8887
%V 113
%N 18
%P 22-29
%D 2015
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Data mining is the process of extracting the useful data, patterns and trends from a large amount of data by using techniques like clustering, classification, association and regression. There are a wide variety of applications in real life. Various tools are available which supports different algorithms. A summary about data mining tools available and the supporting algorithms is the objective of this paper. Comparison between various tools has also been done to enable the users use various tools according to their requirements and applications. Different validation indices for the validation are also summarized.

References
  1. PhridviRaj MSB. , GuruRao CV (2013) Data mining – past, present and future – a typical survey on data streams. INTER-ENG Procedia Technology 12:255 – 263
  2. Srivastava S (2014) Weka: A Tool for Data preprocessing, Classification, Ensemble, Clustering and Association Rule Mining. International Journal of Computer Applications (0975 – 8887) 88:. 10
  3. Soni N, Ganatra A (2012) Categorization of Several Clustering Algorithms from Different Perspective: A Review. IJARCSSE
  4. Demšar J, Zupan B (2013) Orange: Data Mining Fruitful and Fun - A Historical Perspective. Informatica 37:55–60
  5. Jain AK, Murty MN, Flynn PJ (1999) Data Clustering: A Review. ACM Computing Surveys, 31:264-323
  6. Han J, Kamber M (2001) Data Mining. Kaufmann Publishers, Morgan
  7. Rao IKR (2003) Data Mining and Clustering Techniques DRTC Workshop on Semantic Web, pp. 23-30
  8. Mitra S, Pal KS, Mitra P (2002) Data Mining in Soft Computing Framework: A Survey. IEEE, 13: 3-14
  9. Gupta GK (2012) Introduction to data mining with case studies PHI, New Delhi
  10. Baker RID, Yacef K (2009) The State of Educational Data Mining:A Review and Future Visions. JEDM - Journal of Educational Data Mining, 1: 3-16
  11. Kumar R, Kapil AK, Bhatia (2012) A Modified tree classification in data mining. Global Journals Inc. 12, 12: 58-63
  12. Zhao Q, Fränti P (2014) WB-index: A sum-of-squares based index for cluster validity. Data & Knowledge Engineering 92:77–89
  13. Rui Xu, Donald CW II (2005) Survey of Clustering Algorithms. IEEE Transactions on neural Networks, 16: 645-678
  14. Kleinberg J (2002) An impossibility theorem for clustering. Conf. Advances in Neural Information Processing Systems, 15: 463–470
  15. Jain A, Dubes R (1988) Algorithms for Clustering Data. Englewood Cliffs, NJ: Prentice-Hall
  16. Abbas OA (2008) Comparisons between Data Clustering Algorithms. International Journal of Information Technology 5: 320-325
  17. Kotsiantis SB, Pintelas PB (2004) Recent Advances in Clustering: A Brief Survey. WSEAS Transactions on Information Science and Applications, 1(1): 73–81
  18. Jain AK (2010) Data Clustering: 50 Years Beyond K- Means. Pattern Recognition Letters, 31(8): 651-666
  19. Rao GN, Nagaraj S (2014) A Study on the Prediction of Student's Performance by applying straight-line regression analysis using the method of least squares. IJCSE 3: 43-45
  20. Sansgiry SS, Bhosle M, Sail K (2006) Factors That Affect Academic Performance Among Pharmacy Students. American Journal of Pharmaceutical Education 70 (5) Article 104
  21. Kriegel HK, Borgwardt KM, Kröger P, Pryakhin A, Schubert M, Zimek A (2007) Future trends in data mining. Data Mining and Knowledge Discovery 15:87–97
  22. Radaideh Q, Nagi E (2012) Using Data Mining Techniques to Build a Classification Model for Predicting Employees Performance. IJACSA 3:144- 151
  23. Vijiyarani S, Sudha S (2013) Disease prediction in data mining- A survey. IJCAIT (2).
  24. Velmurugan T (2014) Performance based analysis between k-Means and Fuzzy C-Means clustering algorithms for connection oriented telecommunication data. Applied Soft Computing 19 pp. 134–146
  25. Huang Z (1998) Extensions to the k-Means Algorithm for Clustering Large Data Sets with Categorical Values. Acsys CRC, CSIRO
  26. Ngai EWT, Yong Hu, Wong YH,Chen Y, Sun X (2011) The application of data mining techniques in financial fraud detection: A classification framework and an academic review of literature. Decision Support Systems 50:559-569
  27. André L. V. Coelho, , Everlândio Fernandes, Katti Faceli (2011) Multi-objective design of hierarchical consensus functions for clustering ensembles via genetic programming Decision Support Systems 51:794-809
  28. Aviad B, Roy G (2012) A decision support method, based on bounded rationality concepts, to reveal feature saliency in clustering problems. Decision Support Systems 54: 292–303
  29. Combes C, Azema J (2013) Clustering using principal component analysis applied to Autonomy – disability of elderly people. Decision Support Systems 55:578–586
  30. Sandeep, Priyanka, Bansal R (2014) Performance Comparison of Various Partition based Clustering Algorithms. IJEMR pp. 216-223
  31. Oyelade OJ, Oladipupo OO, Obagbuwa, IC (2010) Application of k-Means Clustering algorithm for prediction of Students' Academic Performance. IJCSIS 7: 292-295
  32. Rao GN, Ramachandra M (2014) A Study on the Academic Performance of the Students by Applying K-Means Algorithm. IJETCAS 14-180
  33. Adhikari A, Rao PR (2008) Efficient clustering of databases induced by local patterns. Decision Support Systems 44:925–943
  34. Lin PL, Po-Huang PW ,Kuo PH , Lai YH (2014) A size-insensitive integrity-based fuzzy c-means method for data clustering. Pattern Recognition 47:2042–2056
  35. Jacques J, Preda C (2014) Model-based clustering for multivariate functional data. Computational Statistics and Data Analysis 71:92–106
  36. Angelis LD, Dias JG (2014) Mining categorical sequences from data using a hybrid clustering method. European Journal of Operational Research 234:720–730
  37. Xiao FU, Fan C (2014) Data mining in building automation system for improving building operational performance. Energy and Buildings 75: 109–118
  38. Irpino A, Verde R, Francisco de A. T, Carvalho (2014) Dynamic clustering of histogram data based on adaptive squared Wasserstein distances. Expert Systems with Applications 41:3351–3366
  39. Liu Y, Qianqian Li, Tang X, Ning Ma, Tian R (2014) Superedge prediction:What opinions will be mined based on an opinion supernetwork model. Decision Support Systems 64:118–129
  40. Romero C ,Ventura S (2007) Educational data mining: A survey. Expert Systems with Applications 33: 135–146
  41. Breese JS, Heckerman D, Kadie C (1998) Empirica Analysis of Predictive Algorithms for Collaborative Filtering Microsoft Research, Morgan Kaufmann Publishers, pp. 1-18.
  42. Padmaja S and Fatima SS (2013) Opinion Mining and Sentiment Analysis –An Assessment of People's Belief: A Survey. International Journal of Ad hoc, Sensor & Ubiquitous Computing (IJASUC) 4(1)
  43. Basili, R. , Di Nanni, M. and Pazienza, M. T. (1999) Engineering of IE systems: an object oriented approach. In: Pazienza, editor, Information Exctraction, LNAI 1714, pp. 134–164
  44. Ferrucci D, Lally A (2004) UIMA: an architectural approach to unstructured information processing in the corporate research Environment. Natural Language Engineering 10:327 – 348
  45. Low Y, Gonzalez J, Kyrola A, Bickson A, Guestrin C, Berkeley UC (2010) GraphLab: A NewFramework For Parallel Machine Learning
Index Terms

Computer Science
Information Sciences

Keywords

Data mining Algorithms Clustering