CFP last date
20 December 2024
Reseach Article

Fuzzy based Probability Factor Calculation for Number of Cluster Estimation to K-Mean by using Apriori

by Pratishtha Singh Baghel, Divakar Singh
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 114 - Number 18
Year of Publication: 2015
Authors: Pratishtha Singh Baghel, Divakar Singh
10.5120/20078-2105

Pratishtha Singh Baghel, Divakar Singh . Fuzzy based Probability Factor Calculation for Number of Cluster Estimation to K-Mean by using Apriori. International Journal of Computer Applications. 114, 18 ( March 2015), 18-21. DOI=10.5120/20078-2105

@article{ 10.5120/20078-2105,
author = { Pratishtha Singh Baghel, Divakar Singh },
title = { Fuzzy based Probability Factor Calculation for Number of Cluster Estimation to K-Mean by using Apriori },
journal = { International Journal of Computer Applications },
issue_date = { March 2015 },
volume = { 114 },
number = { 18 },
month = { March },
year = { 2015 },
issn = { 0975-8887 },
pages = { 18-21 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume114/number18/20078-2105/ },
doi = { 10.5120/20078-2105 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T22:53:08.643494+05:30
%A Pratishtha Singh Baghel
%A Divakar Singh
%T Fuzzy based Probability Factor Calculation for Number of Cluster Estimation to K-Mean by using Apriori
%J International Journal of Computer Applications
%@ 0975-8887
%V 114
%N 18
%P 18-21
%D 2015
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Data mining is a powerful and a new field having various techniques. It converts the raw data into useful information in various research fields. Clustering is used to collect similar data in a group. It is a process of putting similar data into groups. A popular technique for clustering is K-means in which data are partitioned into K clusters. In this method, the number of clusters is pre de?ned and the technique is highly dependent on the initial identi?cation of elements that represent the clusters well. But we cannot changethe number of cluster at mid of execution of algorithm. But in k-mean, important factor is that how many clusters we should take, it may be less and it may be more. There is not any mechanism to estimate the number of clusters in k-mean. It totally depends upon user, how many he takes. But for large amount of data user can't decide how much data have similar. For example, if maximum data have common similarities, so why we take more cluster. For this it may be minimum number of s for better evaluation and better performance. similarly if we have a amount of dissimilar data so we should take more cluster in k-mean. For this we are using a priori to generate association rules and with the help of association rule we put the values in my proposed equation and calculate the probability factor to give us the estimated number of cluster sfor k-mean.

References
  1. Aurangzeb Khan, Khairullah khan, Baharum B. Baharudin, "Frequent Patterns Minning Of Stock Data Using Hybrid Clustering Association Algorithm", 2009 International Conference on Information Management and Engineering.
  2. . Tayfun Servi, Hamza Erol, "A Data Mining Method For Refining Groups In Data Using Dynamic Model Based Clustering", 978-1-4799-0661-1 / 13 / 2013 IEEE.
  3. Dr. S. Santhosh Baboo, K. Tajudin, "Clustering Centroid Finding Algorithm (CCFA) using Spatial Temporal Data Mining Concept", "Proceedings of the 2013 International Conference on Pattern Recognition, Informatics and Mobile Engineering (PRIME) February 21-22".
  4. Guangbin Bao,Chaojia Yu, Hong Zhao ,Hong Zhao, "The Model of Data Replica Adjust to the Need Based on HDFS Cluster", 2012 Fifth International Conference on Business Intelligence and Financial Engineering.
  5. Cheng-Fa Tsai, Han-Chang Wu, and Chun-Wei Tsai, "A New Data Clustering Approach for Data Mining in Large Databases", Proceedings of the International Symposium on Parallel Architectures, Algorithms and Networks (ISPAN02).
  6. Margaret H. Dunham, Data Mining- Introductory and Advanced Concepts, Pearson Education, 2006.
  7. Berry, M. J. A. and Linoff, G. Data mining techniques for marketing, sales and customer support, USA: John Wiley and Sons,1997
  8. Fayyad, U. M; Piatetsky-Shapiro, G. ; Smyth, P. ; and Uthurusamy, R. . Advances in Knowledge Discovery and Data Mining. Menlo Park, Calif. : AAAI Press 1996.
  9. Dr. Gary Parker, vol 7, Data Mining: Modules in emerging fields, CD-ROM, 2004.
  10. Jiawei Han and Micheline Kamber , Data Mining Concepts and Techniques, published by Morgan Kauffman, 2nd ed 2006.
  11. Literature Review: Data mining, http://nccur. lib. nccu. edu. twlbitstream/ 140. 1 I 9/3523 I/S/35603 I OS. pdf, retrieved on June 2012.
  12. Divakar Singh, A. Shrivastava, algorithm for frequent item set based on Apriori: SFIT, "3rd International conference on Electronics Computer Technology (ICECT)", 8-10 April 2011.
Index Terms

Computer Science
Information Sciences

Keywords

Data mining clustering a priori k-means association rules probability factor.