We apologize for a recent technical issue with our email system, which temporarily affected account activations. Accounts have now been activated. Authors may proceed with paper submissions. PhDFocusTM
CFP last date
20 November 2024
Reseach Article

Enhancing the Performance of K-Means Clustering by using Fuzzy Partitioning Matrix

by Ahtesham Husain Shaikh, Manoj E. Patil
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 166 - Number 4
Year of Publication: 2017
Authors: Ahtesham Husain Shaikh, Manoj E. Patil
10.5120/ijca2017913996

Ahtesham Husain Shaikh, Manoj E. Patil . Enhancing the Performance of K-Means Clustering by using Fuzzy Partitioning Matrix. International Journal of Computer Applications. 166, 4 ( May 2017), 18-24. DOI=10.5120/ijca2017913996

@article{ 10.5120/ijca2017913996,
author = { Ahtesham Husain Shaikh, Manoj E. Patil },
title = { Enhancing the Performance of K-Means Clustering by using Fuzzy Partitioning Matrix },
journal = { International Journal of Computer Applications },
issue_date = { May 2017 },
volume = { 166 },
number = { 4 },
month = { May },
year = { 2017 },
issn = { 0975-8887 },
pages = { 18-24 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume166/number4/27657-2017913996/ },
doi = { 10.5120/ijca2017913996 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-07T00:12:47.624522+05:30
%A Ahtesham Husain Shaikh
%A Manoj E. Patil
%T Enhancing the Performance of K-Means Clustering by using Fuzzy Partitioning Matrix
%J International Journal of Computer Applications
%@ 0975-8887
%V 166
%N 4
%P 18-24
%D 2017
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Clustering hast two approaches, Hard clustering and soft clustering. The hard clustering restricts that the data object in the given data belongs to exactly one cluster. The problem with hard K-Means (KM) clustering is that the different initial partitions can result in different final clusters. Soft clustering which also known as fuzzy clustering forms clusters such that data object can belong to more than one cluster based on their membership levels. But sometimes the resulting membership values do not always correspond well to the degrees of belonging of the data. So to overcome the problems in hard Fuzzy K-Means clustering, the improved Fuzzy K-Means (FKM) clustering approach is proposed. The proposed improved Fuzzy K-Means clustering assigns membership to an object inversely related to the relative distance of the object to cluster prototype. Fuzzy K-Means clustering assigns membership levels which indicate the degree to which the data elements belong to the clusters, and then using them to assign data object to one or more clusters. These indicate the strength of the association between that data object and a particular cluster. The proposed work also compares the execution time and required memory of Proposed Fuzzy K-Means (FKM) to that of existing Fuzzy K-Means clustering.

References
  1. J. Han and M. Kamber, “Data mining: concepts and techniques," 2001.
  2. C. C. Aggarwal and C. K. Reddy, “Data clustering: algorithms and applications”.
  3. P. Berkhin, “Survey of clustering data mining techniques," San Jose, CA, 2002,
  4. O. M. Jafar and R. Sivakumar, “A comparative study of hard and fuzzy data clustering algorithms with cluster validity indices," in Proceedings of the Elsevier International
  5. J. Daxin, C. Tang, and A. Zhang, “Cluster analysis for gene expression data: a survey", IEEE Transactions on Knowledge and Data Engineering, vol. 16, no. 11, pp. 1370.
  6. A. Baraldi and P. Blonda, “A survey of fuzzy clustering algorithms for pattern recognition," IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, vol. 29, no. 6, pp. 778-785, october 1998.
  7. J. Hu, C. Xiong, J. Shu, X. Zhou, and J. Zhu, “A novel text clustering method based on tgsom and fuzzy k-means" , in Proceedings of the 2009 First International Workshop on Education Technology and Computer Science - Volume 01, ser. ETCS '09, 2009, pp. 26-30.
  8. C. Wu, C. Ouyang, L. nan Chen, and L. Lu, “A new fuzzy clustering validity index with a median factor for centroid based clustering", IEEE Transactions on Fuzzy Systems, vol. 23, no. 3, June 2015.
  9. L. Rokach and O. Maimon, “Clustering methods," in Data mining and knowledge discovery handbook. Springer, 2005, pp. 321-352.
  10. N. A. M. Isa, S. Salamah, U. K. Ngah et al., “Adaptive fuzzy moving k-means clustering algorithm for image segmentation," IEEE Transactions on Consumer Electronics, vol. 55, no. 4, pp. 2145-2153, 2009.
  11. T. J. Ross, Fuzzy logic with engineering applications, 2nd ed. John Wiley & Sons, 2009.
  12. L. A. Zadeh, “Fuzzy sets," Information and Control, vol. 8, pp. 338-353, 1965.
  13. Zadeh, “Is there a need for fuzzy logic?" Information sciences, vol. 178, no. 13, pp. 2751-2779, 2008.
  14. L. Zadeh, C. Negoita, and H. Zimmermann, “Fuzzy sets as a basis for a theory of possibility," Fuzzy sets and systems, vol. 1, pp. 3-28, 1978.
  15. M. S. Yang, “A survey of fuzzy clustering," Mathematical and Computer modelling, vol. 18, no. 11, pp. 1-16, 1993.
  16. Q. Ni, Q. Pan, H. Du, C. Cao, and Y. Zhai, “A novel cluster head selection algorithm based on fuzzy clustering and particle swarm optimization," IEEE/ACM Transactions on Computational Biology and Bioinformatics, no. 99, pp. 1{9, 2015.
  17. C. Fraley and A. E. Raftery, “How many clusters? which clustering method? Answers via model-based cluster analysis," The computer journal, vol. 41, no. 8, pp. 578-588, 1998.
  18. S. Ayramo and T. Karkkainen, “Introduction to partitioning based clustering methods with a robust example," Reports of the Department of Mathematical Information Technology Series C. Software and Computational Engineering, 2006.
  19. K. Tapas, D.M.Mount, N. Netanyahu, C. Piatko, R. Silverman, and A.Y.Wu, “An efficient k-means clustering algorithm: analysis and implementation," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, no. 7, pp. 881-892, July 2002.
  20. I. V. Cadez, S. Ga_ney, and P. Smyth, “A general probabilistic framework for clustering individuals and objects," in Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2000, pp. 140-149.
  21. P. IndiraPriya and D. Ghosh, “A survey on different clustering algorithms in data mining technique," International Journal of Modern Engineering Research (IJMER), vol. 3, no. 1, pp. 267-274, 2013.
  22. M. W. Berry and M. Castellanos, Survey of Text Mining:Clustering, Classification and Retrieval, 2nd ed. Springer, 2007.
  23. N. H. Park and W. S. Lee, “Statistical grid-based clustering over data streams," ACM SIGMOD Record, vol. 33, no. 1, pp. 32-37, 2004.
  24. G.-S. Liang, T.-Y. Chou, and T.-C. Han, “Cluster analysis based on fuzzy equivalence relation," European Journal of Operational Research, vol. 166, no. 1, pp. 160-171, June 2004.
  25. M.-S. Yang and H.-M. Shih, “Cluster analysis based on fuzzy relations," Fuzzy Sets and Systems, vol. 120, no. 2, pp. 197{212, 2001.
  26. E. G. Mansoori, “Frbc: a fuzzy rule-based clustering algorithm," IEEE Transactions on Fuzzy Systems, vol. 19, no. 5, pp. 960-971, 2011.
  27. M. Delgado, A. F. G_omez-Skarmeta, and F. Martin, \A fuzzy clustering-based rapid prototyping for fuzzy rule-based modeling," IEEE Transactions on Fuzzy Systems, vol. 5, no. 2, pp. 223{233, 1997.
  28. Y. Lu, T. Ma, C. Yin, X. Xie, W. Tian, and S. Zhong, \Implementation of the fuzzy c-means clustering algorithm in meteorological data," International Journal of Database Theory and Application, vol. 6, no. 6, pp. 1-18, 2013.
  29. P. Lingras and G. Peters, “Applying rough set concepts to clustering," in Rough Sets: Selected Methods and Applications in Management and Engineering. Springer, 2012,
  30. N. R. Pal and J. C. Bezdek, “On cluster validity for the fuzzy c-means model," IEEE Transactions on Fuzzy Systems, vol. 3, no. 3, pp. 370-379, 1995.
  31. P. Maji and S. Paul, “Rough-fuzzy clustering for grouping functionally similar genes from microarray data," IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 10, no. 2, pp. 286-299, March 2013.
  32. O. Sutton, “Introduction to k nearest neighbour classification and condensed nearest neighbour data reduction," University lectures, University of Leicester, 2012
  33. R.-P. Li, M. Mukaidono, and I. B. Turksen, “A fuzzy neural network for pattern classification and feature selection," Fuzzy Sets and Systems, vol. 130, no. 1, pp. 101-108,
  34. L. Zhu, F.-L. Chung, and S. Wang, “Generalized fuzzy c-means clustering algorithm with improved fuzzy partitions," IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, vol. 39, no. 3, pp. 578-591, 2009.
  35. Y. Karali, D. Kodamasingh, and R. L. H. Behera, “Hard and fuzzy clustering algorithms using normal distribution of data points: a comparative performance analysis," in International Journal of Engineering Research and Technology, vol. 2, no. 10 (October-2013).
  36. “Enron email dataset," https://www.cs.cmu.edu/ ./enron/ [Accessed on: May 2015].
  37. W. Pedrycz and H. Izakian, “Cluster-centric fuzzy modeling," IEEE Transactions on Fuzzy Systems, vol. 22, no. 6, pp. 1585-1597, December 2014.
  38. M. F. Porter, \An algorithm for suffix stripping," Program, vol. 14, no. 3, pp. 130{137, 1980.
  39. C. Silva and B. Ribeiro, “The importance of stop word removal on recall values in text categorization," in Proceedings of the IEEE International Joint Conference on Neural Networks, vol. 3, 2003, pp. 1661-1666.
  40. M. Ramaswami and R. Bhaskaran, “A study on feature selection techniques in educational data mining," journal of computing, vol. 1, no. 1, pp. 7-11, December 2009.
  41. S. Beniwal and J. Arora, “Classification and feature selection techniques in data mining," International Journal of Engineering Research & Technology (IJERT), vol. 1, no. 6, pp. 1-6, 2012.
  42. A. K. Murugesan and B. J. Zhang, “A new term weighting scheme for document clustering," in 7th International Conference on Data.
  43. J. W. Reed, Y. Jiao, T. E. Potok, B. Klump, M. T. Elmore, A. R. Hurson et al., “Tf icf: A new term weighting scheme for clustering dynamic data streams," in proceedings of 5th IEEE International Conference on Machine Learning and Applications, 2006.
  44. G. Salton, A. Wong, and C.-S. Yang, “A vector space model for automatic indexing," Communications of the ACM, vol. 18, no. 11, pp. 613-620, 1975.
Index Terms

Computer Science
Information Sciences

Keywords

Fuzzy clustering Fuzzy Partition Matrix