Outlier Detection using Improved Genetic K-means

M. H. Marghny; Ahmed I. Taloba

Call for Paper

May Edition

IJCA solicits high quality original research papers for the upcoming May edition of the journal. The last date of research paper submission is 20 April 2026

Submit your paper

Know more

The week's pick

A Unified NIST SP 800-90B Validation Framework for CMOS True Random Number Generators and Quantum Random Number Generators

Che-Ping Lin

Random Articles

Reseach Article

Outlier Detection using Improved Genetic K-means

by M. H. Marghny, Ahmed I. Taloba

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 28 - Number 11

Year of Publication: 2011

Authors: M. H. Marghny, Ahmed I. Taloba

10.5120/3458-4723

M. H. Marghny, Ahmed I. Taloba . Outlier Detection using Improved Genetic K-means. International Journal of Computer Applications. 28, 11 ( August 2011), 33-36. DOI=10.5120/3458-4723

@article{ 10.5120/3458-4723,

author = { M. H. Marghny, Ahmed I. Taloba },

title = { Outlier Detection using Improved Genetic K-means },

journal = { International Journal of Computer Applications },

issue_date = { August 2011 },

volume = { 28 },

number = { 11 },

month = { August },

year = { 2011 },

issn = { 0975-8887 },

pages = { 33-36 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume28/number11/3458-4723/ },

doi = { 10.5120/3458-4723 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-06T20:14:33.244841+05:30

%A M. H. Marghny

%A Ahmed I. Taloba

%T Outlier Detection using Improved Genetic K-means

%J International Journal of Computer Applications

%@ 0975-8887

%V 28

%N 11

%P 33-36

%D 2011

%I Foundation of Computer Science (FCS), NY, USA

Abstract

The outlier detection problem in some cases is similar to the classification problem. For example, the main concern of clustering-based outlier detection algorithms is to find clusters and outliers, which are often regarded as noise that should be removed in order to make more reliable clustering. In this article, we present an algorithm that provides outlier detection and data clustering simultaneously. The algorithmimprovesthe estimation of centroids of the generative distribution during the process of clustering and outlier discovery. The proposed algorithm consists of two stages. The first stage consists of improved genetic k-means algorithm (IGK) process, while the second stage iteratively removes the vectors which are far from their cluster centroids.

References

Williams, G., Baxter, R., He, H., Hawkins, S., and Gu, L.2002. A Comparative Study for RNN for Outlier Detection in Data Mining. In Proceedings of the 2nd IEEE International Conference on Data Mining, Maebashi City, Japan, pp.709.
He,Z., Xu, X., and Deng,S. 2003. Discovering Cluster-based Local Outliers. Pattern Recognition Letters, vol.24, pp.1641-1650.
Aggarwal, C., and Yu,P.2001. Outlier Detection for High Dimensional Data. In Proceedings of the ACM SIGMOD International Conference on Management of Data, vol.30, pp.37-46.
Jaing, M., Tseng, S., and Su, C.2001. Two-phase Clustering Process for Outlier Detection. Pattern Recognition Letters, vol.22, pp.691-700.
Taloba, A. I. 2008. Data Clustering Using Evolutionary Algorithms. Master thesis, Assiut University, Assiut,Egypt.
Zhang, T.,Ramakrishnan, R., and Livny, M.1997. BIRCH: A new data clustering algorithm and its applications. Data Mining and Knowledge Discovery, vol.1,pp.141-182.
Ester, M.,Kriegel, H. P., Sander J., and Xu, X.1996. A density-based algorithm for discovering clusters in large spatial databases with noise. In:2nd International Conference on Knowledge Discovery and Data Mining, pp.226-231.
Guha, S.,Rastogi, R., and Shim, K.1999. A robust clustering algorithm for categorical attributes. In 15th International Conference on Data Engineering, pp.512-521.
Pamula, R., Deka, J.K., Nandi, S. 2011. An Outlier Detection Method Based on Clustering. Emerging Applications of Information Technology (EAIT), pp. 253 – 256.
Al-Zoubi, M., Al-Dahoud, A. and Yahya, A.A. 2010. New Outlier Detection Method Based on Fuzzy Clustering, WSEAS Transactions on Information Science and Applications, pp.681-690.
Murugavel, P., and Punithavalli, M. 2011. Improved Hybrid Clustering and Distance-based Technique for Outlier Removal, International Journal on Computer Science and Engineering (IJCSE).
Karmaker, A. and Rahman, S. 2009 Outlier Detection in Spatial Databases Using Clustering Data Mining, Sixth International Conference on Information Technology: New Generations, pp.1657-1658.
Loureiro,A., Torgo, L. and Soares, C. 2004. Outlier Detection using Clustering Methods: a Data Cleaning Application, in Proceedings of KDNet Symposium on Knowledge-based Systems for the Public Sector. Bonn, Germany.
Niu, K., Huang, C., Zhang, S., and Chen, J. 2007. ODDC: Outlier Detection Using Distance Distribution Clustering, T. Washio et al. (Eds.): PAKDD 2007 Workshops, Lecture Notes in Artificial Intelligence (LNAI) 4819, pp. 332–343.
Hautamaki, V., Karkkainen, I., and Franti, P.2004. Outlier detection using knearestneighbour graph. In 17th International Conference on Pattern Recognition (ICPR 2004), Cambridge, United Kingdom, pp.430-433.
Hautamaki,V.Cherednichenko, S.,Karkkainen, I.,Kinnunen, T.,and Franti, P.2005. Improving K-Means by Outlier Removal. In: SCIA 2005, pp.978-987.
Virmajoki, O. 2004. Pairwise Nearest Neighbor Method Revisited. PhD thesis, University of Joensuu, Joensuu, Finland.

Index Terms

Computer Science

Information Sciences

Keywords

Outlier detection Genetic algorithms Clustering K-means algorithm Improved Genetic K-means (IGK)