CFP last date
20 January 2025
Reseach Article

A Comparative Study on Outlier Detection Techniques

by Mohammad Zaid Pasha, Nitin Umesh
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 66 - Number 24
Year of Publication: 2013
Authors: Mohammad Zaid Pasha, Nitin Umesh
10.5120/11265-6475

Mohammad Zaid Pasha, Nitin Umesh . A Comparative Study on Outlier Detection Techniques. International Journal of Computer Applications. 66, 24 ( March 2013), 23-27. DOI=10.5120/11265-6475

@article{ 10.5120/11265-6475,
author = { Mohammad Zaid Pasha, Nitin Umesh },
title = { A Comparative Study on Outlier Detection Techniques },
journal = { International Journal of Computer Applications },
issue_date = { March 2013 },
volume = { 66 },
number = { 24 },
month = { March },
year = { 2013 },
issn = { 0975-8887 },
pages = { 23-27 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume66/number24/11265-6475/ },
doi = { 10.5120/11265-6475 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T21:23:19.140536+05:30
%A Mohammad Zaid Pasha
%A Nitin Umesh
%T A Comparative Study on Outlier Detection Techniques
%J International Journal of Computer Applications
%@ 0975-8887
%V 66
%N 24
%P 23-27
%D 2013
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Outlier detection is an extremely important problem with direct application in a wide variety of domains. A key challenge with outlier detection is that it is not a well-formulated problem like clustering. In this paper, discussion on different techniques and then comparison by analyzing their different aspects, essentially, time complexity. Every unique problem formulation entails a different approach, resulting in a huge literature on outlier detection techniques. Several techniques have been proposed to target a particular application domain. The classification of outlier detection techniques based on the applied knowledge discipline provides an idea of the research done by different communities and also highlights the unexplored research avenues for the outlier detection problem. Discussed of the behavior of different techniques will be done, in this paper, with respect to the nature. The feasibility of a technique in a particular problem setting also depends on other constraints. For example, Statistical techniques assume knowledge about the underlying distribution characteristics of the data. Distance based techniques are typically expensive and hence are not applied in scenarios where computational complexity is an important issue.

References
  1. Aggarwal, C. C. , Yu, S. P. , "An effective and efficient algorithm for high-dimensional outlier detection", The VLDB Journal, 2005, vol. 14, pp. 211–221.
  2. Breunig, M. M. , Kriegel, H. P. , and Ng, R. T. , "LOF: Identifying densitybased local outliers. ", ACM Conference Proceedings, 2000, pp. 93-104.
  3. Zuriana A. B. , Rosmayati M. , Akbar A. , Mustafa M. D. , "A Comparative Study for Outlier Detection Techniques in Data Mining" CIS 2006.
  4. Edwin M. Knorr and Raymond T. Ng. Algorithms for mining distance-based outliers in large datasets. In VLDB '98: Proceedings of the 24rd International Conference on Very Large Data Bases, pages 392–403, San Francisco, CA, USA, 1998. Morgan Kaufmann Publishers Inc.
  5. Hui Cao , Gangquan Si, Wenzhi Zhu,Yanbin Zhang-" Enhanceing Effectveness of Density based Outlier Mining".
  6. Markus M. Breunig,Hans-peter Kriege, Raymond T. Ng,Jorg Sander –" LOF: Identifying Density-Based Local Outlier".
  7. Ester, M. , Kriegel, H. -P. , Sander, J. , and Xu X. (1996), A density-based algorithm for discovering clusters in large spatial data sets with noise. Proc. 2nd Int. Conf. on Knowledge Discovery and Data Mining. Portland, OR, pp. 226-231.
  8. Hinneburg, C. C. Aggarwal, and D. A. Keim. "What is the Nearest Neighbor in High Dimensional Spaces". In Proc. 26th Int. Conf. on Very Large Databases (VLDB'00), Cairo, Egypt, 2000.
  9. C. B. D. Newman and C. Merz. UCI repository of machine learning databases.
  10. Han and Kamber(2007), Data Mining: Concepts and Techniques Morgan Kaufmann publications
  11. George Marakas, Data Warehousing, Data Mining and Visualisation, Pearson publications
Index Terms

Computer Science
Information Sciences

Keywords

Outlier time complexity statistical techniques eucledian distance