CFP last date
20 December 2024
Reseach Article

Case Study: Outlier Detection on Sequential Data

by K. Anusha, S. Manoj Kumar, K. Santhi Sree
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 112 - Number 8
Year of Publication: 2015
Authors: K. Anusha, S. Manoj Kumar, K. Santhi Sree
10.5120/19687-1435

K. Anusha, S. Manoj Kumar, K. Santhi Sree . Case Study: Outlier Detection on Sequential Data. International Journal of Computer Applications. 112, 8 ( February 2015), 29-35. DOI=10.5120/19687-1435

@article{ 10.5120/19687-1435,
author = { K. Anusha, S. Manoj Kumar, K. Santhi Sree },
title = { Case Study: Outlier Detection on Sequential Data },
journal = { International Journal of Computer Applications },
issue_date = { February 2015 },
volume = { 112 },
number = { 8 },
month = { February },
year = { 2015 },
issn = { 0975-8887 },
pages = { 29-35 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume112/number8/19687-1435/ },
doi = { 10.5120/19687-1435 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T22:49:43.635950+05:30
%A K. Anusha
%A S. Manoj Kumar
%A K. Santhi Sree
%T Case Study: Outlier Detection on Sequential Data
%J International Journal of Computer Applications
%@ 0975-8887
%V 112
%N 8
%P 29-35
%D 2015
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Time series data streams are common in wireless sensor networks in nowadays. This type of data is having uncertainty due to the limitation of the measuring equipments or other sources of corrupting noise, leading to uncertain data. As uncertain streaming data is continuously generated, mining algorithms should be able to analyze the uncertain data. To detect the outliers in this project we propose two continuous distance-based outlier detection approaches (an exact and an approximate) are proposed for uncertain time series data streams. These two algorithms are implemented based on the cell based approach. These two approaches can be applied on uncertain objects. A set of uncertain objects at particular time stamp is called state set. As the duration between the two time stamps is very less to detect the outliers we use the incremental approach (use the results obtained from the previous state set to detect outliers in the current state set). An approximate incremental outlier detection approach is proposed to further reduce the cost of incremental outlier detection. Cell based algorithm is employed for the efficient detection of outliers within a state set, in both the incremental algorithms. To show the efficiency of the proposed approaches synthetic and real datasets are used.

References
  1. B. Wang, G. Xiao, H. Yu, and X. Yang, "Distance-based outlier detection on uncertain data," in CIT, 2009.
  2. C. C. Aggarwal and P. S. Yu, "Outlier Detection with Uncertain Data," in SDM, 2008.
  3. E. M. Knorr, R. T. Ng, and V. Tucakov, "Distance-based outliers: Algorithms and applications," VLDB J. , vol. 8(3-4), pp(237–253), 2000.
  4. F. Angiulli and C. Pizzuti, "Fast outlier detection in high dimensional spaces," in Principles of Data Mining and Knowledge Discovery, pp. 15–27,2002.
  5. K. Ishida and H. Kitagawa, "Detecting current outliers: Cont. outlier detect. over time-series data streams," in DEXA, 2008.
  6. Kriegel, H. -P. , Kr¨oger, P. , Zimek, A. : Outlier Detection Techniques. Tutorial at 16th ACM SIGKDD Conference 2010.
  7. "Met office weather data," http://data. gov. uk/data, 2013, [Online; accessed 03-September-2013].
  8. M. Kontaki, A. Gounaris, A. Papadopoulos, K. Tsichlas, and Y. Manolopoulos, "Continuous monitoring of distance-based outliers over data streams," in ICDE, 2011.
  9. Nievergelt, J. , Hinterberger, H. , Sevick, K. C. : The Grid File: An Adaptable, Symmetric multikey File Structure. ACM Transaction on Database Systems 1984.
  10. S. Ramaswamy, R. Rastogi, and K. Shim, "an efficient algorithm for mining outliers from large data sets," SIGMOD Rec. , vol. 29(2), 2000.
  11. Salman Ahmed Shaikh and Hiroyuki Kitagawa "Continuous Outlier Detection on Uncertain Data Streams" in ISSNIP, 2014.
  12. http://www. cs. gsu. edu/~wkim/index_files/SurveyParallelClustering. html.
  13. Y. Tao, X. Xiao, and R. Cheng, "Range search on multidimensional uncertain data," ACM Trans. Database Syst. , vol. 32(3), 2007.
Index Terms

Computer Science
Information Sciences

Keywords

Clustering Outlier detection Cell-Based Approach Grid-File indexing Incremental Outlier Approach