International Journal of Computer Applications |
Foundation of Computer Science (FCS), NY, USA |
Volume 122 - Number 8 |
Year of Publication: 2015 |
Authors: A.rama Satish, P.bala Krishna Prasad |
10.5120/21717-4858 |
A.rama Satish, P.bala Krishna Prasad . Spotting Outliers in Large Distributed Datasets using Cell Density based Approach. International Journal of Computer Applications. 122, 8 ( July 2015), 1-7. DOI=10.5120/21717-4858
Outliers are abnormal instances or observations. Detecting data outliers is a very important concept in Knowledge data discovery. Outlier detection has been studied in the context of a large number of research areas like large distributed systems, data mining, wireless sensor networks(WSN), health monitoring, environmental science, statistics, etc. , Density based (DB) outlier detection techniques are robust in detecting outliers. In many applications, too much voluminous distributed data is generating every day. Finding deviating observations in the large distributed database rather than in any individual database is not a simple task. Integrating distributed database cause two major problems. First, render massive data from different databases. In addition, data integration may cause violation of data security and leakage of sensitive information. In this work we propose cell density based mechanism for outlier detection (CDOD) in large distributed databases. A centralized detection paradigm is used; it allows overcoming the expensive data integration and information leakage. The experimental results show robustness for finding outliers in large number of databases, instances and attributes.