Advanced Computing and Communication Technologies for HPC Applications |
Foundation of Computer Science USA |
ACCTHPCA - Number 3 |
July 2012 |
Authors: Sharon Christa, V. Suma, Lakshmi Maduri |
6f16b25b-355a-47ee-8bf4-aa6ff48ccb46 |
Sharon Christa, V. Suma, Lakshmi Maduri . An Effective Data Preprocessing Technique for Improved Data Management in a Distributed Environment. Advanced Computing and Communication Technologies for HPC Applications. ACCTHPCA, 3 (July 2012), 25-29.
With the evolution of distributed computing, the databases are inherently distributed across the globe and therefore data analysis from various data sources is very essential in decision making. The core need in the current industrial environment is hence to extract information from the huge, complex and dynamic data through data mining techniques. Integrating data from multiple data sources and analysing the large, complex dynamic data is a tedious and complex work. Additionally, database consists of inconsistent and noisy data. Further, with the decrease in quality of data to be mined the quality of knowledge model obtained from it also decrease which inturn affects the decision making process. However optimization of data preprocessing can resolve the aforementioned issues. This paper provides design and development of data preprocessing software, based on intelligent agents. This software enables data preprocessing operations to be performed in an automated mode, and gives accurate results in lesser time when compared to manual data preprocessing.