International Journal of Computer Applications |
Foundation of Computer Science (FCS), NY, USA |
Volume 88 - Number 18 |
Year of Publication: 2014 |
Authors: Prerana S. Kulkarni, J. W. Bakal |
10.5120/15450-3813 |
Prerana S. Kulkarni, J. W. Bakal . Hybrid Approaches for Data Cleaning in Data Warehouse. International Journal of Computer Applications. 88, 18 ( February 2014), 7-10. DOI=10.5120/15450-3813
The quality of data can only be improved by cleaning data prior to loading into the data warehouse as correctness of data is essential for well-informed and reliable decision making. Data warehouse is the only viable solution that can bring that dream into a reality. The quality of the data can only be produced by cleaning data prior to loading into data warehouse. Data Cleaning is a very important process of the data warehouse. It is not a very easy process as many different types of unclean data can be present. So correctness of data is essential for well-informed and reliable decision making. Also, whether a data is clean or dirty is highly dependent on the nature and source of the raw data. Many attempts have been made till now to clean the data using different types of algorithms. In this paper an attempt has been made to provide a hybrid approach for cleaning data which combines modified versions of PNRS, Transitive closure algorithms and Semantic Data Matching algorithm can be applied to the data to get better results in data corrections.