CFP last date
20 December 2024
Reseach Article

A Conceptual Framework for Data Cleansing – A Novel Approach to Support the Cleansing Process

by Kofi Adu-manu Sarpong, Joseph George Davis, Joseph Kobina Panford
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 77 - Number 12
Year of Publication: 2013
Authors: Kofi Adu-manu Sarpong, Joseph George Davis, Joseph Kobina Panford
10.5120/13447-1310

Kofi Adu-manu Sarpong, Joseph George Davis, Joseph Kobina Panford . A Conceptual Framework for Data Cleansing – A Novel Approach to Support the Cleansing Process. International Journal of Computer Applications. 77, 12 ( September 2013), 22-26. DOI=10.5120/13447-1310

@article{ 10.5120/13447-1310,
author = { Kofi Adu-manu Sarpong, Joseph George Davis, Joseph Kobina Panford },
title = { A Conceptual Framework for Data Cleansing – A Novel Approach to Support the Cleansing Process },
journal = { International Journal of Computer Applications },
issue_date = { September 2013 },
volume = { 77 },
number = { 12 },
month = { September },
year = { 2013 },
issn = { 0975-8887 },
pages = { 22-26 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume77/number12/13447-1310/ },
doi = { 10.5120/13447-1310 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T21:50:06.955675+05:30
%A Kofi Adu-manu Sarpong
%A Joseph George Davis
%A Joseph Kobina Panford
%T A Conceptual Framework for Data Cleansing – A Novel Approach to Support the Cleansing Process
%J International Journal of Computer Applications
%@ 0975-8887
%V 77
%N 12
%P 22-26
%D 2013
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Data errors occur in various ways when data is transferred from one point to the other. These data errors occur not necessarily from the formation/insertion of data but are developed and transformed when transferred from one process to another along the information chain within the data warehouse infrastructure. The main focus for this study is to conceptualize the data cleansing process from data acquisition to data maintenance. Data Cleansing is an activity involving a process of detecting and correcting the errors and inconsistencies in data warehouse. Poor data or "dirty data" requires cleansing before it can be useful to organizations. Data cleansing therefore deals with identification of corrupt and duplicate data inherent in the data sets of a data warehouse to enhance the quality of data. The research was directed at investigating some existing approaches and frameworks to data cleansing. The research attempted to solve the gaps identified in some data cleansing approaches and came up with a conceptual framework to overcome the weaknesses which were identified in those frameworks and approaches. This novel conceptual framework considered the data cleansing process from the point of data is obtained to the point of maintaining the data using a periodic automatic cleansing approach.

References
  1. Raman V and Hellerstein J. M, Potter's Wheel: An Interactive Data Cleaning System, Proceedings of the 27th VLDB Conference, Roma, Italy, 2001, pp. 1-10.
  2. Mong L. L, Tok W. L and Wai L. L. (2000). IntelliClean : A Knowledge-Based Intelligent Data Cleaner, ACM, pp. 290-294
  3. Panos V. , Zografoula V, Spiros S. , and Nikos K. (2000). ARKTOS: A Tool For Data Cleaning and Transformation in Data Warehouse Environments. Bulletin of the IEEE Computer Society Technical Committee on Data Engineering, pp. 1-6
  4. H. Galhards, D. Florescu, D. Shasha, E. Simon. (May 2000). AJAX: An extensible data cleaning tool. Proceedings of the ACM SIGMOD on Management of data, Dallas, TX USA, pp. 21-22.
  5. Heiko Muller, Johann-Christoph Freytag. (2003). Problems, Methods, and Challenges in Comprehensive Data Cleansing, pp. 21.
  6. Jonathan I. Maletic, Andrian Marcus. (2000). Data Cleansing: Beyond Integrity Analysis, pp. 8.
  7. R. Mariappan and B. Parthasarathy. (2009). an analysis of data storage and retrieval of file format system, Indian Journal of Science and Technology, vol. 2 No. 9, pp. 38-40.
  8. Dongre Kuldeep (2004). Data cleansing strategies, pp. 10
  9. Rahm, E. , Do, H. H. (2000). Data Cleaning: Problems and Current Approaches. IEEE Data Engineering Bull, vol. 23 No. 4, pp. 3-13
  10. Adu-Manu, K. S and Arthur J. K. (2013). A Review of Data Cleansing Concepts – Achievable Goals and Limitations, International Journal of Computer Applications (0975 –8887), vol. no 76, pp. 19-22.
  11. Adu-Manu, K. S and Arthur J. K. (2013). Analysis of Data Cleansing Approaches regarding Dirty data – a Comparative Study, International Journal of Computer Applications (0975 –8887), vol. no 76, pp. 14-18.
Index Terms

Computer Science
Information Sciences

Keywords

Conceptual Framework data cleansing process gap analysis dirty data