Scalable Algorithms for Missing Value Imputation

Marghny H. Mohamed; Abdel-rahiem A. Hashem; Mohammed M. Abdelsamea

Call for Paper

May Edition

IJCA solicits high quality original research papers for the upcoming May edition of the journal. The last date of research paper submission is 20 April 2026

Submit your paper

Know more

The week's pick

Evaluating Text-to-Text Generation from LLMs: A Case Study and Scalable Framework

Ziqiao Ao Juhi Singh Sebastian Antinome

Random Articles

Reseach Article

Scalable Algorithms for Missing Value Imputation

by Marghny H. Mohamed, Abdel-rahiem A. Hashem, Mohammed M. Abdelsamea

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 87 - Number 11

Year of Publication: 2014

Authors: Marghny H. Mohamed, Abdel-rahiem A. Hashem, Mohammed M. Abdelsamea

10.5120/15255-4019

Marghny H. Mohamed, Abdel-rahiem A. Hashem, Mohammed M. Abdelsamea . Scalable Algorithms for Missing Value Imputation. International Journal of Computer Applications. 87, 11 ( February 2014), 35-42. DOI=10.5120/15255-4019

@article{ 10.5120/15255-4019,

author = { Marghny H. Mohamed, Abdel-rahiem A. Hashem, Mohammed M. Abdelsamea },

title = { Scalable Algorithms for Missing Value Imputation },

journal = { International Journal of Computer Applications },

issue_date = { February 2014 },

volume = { 87 },

number = { 11 },

month = { February },

year = { 2014 },

issn = { 0975-8887 },

pages = { 35-42 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume87/number11/15255-4019/ },

doi = { 10.5120/15255-4019 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-06T22:05:40.760144+05:30

%A Marghny H. Mohamed

%A Abdel-rahiem A. Hashem

%A Mohammed M. Abdelsamea

%T Scalable Algorithms for Missing Value Imputation

%J International Journal of Computer Applications

%@ 0975-8887

%V 87

%N 11

%P 35-42

%D 2014

%I Foundation of Computer Science (FCS), NY, USA

Abstract

Statistical Imputation Techniques have been proposed mainly with the aim of predicting the missing values in the incomplete sets as an essential step in any data analysis framework. K-means-based Imputation, as a representative statistical imputation method, has been producing satisfied results in terms of effectiveness and efficiency in handling popular and freely available data set (e. g. , Bupa, Breast Cancer, Pima, etc. ). The main idea of K-means based methods is to impute the missing value relying on the prototypes of the representative class and the similarity of the data. However, such kinds of methods share the same limitations of the K-means as data mining technique. In this paper and motivated by such drawbacks, we introduce simple and efficient imputation methods based on K-means to deal with the missing data from various classes of data sets. Our proposed methods give higher accuracy than the one given by the standard K-means.

References

Jiawei, H. and Micheline, K. , 2006. Data mining Concept and Techniques. 2nd Edn Morgon Kaufmaan Publishers. ISBN: 1-55860-901-6.
Mehala, B. , Vivekanandan K. and Ranjit Jeba Thangaiah, P. , 2008. An Analysis on K-Means Algorithm as an Imputation Method to Deal with Missing Values. Asian Journal of Information Technology 7 (9): 434-441.
Lakshminarayan, K. , Harp, S. A. and Samad, T. , 1999. Imputation of missing data in industrial database, Apple. Intell. 11, 259-275.
Jau-Huei Lin and Peter J. Haug, 2008. Exploiting missing clini- cal data in Bayesian network modeling for predicting medical problems Journal of Biomedical Informatics 41, 1-4.
Alireza farhangfar, Lukase Kurgan and Jennifer Dy, 2008. Impact of imputation of missing values on classification error for discrete data. Pattern Recognition 41, 3692-3705.
Dempster, A. P. and LairdandDB Rubin, R. J. , 1977. Maximum likelyhood from incomplete data via the EM algoritm (with Discussion). I. R. Stat. Soc, B39: 1-38. http://wwwjstororg/pss/2984875.
Daqian, G. and Yang, G. 2005. Incremental gradent descent imputation method for missing data in learning classifier systems. GECCO, ACM, Wash- ington, DC, USA, pp: 72-73.
Fulufhelo, V. , Nelwamondo and Tshlidzi, M. 2007. Rough sets computations to impute missing data. Comput. Vision and Pattern Recog. , 1, 1-19.
Musil, C. M. , Wamer, C. B. , Yobas , P. K. and Jones, S. L. 2002. A comparison of imputation techniques for han- dling missing data. Western J. Nus. Res. , 24 (5).
Cristian P. , D. , Alain, P. Monique and Tahar, K. 2005. Tools for statistical analysis with missing data: Appli cation to a large medxal database. ENMI, pp: 181-186.
Joseph L. Schafer and Maren K. Olsen, 1998. Multiple Imputation for multivariate Missing data problems: a data analyst's perspective, 33, 545--571.
Pedro J. Garc Laencina, Jose'-Luis Sancho-Gomez, Anbal R. Figueiras-Vidal and Michel Verleysen, 2009. K nearest neighbours with mutual information for simultaneous Classification and missing data imputation. Neurocomputing 72, 1483-1493.
Allison, P. D. , 2001. Missing data, Sage University Papers Serieson Quantitative Applications in the Social Sciences, Thousand Oaks, California, USA.
Little, R. J. A. and Rubin, D. B. Statistical 2002. Analysis with Missing Data, seconded, Wiley, NJ, USA.
Sande, G. 1983. Hot Deck Imputation Procedures, Incomplete data in Sample Surveys, vol. 3, Academic Press.

Index Terms

Computer Science

Information Sciences

Keywords

Statistical Imputation Clustering K-mean