We apologize for a recent technical issue with our email system, which temporarily affected account activations. Accounts have now been activated. Authors may proceed with paper submissions. PhDFocusTM
CFP last date
20 December 2024
Reseach Article

An Ensemble approach on Missing Value Handling in Hepatitis Disease Dataset

by Sridevi Radhakrishnan, D. Shanmuga Priyaa
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 130 - Number 17
Year of Publication: 2015
Authors: Sridevi Radhakrishnan, D. Shanmuga Priyaa
10.5120/ijca2015907197

Sridevi Radhakrishnan, D. Shanmuga Priyaa . An Ensemble approach on Missing Value Handling in Hepatitis Disease Dataset. International Journal of Computer Applications. 130, 17 ( November 2015), 23-27. DOI=10.5120/ijca2015907197

@article{ 10.5120/ijca2015907197,
author = { Sridevi Radhakrishnan, D. Shanmuga Priyaa },
title = { An Ensemble approach on Missing Value Handling in Hepatitis Disease Dataset },
journal = { International Journal of Computer Applications },
issue_date = { November 2015 },
volume = { 130 },
number = { 17 },
month = { November },
year = { 2015 },
issn = { 0975-8887 },
pages = { 23-27 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume130/number17/23302-2015907197/ },
doi = { 10.5120/ijca2015907197 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T23:25:54.244601+05:30
%A Sridevi Radhakrishnan
%A D. Shanmuga Priyaa
%T An Ensemble approach on Missing Value Handling in Hepatitis Disease Dataset
%J International Journal of Computer Applications
%@ 0975-8887
%V 130
%N 17
%P 23-27
%D 2015
%I Foundation of Computer Science (FCS), NY, USA
Abstract

The Major work in data pre-processing is handling Missing value imputation in Hepatitis Disease Diagnosis which is one of the primary stage in data mining. Many health datasets are typically imperfect. Just removing the cases from the original datasets can fetch added problems than elucidations. A appropriate technique for missing value imputation can assist to generate high-quality datasets for enhanced scrutinizing in clinical trials. This paper investigates the exploit of a machine learning technique as a missing value imputation process for incomplete Hepatitis data. Mean/mode imputation, ID3 algorithm imputation, decision tree imputation and proposed bootstrap aggregation based imputation are used as missing value imputation and the resultant datasets are classified using KNN. The experiment reveals that classifier performance is enhanced when the Bagging based imputation algorithm is used to foresee missing attribute values.

References
  1. WHO, Hepatitis C (Fact Sheet No. 164), World Health Organization, Geneva, 2000.
  2. WHO, Hepatitis C global prevalence (update), Weekly Epidemiological Record (World Health Organization), 74, 1999, pp. 421–428.
  3. Information regarding hepatitis C from the staff of Mayo Clinic; available at: http://www.mayoclinic.com/health/hepatitis-c/DS00097
  4. D. F. Sittig, A. Wright, J. A. Osheroff, B. Middleton, J. M. Teich, J. S. Ash, et al., "Grand challenges in clinical decision support," in J Biomed Inform. vol. 41, ed United States, 2008, pp. 387-92.
  5. J. Fox, D. Glasspool, V. Patkar, M. Austin, L. Black, M. South, et al., "Delivering clinical decision support services: there is nothing as practical as a good theory," in J Biomed Inform. vol. 43, ed United States, 2010, pp. 831-43.
  6. R. Bellazzi and B. Zupan, "Predictive data mining in clinical medicine: Current issues and guidelines," International Journal of Medical Informatics, vol. 77, pp. 81-97, Feb 2008.
  7. Roslina, A.H. and Noraziah, A “Prediction of Hepatitis Prognosis Using Support Vector Machine and Wrapper Method”, Seventh International Conference on Fuzzy Systems and knowledge Discovery (FSKD 2010), 978-1-4244-5934-6/10, 2010 IEEE.
  8. Jiawei Han and Micheline Kamber. “Data Mining: Concepts and Techniques”,Data Preprocessing, Third Edition, 2011
  9. Weston, J., Mukherjee, S., Chapelle, O., Pontil, M., Poggio, T. and Vapnik, V., “ Feature Selection For SVMs”, Advances in Neural Information processing Systems, MIT Press 2001, pg 668- 674.
  10. Ron Kohavai and George H. John., “Wrappers for feature subset selection” , Artificial Intelligence
  11. Kantardzic M. 2003: Data Mining – Concepts, Models, Methods, and Algorithms, IEEE, pp. 165-176.
  12. Lakshminarayan, K., Harp S. A. & Samad, T., 1999: Imputation of Missing Data in Industrial Databases, Applied Intelligence 11, pp. 259–275.
  13. Liu Peng, Lei Lei , A Review of Missing Data Treatment Method
  14. http://www.cise.ufl.edu/~ddd/cap6635/Fall-97/Short-papers/2.htm
  15. http://docs.rapidminer.com/studio/operators/modeling/classification_and_regression/meta/bagging
Index Terms

Computer Science
Information Sciences

Keywords

data mining prediction knn imputation missing values bagging bootstrap