CFP last date
20 January 2025
Reseach Article

Hepatitis-C Classification using Data Mining Techniques

by Huda Yasin, Tahseen A. Jilani, Madiha Danish
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 24 - Number 3
Year of Publication: 2011
Authors: Huda Yasin, Tahseen A. Jilani, Madiha Danish
10.5120/2934-3888

Huda Yasin, Tahseen A. Jilani, Madiha Danish . Hepatitis-C Classification using Data Mining Techniques. International Journal of Computer Applications. 24, 3 ( June 2011), 1-6. DOI=10.5120/2934-3888

@article{ 10.5120/2934-3888,
author = { Huda Yasin, Tahseen A. Jilani, Madiha Danish },
title = { Hepatitis-C Classification using Data Mining Techniques },
journal = { International Journal of Computer Applications },
issue_date = { June 2011 },
volume = { 24 },
number = { 3 },
month = { June },
year = { 2011 },
issn = { 0975-8887 },
pages = { 1-6 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume24/number3/2934-3888/ },
doi = { 10.5120/2934-3888 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T20:09:59.838364+05:30
%A Huda Yasin
%A Tahseen A. Jilani
%A Madiha Danish
%T Hepatitis-C Classification using Data Mining Techniques
%J International Journal of Computer Applications
%@ 0975-8887
%V 24
%N 3
%P 1-6
%D 2011
%I Foundation of Computer Science (FCS), NY, USA
Abstract

In this paper, we scrutinize factors that dole out significantly to augmenting the risk of hepatitis-C virus. The dataset has been taken from the machine learning warehouse of University of California. It contains nineteen features along with a class feature having binary classification. There is a total of 15 binary attributes together with a class attribute and 5 continuous attributes. The dataset contains 155 records. In order to prevail over the missing values problem, data normalization techniques are applied. First, the dimension of the problem is trimmed down. Next binary logistic regression is applied to classify the cases by using qualitative and quantitative approaches for data reduction. The three stage procedure has produced more than 89% accurate classification. Our proposed approach has a low feature complexity with a good classification rate as it is working by using only 37% of the total fields.

References
  1. WHO, Hepatitis C (Fact Sheet No. 164), World Health Organization, Geneva, 2000.
  2. WHO, Hepatitis C global prevalence (update), Weekly Epidemiological Record (World Health Organization), 74, 1999, pp. 421–428.
  3. Information regarding hepatitis C from the staff of Mayo Clinic; available at: http://www.mayoclinic.com/health/hepatitis-c/DS00097.
  4. Hodgson S., Harrison R. F., Cross S. S., An automated pattern recognition system for the quantification of inflammatory cells in hepatitis-C-infected liver biopsies, Image and Vision Computing 24, 2006, pp. 1025–1038.
  5. Moriishi K. and Y. Matsuura, “Mechanisms of hepatitis C virus infection”, Antivir. Chem. Chemother 14, 2003, pp. 285–297.
  6. Fattovich G. and Schalm S.W., Hepatitis C and cirrhosis, in Hepatitis C, T.J. Liang, J.H. Hoofnagle, San Diego, Academic Press, pp. 241–264 eds, 2000.
  7. Lawrence S. P., “Advances in the treatment of hepatitis C”, Advanced in International Medicine. 45, 2000, pp. 65–105.
  8. Polat K., Gunes S., “Hepatitis disease diagnosis using a new hybrid system based on feature selection (FS) and artificial immune recognition system with fuzzy resource allocation”, Digital Signal Processing 16, 2006, pp. 889–901.
  9. Polat, K., & Gunes, S., An expert system approach based on principal component analysis and adaptive neuro-fuzzy inference system to diagnosis of diabetes disease. Digital Signal Processing, 17(4), 2007, pp. 702–710.
  10. Ishak K., A. Baptista, L.B. Histological, et al., Histological grading and staging of chronic hepatitis, Journal of Hepatology, 22, 1995, pp. 696–699.
  11. Kedziora P., Figlerowicz M., Formanowicz P., Alejska M., Jackowiak P., Malinowska N., Fratczak A., Blazewicz J., and Figlerowicz M., “Computational Methods in Diagnostics of Chronic Hepatitis C”, Bulletin of the Polish Academy of Sciences, Technical Sciences, 53 (3), 2005, pp.273-281.
  12. Peng Guan, De-Sheng Huang, Bao-Sen Zhou, “Forecasting model for the incidence of hepatitis A based on artificial neural network”, China World Journal of Gastroenterol , 10 (24), 2004, pp. 3579-3582.
  13. Avendao R., Esteva L., Flores J. A., et al., A Mathematical Model for the Dynamics of Hepatitis C, Computational and Mathematical Methods in Medicine 4(2), 2002, pp.109- 118.
  14. Moneim I. A. and Mosa G. A., “Modeling the Hepatitis C with Different Types of Virus Genome”, Computational and Mathematical Methods in Medicine, 7(1), 2006, pp. 3-13.
  15. Dunham M. H. and Sridhar S., Data Mining: Introductory and Advanced topics, Pearson Education, 2006
  16. Larose D. T., Data mining methods and models. John Wiley and sons, 2006.
  17. Kantardzic M., Data Mining: Concepts, Models, Methods, and Algorithms. John Wiley & Sons 2003.
  18. Neter J., Waserman W. and Kutner, M., Applied Linear Statistical Models: Regression Analysis of Variance and Experimental Designs, McGraw Hill, 3rd edition, 1996.
  19. Hosmer D. and Lemeshow S., Applied Logistic Regression, John Wiley and Sons, 2nd edition, 2000.
Index Terms

Computer Science
Information Sciences

Keywords

Binary logistic regression analyses data mining hepatitis-C Virus (HCV) principle component analysis