CFP last date
20 March 2025
Reseach Article

Exploring Machine Learning Utilization using Real-Life Dataset for Influenza Pandemic

by Shahid Hussain, Ubaida Fatima
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 186 - Number 65
Year of Publication: 2025
Authors: Shahid Hussain, Ubaida Fatima
10.5120/ijca2025924411

Shahid Hussain, Ubaida Fatima . Exploring Machine Learning Utilization using Real-Life Dataset for Influenza Pandemic. International Journal of Computer Applications. 186, 65 ( Feb 2025), 8-18. DOI=10.5120/ijca2025924411

@article{ 10.5120/ijca2025924411,
author = { Shahid Hussain, Ubaida Fatima },
title = { Exploring Machine Learning Utilization using Real-Life Dataset for Influenza Pandemic },
journal = { International Journal of Computer Applications },
issue_date = { Feb 2025 },
volume = { 186 },
number = { 65 },
month = { Feb },
year = { 2025 },
issn = { 0975-8887 },
pages = { 8-18 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume186/number65/exploring-machine-learning-utilization-using-real-life-dataset-for-influenza-pandemic/ },
doi = { 10.5120/ijca2025924411 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2025-02-03T23:25:44.041530+05:30
%A Shahid Hussain
%A Ubaida Fatima
%T Exploring Machine Learning Utilization using Real-Life Dataset for Influenza Pandemic
%J International Journal of Computer Applications
%@ 0975-8887
%V 186
%N 65
%P 8-18
%D 2025
%I Foundation of Computer Science (FCS), NY, USA
Abstract

There must be an exact system for monitoring the influenza outbreaks to have an optimum solution for the recovery of infected people’s health. For reducing the spread of future outbreaks of influenza virus, forecasting plays an important role. Influenza a is type of disease which is transferred to human beings through pigs, found in animals. It became pandemic in Spain, approximately, 1/3rd of human population died and 1/4th of pig population. Again in 2009, influenza "A" caused millions of deaths, and spread like a pandemic rapidly. Variety of researches inspected data obtained from World Health Organization and local hospitals at country level. This research work is based on mathematical biology using data science techniques in the domain of machine learning. This research suggests a modeling scheme for influenza pandemic predictions, its different classifications and types such as H1N1, B-Victoria etc. via machine learning prediction and regression as well as classification algorithms such as Logistic Regression (LR), Support Vector Machines (SVM) using Linear, Polynomial and RBF kernels; Naïve Bayes (NB) and Random Forest (RF) method for the prediction of influenza disease and its outbreak, the influenza kind became pandemic with the infected populated area. After using various kernels in SVM algorithm, it is observed that Polynomial and Linear kernels have approximately the same accuracy scores, while RBF kernel was not best-fitted for the considered influenza datasets. As far as the overall performance is concerned, at average, RF has the highest accuracy score as 74% while the LR had also the better average score as 72% after RF. After applying the considered ML algorithms, Random Forest algorithm performed in well-effective manner and comparatively it was analyzed as the best-fitted algorithm for the considered datasets.

References
  1. Khan, M., et al., Forecast the Influenza Pandemic Using Machine Learning. Cmc -Tech Science Press-, 2020. 66: p. 331-340.
  2. Choi, R.Y., et al., Introduction to Machine Learning, Neural Networks, and Deep Learning. Translational Vision Science & Technology, 2020. 9(2): p. 14-14.
  3. LaValley, M.P., Logistic regression. Circulation, 2008. 117(18): p. 2395-2399.
  4. Hilbe, J.M., Logistic regression models. 2009: CRC press.
  5. Chauhan, V.K., K. Dahiya, and A. Sharma, Problem formulations and solvers in linear SVM: a review. Artificial Intelligence Review, 2019. 52(2): p. 803-855.
  6. Jakkula, V., Tutorial on support vector machine (svm). School of EECS, Washington State University, 2006. 37(2.5): p. 3.
  7. Bayes, T., Naive bayes classifier. Article Sources and Contributors, 1968: p. 1-9.
  8. Hutchinson, E.C. and Y. Yamauchi, Understanding Influenza, in Influenza Virus: Methods and Protocols, Y. Yamauchi, Editor. 2018, Springer New York: New York, NY. p. 1-21.
  9. NCIRD, Types of Influenza Viruses. CDC, 2023.
  10. Poirier, C., et al., Real time influenza monitoring using hospital big data in combination with machine learning methods: comparison study. JMIR public health and surveillance, 2018. 4(4): p. e11361.
  11. Yin, Z., L.M. Sulieman, and B.A. Malin, A systematic literature review of machine learning in online personal health data. Journal of the American medical informatics association, 2019. 26(6): p. 561-576.
  12. Cacciabue, M. and D.N. Marcone, INFINITy: A fast machine learning-based application for human influenza A and B virus subtyping. Influenza Other Respir Viruses, 2023. 17(1): p. e13096.
  13. Shah, S., et al., Seasonal antigenic prediction of influenza A H3N2 using machine learning. 2023.
  14. Wang, H., K.O. Kwok, and S. Riley, Forecasting influenza incidence as an ordinal variable using machine learning. medRxiv, 2023: p. 2023.02.09.23285705.
  15. Halev, A., et al., Outbreak Prediction in Swine Populations with Machine Learning. 2023.
  16. Hung, S.-K., et al., Developing and validating clinical features-based machine learning algorithms to predict influenza infection in influenza-like illness patients. Biomedical Journal, 2023. 46(5): p. 100561.
  17. Zou, X., et al., Accurately identifying hemagglutinin using sequence information and machine learning methods. Front Med (Lausanne), 2023. 10: p. 1281880.
  18. Marquez, E., et al., Supervised Machine Learning Methods for Seasonal Influenza Diagnosis. Diagnostics, 2023. 13(21): p. 3352.
  19. Saloni Dattani, F.S., Edouard Mathieu, Hannah Ritchie and Max Roser. Influenza [cited 2024 February 2024]; Influenza dataset ]. Available from: https://ourworldindata.org/influenza.
  20. LACHMANN, A. Weekly Influenza Reports by Country. [cited 2024 February 2024]; Available from: https://www.kaggle.com/datasets/lachmann12/weekly-influenza-reports-by-country.
  21. He, Z., J. Camobreco, and K. Perkins, How he won: Using machine learning to understand Trump’s 2016 victory. Journal of Computational Social Science, 2022. 5(1): p. 905-947.
  22. Stoltzfus, J.C., Logistic regression: a brief primer. Academic emergency medicine, 2011. 18(10): p. 1099-1104.
  23. Patle, A. and D.S. Chouhan. SVM kernel functions for classification. in 2013 International conference on advances in technology and engineering (ICATE). 2013. IEEE.
  24. Bodlaender, H.L., et al., On problems without polynomial kernels. Journal of Computer and System Sciences, 2009. 75(8): p. 423-434.
  25. Breiman, L., Random forests. Machine learning, 2001. 45: p. 5-32.
  26. Ziegler, A. and I.R. König, Mining data with random forests: current options for real‐world applications. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 2014. 4(1): p. 55-63.
  27. Lokanan, M.E., Incorporating machine learning in dispute resolution and settlement process for financial fraud. Journal of Computational Social Science, 2023. 6(2): p. 515-539.
  28. Salazar, J.J., et al., Fair train-test split in machine learning: Mitigating spatial autocorrelation for improved prediction accuracy. Journal of Petroleum Science and Engineering, 2022. 209: p. 109885.
  29. Tan, J., et al., A critical look at the current train/test split in machine learning. arXiv preprint arXiv:2106.04525, 2021.
  30. Beauxis-Aussalet, E. and L. Hardman. Visualization of confusion matrix for non-expert users. in IEEE Conference on Visual Analytics Science and Technology (VAST)-Poster Proceedings. 2014.
  31. Maria Navin, J. and R. Pankaja, Performance analysis of text classification algorithms using confusion matrix. International Journal of Engineering and Technical Research (IJETR), 2016. 6(4): p. 75-8.
Index Terms

Computer Science
Information Sciences

Keywords

Influenza pandemic Forecasting model H1N1 Influenza Data science Biology Logistic Regression SVM Linear SVM Polynomial SVM RBF kernel Naïve Bayes Random Forest.