CFP last date
20 March 2025
Reseach Article

Multifaceted Computational Framework for COVID-19 Variant Classification using Advanced Machine Learning, Signal Processing, and High-Dimensional Feature Reduction Techniques

by Love Fadia, Vatsal Shah, Mohammad Hassanzadeh, Majid Ahmadi, Jonathan Wu
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 186 - Number 70
Year of Publication: 2025
Authors: Love Fadia, Vatsal Shah, Mohammad Hassanzadeh, Majid Ahmadi, Jonathan Wu
10.5120/ijca2025924499

Love Fadia, Vatsal Shah, Mohammad Hassanzadeh, Majid Ahmadi, Jonathan Wu . Multifaceted Computational Framework for COVID-19 Variant Classification using Advanced Machine Learning, Signal Processing, and High-Dimensional Feature Reduction Techniques. International Journal of Computer Applications. 186, 70 ( Mar 2025), 1-8. DOI=10.5120/ijca2025924499

@article{ 10.5120/ijca2025924499,
author = { Love Fadia, Vatsal Shah, Mohammad Hassanzadeh, Majid Ahmadi, Jonathan Wu },
title = { Multifaceted Computational Framework for COVID-19 Variant Classification using Advanced Machine Learning, Signal Processing, and High-Dimensional Feature Reduction Techniques },
journal = { International Journal of Computer Applications },
issue_date = { Mar 2025 },
volume = { 186 },
number = { 70 },
month = { Mar },
year = { 2025 },
issn = { 0975-8887 },
pages = { 1-8 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume186/number70/multifaceted-computational-framework-for-covid-19-variant-classification-using-advanced-machine-learning-signal-processing-and-high-dimensional-feature-reduction-techniques/ },
doi = { 10.5120/ijca2025924499 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2025-03-01T12:38:59.232980+05:30
%A Love Fadia
%A Vatsal Shah
%A Mohammad Hassanzadeh
%A Majid Ahmadi
%A Jonathan Wu
%T Multifaceted Computational Framework for COVID-19 Variant Classification using Advanced Machine Learning, Signal Processing, and High-Dimensional Feature Reduction Techniques
%J International Journal of Computer Applications
%@ 0975-8887
%V 186
%N 70
%P 1-8
%D 2025
%I Foundation of Computer Science (FCS), NY, USA
Abstract

The coronavirus pandemic, caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), had an extensive global impact, causing widespread disruptions to public health. The early and accurate identification of the virus and its various strains is imperative for safeguarding lives. Over the past few years, multifarious machine learning and deep learning techniques were used to classify genomic sequences . However, existing methods face several limitations. Many approaches struggle with dataset imbalance, leading to biased and unreliable models. Traditional neural network-based methods are computationally intensive, requiring significant time and resources. Moreover, existing techniques often fail to achieve consistently high classification accuracy across properly balanced datasets. To address these gaps, this article presents an efficient method for classifying coronavirus variants’ DNA sequences using a combination of machine learning and signal processing. The DNA sequences are first converted into numbers using Electron-Ion Interaction Potential, Numeric, and Complex coding techniques. After that signal processing methods; Discrete Cosine Transform II, Discrete Cosine Transform III, Fast Fourier Transform, Haar Wavelet Transform, and Coiflet Wavelet Transform are applied to extract features from the coded data. The high dimensionality is reduced using Linear Discriminant Analysis and Principal Component Analysis. For the classification task, machine learning models such as Decision Tree, Support Vector Classifier, and a fusion of Light-Gradient Boosting Machine, AdaBoost, and Random Forest are employed. The proposed approach achieves an impressive accuracy of 99.8%, which surpasses the state of the art using a different combination of transformations with Numeric coding and Voting Classifier.

References
  1. NIH: National Institute of Allergy and Infectious Diseases. Coronaviruses, March 2022.
  2. J. Emrani. Sars-cov-2, infection, transmission, transcription, translation, proteins, and treatment: A review. International Journal of Biological Macromolecules, 193:1249–1273, December 2021.
  3. S. Amin, A. Alharbi, M. I. Uddin, and H. Alyami. Adapting recurrent neural networks for classifying public discourse on covid-19 symptoms in twitter content. Soft Computing, 26(20):11077–11089, August 2022.
  4. A. Aleem, A. B. A. Samad, and S. Vaqar. Emerging variants of sars-cov-2 and novel therapeutics against coronavirus (covid-19), May 2023.
  5. W. Hariri and A. Narin. Deep neural networks for covid- 19 detection and diagnosis using images and acoustic-based techniques: a recent review. Soft Computing, 25(24):15345– 15362, August 2021.
  6. A. Khodaei, P. Shams, H. Sharifi, and B. M. Tazehkand. Identification and classification of coronavirus genomic signals based on linear predictive coding and machine learning methods. Biomedical Signal Processing and Control, 80:104192, February 2023.
  7. S. M. Naeem, M. S. Mabrouk, S. Y. Marzouk, and M. A. Eldosoky. A diagnostic genomic signal processing (gsp)-based system for automatic feature analysis and detection of covid- 19. Briefings in Bioinformatics, 22(2):1197–1205, August 2020.
  8. K. Patel, V. Shah, N. Patel, and Y. Mehta. An non-invasive approach of corona genome detection. In 2020 International Conference on Advances in Computing, Communication and Materials (ICACCM), pages 154–157, 2020.
  9. T. Meng, A. Soliman, M. Shyu, Y. Yang, S. Chen, and S. Iyengar. Wavelet analysis in current cancer genome research: A survey. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 10(6):1442–1459, December 2013.
  10. Y. Yadav, S. N. Sharma, and D. K. Shakya. Detection of tandem repeats in dna sequences using short-time ramanujan fourier transform. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 19(3):1583–1591, June 2022.
  11. J. Mena-Chalco, H. Carrer, Y. Zana, and R. M. Cesar Jr. Identification of protein coding regions using the modified gaborwavelet transform. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 5(2):198–207, June 2008.
  12. G. S. Randhawa, M. P. M. Soltysiak, H. E. Roz, C. P. E. De Souza, K. A. Hill, and L. Kari. Machine learning using intrinsic genomic signatures for rapid classification of novel pathogens: Covid-19 case study. PLOS ONE, 15(4):e0232391, April 2020.
  13. I. Muhammad, I. Mukhlash, M. Jamhuri, M. Iqbal, and M. I. Irawan. Classification of covid-19 variants using boosting algorithm. In 2022 9th International Conference on Electrical Engineering, Computer Science and Informatics (EECSI), pages 29–34, 2022.
  14. M. S. Hammad, V. F. Ghoneim, M. S. Mabrouk, and W. Al- Atabany. A hybrid deep learning approach for covid-19 detection based on genomic image processing techniques. Scientific Reports, 13(1), March 2023.
  15. I. Saha, N. Ghosh, D. Maity, A. Seal, and D. Plewczynski. Covid-deeppredictor: Recurrent neural network to predict sars-cov-2 and other pathogenic viruses. Frontiers in Genetics, 12, February 2021.
  16. M. A. El-Dosuky, M. Soliman, and A. E. Hassanien. Covid- 19 vs influenza viruses: A cockroach optimized deep neural network classification approach. International Journal of Imaging Systems and Technology, 31(2):472–482, February 2021.
  17. NCBI Virus. Ncbi virus: Sequences for discovery, 2024.
  18. S. S. Sahu and G. Panda. Identification of protein-coding regions in dna sequences using a time-frequency filtering approach. Genomics, Proteomics and Bioinformatics, 9(1– 2):45–55, April 2011.
  19. M. Akhtar, J. Epps, and E. Ambikairajah. Signal processing in sequence analysis: Advances in eukaryotic gene prediction. IEEE Journal of Selected Topics in Signal Processing, 2(3):310–321, June 2008.
  20. M. Akhtar, J. Epps, and E. Ambikairajah. On dna numeric representations for period-3 based exon prediction. In 2007 IEEE International Workshop on Genomic Signal Processing and Statistics, pages 1–4, 2007.
  21. N. Ahmed, T. Natarajan, and K. R. Rao. Discrete cosine transform. IEEE Transactions on Computers, C-23(1):90–93, January 1974.
  22. M. Kumar and T. K. Rawat. Design of fractional order differentiator using type-iii and type-iv discrete cosine transform. Engineering Science and Technology, an International Journal, 20(1):51–58, February 2017.
  23. M. Hassanzadeh and B. Shahrrava. Linear version of parseval’s theorem. IEEE Access, 10:27230–27241, 2022.
  24. Patrick J. Van Fleet. The haar wavelet transformation. In Discrete Wavelet Transformations: An Elementary Approach with Applications, pages 125–181. Wiley, 2019.
  25. Shyh-Jier Huang and Cheng-Tao Hsieh. Coiflet wavelet transform applied to inspect power system disturbance-generated signals. IEEE Transactions on Aerospace and Electronic Systems, 38(1):204–210, January 2002.
  26. I. T. Jolliffe and J. Cadima. Principal component analysis: a review and recent developments. Philosophical Transactions - Royal Society. Mathematical, Physical and Engineering Sciences, 374(2065):20150202, April 2016.
  27. N. Zhao, W. Mio, and X. Liu. A hybrid pca-lda model for dimension reduction. In The 2011 International Joint Conference on Neural Networks, pages 2184–2190, 2011.
  28. V. Jackins, S. Vimal, M. Kaliappan, and M. Y. Lee. Aibased smart prediction of clinical disease using random forest classifier and naive bayes. The Journal of Supercomputing, November 2020.
  29. J. Hatwell, M. M. Gaber, and R. M. Atif Azad. Ada-whips: explaining adaboost classification with applications in the health sciences. BMC Medical Informatics and Decision Making, 20(1), October 2020.
  30. J. R. Quinlan. Induction of decision trees. Machine Learning, 1(1):81–106, March 1986.
  31. T. M. T. A. Hamid, R. Sallehuddin, Z. M. Yunos, and A. Ali. Ensemble based filter feature selection with harmonize particle swarm optimization and support vector machine for optimal cancer classification. Machine Learning with Applications, 4, March 2021.
Index Terms

Computer Science
Information Sciences

Keywords

Genomic Sequence Analysis Signal Processing Dimensionality Reduction Machine Learning