CFP last date
20 February 2025
Reseach Article

Comparative Analysis of Tree-based Intrusion Detection Modelling and Machine Learning Classification Models using Cyber-Security Dataset

by Motlatso Mokoele, Sello Mokwena
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 186 - Number 13
Year of Publication: 2024
Authors: Motlatso Mokoele, Sello Mokwena
10.5120/ijca2024923480

Motlatso Mokoele, Sello Mokwena . Comparative Analysis of Tree-based Intrusion Detection Modelling and Machine Learning Classification Models using Cyber-Security Dataset. International Journal of Computer Applications. 186, 13 ( Mar 2024), 33-40. DOI=10.5120/ijca2024923480

@article{ 10.5120/ijca2024923480,
author = { Motlatso Mokoele, Sello Mokwena },
title = { Comparative Analysis of Tree-based Intrusion Detection Modelling and Machine Learning Classification Models using Cyber-Security Dataset },
journal = { International Journal of Computer Applications },
issue_date = { Mar 2024 },
volume = { 186 },
number = { 13 },
month = { Mar },
year = { 2024 },
issn = { 0975-8887 },
pages = { 33-40 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume186/number13/comparative-analysis-of-tree-based-intrusion-detection-modelling-and-machine-learning-classification-models-using-cyber-security-dataset/ },
doi = { 10.5120/ijca2024923480 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-03-27T00:44:38.441985+05:30
%A Motlatso Mokoele
%A Sello Mokwena
%T Comparative Analysis of Tree-based Intrusion Detection Modelling and Machine Learning Classification Models using Cyber-Security Dataset
%J International Journal of Computer Applications
%@ 0975-8887
%V 186
%N 13
%P 33-40
%D 2024
%I Foundation of Computer Science (FCS), NY, USA
Abstract

One of the critical problems organizations encounters is the increasing prevalence of cyber-criminals exploiting vulnerabilities, leading to identity theft. This breach of privacy not only threatens the organization’s financial assets, but can also have long-lasting consequences such as damaged reputations and legal implications. To address these issues, the study presented a thorough comparative analysis between tree-based intrusion detection model and popular machine learning classifiers using the well-established KDD99 dataset. The approach leverages a hybrid feature selection method, integrating the Gini index and information gain within a decision tree framework to enhance model efficiency. Evaluation metrics encompass precision, F1 score, confusion matrix, precision, recall, and execution time. Rigorous dataset preprocessing eliminates noise and biases. The findings reveal nuanced insights into model strengths and weaknesses, emphasizing the efficacy of the hybrid feature selection method in tree-based models. This study offers valuable guidance for cybersecurity professionals, helping to select models based on specific performance criteria. Ultimately, the research contributes to the advancement of intrusion detection techniques, highlighting potential areas for further exploration and improvement in the pursuit of more efficient and accurate intrusion detection systems.

References
  1. I. H. Sarker, Y. B. Abushark, F. Alsolami, and A. I. Khan, ‘IntruDTree: A machine learning based cyber security intrusion detection model”, Symmetry (Basel)., vol. 12, no. 5, May 2020, doi: 10.3390/SYM12050754.
  2. H. Alqahtani, I. H. Sarker, A. Kalim, S. M. Minhaz Hossain, S. Ikhlaq, and S. Hossain, ‘Cyber intrusion detection using machine learning classification techniques’, in Communications in Computer and Information Science, Springer, 2020, pp. 121–131. doi: 10.1007/978-981-15-6648-6_10.
  3. ‘South African companies getting hit by ransomware — and they are paying for it, Dotnetworx’. https://dotnetworx.co.za/south-african-companies-getting-nailed-by-ransomware-and-they-are-paying-up/ (accessed Oct. 23, 2023).
  4. ‘TransUnion Credit Bureau hacked for ransom - hundreds of companies under threat | Business.’ https://www.news24.com/fin24/companies/credit-bureau-transunion-hacked-for-ransom-hundreds-of-companies-under-threat-20220318 (accessed October 23, 2023).
  5. N. H. Al-A’araji et al., “A Survey on Anomaly Based Host Intrusion Detection System You may also like Research on Intrusion Detection Method Based on Cloud Computing Mengmeng Cai and Honglin Wang-Classification and Clustering Based Ensemble Techniques for Intrusion Detection Systems: A Survey An Improved Network Intrusion Detection Based on Deep Neural Network A Survey on Anomaly Based Host Intrusion Detection System,” IOP Conf. Ser. J. Phys. Conf. Ser., vol. 1000, p. 12049, 2018, doi: 10.1088/1742-6596/1000/1/012049.
  6. K. Rai, M. S. Devi and A. Guleria, “Decision Tree Based Algorithm for intrusion detection’, Int. J. Adv. Netw. Appl., vol. 07, no. 04, pp. 2828–2834, 2016, [online]. Available: https://www.researchgate.net/publication/298175900
  7. N. Farnaaz and Jabbar, “Random Forest Modeling for Network Intrusion Detection System’, Procedia Comput. Sci., vol. 89, pp. 213–217, 2016, doi: 10.1016/j.procs.2016.06.047.
  8. I. Ahmad, M. Basheri, M. J. Iqbal and A. Rahim, “Performance Comparison of Support Vector Machine, Random Forest and Extreme Learning Machine for Intrusion Detection’, IEEE Access, vol. 6, pp. 33789-3795, 2018, doi: 10.1109/ACCESS.2018.2841987.
  9. P. Garca-Teodoro, J. Daz-Verdejo, G. Maciá-Fernández, and E. Vázquez, ‘Anomaly-based network intrusion detection: Techniques, systems, and challenges,’ Comput. Secur., vol. 28, no. 1–2, pp. 18–28, 2009, doi: 10.1016/j.cose.2008.08.003.
  10. S. Malhotra, V. Bali and K. K. Paliwal, “Genetic programming and the K nearest neighbor classifier-based intrusion detection model’, Proc. 7th int. Conf. Conflu. 2017 Cloud Computing. Data Sci. Eng., pp. 42–46, 2017, doi: 10.1109/CONFLUENCE.2017.7943121.
  11. “Intrusion Detection System.” https://www.barracuda.com/support/glossary/intrusion-detection-system (accessed January 31, 2024).
  12. S. Iqbal et al., ‘On cloud security attacks: A Taxonomy and Intrusion Detection and prevention as a service,” J. Netw. Comput. Appl., vol. 74, pp. 98–120, Oct. 2016, doi: 10.1016/J.JNCA.2016.08.016.
  13. X. F. Chen and S. Z. Yu, ‘CIPA: A collaborative intrusion prevention architecture for the programmable network and SDN ‘Comput. Secur., vol. 58, pp. 1–19, May 2016, doi: 10.1016/J.COSE.2015.11.008.
  14. D. K. Bhattacharyya, ‘Network Anomaly Detection: A Machine Learning Perspective Big Data Analytics View project Gene Expression Data View project”, 2013, doi: 10.1201/b15088.
  15. H. Kaur, G. Singh, and J. Minhas, “A Review of Machine Learning-based Anomaly Detection Techniques,” Int. J. Comput. Appl. Technol. Res., vol. 2, no. 2, pp. 185–187, Jul. 2013, doi: 10.7753/ijcatr0202.1020.
  16. N. Elmrabit, F. Zhou, F. Li and H. Zhou, “Evaluation of Machine Learning Algorithms for Anomaly Detection’, Int. Conf. Cyber Secur. Prot. Digit. Serv. Cyber Secur. 2020, Jun. 2020, doi: 10.1109/CYBERSECURITY49315.2020.9138871.
  17. H. Haddad Pajouh, G. Dastghaibyfard, S. Hashemi and S. Hashemi hashemi, “Two-tier network anomaly detection model: a machine learning approach’, J Intell Inf Syst, vol. 48, pp. 61–74, 2017, doi: 10.1007/s10844-015-0388-x.
  18. M. Al-Omari, M. Rawashdeh, F. Qutaishat, M. Alshira ‘H and N. Ababneh, ‘An Intelligent Tree-Based Intrusion Detection Model for Cyber Security’, J. Netw. Syst. Manag., vol. 29, no. 2, April 20,21, doi: 10.1007/s10922-021-09591-y.
  19. B. Ingre, A. Yadav and A. K. Soni, “Decision tree based intrusion detection system for the NSL-KDD dataset’, Smart Innov. Syst. Technol., vol. 84, no. Ictis 2017, pp. 207–218, 2018, doi: 10.1007/978-3-319-63645-0_23.
  20. R. TEKN, O. YAMAN and T. TUNCER, “Decision Tree Based Intrusion Detection Method in the Internet of Things”, Int. J. Innov. Eng. Appl., vol. 6, no. 1, pp. 17–23, June 2022, doi: 10.46460/IJIEA.970383.
  21. L. Yang, A. Moubayed, I. Hamieh and A. Shami, “Tree-based intelligent intrusion detection system in the Internet of vehicles’, 2019 IEEE Glob. Commun. Conf. GLOBECOM 2019 - Proc., no. Ml, 2019, doi: 10.1109/GLOBECOM38437.2019.9013892.
  22. M. Rashid, J. Kamruzzaman, T. Imam, S. Wibowo, and S. Gordon, “A tree-based stacking ensemble technique with feature selection for network intrusion detection,” Appl. Intell., vol. 52, no. 9, pp. 9768–9781, 2022, doi: 10.1007/s10489-021-02968-1.
  23. Z. Halim et al., ‘An effective genetic algorithm-based feature selection method for intrusion detection systems’, Comput. Secur., vol. 110, p. 102448, Nov. 2021, doi: 10.1016/J.COSE.2021.102448.
  24. M. A. Bouke, A. Abdullah S. H. ALshatebi, M. T. Abdullah, and H. El Atigh, ‘An intelligent DDoS attack detection tree-based model using the Gini index feature selection method’, Microprocess. Microsyst., vol. 98, p. 104823, April 2023, doi: 10.1016/J.MICPRO.2023.104823.
  25. R. A. Disha and S. Waheed, ‘Performance analysis of machine learning models for intrusion detection system using Gini impurity-based weighted random forest (GIWRF) feature selection technique’ Cybersecurity, vol. 5, no. 1, pp. 1–22, 2022, doi: 10.1186/s42400-021-00103-8.
  26. K. K. Vasan and B. Surendiran, ‘Feature subset selection for intrusion detection using various rank-based algorithms’, Int. J. Comput. Appl. Technol., vol. 55, no. 4, pp. 298–307, 2017, doi: 10.1504/IJCAT.2017.086017.
Index Terms

Computer Science
Information Sciences
Cyber Threats
Data Preprocessing
Evaluation Metrics
Classification Models
Digital Landscape
Denial-of-Service
Internet of Things

Keywords

Cybersecurity intrusion detection machine learning hybrid feature selection tree-based intrusion detection modeling Gini index Information Gain