CFP last date
20 January 2025
Reseach Article

Malicious Web Page Detection and Content Analysis

by Vishal Jagtap, Vaibhav Shinde, Pratik Sapre, Kartik Karande, Ketaki Bhoyar
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 174 - Number 31
Year of Publication: 2021
Authors: Vishal Jagtap, Vaibhav Shinde, Pratik Sapre, Kartik Karande, Ketaki Bhoyar
10.5120/ijca2021921249

Vishal Jagtap, Vaibhav Shinde, Pratik Sapre, Kartik Karande, Ketaki Bhoyar . Malicious Web Page Detection and Content Analysis. International Journal of Computer Applications. 174, 31 ( Apr 2021), 10-13. DOI=10.5120/ijca2021921249

@article{ 10.5120/ijca2021921249,
author = { Vishal Jagtap, Vaibhav Shinde, Pratik Sapre, Kartik Karande, Ketaki Bhoyar },
title = { Malicious Web Page Detection and Content Analysis },
journal = { International Journal of Computer Applications },
issue_date = { Apr 2021 },
volume = { 174 },
number = { 31 },
month = { Apr },
year = { 2021 },
issn = { 0975-8887 },
pages = { 10-13 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume174/number31/31876-2021921249/ },
doi = { 10.5120/ijca2021921249 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-07T00:23:35.437268+05:30
%A Vishal Jagtap
%A Vaibhav Shinde
%A Pratik Sapre
%A Kartik Karande
%A Ketaki Bhoyar
%T Malicious Web Page Detection and Content Analysis
%J International Journal of Computer Applications
%@ 0975-8887
%V 174
%N 31
%P 10-13
%D 2021
%I Foundation of Computer Science (FCS), NY, USA
Abstract

The detection of malicious web pages is a complex engineering problem due to the ​dynamic nature of the information ​contained on the internet.Since the data stored on web-servers updates on a continuous basis, It is very hard to find and classify which links are malicious and which are not in ​real-time. Hence, brute-force checks (system-scans) and voting-based approaches (blacklisting) fail to capture the exhaustive list of malicious content on the internet. A machine learning based model is proposed which is able to classify the malicious links and content on the user’s device. It can later be applied in the forms: a web application, Android, iOS mobile applications and also browser extension which is able to give you a report of that link which you want to open on a device. The whole system performs a complete scan on that link and generates a report.

References
  1. Doyen Sahoo, Chenghao Liu, Steven C.H. Hoi, “Malicious URL Detection using Machine Learning: A Survey,”, School of Information Systems, Singapore Management University, Vol. 1, No. 1, Article . Publication date: August 2019.
  2. Immadisetti Naga Venkata Durga Naveen, Manamohana K, Rohit Verma, “Detection of Malicious URLs using Machine Learning Techniques,” International Journal of Innovative Technology and Exploring Engineering (IJITEE) ISSN: 2278-3075, Volume-8 Issue-4S2 March, 2019.
  3. Abubakr Sirageldin, Baharum B. Baharudin, and Low Tang Jung, “Malicious Web Page Detection: A Machine Learning Approach, ” Computer & Information Science Department, University Technology Petronas Bandar Seri Iskandar, 31750 Tronoh , 2014.
  4. Atul Choudhary, Manikrao Dhore “CIDT: Detection of Malicious Code Injection Attacks on Web Application, ” in International Journal of Computer Applications · August 2012
  5. Justin Ma, Lawrence K. Saul, Stefan Savage, Geoffrey M. Voelker, “Learning to Detect Malicious URLs,” ACM Trans.Intell. Syst. Technol. 2, 3, Article 30 (April 2011), 24 pages. DOI=10.1145/1961189.1961202 http://doi.acm.org/10.1145/1961189.1961202
  6. Y. Shi, G. Chen, and J. Li, “Malicious domain name detection based on extreme machine learning,” in Neural Processing Letters, vol.48, pp.1347-1357, 2018. DOI:10.1007/s11063- 017-9666-7
  7. Y. Hang, J. Hong, L. Cranor, “CANTINA: a content-based approach to detecting phishing web sites,” Proc. 16th International Conference on World Wide Web, pp.639-648, January, 2007. DOI:10.1145/1242572.1242659
  8. MCGRATH, D. K., AND GUPTA, M. Behind phishing: An examination of phisher modi operandi. In LEET: Proceedings of the USENIX Workshop on Large-Scale Exploits and Emergent Threats (2008)
  9. HOU, Y.-T., CHANG, Y., CHEN, T., LAIH, C.-S., AND CHEN, C.-M. Malicious web content detection by machine learning. Expert Systems with Applications (2010), 55–60.
  10. Parveen Rani, Er. Sukhpreet Singh: An Offline SEO (Search Engine Optimization) Based Algorithm to Calculate Web Page Rank According to Different Parameters, INTERNATIONAL JOURNAL OF COMPUTERS & TECHNOLOGY Vol 9, No 1, July 15, 2013
  11. ZHANG, Y., HONG, J., AND CRANOR, L. CANTINA: A content-based approach to detecting phishing web sites. In WWW: Proceedings of the international conference on World Wide Web (2007).
  12. RAMACHANDRAN,A.,AND FEAMSTER, N. Understanding the network-level behavior of spammers. In SIGCOMM (2006).
  13. MA, J., SAUL, L. K., SAVAGE, S., AND VOELKER, G. M. Beyond blacklists: learning to detect malicious web sites from suspicious URLs. In KDD: Proceedings of the international conference on Knowledge Discovery and Data mining (2009).
  14. Justin Ma, Lawrence K. Saul ,Stefan Geoffrey, M. Voelker, Identifying Suspicious URLs: An Application of Large-Scale Online Learning, Department of Computer Science & Engineering, UC San Diego (2009).
  15. MA, J., SAUL, L. K., SAVAGE, S., AND VOELKER, G. M. Identifying suspicious URLs: an application of large-scale online learning. In ICML: Proceedings of the International Conference on Machine Learning (2009).
Index Terms

Computer Science
Information Sciences

Keywords

Malicious Web Page