Malicious URL Detection and Identification

Anjali B. Sayamber; Arati M. Dixit

Call for Paper

March Edition

IJCA solicits high quality original research papers for the upcoming March edition of the journal. The last date of research paper submission is 20 February 2026

Submit your paper

Know more

The week's pick

A Knowledge-Graph–Driven Multimodal Large Model for Semantic Understanding and Controllable Generation of Intangible Cultural Heritage

Jundi Yang Heng Yao

Random Articles

Reseach Article

Malicious URL Detection and Identification

by Anjali B. Sayamber, Arati M. Dixit

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 99 - Number 17

Year of Publication: 2014

Authors: Anjali B. Sayamber, Arati M. Dixit

10.5120/17464-8247

Anjali B. Sayamber, Arati M. Dixit . Malicious URL Detection and Identification. International Journal of Computer Applications. 99, 17 ( August 2014), 17-23. DOI=10.5120/17464-8247

@article{ 10.5120/17464-8247,

author = { Anjali B. Sayamber, Arati M. Dixit },

title = { Malicious URL Detection and Identification },

journal = { International Journal of Computer Applications },

issue_date = { August 2014 },

volume = { 99 },

number = { 17 },

month = { August },

year = { 2014 },

issn = { 0975-8887 },

pages = { 17-23 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume99/number17/17464-8247/ },

doi = { 10.5120/17464-8247 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-06T22:28:27.060954+05:30

%A Anjali B. Sayamber

%A Arati M. Dixit

%T Malicious URL Detection and Identification

%J International Journal of Computer Applications

%@ 0975-8887

%V 99

%N 17

%P 17-23

%D 2014

%I Foundation of Computer Science (FCS), NY, USA

Abstract

Malicious links are used as a source by the distribution channels to broadcast malware all over the Web. These links become instrumental in giving partial or full system control to the attackers. This results in victim systems, which get easily infected and, attackers can utilize systems for various cyber crimes such as stealing credentials, spamming, phishing, denial-of-service and many more such attacks. To detect such crimes systems should be fast and precise with the ability to detect new malicious content. This paper introduces various aspects associated with the URL (Uniform Resource Locator) classification process which recognizes whether the target website is a malicious or benign. The standard datasets are used for training purpose from different sources. The rising problem spamming, phishing and malware, has generated a need for reliable framework solution which can classify and further identify the malicious URL. An alternative approach has been proposed which uses a Naïve Bayes classifier for an automated classification and detection of malicious URLs. The proposed model based on Naive Bayes is supported by clustering and classification technique. On the other hand, they are rarely used for general probabilistic learning and inference which is typically used for estimating with conditional and marginal distributions. The proposed work in this paper shows that, for a wide range of benchmark datasets, Naive Bayes models learned using Probability model has better accuracy than Support Vector Machine model.

References

Harry Zhang "The Optimality of Naive Bayes". FLAIRS 2004 conference.
Caruana, R. and Niculescu-Mizil, A. : "An empirical comparison of supervised learning algorithms". Proceedings of the 23rd international conference on Machine learning, 2006.
George H. John and Pat Langley "Estimating Continuous Distributions in Bayesian Classifiers". Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence. pp. 338-345. Morgan Kaufmann, San Mateo, 1995.
Breese, J. S. , Heckerman, D. , & Kadie, C. "Empirical analysis of predictive algorithms for collaborative filtering". Proc. UAI-98, (1998), (pp. 43–52).
Cheese man, P. , & Stutz, J. (1996). Bayesian classification (Auto Class): Theory and results. In Advances in knowledge discovery and data mining, 153–180. Menlo Park, CA: AAAI Press.
Dempster, A. P. , Laird, N. M. , & Rubin, D. B. (1977). MaCmum likelihood from incomplete data via the EM algorithm. J. Royal Statistical Society B, 39, 1–38.
Domingo's, P. , & Pazzani M. . "On the optimality of the simple Bayesian classifier under zero-one loss". Machine Learning, 29, 103–130, (1997). .
Friedman, N. (1998). The Bayesian structural EM algorithm. Proc. UAI-98 (pp. 129–138),
Gilks, W. R. , Richardson, S. , & Spiegel halter, D. J. (Eds. ). (1996). Markov chain Monte Carlo in practice. London, UK: Chapman and Hall.
Heckerman, D. , Geiger, D. , & Chickering, D. M. (1995). Learning Bayesian networks: The combination of knowledge and statist. data. Machine Learning, 20, 197–243.
Pearl, J. (1988). Probabilistic reasoning in intelligent systems: Networks of plausible inference. San Francisco, CA: Morgan Kaufmann.
Roth, D. (1996). On the hardness of approCmate reasoning. Artificial Intelligence, 82, 273–302.
Yedidia, J. S. , Freeman, W. T. , & Weiss, Y. (2001). Generalized belief propagation. In Adv. NIPS 13, 689–695.
Hyunsang Choi. . Seoul, Bin B. Zhu. "Detecting Malicious Web Links and Identifying Their Attack Types". Korea University (2011).
DNS-BH. Malware prevention through domain blocking.
JWSPAMSPY. E-mail spam filter for Microsoft Windows.
PHISHTANK. Free community site for anti-phishing service.
http://random. yahoo. com/bin/ryl)3. (accessed on 20/06/2014)
Mcgraph, D. K. , And Gupta, M. (2008). Behind phishing: An examination of phisher modi operandi. In LEET: Proceedings of the USENIX Workshop on Large-Scale Exploits and Emergent Threats.
Hou, Y. -T. , Chang, Y. , Chen, T. , Laih, C. -S. , And Chen, C. -M. "Malicious web content detection by machine learning". Expert Systems with Applications (2010), 55–60.
Ramchandran, A. , And Feamster, N. "Understanding the network-level behavior of spammers". In Sigcomm (2006).
Holz, T. , Gorecki, C. , Rieck, K. , And Freiling, F. C. "Detection and mitigation of fast-flux service networks". In NDSS: Proceedings of the Network and Distributed System Security Symposium (2008).
Anjali B. Sayamber, Arati M. Dixit. "On URL Classification" International Journal of Computer Trends and Technology (IJCTT) – volume 12 number 5 – Jun 2014.
http://en. wikipedia. org/wiki/Malware (accessed on 20/06/2014).
Fette, I. , Sadeh, N. , and Tomasic, A. "Learning to detect phishing emails". In WWW: Proceedings of the international conference on World Wide Web (2007).
Cortes, C. , and Vapnik, V. "Support vector networks". Machine Learning (1995), 273–297.
Zhang, Y. , Hong, J. , and Cranor, L. Cantina: "A content-based approach to detecting phishing web sites". In WWW: Proceedings of the international conference on World Wide Web (2007).
Ntoula, A. , Najork, M. , Manasse, M. , and Fetterly,D. "Detecting spam web pages through content analysis". In WWW: Proceedings of international conference on World Wide Web (2006).

Index Terms

Computer Science

Information Sciences

Keywords

Machine Learning Feature Extraction Benign Malicious Web Pages Classification Module Web-Based Attacks.