CFP last date
20 April 2026
Reseach Article

Machine Learning-based Phishing URL Detection using Lexical and Structural Features

by Muneiah Tellakula, Phaneendra Kanduri, Sivamanikanta Reddy Ramireddy, UdaySankar Reddy Konche
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 187 - Number 96
Year of Publication: 2026
Authors: Muneiah Tellakula, Phaneendra Kanduri, Sivamanikanta Reddy Ramireddy, UdaySankar Reddy Konche
10.5120/ijcaebeeeb7bb333

Muneiah Tellakula, Phaneendra Kanduri, Sivamanikanta Reddy Ramireddy, UdaySankar Reddy Konche . Machine Learning-based Phishing URL Detection using Lexical and Structural Features. International Journal of Computer Applications. 187, 96 ( Apr 2026), 1-5. DOI=10.5120/ijcaebeeeb7bb333

@article{ 10.5120/ijcaebeeeb7bb333,
author = { Muneiah Tellakula, Phaneendra Kanduri, Sivamanikanta Reddy Ramireddy, UdaySankar Reddy Konche },
title = { Machine Learning-based Phishing URL Detection using Lexical and Structural Features },
journal = { International Journal of Computer Applications },
issue_date = { Apr 2026 },
volume = { 187 },
number = { 96 },
month = { Apr },
year = { 2026 },
issn = { 0975-8887 },
pages = { 1-5 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume187/number96/machine-learning-based-phishing-url-detection-using-lexical-and-structural-features/ },
doi = { 10.5120/ijcaebeeeb7bb333 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2026-04-19T00:40:22.810206+05:30
%A Muneiah Tellakula
%A Phaneendra Kanduri
%A Sivamanikanta Reddy Ramireddy
%A UdaySankar Reddy Konche
%T Machine Learning-based Phishing URL Detection using Lexical and Structural Features
%J International Journal of Computer Applications
%@ 0975-8887
%V 187
%N 96
%P 1-5
%D 2026
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Phishing attacks, which use fraudulent sites to gather sensitive information from users, continue to be one of the major threats in cybersecurity. Thus, this work proposes a machine learning-oriented method for detecting phishing URL by leveraging the lexical and structural characteristics of URLs to overcome such difficulty. Over 100,000 URLs from the dataset were encoded as sixteen hand-crafted features which contained domain, path and character level information. For detection performance, Random Forest classifier with balanced class weights was used to decrease class imbalance. The aforementioned outcomes from the experiments validate that the proposed model plays a highly effective role in classifying if a given URL is phishing or legal with high accuracy attaining equal precision and recall. The proposed method has lower computing complexity and performs competitively to deep learning techniques, thus its suitability for real time phishing prevention systems.

References
  1. Jain, A.K., Gupta, B.B.: Phishing detection: analysis of visual similarity based approaches. Security and Communication Networks 10(8), 1319–1335 (2017)
  2. Verma, R., Das, A.: What’s in a URL: fast feature extraction and malicious URL detection. In: IEEE International Conference on Data Mining Workshops, pp. 986–993 (2017)
  3. Sahingoz, M., Buber, B., Demir, O., Diri, B.: Machine learning based phishing detection from URLs. Expert Systems with Applications 117, 345–357 (2019)
  4. Ma, J., Saul, L.K., Savage, S., Voelker, G.M.: Beyond blacklists: learning to detect malicious web sites from suspicious URLs. In: ACM SIGKDD Conference, pp. 1245–1254 (2009)
  5. Fette, T., Sadeh, N., Tomasic, A.: Learning to detect phishing emails. In: Proceedings of the World Wide Web Conference, pp. 649–656 (2007)
  6. Marchal, S., Francois, J., State, R., Engel, T.: PhishStorm: detecting phishing with streaming analytics. IEEE Transactions on Network and Service Management 11(4), 458–471 (2014)
  7. Chiew, K.L., Yong, K.S.C., Tan, C.L.: A survey of phishing attacks: their types, vectors and technical approaches. Expert Systems with Applications 106, 1–20 (2018)
  8. Le, A., Markopoulou, A., Faloutsos, M.: PhishDef: URL names say it all. In: IEEE INFOCOM, pp. 191–195 (2011)
  9. Wang,W., Zhang, F., Luo, X., Zhang, S.: Precise phishing detection with recurrent convolutional neural networks. Security and Communication Networks (2019)
  10. Garera, S., Provos, N., Chew, M., Rubin, A.D.: A framework for detection and measurement of phishing attacks. In: ACM Workshop on Rapid Malcode, pp. 1–8 (2007)
  11. Rao, Y., Pais, A.: Detection of phishing websites using an efficient feature-based machine learning framework. Neural Computing and Applications 31, 3851–3873 (2019)
  12. Google Safe Browsing: Safe browsing transparency report (2023). https://safebrowsing.google.com
  13. Anti-Phishing Working Group (APWG): Phishing activity trends report (2023)
  14. Abdelhamid, N., Ayesh, A., Thabtah, F.: Phishing detection based associative classification data mining. Expert Systems with Applications 41(13), 5948–5959 (2014)
  15. Gupta, B.B., Arachchilage, N.A.G., Psannis, K.E.: Defending against phishing attacks: taxonomy of methods, current issues and future directions. Telecommunication Systems 67, 247– 267 (2018)
  16. Aburrous, M., Hossain, M., Dahal, K., Thabtah, F.: Intelligent phishing detection system for e-banking using fuzzy data mining. Expert Systems with Applications 37(12), 7913– 7921 (2010)
Index Terms

Computer Science
Information Sciences

Keywords

Phishing Detection URL Classification Machine Learning Random Forest Lexical Features Cybersecurity