CFP last date
20 December 2024
Reseach Article

Detection and Prediction of Phishing Websites using Classification Mining Techniques

by Mofleh Al-diabat
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 147 - Number 5
Year of Publication: 2016
Authors: Mofleh Al-diabat
10.5120/ijca2016911061

Mofleh Al-diabat . Detection and Prediction of Phishing Websites using Classification Mining Techniques. International Journal of Computer Applications. 147, 5 ( Aug 2016), 5-11. DOI=10.5120/ijca2016911061

@article{ 10.5120/ijca2016911061,
author = { Mofleh Al-diabat },
title = { Detection and Prediction of Phishing Websites using Classification Mining Techniques },
journal = { International Journal of Computer Applications },
issue_date = { Aug 2016 },
volume = { 147 },
number = { 5 },
month = { Aug },
year = { 2016 },
issn = { 0975-8887 },
pages = { 5-11 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume147/number5/25647-2016911061/ },
doi = { 10.5120/ijca2016911061 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T23:51:03.831439+05:30
%A Mofleh Al-diabat
%T Detection and Prediction of Phishing Websites using Classification Mining Techniques
%J International Journal of Computer Applications
%@ 0975-8887
%V 147
%N 5
%P 5-11
%D 2016
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Phishing is serious web security problem that involves mimicking legitimate websites to deceive online users in order to steal their sensitive information. Phishing can be seen as a typical classification problem in data mining where the classifier is constructed from large number of website’s features.There are high demands on identifying the best set of features that when mined the predictive accuracy of the classifiers is enhanced. This paper investigates features selection aiming to determine the effective set of features in terms of classification performance. We compare two known featuresselectionmethod in order to determine the least set of features of phishing detection using data mining. Experimental tests on large number of features data set havebeen doneusing Information Gain and Correlation Features set methods. Further, two data mining algorithms namely C4.5 and IREP have been trained on different sets of selected features to show the pros and cons of the feature selection process. We have been able to identify new knowledge in the forms of rules that show vital correlations among significant features.

References
  1. Abdelhamid N., Ayesh A., Thabtah F. (2014) Phishing detection based associative classification data mining. Expert Systems with Applications 41 (13) Pages 5948–5959, Oct 2014.
  2. Abdelhamid N, Ayesh A., Thabtah F. (2013) Phishing Detection using Associative Classification Data Mining. ICAI'13 - The 2013 International Conference on Artificial Intelligence, pp. (491-499). USA.
  3. R. Basnet, A. Sung, and Q. Liu, “Feature selection for improved phishing detection,” Advanced Research in Applied Artificial Intelligence, pp. 252–261, 2012.
  4. Cohen W. (1995) Fast effective rule induction. In machine learning: Proceedings of the 12th International conference, pp. 115-123. Lake Tahoe, California. Morgan Kaufmann.
  5. Mohammad R. Thabtah F. McCluskey L., (2015) Phishing websites dataset. Available: https://archive.ics.uci.edu/ml/datasets/Phishing+Websites Accessed January 2016.
  6. Muhammad R., Thabtah F., McCluskey L., (2014) Predicting Phishing Websites based on Self-Structuring Neural Network. Journal of Neural Computing and Applications, (3)1-16. Springer.
  7. Mohammad R., Thabtah F, McCluskey L (2012) An Assessment of Features Related to Phishing Websites using an Automated Technique. In The 7th International Conference for Internet Technology and Secured Transactions (ICITST-2012); 2012; London: ICITST.
  8. Peng, H.C., Long, F., and Ding, C. (2005). Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. IEEE Transactions on Pattern Analysis and Machine Intelligence 27 (8): 1226–1238. doi:10.1109/tpami.2005.159. PMID 16119262.
  9. Qabajeh I, Thabtah F. (2014) An Experimental Study for Assessing Email Classification Attributes Using Feature Selection Methods. Proceedings of the 3rd IEEE conference on Advanced Computer Science Applications and Technologies (ACSAT), pp. 125-132, 2014.
  10. Quinlan, J. (1993) C4.5: Programs for machine learning. San Mateo, CA: Morgan Kaufmann.
  11. Quinlan J. (1986). Induction of Decision Trees, Machine Learning, (1), 81-106.
  12. Uysal A. K. (2016) An improved global feature selection scheme for text classification. Expert systems with Applications, Vol. 43, pp. 82-92.
  13. Uzun E., Agun H. V., and Yerlikaya T. A. (2013) A hybrid approach for extracting informative content from web pages. Information Processing & Management, (49), 928-944, 2013.
  14. Zuhir H., Selmat A., Salleh M. (2015) The Effect of Feature Selection on Phish Website Detection An Empirical Study on Robust Feature Subset Selection for Effective Classification. International Journal of Advanced Computer Science and Applications, Vol. 6.,pp 221-232. 2011.
Index Terms

Computer Science
Information Sciences

Keywords

Classification Accuracy Website Security Data mining Feature Assessment Phishing