CFP last date
20 January 2025
Reseach Article

Detection and Classification of Legitimate and Spam Emails using K-Nearest Neighbor Augmented with Quadratic Sieve Algorithm

by Jumoke Soyemi, Mudasiru Hammed
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 175 - Number 18
Year of Publication: 2020
Authors: Jumoke Soyemi, Mudasiru Hammed
10.5120/ijca2020920700

Jumoke Soyemi, Mudasiru Hammed . Detection and Classification of Legitimate and Spam Emails using K-Nearest Neighbor Augmented with Quadratic Sieve Algorithm. International Journal of Computer Applications. 175, 18 ( Sep 2020), 28-32. DOI=10.5120/ijca2020920700

@article{ 10.5120/ijca2020920700,
author = { Jumoke Soyemi, Mudasiru Hammed },
title = { Detection and Classification of Legitimate and Spam Emails using K-Nearest Neighbor Augmented with Quadratic Sieve Algorithm },
journal = { International Journal of Computer Applications },
issue_date = { Sep 2020 },
volume = { 175 },
number = { 18 },
month = { Sep },
year = { 2020 },
issn = { 0975-8887 },
pages = { 28-32 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume175/number18/31554-2020920700/ },
doi = { 10.5120/ijca2020920700 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-07T00:25:24.537017+05:30
%A Jumoke Soyemi
%A Mudasiru Hammed
%T Detection and Classification of Legitimate and Spam Emails using K-Nearest Neighbor Augmented with Quadratic Sieve Algorithm
%J International Journal of Computer Applications
%@ 0975-8887
%V 175
%N 18
%P 28-32
%D 2020
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Spam in emails is a major challenge that is inherent in today’s internet as it endangers financial institutions and poses a threat to individual users. Various techniques have been proposed by different studies to prevent spam in emails; however, classification and filtering technique using machine intelligence methods are the most efficient among the several methods. This study employed a K-Nearest Neighbor (KNN) augmented with the Quadratic Sieve algorithm to detect and classify legitimate emails and spam. The sieve algorithm revealed all the prime numbers for all the dataset used, starting from the input dataset to reduce the errors that may cause an imbalance in the classification. The result from this study shows that implementation of KNN augmented with Quadratic Sieve algorithm detects and properly classifies legitimate e-mail as well as spam much better.

References
  1. Basavaraju, M. Prabhakar, R.A. 2010. Novel Method of Spam Mail Detection using Text Based Clustering Approach. International Journal of Computer Applications. 5(4), 15 – 25.
  2. Sahil, P. Dishant, G. Mehak, A. Ishita, K. and Nishtha, J. 2013. Comparison and Analysis of Spam Detection Algorithms. International Journal of Application or Innovation in Engineering & Management (IJAIEM). 2 (4), 1 – 7.
  3. Garcia, V. Mollineda, R.A. and Sa´nchez, J.S. 2008. On the k-NN performance in a challenging scenario of imbalance and overlapping, Springer-Verlag London Limited, United Kingdon, 269–280.
  4. Man, Q. and Mousoli. R. 2010. Semantic analysis for spam filtering. In: Proceedings of the Seventh International Conference on Fuzzy Systems and Knowledge Discovery (FSKD).
  5. Qi, M. and Mousoli, R. 2010. Semantic analysis for spam filtering. In 2010 Seventh International Conference on Fuzzy Systems and Knowledge Discovery, pp. 2914-2917, IEEE.
  6. ZhiWei M. Singh M.M.and Zaaba, Z.F. 2017. Email spam detection: a method of metaclassifiers stacking. In The 6th international conference on computing and informatics, pp. 750-757.
  7. Aman, K. and Singh, M.D. 2013. A review of data classification using K-Nearest Neighbour algorithm. International Journal of Emerging Technology and Advanced Engineering. 3(6), 354 – 360.
  8. Rungsawang, A. Taweesiriwate, A. and Manaskasemsak, B. 2011. Spam Host Detection using Ant Colony Optimization, in IT Convergence and Services. Springer, pp. 13-21.
  9. Dave, D. and Harry, W. 2009. Spam detection using clustering, random forests, and active learning. CEAS 2009 – Sixth Conference on Email and Anti-Spam, July 16-17, 2009, Mountain View, California USA.
  10. ZhiWei M. Singh M.M. and Zaaba Z.F. 2017. Email spam detection: a method of metaclassifiers stacking. In The 6th international conference on computing and informatics, pp. 750-757.
Index Terms

Computer Science
Information Sciences

Keywords

Spam email K-Nearest Neighbor Cross-Validation Quadratic Sieve Algorithm