CFP last date
20 January 2025
Reseach Article

A Novel Method for Detecting Spam Email using KNN Classification with Spearman Correlation as Distance Measure

by Ajay Sharma, Anil Suryawanshi
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 136 - Number 6
Year of Publication: 2016
Authors: Ajay Sharma, Anil Suryawanshi
10.5120/ijca2016908471

Ajay Sharma, Anil Suryawanshi . A Novel Method for Detecting Spam Email using KNN Classification with Spearman Correlation as Distance Measure. International Journal of Computer Applications. 136, 6 ( February 2016), 28-35. DOI=10.5120/ijca2016908471

@article{ 10.5120/ijca2016908471,
author = { Ajay Sharma, Anil Suryawanshi },
title = { A Novel Method for Detecting Spam Email using KNN Classification with Spearman Correlation as Distance Measure },
journal = { International Journal of Computer Applications },
issue_date = { February 2016 },
volume = { 136 },
number = { 6 },
month = { February },
year = { 2016 },
issn = { 0975-8887 },
pages = { 28-35 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume136/number6/24159-2016908471/ },
doi = { 10.5120/ijca2016908471 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T23:36:22.042949+05:30
%A Ajay Sharma
%A Anil Suryawanshi
%T A Novel Method for Detecting Spam Email using KNN Classification with Spearman Correlation as Distance Measure
%J International Journal of Computer Applications
%@ 0975-8887
%V 136
%N 6
%P 28-35
%D 2016
%I Foundation of Computer Science (FCS), NY, USA
Abstract

E-mail is the most prevalent methods for correspondence because of its availability, quick message exchange and low sending cost. Spam mail appears as a serious issue influencing this application today's internet. Spam may contain suspicious URL’s, or may ask for financial information as money exchange information or credit card details. Here comes the scope of filtering spam from legitimate e-mails. Classification is a way to get rid of those spam messages. Various researches are proposed for spam filtering by classifying them into labels of spam and business messages. Bayesian classification based spam filtering technique is a popular method. Also SVM based classifications are also used. K-nearest neighbour classification is simple, straightforward and easy to implement and has high F-measure compare to Bayesian and SVM classification. But accuracy of traditional KNN is lower than Bayesian classification. In this work a detection of spam mail is proposed by using K-nearest neighbour classification method by combining Spearman’s correlation coefficient as distance measure rather than traditional Euclidean distance. Experimental results present a significant improvement in accuracy with higher F-measure compare to traditional algorithms.

References
  1. Liu, Bing. Web data mining: exploring hyperlinks, contents, and usage data. Springer Science & Business Media, 2007.
  2. M. Tariq Banday. Effectiveness and limitations of E-mail security Protocols. International Journal of Distributed and Parallel Systems (IJDPS) Vol.2, No.3, May 2011
  3. Di Liu. A Spearman correlation coefficient ranking for matching-score fusion on speaker recognition. Browse Conference Publications> TENCON 2010 - 2010 IEEE Regio
  4. MEI paper on Spearman’s rank correlation coefficient. December 2007. "Spearman’s rank correlation"
  5. Volume 14, Supplement 1, August 2015 "Privacy-preserving email forensics"
  6. T. Pranav Bhat, C. Karthik A Privacy Preserved Data Mining Approach Based on k-Partite Graph Theor Volume 54, 2015
  7. Volume 27, Issue 1, January 2015" Clustering and classification of email contents"a
  8. Ommera jan ,heena khana “An analysis of misclassification error detection in mails using data mining techniques”MAY 2015
  9. Tarjini vyas ,payal prajapati “A survey and evalution of supervised machine learning techniques for spam E-mail filtering” 978-1-4799-608S-9/1S/$31.00©201S IEEE
  10. Mehran sahani ,susan dumais “A Bayesian approach to filtering junk E-mail”
  11. “An empirical study on email classification using supervised machine learning in real environments “EEE ICC 2015 - Communication and Information Systems Security Symposium
  12. “E-mail spam filtering using adaptive genetic algorithm” I.J. Intelligent Systems and Applications, 2014, 02, 54-60
  13. Dr.sanjeev dhawan, jyoti verma “social networking spam detection using R package and k-nearest neighbor classification”www.iasir.net
  14. Emam M.baghat, sherine rady “An email filtering approach using classification techniques”
  15. Tao ban “An online malicious spam mail detection system using resource allocating network with locality sensitive hashing” Received 25 February 2015; accepted 20 April 2015; published 22 April 2015
  16. Kishor, N. Ratna. "International Journal of Advance Research in Computer Science and Management Studies." International Journal 2, no. 3 (2014).
  17. http://www.ics.uci.edu/~mlearn/MLRepository.html (data set )
Index Terms

Computer Science
Information Sciences

Keywords

Bayesian classification SVM Classification spam Email KNN classification Spearman correlation Spam Filtering Accuracy F-measure.