Analysis of Random Forest and Naive Bayes for Spam Mail using Feature Selection Catagorization

Rachana Mishra; R. S. Thakur

Call for Paper

September Edition

IJCA solicits high quality original research papers for the upcoming September edition of the journal. The last date of research paper submission is 20 August 2025

Submit your paper

Know more

The week's pick

Assessing LLMs as Cognitive Interpreters of Student Prompts: A Typological Framework

Tadeu da Ponte Matevz Vremec Matej Mertik

Random Articles

Reseach Article

Analysis of Random Forest and Naive Bayes for Spam Mail using Feature Selection Catagorization

by Rachana Mishra, R. S. Thakur

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 80 - Number 3

Year of Publication: 2013

Authors: Rachana Mishra, R. S. Thakur

10.5120/13844-1670

Rachana Mishra, R. S. Thakur . Analysis of Random Forest and Naive Bayes for Spam Mail using Feature Selection Catagorization. International Journal of Computer Applications. 80, 3 ( October 2013), 42-47. DOI=10.5120/13844-1670

@article{ 10.5120/13844-1670,

author = { Rachana Mishra, R. S. Thakur },

title = { Analysis of Random Forest and Naive Bayes for Spam Mail using Feature Selection Catagorization },

journal = { International Journal of Computer Applications },

issue_date = { October 2013 },

volume = { 80 },

number = { 3 },

month = { October },

year = { 2013 },

issn = { 0975-8887 },

pages = { 42-47 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume80/number3/13844-1670/ },

doi = { 10.5120/13844-1670 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-06T21:53:36.340982+05:30

%A Rachana Mishra

%A R. S. Thakur

%T Analysis of Random Forest and Naive Bayes for Spam Mail using Feature Selection Catagorization

%J International Journal of Computer Applications

%@ 0975-8887

%V 80

%N 3

%P 42-47

%D 2013

%I Foundation of Computer Science (FCS), NY, USA

Abstract

Today, internet users are increases Spam mail is the major problem and big challenges for researcher to reduce it . Spam is commonly defined as unsolicited email messages and the goal of spam categorization is to distinguish between spam and legitimate email messages. This paper shows classification of spam mail and solving various problems is related to web space. Many machine learning algorithm are used to classified the spam and legitimate mail. This paper identify the best classification approach using bench mark dataset . The dataset consist of 9324 records and 500 attributes used for (training and testing) to build the model. This paper can play significant role to help eliminate unsolicited commercial e-mail, viruses, Trojans, and worms, as well as frauds perpetrated electronically and other undesired and troublesome e-mail. Three machines learning supervised algorithms namely naive bayes, Random Tree and Random Forest have applied on spam mail dataset using two feature selection algorithms.

References

http://www. dpw. co. santacruz. ca. us/www. santacruzcountyrecycles/Junk_Mail/index. html.
Improvising BayesNet Classifier Using Various Feature Reduction Method for Spam Classification, 1D. Shanmuga Priyaa, 2B. Kavitha, 3R. Naveen Kumar, 4K. Banuroopa 1Dept. of Information Technology, Karpagam University, India,, IJCST Vol. 1, Issue 2, December 2010.
A Novel Approach towards Image Spam Classification, M. Soranamageswari, Dr. C. MeenaInternational Journal of Computer Theory and Engineering, Vol. 3, No. 1, February, 2011 ,1793-8201.
Fulu Li, Mo-han Hsieh, "An empirical study ofclustering behavior of spammers and Group based Anti-spam strategies", CEAS 2006, pp 21-28, 2006.
Dhinaharan Nagamalai, Cynthia. D, Jae Kwang Lee," ANovel Mechanism to defend DDoS attacks caused by spam", International Journal of Smart Home, SERSC, Seoul, July 2007, pp 83-96.
Calton pu, Steve webb: "Observed trends in spam construction techniques: A case study of spam evolution", CEAS 2006, pp 104-112, July 27-28, 2006.
Anirudh Ramachandran, David Dagon, Nick Feamste,"Can DNS-based Blacklists keep up with Bots", CEAS 2006,CA, USA, July 27-28, 2006.
SpamCop , available at http://spamcop. net.
Internet User Forecasts by Country http:// www. etforecasts. Com.
Nigerian fraud mail Gallery http://www. potifos. com/fraud/.
Fairfax Digital http://www. smh. com. au/articles /2004/10/18.
D. Shanmuga priyaa ,b. Kavitha "Improvising Bayes Net classifier using various feature reduction method for spam classification" ,ISSN :0976-8491
Anil. K Jain, Robert P. W, Jianchang Mao "Statistical Pattern Reorganization: A Review", IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 22, NO. 1, JANUARY 2000.
Biao Qin, Yuni Xia Sunil Prabhakar, Yicheng Tu "A Rule-Based Classification Algorithm for Uncertain Data" IEEE International Conference on Data Engineering 2009
Ziqiang Wang, Xia Sun "An Efficient Spam Filtering Algorithm Based on NPE" IEEE International Symposium on Knowledge Acquisition and Modeling Workshop,21-22Dec 2008 pp 1102 – 1104.
http://www. aueb. gr/users/ion/data/PU123ACorpora. tagz.
http://www. aueb. gr/users/ion/data/lingspam_public. tar. gz
Ravi Kiran and Indriyati Atmosukarto , "Spam or Not Spam. That is the question".
David Mertz "Spam Filtering Techniques: Comparing a Half-Dozen Approaches to Eliminating Unwanted Email" August2002Availableat:http://gnosis. cx/publish/programming/filtering-spam. html
David Mertz "Spam Filtering Techniques: Comparing a Half-Dozen Approaches to Eliminating Unwanted Email" August 2002
Available at: http://gnosis. cx/publish/ programming/filtering-spam. html
Vangelis Metsis, Ion Androutsopoulos, Georgios Paliouras,"Spam Filtering with Naive Bayes – Which Naive Bayes?" CEAS 2006 Third Conference on Email and AntiSpam July 27-28, 2006, Mountain View, California USA.
Tommi S. Jaakkola "Machine learning: lecture 7" MIT CSAIL Available at:http://www. ai. mit. edu/courses/6. 867-f04/lectures/lecture-7-ho. pdf.
http://gogoshen. org/ml2005/Journal%20Paper/JournalPaper_Livingston. pdf.
http://en. wikipedia. org/wiki/Naive_bayes.
Benchmarking Attribute Selection Techniques for Discrete Class Data Mining by Mark A Hall and Geoffrey Holmes at: http://www2. computer. org/portal/web/csdl/doi/10. 1109/TKDE/2003. html

Index Terms

Computer Science

Information Sciences

Keywords

spam problem spam classification weka