Designing Spam Model- Classification Analysis using Decision Trees

Shweta Rajput; Amit Arora

Call for Paper

December Edition

IJCA solicits high quality original research papers for the upcoming December edition of the journal. The last date of research paper submission is 20 November 2024

Submit your paper

Know more

The week's pick

Improved Shuffled Frog Leaping Algorithm with Self-Adaptive Shuffling for Fuzzy Logic PD+G Controller Optimization in Robotic Manipulators

Duc Hoang Nguyen

Random Articles

An Empirical Study of Agile Software Development Methodologies: A Sri Lankan Perspective

December

2013

Low-Complexity and High-Quality Image Compression Algorithm for Onboard Satellite

June

2012

Library Management System using RFID

Aug

2020

Secure End to End Data Aggregation using Public Key Encryption in Wireless Sensor Network

July

2015

Reseach Article

Designing Spam Model- Classification Analysis using Decision Trees

by Shweta Rajput, Amit Arora

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 75 - Number 10

Year of Publication: 2013

Authors: Shweta Rajput, Amit Arora

10.5120/13145-0549

Shweta Rajput, Amit Arora . Designing Spam Model- Classification Analysis using Decision Trees. International Journal of Computer Applications. 75, 10 ( August 2013), 6-12. DOI=10.5120/13145-0549

@article{ 10.5120/13145-0549,

author = { Shweta Rajput, Amit Arora },

title = { Designing Spam Model- Classification Analysis using Decision Trees },

journal = { International Journal of Computer Applications },

issue_date = { August 2013 },

volume = { 75 },

number = { 10 },

month = { August },

year = { 2013 },

issn = { 0975-8887 },

pages = { 6-12 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume75/number10/13145-0549/ },

doi = { 10.5120/13145-0549 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-06T21:43:54.033557+05:30

%A Shweta Rajput

%A Amit Arora

%T Designing Spam Model- Classification Analysis using Decision Trees

%J International Journal of Computer Applications

%@ 0975-8887

%V 75

%N 10

%P 6-12

%D 2013

%I Foundation of Computer Science (FCS), NY, USA

Abstract

A spam has diluted the message pool, causing frustration so require an automatic processing of emails. This study is to construct a spam model using classification technique in data mining. To accomplish this, experiments were conducted on spam dataset downloaded from the UCI machine learning repository which was classified using a popular data mining tool called WEKA. The final classification result should be '1' if it is finally spam, otherwise, it should be '0'. Email is popular mode of communication and its users are growing day by day. But, due to social networks and electronic business, most of the emails contain unsolicited bulk e-mail called spam. Several solutions have been proposed to overcome the spam problem, filtering using decision tree classifiers is the one of the most significant techniques. Machine learning classifiers, J48, J48graft and Simple CART were used for classifying spam messages from e-mail. These trees are induced first and then prune sub trees to improve classification accuracy and size of tree. It helps to reduce size, complexity and to achieve better predictive accuracy of final classifier. Grafting is then applied as a post process to an inferred decision tree. Results showed that J48graft had pretty good prediction accuracy as compared to CART and J48 algorithms.

References

J. Quinlan, Simplifying decision trees, Int. J. Human Computer Studies.
SamDrazin and MattMontag, Decision Tree Analysis using WEKA, Machine Learning-Project II, University of Miami.
J. R. Quinlan, Induction of decision trees, Machine Learning, vol. 1, no. 1, pp. 81–106, 1986.
I. Bratko and M. Bohanec, Trading accuracy for simplicity in decision trees, Machine Learning 15, 223-250, 1994.
C4. 5:Programs for Machine Learning. Morgan Kaufmann, 1993, ISBN 1-55860-238-0.
F. Esposito,D. Malerba, and G. Semeraro,A comparative Analysis of Methods for Pruning Decision Trees", IEEE transactions on pattern analysis and machine intelligence, vol. 19(5): pp. 476-491, 1997.
UCI Machine Learning Repository Irvine, CA: University of California, School of Information and Computer Science. Accessed online from http://www. ics. uci. edu/~mlearn/MLRepository. html.
T. M Mitchell. Machine Learning. McGraw-Hill, New York, 1997.
Dipti D. Patil, V. M. Wadhai, J. A. Gokhale. Evaluation of Decision Tree Pruning Algorithms for Complexity and Classification Accuracy, Volume 11– No. 2, December 2010.
Max Bramer," Pre-pruning Classification Trees to reduce Overfitting in Noisy Domains", Faculty of Technology, University of Portsmouth, UK.

Index Terms

Computer Science

Information Sciences

Keywords

Weka Simple CART J48 J48graft Spam filtration Post pruning Pre pruning Classification Grafting