International Journal of Computer Applications |
Foundation of Computer Science (FCS), NY, USA |
Volume 75 - Number 10 |
Year of Publication: 2013 |
Authors: Shweta Rajput, Amit Arora |
10.5120/13145-0549 |
Shweta Rajput, Amit Arora . Designing Spam Model- Classification Analysis using Decision Trees. International Journal of Computer Applications. 75, 10 ( August 2013), 6-12. DOI=10.5120/13145-0549
A spam has diluted the message pool, causing frustration so require an automatic processing of emails. This study is to construct a spam model using classification technique in data mining. To accomplish this, experiments were conducted on spam dataset downloaded from the UCI machine learning repository which was classified using a popular data mining tool called WEKA. The final classification result should be '1' if it is finally spam, otherwise, it should be '0'. Email is popular mode of communication and its users are growing day by day. But, due to social networks and electronic business, most of the emails contain unsolicited bulk e-mail called spam. Several solutions have been proposed to overcome the spam problem, filtering using decision tree classifiers is the one of the most significant techniques. Machine learning classifiers, J48, J48graft and Simple CART were used for classifying spam messages from e-mail. These trees are induced first and then prune sub trees to improve classification accuracy and size of tree. It helps to reduce size, complexity and to achieve better predictive accuracy of final classifier. Grafting is then applied as a post process to an inferred decision tree. Results showed that J48graft had pretty good prediction accuracy as compared to CART and J48 algorithms.