International Journal of Computer Applications |
Foundation of Computer Science (FCS), NY, USA |
Volume 39 - Number 14 |
Year of Publication: 2012 |
Authors: Liny Varghese, Supriya M.H, K. Poulose Jacob |
10.5120/4891-7383 |
Liny Varghese, Supriya M.H, K. Poulose Jacob . Filtering Template driven spam mails using Vector Space models. International Journal of Computer Applications. 39, 14 ( February 2012), 33-35. DOI=10.5120/4891-7383
Spam became a big problem to the society. Some spammers are using templates for sending spam. To send a particular promotion they create some template and merge the details of receivers with the template. Similarities can find among these mails and easily ignore the forthcoming spam. Most high-volume spam is sent using tools those randomizes parts of the message - subject, body, sender address etc. The general form of the template that the spammer is using can often guess by inspecting the features of messages. Most of the spam filters are either rule based models or Bayesian models. The main objective in this paper is to find out semantic distance and evaluate the applicability of the two information retrieval techniques, Simple Vector Space Models (VSM) and VSM using Rocchio Classification in the spam context. Both methods are using cosine similarities to identify the spam