Detection of Fraudulent Emails by Authorship Extraction

A. Pandian; Mohamed Abdul Karim

Call for Paper

September Edition

IJCA solicits high quality original research papers for the upcoming September edition of the journal. The last date of research paper submission is 20 August 2025

Submit your paper

Know more

The week's pick

Real-time Synchronization Mechanisms Between Batch-oriented Legacy Systems and Modern Interfaces in the Retirement Domain

Balamurugan Krishnaswamy Gnanasekaran

Random Articles

Article:PID Control of Heat Exchanger System

October

2010

Shared Cryptography with Embedded Session Key for Secret Audio

July

2011

A Holistic Approach to Autonomic Self-Healing Distributed Computing System

August

2013

Study and Analysis of Scientific Scopes and Issues towards Developing an Efficient LECIM

July

2013

Reseach Article

Detection of Fraudulent Emails by Authorship Extraction

by A. Pandian, Mohamed Abdul Karim

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 41 - Number 7

Year of Publication: 2012

Authors: A. Pandian, Mohamed Abdul Karim

10.5120/5551-7619

A. Pandian, Mohamed Abdul Karim . Detection of Fraudulent Emails by Authorship Extraction. International Journal of Computer Applications. 41, 7 ( March 2012), 7-12. DOI=10.5120/5551-7619

@article{ 10.5120/5551-7619,

author = { A. Pandian, Mohamed Abdul Karim },

title = { Detection of Fraudulent Emails by Authorship Extraction },

journal = { International Journal of Computer Applications },

issue_date = { March 2012 },

volume = { 41 },

number = { 7 },

month = { March },

year = { 2012 },

issn = { 0975-8887 },

pages = { 7-12 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume41/number7/5551-7619/ },

doi = { 10.5120/5551-7619 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-06T20:28:58.299059+05:30

%A A. Pandian

%A Mohamed Abdul Karim

%T Detection of Fraudulent Emails by Authorship Extraction

%J International Journal of Computer Applications

%@ 0975-8887

%V 41

%N 7

%P 7-12

%D 2012

%I Foundation of Computer Science (FCS), NY, USA

Abstract

Fraudulent emails can be detected by extraction of authorship information from the contents of emails. This paper presents information extraction based on unique words from the emails. These unique words will be used as representative features to train Radial Basis function (RBF). Final weights are obtained and subsequently used for testing. The percentage of identification of email authorship depends upon number of RBF centers and the type of functional words used for training RBF. One hundred and fifty authors with over one hundred files from the sent folder of Enron email dataset are considered. A total of 300 unique words of number of characters in each word ranging from three to seven are considered. Training and testing of RBF are done by taking different lengths of words. Our simulation shows the effectiveness of the proposed RBF network for email authorship identification. The accuracy of authorship identification ranges from 95% to 97%.

References

Abbasi A. And Chen H, "Applying Authorship Analysis to Extremist-Group Web Forum Messages" IEEE INTELLIGENT SYSTEMS, pp. 67–75, 2005.
David Madigan, Alexander Genkin, David Lewis, Shlomo Argamon, Dmitriy Fradkin, and Li Ye, "Author Identification on the Large Scale", Proc. of The Meeting Of The Classification Society of North America,2005.
Diederich, J. , and Chen, H. 2008. Writeprints, "A stylometric approach to identity-level identification and similarity detection", ACM Transactions on Information Systems (26:2),pp. 7.
Diederich, J. , Kindermann, J. , Leopold, E. and Paass, G. (2003), "Authorship Attribution with Support Vector Machines", Applied Intelligence 19(1), pp. 109-123.
Goodman R. , Hahn M. , Marella M. , Ojar C. , And Westcott S. , "The Use Of Stylometry For Email Author Identification: A Feasibility Study", Proc. Student/Faculty Research Day, CSIS, Pace University, White Plains, NY, pp. 1-7, May 2007.
Klimt B. & Yang Y. , (2004). The Enron corpus: A new dataset for email classification research, In Proceedings of ECML'04, 15th European Conference on Machine Learning, pages 217-226,(2004
Koppel, M. , Schler, J. , Argamon, S. and Messeri, E. , "Authorship Attribution with Thousands of Candidate Authors", in Proc. 29th ACM SIGIR Conference on Research & Development on Information Retrieval, 2006.
Moshe Koppel, Shlomo Argamon, And Anat Rachel Shimoni, "Automatically Categorizing Written Texts By Author Gender", Literary And Linguistic Computation. 17(4):pp. 401-412, 2002.
Pavelec, D. , Justino, E. , And Oliveira, L. S. , "Author Identification Using Stylometric Features", Inteligencia Artificial (11:36), pp. 59-65, 2007.
Peng, F. , Schuurmans, D. , ,Wang, S. , "Augumenting Naive Bayes Text Classifier With Statistical Language Models , Information Retrieval", 7 (3-4), Pp. 317 – 345, 2004.
Stamatatos, E. , Fakotakis, N. , & Kokkinakis, G. , (2000). Automatic text categorization in terms of genre and author. Computational Linguistics, 26(4), 471-495.
Zheng R. , Li J. , Chen H. , Huang Z. , "A Framework For Authorship Identification Of Online Messages: Writing-Style Features And Classification Techniques", Journal of the American Society for Information Science and Technology, 57(3):378–93.
Farkhund Iqbal , Hamad Binsalleeh, Benjamin C. M. Fung, Mourad Debbabi , " Mining writeprints from anonymous e-mails for forensic investigation, Digital Investigation,1 – 9 (2010) .

Index Terms

Computer Science

Information Sciences

Keywords

Email Authorship Identification Spam Word Frequency Radial Basis Function