We apologize for a recent technical issue with our email system, which temporarily affected account activations. Accounts have now been activated. Authors may proceed with paper submissions. PhDFocusTM
CFP last date
20 November 2024
Call for Paper
December Edition
IJCA solicits high quality original research papers for the upcoming December edition of the journal. The last date of research paper submission is 20 November 2024

Submit your paper
Know more
Reseach Article

Authorship Analysis and Identification Techniques: A Review

by Mubin Shaukat Tamboli, Rajesh S. Prasad
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 77 - Number 16
Year of Publication: 2013
Authors: Mubin Shaukat Tamboli, Rajesh S. Prasad
10.5120/13566-1375

Mubin Shaukat Tamboli, Rajesh S. Prasad . Authorship Analysis and Identification Techniques: A Review. International Journal of Computer Applications. 77, 16 ( September 2013), 11-15. DOI=10.5120/13566-1375

@article{ 10.5120/13566-1375,
author = { Mubin Shaukat Tamboli, Rajesh S. Prasad },
title = { Authorship Analysis and Identification Techniques: A Review },
journal = { International Journal of Computer Applications },
issue_date = { September 2013 },
volume = { 77 },
number = { 16 },
month = { September },
year = { 2013 },
issn = { 0975-8887 },
pages = { 11-15 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume77/number16/13566-1375/ },
doi = { 10.5120/13566-1375 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T21:49:03.595185+05:30
%A Mubin Shaukat Tamboli
%A Rajesh S. Prasad
%T Authorship Analysis and Identification Techniques: A Review
%J International Journal of Computer Applications
%@ 0975-8887
%V 77
%N 16
%P 11-15
%D 2013
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Trends in data mining are increasing over the time. Current world is of internet and everything is available over internet, which leads to criminal and malicious activity. So the identity of available content is now a need. Available content is always in the form of text data. Authorship analysis is the statistical study of linguistic and computational characteristics of the written documents of individuals. This paper describes review of various methods for authorship analysis and identification for a set of provided text. Surely research in authorship analysis and identification will continue and even increase over decades. In this article, we put our vision of future authorship analysis and identification with high performance and solution for behavioral feature extraction from set of text documents.

References
  1. Johannes Furnkranz, "A Study using n-gram Feature for Text Categorization", Technical report OEFAI-TR-98-30, 1998
  2. Maria Fernanda Caropreso, "Statistical Prases in Automated Text Categorization," IEI-B4-07-2000. Pisa, IT, (2000).
  3. E. Stamatatos, N. Fakotakis and G. Kokkinakis, "Computer-Based Authorship Attribution Without Lexical Measures" , Kluwer Academic Publishers, Computers and the Humanities 35, 2001, pp 193-214.
  4. Chade-Meng Tan, Yuan-Fang Wang, Chan-Do Lee, "The use of Bigrams to enhance Caegorization," Inf. Process. Manage. 38(4): 529-546 (2002. ).
  5. Kevin Burns, "Bayesian inference in disputed authorship: A case study of cognitive errors and a new system for decision support" Information Sciences 176, 2006 pp1570–1589.
  6. Munirul Mansur, Naushad UzZaman and Mumit Khan, "Analysis of N-Gram Based Text Categorization for Bangla in a Newspaper Corpus," School of Engineering and Computer Science (SECS), BRAC University, 2006.
  7. Prasad, R. S. , U. V. Kulkarni and J. R. Prasad, "A novel Evolutionary Connectionist Text Summarizer (ECTS)", Proceedings of IEEE International Conference on Anti-Counterfeiting, Security and Identification, Aug. 20-22, IEEE Xplore Press, Hong Kong, pp: 606-610. DOI: 10. 1109/ICASID. 2009. 5277003.
  8. Georgia Frantzeskou, Stephen MacDonell, EfstathiosStamatatos, StefanosGritzalis, "Examining the signi?cance of high-level programming features in source code author classi?cation" The Journal of Systems and Software 81, 2008 pp. 447–460.
  9. Vineet Chaoji, Apirak Hoonlor and Boleslaw K. Szymanski, "Recursive Data Mining for Author and Role Identification" Proc. 3rd Annual Information Assurance Workshop ASIA'08, 2008, pp. 53-62.
  10. Moshe Koppel, Jonathan Schler, Shlomo Argamon, "Computational Methods in Authorship Attribution".
  11. B. Rama Krishna, J. Ramesh, "An Efficient Self Constructing Algorithm for Text Categorization" International Journal of Engineering Research & Technology (IJERT) Vol. 1 Issue 7, 2012, ISSN: 2278-0181.
  12. Na Cheng, R. Chandramouli, K. P. Subbalakshm, "Author gender identi?cation from text" Eslevier Digital Investigation 8 (2011), pp 78-88.
  13. Abdur Rahman, Haroon A. Babri, Mehreen Saeed, "Feature Extraction Algorithms for Classification of Text Documents", ICCIT 2012, pp. 231-236.
  14. Daniel Pavelec, Edson Justino, Leonardo V. Batista, and Luiz S. Oliveira, "Author Identi?cation using Writer-Dependent and Writer-Independent Strategies" SAC'08 March 16-20, 2008, ACM 978-1-59593-753-7/08/0003, pp. 414-418.
  15. Abbasi, A. and Chen, H. "Writeprints: A stylometric approach to identity-level identi?cation and similarity detection in cyberspace" ACM Trans. Inf. Syst. 26, 2, Article 7 (March 2008), pp. 1-29.
  16. Rosen-Zvi, M. , Chemudugunta, C. , Grif?ths, T. , Smyth, P. , and Steyvers, M. "Learning author topic models from text corpora" ACM Trans. Inform. Syst. 28(1), Article 4 January 2010, pp. 1-38.
  17. Giacomo Inches, Fabio Crestani, "Online Conversation Mining for Author Characterization and Topic Identi?cation" PIKM'11, October 2011, ACM 978-1-4503-0953-0/11/10.
  18. Farkhund Iqbal, HamadBinsalleeh, Benjamin C. M. Fung, MouradDebbabi, "A uni?ed data mining solution for authorship analysis in anonymous textual communications" Elseveir Pub. , Information Sciences 231 (2013) pp. 98–112.
  19. Jacques Savoy, "Authorship attribution based on a probabilistic topic model," Information Processing and Management 49 (2013) Elsevier Pub. pp. 341–354.
  20. ShlomoArgamon, Marin Sari, Sterling S. Stein, "Style Mining of Electronic Messages for Multiple Authorship Discrimination: First Results," SIGKDD'03, August 2003 pp. 24-27, Washington, DC, USA, ACM 1-58113-737-0/03/0008.
  21. Rong Zheng, "A Framework for Authorship Identi?cation of Online Messages: Writing-Style Features and Classi?cation Techniques," Wiley Periodicals, Inc. , Published online 21 December 2005 ( www. interscience. wiley. com).
  22. Jiexun Li, RongZheng, and Hisinchun Chen, "From Fingerprint to Writeprint," Communication of ACM, April 2006 Vol. 49 No. 4 pp. 76-82.
  23. Prasad, R. S. , U. V. Kulkarni, "Implementation and Evaluation of Evolutionary Connectionist Approaches to Automated Text Summarization," Journal of Computer Science 6 (11) 2010, pp. 1366-1376, ISSN 1549-3636.
Index Terms

Computer Science
Information Sciences

Keywords

Features extraction n-gram lexical structural stylomatric features identification Writeprint.