We apologize for a recent technical issue with our email system, which temporarily affected account activations. Accounts have now been activated. Authors may proceed with paper submissions. PhDFocusTM
CFP last date
20 December 2024
Reseach Article

Pattern Taxonomy Mining for Text Categorization

by Neeraj Kesavan, N. Jaisankar, Ramani S.
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 168 - Number 8
Year of Publication: 2017
Authors: Neeraj Kesavan, N. Jaisankar, Ramani S.
10.5120/ijca2017914465

Neeraj Kesavan, N. Jaisankar, Ramani S. . Pattern Taxonomy Mining for Text Categorization. International Journal of Computer Applications. 168, 8 ( Jun 2017), 1-5. DOI=10.5120/ijca2017914465

@article{ 10.5120/ijca2017914465,
author = { Neeraj Kesavan, N. Jaisankar, Ramani S. },
title = { Pattern Taxonomy Mining for Text Categorization },
journal = { International Journal of Computer Applications },
issue_date = { Jun 2017 },
volume = { 168 },
number = { 8 },
month = { Jun },
year = { 2017 },
issn = { 0975-8887 },
pages = { 1-5 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume168/number8/27892-2017914465/ },
doi = { 10.5120/ijca2017914465 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-07T00:15:33.794941+05:30
%A Neeraj Kesavan
%A N. Jaisankar
%A Ramani S.
%T Pattern Taxonomy Mining for Text Categorization
%J International Journal of Computer Applications
%@ 0975-8887
%V 168
%N 8
%P 1-5
%D 2017
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Most of the text mining methods use term-based mining. All those methods are affected by common problems such as synonymy and polysemy. Mining of patterns have more advantage than other term based methods. Pattern Taxonomy Mining can be used to increase the effectiveness in the discovery of useful patterns. In addition to solving the common problems in term based mining, this paper tries to address the low occurring problems as well. Algorithms to deploy patterns and to evolve inner pattern are used to improve the effectiveness of pattern discovery. RCV1 text collection is used for experiments in this paper. Performance and execution of text categorization have significantly enhanced without any lose in the accuracy rate.

References
  1. K. Aas and L. Eikvil, "Text Categorisation: A Survey," Technical Report 941, Norwegian Computing Center, June 1999.
  2. S.T. Dumais, “Improving the Retrieval of Information from External Sources,” Behavior Research Methods, Instruments & Computers, 1991, 23(2), pp. 229-236.
  3. David D. Lewis. 1995. Evaluating and optimizing autonomous text classification systems. In Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval (SIGIR '95), Edward A. Fox, Peter Ingwersen, and Raya Fidel (Eds.). ACM, New York, NY, USA, 246-254.
  4. Lewis, D.D., (2004), The LYRL2004 Distribution of the RCV1-v2 Text Categorization, http://www.jmlr.org/ papers/volume5/lewis04a/lyrl2004_rcv1v2_README.htm, Accessed on June 2016.
  5. Yuefeng Li, Abdulmohsen Algarni, and Ning Zhong. 2010. Mining positive and negative patterns for relevance feature discovery. In Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining (KDD '10). ACM, New York, NY, USA, 753-762.
  6. Christopher D. Manning, Prabhakar Raghavan, and Hinrich Schütze. 2008. Introduction to Information Retrieval. Cambridge University Press, New York, NY, USA.
  7. Manan Parikh, Bharat Chaudhari and Chetna Chand. “A Comparative Study of Sequential Pattern Mining Algorithms,” International Journal of Application or Innovation in Engineering & Management, Volume 2, Issue 2, February 2013, pp. 103-109.
  8. J. Pei, J. Han, and R. Mao, “CLOSET: An Efficient Algorithm for Mining Frequent Closed Itemsets,” Proceedings 2000 ACM-SIGMOD International Workshop Data Mining and Knowledge Discovery (DMKD ’00), pp. 11-20, May 2000.
  9. M.F. Porter, “An Algorithm for Suffix Stripping,” Program, Automated Library and Information Systems, vol. 14, no. 3, pp. 130-137, 1980.
  10. S. Robertson and I. Soboroff. The TREC 2002 filtering track report. In Proceedings of TREC’02, 2002.
  11. Wikipedia, (2007) Precision and Recall, https://en.wikipedia.org/wiki/Precision_and_recall, Accessed on June 2016.
  12. S. t. Wu, Y. Li and Y. Xu, "Deploying Approaches for Pattern Refinement in Text Mining," Sixth International Conference on Data Mining (ICDM'06), Hong Kong, 2006, pp. 1157-1161.
  13. Sheng-Tang Wu, Yuefeng Li, Yue Xu, Binh Pham and Phoebe Chen, "Automatic Pattern-Taxonomy Extraction for Web Mining," Web Intelligence, 2004. WI 2004. Proceedings. IEEE/WIC/ACM International Conference on, 2004, pp. 242-248.
  14. N. Zhong, Y. Li and S. T. Wu, "Effective Pattern Discovery for Text Mining," in IEEE Transactions on Knowledge and Data Engineering, vol. 24, no. 1, pp. 30-44, Jan. 2012.
Index Terms

Computer Science
Information Sciences

Keywords

Pattern Mining Text Categorization Inner Patterns Pattern Taxonomy Useful Information.