CFP last date
20 December 2024
Reseach Article

An Enhanced Data Mining For Text Clustering

Published on May 2013 by Rupali D. Tajanpure, D. B. Kshirsagar
International Conference on Recent Trends in Engineering and Technology 2013
Foundation of Computer Science USA
ICRTET - Number 2
May 2013
Authors: Rupali D. Tajanpure, D. B. Kshirsagar
d87d9d27-31f7-4b6e-98b1-3357c893f41a

Rupali D. Tajanpure, D. B. Kshirsagar . An Enhanced Data Mining For Text Clustering. International Conference on Recent Trends in Engineering and Technology 2013. ICRTET, 2 (May 2013), 19-23.

@article{
author = { Rupali D. Tajanpure, D. B. Kshirsagar },
title = { An Enhanced Data Mining For Text Clustering },
journal = { International Conference on Recent Trends in Engineering and Technology 2013 },
issue_date = { May 2013 },
volume = { ICRTET },
number = { 2 },
month = { May },
year = { 2013 },
issn = 0975-8887,
pages = { 19-23 },
numpages = 5,
url = { /proceedings/icrtet/number2/11770-1321/ },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Proceeding Article
%1 International Conference on Recent Trends in Engineering and Technology 2013
%A Rupali D. Tajanpure
%A D. B. Kshirsagar
%T An Enhanced Data Mining For Text Clustering
%J International Conference on Recent Trends in Engineering and Technology 2013
%@ 0975-8887
%V ICRTET
%N 2
%P 19-23
%D 2013
%I International Journal of Computer Applications
Abstract

Text mining is based on the statistical analysis of a term, either word or phrase. Statistical analysis of a term frequency captures the importance of the term within document only. Usually in text mining techniques the basic measures like term frequency of a term (word or phrase) is computed to compute the importance of the term in the document. But with statistical analysis, the original semantics of the term may not carry the exact meaning of the term. To overcome this problem, a new framework has been introduced which relies on concept based model approach. The proposed model can efficiently find significant matching and related concepts between documents according to concept based approaches.

References
  1. P. Kingsbury and M. Palmer "Propbank: the next level of treebank". In Proceedings of Treebanks and Lexical Theories, 2003.
  2. C. Fillmore, "The Case for Case Universals in Linguistic Theory", Holt, Rinehart and Winston, 1968.
  3. D. Gildea and D. Jurafsky, "Automatic Labeling of Semantic Roles", Computational Linguistics, vol. 28, no. 3, pp. 245-288, 2002.
  4. S. Pradhan, W. Ward, K. Hacioglu, J. Martin, and D. Jurafsky, "Shallow Semantic Parsing Using Support Vector Machines", Proc. Human Language Technology/North Am. Assoc. for ComputationalLinguistics (HLT/NAACL), 2004.
  5. S. Pradhan, K. Hacioglu, W. Ward, J. H. Martin, and D. Jurafsky, "Semantic Role Parsing: Adding Semantic Structure to Unstructured Text", Proc. Third IEEE Int'l Conf. Data Mining(ICDM), pp. 629-632, 2003.
  6. S. Pradhan, K. Hacioglu, V. Krugler, W. Ward, J. H. Martin,and D. Jurafsky, "Support Vector Learning for Semantic Argument Classification", Machine Learning, vol. 60, nos. 1-3,pp. 11-39, 2005.
  7. S. Shehata, F. Karray, and M. Kamel, "Enhancing Text Clustering Using Concept-Based Mining Model", Proc. Sixth IEEE Int'l Conf. Data Mining (ICDM), 2006.
  8. A. K. Jain and R. C. Dubes, "Algorithms for Clustering Data", PrenticeHall, 1988.
  9. M. Steinbach, G. Karypis, and V. Kumar, "A Comparison of Document Clustering Techniques",Proc. Knowledge Discovery and Data Mining (KDD) Workshop Text Mining, Aug.
Index Terms

Computer Science
Information Sciences

Keywords

Concept-based Mining Model Sentence-based Document-based Corpus-based Concept Analysis