CFP last date
20 December 2024
Reseach Article

Keyword Extraction using Semantic Analysis

by Mohamed H. Haggag
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 61 - Number 1
Year of Publication: 2013
Authors: Mohamed H. Haggag
10.5120/9889-4445

Mohamed H. Haggag . Keyword Extraction using Semantic Analysis. International Journal of Computer Applications. 61, 1 ( January 2013), 1-6. DOI=10.5120/9889-4445

@article{ 10.5120/9889-4445,
author = { Mohamed H. Haggag },
title = { Keyword Extraction using Semantic Analysis },
journal = { International Journal of Computer Applications },
issue_date = { January 2013 },
volume = { 61 },
number = { 1 },
month = { January },
year = { 2013 },
issn = { 0975-8887 },
pages = { 1-6 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume61/number1/9889-4445/ },
doi = { 10.5120/9889-4445 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T21:07:51.874144+05:30
%A Mohamed H. Haggag
%T Keyword Extraction using Semantic Analysis
%J International Journal of Computer Applications
%@ 0975-8887
%V 61
%N 1
%P 1-6
%D 2013
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Keywords are list of significant words or terms that best present the document context in brief and relate to the textual context. Extraction models are categorized into either statistical, linguistic, machine learning or a combination of these approaches. This paper introduces a model for extracting keywords based on their relatedness weight among the entire text terms. Strength of terms relationship is evaluated by semantic similarity. Document terms are assigned a weighted metric based on the likeness of their meaning content. Terms that are strongly co-related to each other are highly considered in individual terms semantic similarity. Provision of the overall terms similarity is crucial for defining relevant keywords that most expressing the text in both frequency and weighted likeness. Keywords are recursively evaluated according to their cohesion to each other and to the document context. The proposed model showed enhanced precision and recall extraction values over other approaches.

References
  1. Hunyadi, L. - Keyword extraction: aims and ways today and tomorrow. - In: Proceedings of the Keyword Project: Unlocking Content through Computational Linguistics. 2001.
  2. Y Matsuo, M Ishizuka, - Keyword Extraction from a Single Document Using Word Co-Occurrence Statistical Information- - International Journal on Artificial Intelligence Tools, 2004.
  3. N. Kang, C. Domeniconi, and D. Barbará. - Categorization and keyword identification of unlabeled documents. - In ICDM, pages 677–680. IEEE Computer Society, 2005.
  4. P. D. Turney. - Learning algorithms for key phrase extraction - . Information Retrieval, Springer, 2000.
  5. Y. Liu, B. J. Ciliax, K. Borges, V. Dasigi, A. Ram, S. B. Navathe, and R. Dingledine. - Comparison of two schemes for automatic keyword extraction from MEDLINE for functional gene clustering. - In CSB, pages 394–404. IEEE Computer Society, 2004.
  6. Andres Romero, Fernando Nino, - Keyword extraction using an artificial immune system, - Genetic And Evolutionary Computation Conference, Proceedings of the 9th annual conference on Genetic and evolutionary computation 2007.
  7. A Hulth, J Karlgren, A Jonsson, H Bostrom, L Asker – Automatic Keyword Extraction Using Domain Knowledge, Computational Linguistics and Intelligent Text Processing, 2001 – Springer.
  8. Anette Hulth. 2003a. - Improved automatic keyword extraction given more linguistic knowledge - . In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2003), pages 216– 223, Sapporo, Japan. Association for Computational Linguistics.
  9. Hulth, A. (2004). - Enhancing Linguistically Oriented Automatic Keyword Extraction. - In Proceedings of the human language technology conference/North American chapter of the Association for Computational Linguistics Annual Meeting (HLT/NAACL 2004), Boston, May 2004.
  10. Martinez-Fernandez, J. L. , A. García-Serrano, P. Martínez, and J. Villena, - Automatic Keyword Extraction for News Finder. - LNCS, 2004. 3094.
  11. Gonec Ercan, Llyas Cicekli- - Using lexical chains for keyword extraction - Information Processing and Management: an International Journal, ACM, Volume 43 , Issue 6 (November 2007).
  12. R. Bekkerman, R. El-Yaniv, N. Tishby, and Y. Winter. - Distributional word clusters vs. words for text categorization. - Journal of Machine Learning Research, 3:1183–1208, 2003.
  13. Xinghua Hu; Bin Wu - Automatic Keyword Extraction Using Linguistic Features,- Data Mining Workshops, 2006. ICDM Workshops 2006. Sixth IEEE International Conference, Page(s):19 - 23 Dec. (2006)
  14. H. Frigui and O. Nasraoui. Simultaneous categorization of text documents and identification of cluster-dependent keywords, Apr. 07 2002.
  15. P. Tonella, F. Ricca, E. Pianta, and C. Girardi. Using keyword extraction for web site clustering. In WSE, pages 41–48. IEEE Computer Society, 2003.
  16. C. Fellbaum (Ed. ), WordNet: An electronic lexical database, MIT Press, 1998.
  17. M. Lesk, - Automatic sense disambiguation using machine readable dictionaries: how to tell a pine code from an ice cream cone - in: Proceedings of the 5th annual international conference on Systems documentation, ACM Press, 1986.
  18. Ted Pedersen, Siddharth Patwardhan, and Jason Michelizzi. WordNet::Similarity – measuring the relatedness of concepts. In Proceedings of the Fifth Annual Meeting of the North American Chapter of the Association for Computational Linguistics, Boston, Massachusetts, 2004.
  19. Ted Pedersen, Satanjeev Banerjee, and Siddharth Patwardhan. Maximizing semantic relatedness to perform word sense disambiguation. Technical Report UMSI 2005/25, University of Minnesota Supercomputing Institute, March 2005.
  20. BBC dataset, Machine Learning Group; http://mlg. ucd. ie/.
  21. CLUTO. A Clustering Toolkit. Release 2. 1. http://www-users. cs. umn. edu-/ karypis/cluto.
  22. M. Steinbach, G. Karypis, and V. Kumar. A Comparison of Document Clustering Techniques , KDD Workshop on Text Mining, 2000.
  23. Zhao, Y. , & Karypis, G. Criterion functions for document clustering: Experiments and analysis. Technical Report TR #01–40, Department of Computer Science, University of Minnesota, Minneapolis, MN. (2001).
  24. Junsheng Zhang, Yunchuan Sun, Huilin Wang, and Yanqing He. "Calculating Statistical Similarity between Sentences". Journal of Convergence Information Technology, Volume 6, Number 2. February 2011
  25. Peter Turney, "Extraction of Keyphrases from Text: Evaluation of Four Algorithms", National Research Council of Canada, Canada, October 23, 1997
  26. J. Naveenkumar, "Keyword Extraction through Applying Rules of Association and Threshold Values", International Journal of Advanced Research in Computer and Communication Engineering (IJARCCE), ISSN: 2278–1021, Vol. 1, Issue 5, pp 295-297, July 2012
  27. Jasmeen Kaur and Vishal Gupta, "Effective Approaches for Extraction of Keywords", International Journal of Computer Science Issues (IJCSI), ISSN (Online): 1694-0814, Vol. 7, Issue 6, pp 144-148, November 2010
  28. Philip Resnik, "Semantic Similarity in a Taxonomy: An Information-Based Measure and its Application to Problems of Ambiguity in Natural Language", Journal of Artificial Intelligence Research 11, pp 95-130, 1999
Index Terms

Computer Science
Information Sciences

Keywords

Keywords Extraction Sematic Similarity Semantic Relatedness Semantic Analysis Word Sense