CFP last date
20 December 2024
Reseach Article

Fuzzy Set Theoretic Approach To Collocation Extraction

by H. S. Dhami, Raj Kishor Bisht
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 5 - Number 3
Year of Publication: 2010
Authors: H. S. Dhami, Raj Kishor Bisht
10.5120/895-1269

H. S. Dhami, Raj Kishor Bisht . Fuzzy Set Theoretic Approach To Collocation Extraction. International Journal of Computer Applications. 5, 3 ( August 2010), 43-49. DOI=10.5120/895-1269

@article{ 10.5120/895-1269,
author = { H. S. Dhami, Raj Kishor Bisht },
title = { Fuzzy Set Theoretic Approach To Collocation Extraction },
journal = { International Journal of Computer Applications },
issue_date = { August 2010 },
volume = { 5 },
number = { 3 },
month = { August },
year = { 2010 },
issn = { 0975-8887 },
pages = { 43-49 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume5/number3/895-1269/ },
doi = { 10.5120/895-1269 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T19:53:19.665395+05:30
%A H. S. Dhami
%A Raj Kishor Bisht
%T Fuzzy Set Theoretic Approach To Collocation Extraction
%J International Journal of Computer Applications
%@ 0975-8887
%V 5
%N 3
%P 43-49
%D 2010
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Fuzzy approach deals with the linguistic properties of elements such as beauty, coldness, hotness etc. Collocations are linguistically motivated. Decision of word combination for being collocation is a linguistic term as merely co-occurrence of word combinations does not signify the presence of collocation. Thus collocation extraction can be made possible by looking its linguistic aspect. In the present paper, an attempt has been made to make two different fuzzy sets of word combinations to be considered for collocations. Mutual information and t-test have been taken as basis for the construction of fuzzy sets. Two fuzzy set theoretical models have been proposed to identify collocations. It has been shown that fuzzy set theoretical approach works very well for collocation extraction. The working data has been based on a corpus of about one million words contained in different novels constituting project Gutenberg available on www.gutenberg.org.

References
  1. Bellman, R.E., Zadeh, L. A. (1970). Decision making in fuzzy environment. Management Science 17(4) 141-164.
  2. Choueka, Y., Klien, T., Neuwitz, E. (1983). Automatic retrieval of frequent idiomatic and collocational expressions in a large corpus. Journal for Literary and Linguistic computing Vol 4, 34-38.
  3. Church Kenneth W., Hanks, Patrick. (1989). Word association norms, mutual information and lexicography. In Proceedings of the 27th meeting of the Association of Computational Linguistics 76-83.
  4. Church, Kenneth W., Gale, William A. (1991). Concordance for parallel text. In proceedings of the seventh annual conference of the UW centre for new OED and text research Oxford 40-62.
  5. Cambridge International Dictionary of Idioms (1998). UK, CUP.
  6. Dunning, Ted. (1993). Accurate methods for the statistics of surprise and coincidence. Computational Linguistics. Vol 19 61-74.
  7. Klir, George J. & Yuan Bo. (2001). Fuzzy sets and fuzzy logic theory and application Prentice Hall of India
  8. Kathleen R. McKeown and Dragomir R. Radev . (2000). Online manual. Available at: http://citeseer.ist.psu.edu/mckeown00collocations.html
  9. Lin, Dekang. (1998). Extracting collocations from text corpora. In first workshop on Computational terminology, Montreal, Canada.
  10. Manning, Christopher D., Schutze Heinrich. (2002). Foundations of Statistical Natural Language Processing, MIT Press.
  11. Smadja, Frank.(1993). Retrieving collocations from text: Xtract. Computational Linguistic Vol 19(1) 143-177.
  12. Fazly, A. , Suzanne S. (2007). “Distinguishing Subtypes of Multiword Expressions Using Linguistically-Motivated Statistical Measures”. In Proceedings of the Workshop on A Broader Perspective on Multiword Expressions, 9–16,
  13. Pecina Pavel. (2005). “An Extensive Empirical Study of Collocation Extraction Methods”. In Proceedings of the ACL Student Research Workshop, 13–18,
  14. Seretan V. , Wehrli E. (2006). “Multilingual Collocation Extraction: Issues and Solutions”. In Proceedings of the Workshop on Multilingual Language Resources and Interoperability, 40–49,
  15. Inkpen Diana Zaiu, Hirst Graeme. (2002). “Acquiring Collocations for Lexical Choice between Near-Synonyms” In Proceedings of the Workshop of the ACL Special Interest Group on the Lexicon. 67-76.
  16. Weeber Marc, Vos Rein. (2000). “Extracting the Lowest-Frequency Words: Pitfalls and Possibilities”. Computational Linguistics Volume 26, Number 3, 301-317.
Index Terms

Computer Science
Information Sciences

Keywords

Collocation Fuzzy set Mutual Information t-test