Decision Tree based Supervised Word Sense Disambiguation for Assamese

Jumi Sarmah; Shikhar Kr. Sarma

Call for Paper

August Edition

IJCA solicits high quality original research papers for the upcoming August edition of the journal. The last date of research paper submission is 21 July 2025

Submit your paper

Know more

The week's pick

FORENSIC ANALYSIS FRAMEWORKS FOR ENCRYPTED CLOUD STORAGE INVESTIGATIONS

Joy Awoleye Sarah Mavire Allan Munyira Kelvin Magora

Random Articles

Impact of using Snowflake Schema and Bitmap Index on Data Warehouse Querying

Jan

2018

Customer Complain Detection in E-commerce Platforms using NLP

Dec

2022

Comparative Analysis of Search Algorithms

Jun

2018

Enhanced HMM Speech Emotion Recognition using SVM and Neural Classifier

February

2014

Reseach Article

Decision Tree based Supervised Word Sense Disambiguation for Assamese

by Jumi Sarmah, Shikhar Kr. Sarma

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 141 - Number 1

Year of Publication: 2016

Authors: Jumi Sarmah, Shikhar Kr. Sarma

10.5120/ijca2016909488

Jumi Sarmah, Shikhar Kr. Sarma . Decision Tree based Supervised Word Sense Disambiguation for Assamese. International Journal of Computer Applications. 141, 1 ( May 2016), 42-48. DOI=10.5120/ijca2016909488

@article{ 10.5120/ijca2016909488,

author = { Jumi Sarmah, Shikhar Kr. Sarma },

title = { Decision Tree based Supervised Word Sense Disambiguation for Assamese },

journal = { International Journal of Computer Applications },

issue_date = { May 2016 },

volume = { 141 },

number = { 1 },

month = { May },

year = { 2016 },

issn = { 0975-8887 },

pages = { 42-48 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume141/number1/24752-2016909488/ },

doi = { 10.5120/ijca2016909488 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-06T23:42:22.511505+05:30

%A Jumi Sarmah

%A Shikhar Kr. Sarma

%T Decision Tree based Supervised Word Sense Disambiguation for Assamese

%J International Journal of Computer Applications

%@ 0975-8887

%V 141

%N 1

%P 42-48

%D 2016

%I Foundation of Computer Science (FCS), NY, USA

Abstract

Word Sense Disambiguation (WSD) aims to disambiguate the words which have multiple sense in a context automatically. Sense denotes the meaning of a word and the words which have various meanings in a context are referred as ambiguous words. WSD is vital in many important Natural Language Processing tasks like MT, IR, TC, SP etc. This research paper attempts to propose a supervised Machine Learning approach- Decision Tree for Word Sense Disambiguation task in Assamese language. A Decision Tree is decision model flow-chart like tree structure where each internal node denotes a test, each branch represents result of a test and each leaf holds a sense label. J48 a Java implementation of C4.5 decision tree algorithm is taken for experimentation in our case. A few polysemous words with different real occurrences in Assamese text with manual sense annotation was collected as the training and test dataset. DT algorithm produces average F-measure of .611 when 10-fold crossvalidation evaluation was performed on 10 Assamese ambiguous words.

References

Ide, N. and Véronis, J. 1998. Word sense disambiguation: The state of the art. MIT Press Computational Linguistics Journal, 24(1):1-40.
Sarmah, J. and Sarma, S.K., Survey on Word Sense Disambiguation: an initiative towards an Indo-Aryan Language. Accepted in IJEM, March 2016, ISSN: 2305-3631 (Print), ISSN:2306-5982 (Online)
Linden, K., Word Sense Discovery and Disambiguation Thesis, PUBLICATION No. 37, 2005. ISSN 0355-7170.
https://en.wikipedia.org/wiki/C4.5_algorithm.
Sarmah, J. and Sarma, S.K., Word Sense Disambiguation for Assamese, Accepted in 6th IEEE IACC 2016, Feb 27-28, ISBN: 978-1-4673-8285-4
Borah, P.P., Talukdar, G., Baruah, A., In Proceedings of IEEE IC3I, 2014, Nov 27-29.Pg: 946-950
Singh, R.L., Ghosh, K., Nongmeikapam, K. and Bandyopadhyay, S., A decision tree based Word Sense Disambiguation System in Manipuri Language. Advanced Computing: An International Journal (ACIJ), Vol.5, No.4, July 2014
Kumar, A.M., Rajendran, S., Soman, PK., Tamil Word Sense Disambiguation using support vector machines with rich features. International Journal of Applied Engineering Research, Research India Publications, Volume 9, Number 20, p.7609-7620 (2014)
Haroon, R.P., “Malayalam Word Sense Disambiguation” In Proceedings of IEEE International Computational Intelligence and Computing Research (ICCIC), 2010.
Sinha, M., Reddy R.M.K., Bhattacharyya, P., Pandey, P., Kashyap,L.,www.cfilt.iitb.ac.in/wordnet/webhwn/papers/HindiWSD.pdf
Parameswarappa, S., Target Word Sense Disambiguation system for Kannada language. In Proceedings of 3rd International Conference on Advances in Recent Technologies in Communication and Computing (ARTCom 2011).
Roy, A., Sarkar, S., and Purkayastha, B.S., Knowledge Based Approaches to Nepali Word Sense Disambiguation. International Journal on Natural Language Computing(IJNLC) Vol. 3, No.3, June 2014
Kalita, P. and Barman. AK, Word Sense Disambiguation: A Survey. International Journal Of Engineering And Computer Science ISSN:2319-7242 Volume 4 Issue 5 May 2015, Page No. 11743-11748V
Zampieri, M., A supervised Machine Learning Method for Word Sense Disambiguation of Portuguese Nouns, A Project submitted as part of a program of study for the award of MA Natural Language Processing & Human Language Technology, UNIVERSITY OF WOLVERHAMPTON .
Al_Bayaty, B.F.Z., Joshi, S., International Conference on Emerging Trends in Science and Cutting Edge Technology (ICETSCET-2014) EMPIRICAL IMPLEMENTATION DECISION TREE CLASSIFIER TO WSD PROBLEM.
Dai, W., and Ji, W., A MapReduce Implementation of C4.5 Decision Tree Algorithm, International Journal of Database Theory and Application, Vol 7, No 1(2014), pp 49-60
Han, J., Kamber., M., Pei, J., Third Edition Data Mining Concepts and Techniques– Book Published by Morgan Kaufmann Publishers, ISBN: 978-93--80931-91-3
[18Barman. A.K., A Structured Approach for Building Assamese Corpus: Insights, Applications and Challenges. In Proceedings of the 10th Workshop on Asian Language Resources, pages 21–28, COLING 2012, Mumbai, December 2012.
Sarma, S.K., Gogoi, M., Saikia, U., Medhi, R., Foundation and structure of Developing Assamese WordNet. In Proceedings of 5th International Conference of the Global WordNetAssociation(GWC-2010).

Index Terms

Computer Science

Information Sciences

Keywords

Word Sense Disambiguation Decision Tree Assamese Supervised approach