International Journal of Computer Applications |
Foundation of Computer Science (FCS), NY, USA |
Volume 126 - Number 11 |
Year of Publication: 2015 |
Authors: Aamira Shabnam, Tapashee Tabassum Urmi, Md. Saiful Islam |
10.5120/ijca2015906224 |
Aamira Shabnam, Tapashee Tabassum Urmi, Md. Saiful Islam . A Faster Approach to Sort Unicode Represented Bengali Words. International Journal of Computer Applications. 126, 11 ( September 2015), 29-32. DOI=10.5120/ijca2015906224
Sorting Bengali words, a constituent part of Bengali language processing, Bengali data manipulation and Bengali database system comes up with a lot of challenges. A simple lexicographic ordering based on the Unicode representation does not yield the correct order of Bengali words as the character order in Unicode for Bengali differs from the order suggested by Bangla Academy. Besides, the presence of modifiers, compound characters, dual representation of some characters in Unicode as well as the precedence of vowels have made the situation even more complex. Our study aims to adapt the linguistic order for Unicode represented Bengali text while achieving maximum possible time and space efficiency. In this paper, we propose an approach to sort Bengali texts using popular algorithms with a slight modification in mapping so that it follows the linguistic order of the language and takes no extra memory. Also it shows a run time comparison with the previous works done on this topic.