CFP last date
20 January 2025
Reseach Article

An Encoding Scheme to Support Efficient Searching and Linguistic Sorting for Bengali Texts

by Tareque Mohmud Chowdhury, M. A. Mottalib
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 46 - Number 23
Year of Publication: 2012
Authors: Tareque Mohmud Chowdhury, M. A. Mottalib
10.5120/7108-9811

Tareque Mohmud Chowdhury, M. A. Mottalib . An Encoding Scheme to Support Efficient Searching and Linguistic Sorting for Bengali Texts. International Journal of Computer Applications. 46, 23 ( May 2012), 37-40. DOI=10.5120/7108-9811

@article{ 10.5120/7108-9811,
author = { Tareque Mohmud Chowdhury, M. A. Mottalib },
title = { An Encoding Scheme to Support Efficient Searching and Linguistic Sorting for Bengali Texts },
journal = { International Journal of Computer Applications },
issue_date = { May 2012 },
volume = { 46 },
number = { 23 },
month = { May },
year = { 2012 },
issn = { 0975-8887 },
pages = { 37-40 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume46/number23/7108-9811/ },
doi = { 10.5120/7108-9811 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T20:40:26.708318+05:30
%A Tareque Mohmud Chowdhury
%A M. A. Mottalib
%T An Encoding Scheme to Support Efficient Searching and Linguistic Sorting for Bengali Texts
%J International Journal of Computer Applications
%@ 0975-8887
%V 46
%N 23
%P 37-40
%D 2012
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Most of the known encoding schemes for Bengali language have a common drawback. That is characters order in the encoding scheme is different than the linguistic order. As a result, sorting of Bengali texts as per encoded value does not sort them in correct linguistic order. Even if Bengali characters are encoded in linguistic order, because of special properties of Bengali conjunct character, Bengali text can not be sorted directly using only traditional sorting algorithms. In this paper we proposed an encoding scheme for Bengali script which supports sorting of texts by sorting them as per encoded value. Thus the new encoding scheme can save significant amount of processing time for sort operations over large volume of Bengali texts.

References
  1. Bengali script code block, Unicode consortium http://www. unicode. org/charts/PDF/U0980. pdf
  2. M. R. Amin, A. M. Samir, M. Chakraborty, and M. M. Rahman, "An efficient Unicode based Sorting Algorithm for Bengali Words", International Journal of Computer Applications (0975-8887), volume 24-No. 7, Jun 2011.
  3. M. A. Rahman and M. A. Sattar, "A New Approach to Sort Unicode Bengali Text", Proceedings of 5th International Conference on Computer and Information Technology, ICECE 2008, Dhaka, Bangladesh, pp. 628-630.
  4. S. M. Emrul Islam and M. M. Ali, "An Approach to Sort Unicode Bengali text using ancillary maps", Asian Journal of Information Technology, 4(10) pp: 890-894, 2005
  5. M. M. Murshed and M. Kaykobad, "Linguistically Sorting Bengali Texts: A Case Study of Multilingual Applications", Proceedings of the 9th International Conference of the Information Resources Management Association, Boston, Massachusetts, USA, pp. 795—797, 1998.
  6. M A. Mottalib, "Development of a Bengali Word/Information Processor - A First Approach", M. Sc. Thesis Work, Asian Institute of Technology, 1984
  7. M. F. Zibran, A. Tanvir, R. Sammi and M. A. Sattar, "Computer Representation of Bangla characters and Sorting of Bangla words". Proceedings of International Conference on Computer and Information Technology, Dhaka, 2002, pp. 191-195.
  8. M. H. Khan, S. M. R. Haque, M. S. Uddin, R. Khan, and A. B. M. T. Islam, "An Efficient and Correct Bangla Sorting Algorithm", Proceedings of 7th International Conference on Computer and Information Technology, Dhaka, 2004, pp. 125-129.
  9. M. S. Rahman and M. Z. Iqbal, "Bangla sorting algorithm: A linguistic approach". Proceedings of International Conference on Computer and Information Technology, Dhaka, 1998, pp. 204-208.
  10. Accuracy and Performance evolution of proposed encoding scheme, http://www. banglacomputing. info
  11. Bangla Abhidhan, Bangla Academy, Bangladesh (ISBN 984-07-4642-1)
Index Terms

Computer Science
Information Sciences

Keywords

Bengali Character Encoding Bengali Text Linguistic Sort Bengali Text Search