A Revised Unicode based Sorting Algorithm for Bengali Texts

Md. Mahfuzur Rahaman

Call for Paper

April Edition

IJCA solicits high quality original research papers for the upcoming April edition of the journal. The last date of research paper submission is 20 March 2026

Submit your paper

Know more

The week's pick

Explainable Hybrid Deep Learning for Automated Diagnosis of Canine Mammary Tumors

Elham Shawky Salama Heba Askr Ashraf Darwish Aboul Ella Hassanien

Random Articles

Reseach Article

A Revised Unicode based Sorting Algorithm for Bengali Texts

by Md. Mahfuzur Rahaman

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 147 - Number 14

Year of Publication: 2016

Authors: Md. Mahfuzur Rahaman

10.5120/ijca2016911305

Md. Mahfuzur Rahaman . A Revised Unicode based Sorting Algorithm for Bengali Texts. International Journal of Computer Applications. 147, 14 ( Aug 2016), 35-40. DOI=10.5120/ijca2016911305

@article{ 10.5120/ijca2016911305,

author = { Md. Mahfuzur Rahaman },

title = { A Revised Unicode based Sorting Algorithm for Bengali Texts },

journal = { International Journal of Computer Applications },

issue_date = { Aug 2016 },

volume = { 147 },

number = { 14 },

month = { Aug },

year = { 2016 },

issn = { 0975-8887 },

pages = { 35-40 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume147/number14/25836-2016911305/ },

doi = { 10.5120/ijca2016911305 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-06T23:51:58.378410+05:30

%A Md. Mahfuzur Rahaman

%T A Revised Unicode based Sorting Algorithm for Bengali Texts

%J International Journal of Computer Applications

%@ 0975-8887

%V 147

%N 14

%P 35-40

%D 2016

%I Foundation of Computer Science (FCS), NY, USA

Abstract

This paper describes a sorting algorithm for Bengali texts which is one of the most vital tasks for Bengali Natural Language Processing. As Unicode is much more preferable than ASCII encoding, we need to use this representation for Bengali Language. But due to some distinct properties of Bengali Language, they cannot be sorted directly using the order in Unicode character scheme. A few works have been done on this topics – some of them are for ASCII encoding whether some are for Unicode. But still they have some drawbacks and still there is no standard to sort Bengali texts. In this paper, we have discussed about the previous approaches and proposing a revised and easier procedure to sort Unicode Bengali texts. We used a mapping to simplify the sorting process. The efficiency depends on the efficiency of the sorting algorithm. This method is able to sort any Unicode Bengali texts. It will also work for Unicode text of any language if we just change the mapping part. So the process is both keyboard and language independent.

References

https://en.wikibooks.org/wiki/Bengali
https://en.wikipedia.org/wiki/Bengali_language
Kenneth Katzner, 'The Languages of the World', Routledge, 1995.
http://www.banglaacademy.org.bd/
https://en.wikipedia.org/wiki/Bangla_Academy
https://en.wikipedia.org/wiki/Bengali_alphabet
http://forum.daffodilvarsity.edu.bd/index.php?topic=11714.0
Md. Ruhul Amin, Asif Mohammed Samir, Madhusodan Chakraborty, Md. Mahfuzur Rahman, “An Efficient Unicode based Sorting Algorithm for Bengali Words”
Aamira Shabnam, Debakar Shamanta Piklu, “An Easily Comprehendible Unicode Based Sorting Algorithm for Bangla Words”
Aamira Shabnam, Tapashee Tabassum Urmi, Md. Saiful Islam, “A Faster Approach to Sort Unicode Represented Bengali Words”
Partha Sarathi Kar, Shantanu Mandal, Labiba Jahan, “An Improved Unicode Based Sorting Algorithm for Bengali Words”
https://en.wikipedia.org/wiki/List_of_languages_by_number_of_native_speakers
Bangla Academy Bengali-English Dictionary, First Edition June, 1994, Bangla Academy, Dhaka, Bangladesh.
Cormen, Thomas and Leiserson, Charles and Rivest,Ronald: “Introduction to Algorithm”, Prentice – Hall of India Private Limited, 1999.
Ellis Horowitz and Sartaz Shani,: "Fundamentals of Computer Algorithm", Galgotia Publications Limited.
Unicode Consortium http://www.unicode.org/charts/PDF/U0980.pdf
Mohammad, Kazi Din: “Adhunik Bangla Byakoron O Rochona”
Rajesh Palit, Md. Abdus Sattar, “Representation of Bangla Characters in the Computer Systems”, Bangladesh Journal of Computer and Information Technology, Vol. 7, No. 1, December, 1999.
Masum, Md. Salahuddin, “Study of Bangla Conjunctive Characters for Recognition”, B.Sc.Engg.Thesis, department of Computer Scince and Engineering, BUET, August 2001.
Deitel and Santry “Advanced Java 2 Platform”, Prentice Hall Publications.
Knuth, Donald “The Art of Computer Programming”, Addison-Wisely Publications, Boston
Samsad Bengali-English Dictionary -http://dsal.uchicago.edu/dictionaries/biswas-bengali/
Ishida, Richard - Bengali script noteshttp://rishida.net/scripts/bengali/

Index Terms

Computer Science

Information Sciences

Keywords

Bengali Word Sorting Bengali Text Sorting Unicode Bengali Text Sorting Bengali Linguistic Sort Bengali Dictionary Sort Bangla Academy Dictionary Based Sort.