We apologize for a recent technical issue with our email system, which temporarily affected account activations. Accounts have now been activated. Authors may proceed with paper submissions. PhDFocusTM
CFP last date
20 December 2024
Reseach Article

A Comparative Study on Arabic Stemmers

by Mohamed Y. Dahab, Asma'a Al Ibrahim, Rihab Al-Mutawa
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 125 - Number 8
Year of Publication: 2015
Authors: Mohamed Y. Dahab, Asma'a Al Ibrahim, Rihab Al-Mutawa
10.5120/ijca2015906129

Mohamed Y. Dahab, Asma'a Al Ibrahim, Rihab Al-Mutawa . A Comparative Study on Arabic Stemmers. International Journal of Computer Applications. 125, 8 ( September 2015), 38-47. DOI=10.5120/ijca2015906129

@article{ 10.5120/ijca2015906129,
author = { Mohamed Y. Dahab, Asma'a Al Ibrahim, Rihab Al-Mutawa },
title = { A Comparative Study on Arabic Stemmers },
journal = { International Journal of Computer Applications },
issue_date = { September 2015 },
volume = { 125 },
number = { 8 },
month = { September },
year = { 2015 },
issn = { 0975-8887 },
pages = { 38-47 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume125/number8/22455-2015906129/ },
doi = { 10.5120/ijca2015906129 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T23:15:31.611057+05:30
%A Mohamed Y. Dahab
%A Asma'a Al Ibrahim
%A Rihab Al-Mutawa
%T A Comparative Study on Arabic Stemmers
%J International Journal of Computer Applications
%@ 0975-8887
%V 125
%N 8
%P 38-47
%D 2015
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Stemming is considered as a pre-processing step in many applications: text mining, information retrieval, machine translation etc. The Arabic language has many special cases or properties that affect stemming or any automatic method, it depends on both inflectional and derivational morphology to produce the various forms of the language words. Many researchers have proposed algorithms to solve the problems of stemming. This paper aims to make a comparison study among the existing Arabic stemmers, the comparison study is based on the methodologies, the usage, main idea, algorithm, the affixes, limitations, output, and the stemmers’ sensitivity for both diacritics and context.

References
  1. Al Ameed, H., Al Ketbi, S., Al Kaabi, A., Al Shebli, K., Al Shamsi, N., Al Nuaimi, N. H., & Al Muhairi, S. S. (2005, September). Arabic light stemmer: A new enhanced approach. In The Second International Conference on Innovations in Information Technology (IIT’05).
  2. Aljlayl, M., & Frieder, O. (2002, November). On Arabic search: improving the retrieval effectiveness via a light stemming approach. In Proceedings of the eleventh international conference on Information and knowledge management(pp. 340-347). ACM.
  3. Al-Nashashibi, May Y., D. Neagu, and Ali Yaghi. "Stemming techniques for Arabic words: A comparative study." Computer Technology and Development (ICCTD), 2010 2nd International Conference on. IEEE, 2010.
  4. Al-Omari, A., & Abuata, B. (2014). ARABIC LIGHT STEMMER (ARS). Journal of Engineering Science and Technology, 9(6), 702-717.
  5. Aqel, Afnan, Sahar Alwadei, and Mohammad Dahab. "Building an Arabic Words Generator." International Journal of Computer Applications 112.14 (2015).
  6. Al-Shammari, E., & Lin, J. (2008, July). A novel Arabic lemmatization algorithm. InProceedings of the second workshop on Analytics for noisy unstructured text data (pp. 113-118). ACM.
  7. Al Sughaiyer, Imad A., and Ibrahim A. Al‐Kharashi. "Arabic morphological analysis techniques: A comprehensive survey." Journal of the American Society for Information Science and Technology 55.3 (2004): 189-213.
  8. Bal, B. K., & Shrestha, P. (2004). A Morphological Analyzer and a stemmer for Nepali. PAN Localization, Working Papers, 2007, 324-331.
  9. Boudlal, A., Lakhouaja, A., Mazroui, A., Meziane, A., Bebah, M. O. A. O., & Shoul, M. (2010). Alkhalil morpho sys1: A morphosyntactic analysis system for arabic texts. In International Arab Conference on Information Technology.
  10. Chen, A., & Gey, F. C. (2002, November). Building an Arabic Stemmer for Information Retrieval. In TREC (Vol. 2002, pp. 631-639).
  11. Darwish, Kareem. "Building a shallow Arabic morphological analyzer in one day." Proceedings of the ACL-02 workshop on Computational approaches to semitic languages. Association for Computational Linguistics, 2002.
  12. El-Beltagy, S. R., & Rafea, A. (2011). An accuracy-enhanced light stemmer for arabic text. ACM Transactions on Speech and Language Processing (TSLP),7(2), 2.
  13. Eldesouki, M. I., Arafa, W., & Darwish, K. (2009). Stemming techniques of Arabic Language: Comparative Study from the Information Retrieval Perspective. The Egyptian Computer Journal, 36(1).
  14. Fautsch, C. and Savoy, J. (2009), Algorithmic stemmers or morphological analysis? An evaluation. J. Am. Soc. Inf. Sci., 60: 1616–1624. doi: 10.1002/asi.21093
  15. Ghwanmeh, S., Kanaan, G., Al-Shalabi, R., & Rabab'ah, S. (2009, August). Enhanced algorithm for extracting the root of Arabic words. In Computer Graphics, Imaging and Visualization, 2009. CGIV'09. Sixth International Conference on (pp. 388-391). IEEE.
  16. Hammo, B. H. (2009). Towards enhancing retrieval effectiveness of search engines for diacritisized Arabic documents. Information retrieval, 12(3), 300-323.
  17. Jafar, Younes, and Karim Bouzoubaa. "Benchmark of Arabic morphological analyzers challenges and solutions." Intelligent Systems: Theories and Applications (SITA-14), 2014 9th International Conference on. IEEE, 2014.
  18. Kadri, Y., & Nie, J. Y. (2006, October). Effective stemming for Arabic information retrieval. In proceedings of the Challenge of Arabic for NLP/MT Conference, Londres, Royaume-Uni.
  19. Kammoun, N. C., Belguith, L. H., & Hamadou, A. B. (2010, June). The MORPH2 new version: A robust morphological analyzer for Arabic texts. InJADT 2010: 10th International Conference on Statistical Analysis of Textual Data.
  20. Kanaan, G., Al-Shalabi, R., Ababneh, M., & Al-Nobani, A. (2008, December). Building an effective rule-based light stemmer for Arabic language to inprove search effectiveness. In Innovations in Information Technology, 2008. IIT 2008. International Conference on (pp. 312-316). IEEE
  21. Kchaou, Z., & Kanoun, S. (2008, December). Arabic stemming with two dictionaries. In Innovations in Information Technology, 2008. IIT 2008. International Conference on (pp. 688-691). IEEE.
  22. Khoja, S., & Garside, R. (1999). Stemming arabic text. Lancaster, UK, Computing Department, Lancaster University.
  23. Larkey, L. S., Ballesteros, L., & Connell, M. E. (2007). Light stemming for Arabic information retrieval. In Arabic computational morphology (pp. 221-243). Springer Netherlands.
  24. Larkey, L. S., & Connell, M. E. (2005). Structured queries, language modeling, and relevance modeling in cross-language information retrieval. Information processing & management, 41(3), 457-473.
  25. Larkey, L. S., Ballesteros, L., & Connell, M. E. (2002, August). Improving stemming for Arabic information retrieval: light stemming and co-occurrence analysis. In Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval (pp. 275-282). ACM.
  26. Otair, M. "Comparative analysis of Arabic stemming algorithms." International Journal of Managing Information Technology (IJMIT) Vol5 (2013): 1-12.
  27. Syiam, M. M., Fayed, Z. T., & Habib, M. B. (2006). An intelligent system for Arabic text categorization. International Journal of Intelligent Computing and Information Sciences, 6(1), 1-19.
  28. Taghva, Kazem, Rania Elkhoury, and Jeffrey Coombs. "Arabic stemming without a root dictionary." Information Technology: Coding and Computing, 2005. ITCC 2005. International Conference on. Vol. 1. IEEE, 2005
  29. Thabet, N. (2004, August). Stemming the Qur'an. In Proceedings of the Workshop on Computational Approaches to Arabic Script-based Languages (pp. 85-88). Association for Computational Linguistics.
Index Terms

Computer Science
Information Sciences

Keywords

Arabic Stemmers Arabic Morphological Analyzer.