CFP last date
20 January 2025
Reseach Article

Arabic Anaphora Resolution: Corpus of the Holy Qur’an Annotated with Anaphoric Information

by Khadiga M. Seddik, Ali Farghaly, Aly Aly Fahmy
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 124 - Number 15
Year of Publication: 2015
Authors: Khadiga M. Seddik, Ali Farghaly, Aly Aly Fahmy
10.5120/ijca2015905709

Khadiga M. Seddik, Ali Farghaly, Aly Aly Fahmy . Arabic Anaphora Resolution: Corpus of the Holy Qur’an Annotated with Anaphoric Information. International Journal of Computer Applications. 124, 15 ( August 2015), 35-43. DOI=10.5120/ijca2015905709

@article{ 10.5120/ijca2015905709,
author = { Khadiga M. Seddik, Ali Farghaly, Aly Aly Fahmy },
title = { Arabic Anaphora Resolution: Corpus of the Holy Qur’an Annotated with Anaphoric Information },
journal = { International Journal of Computer Applications },
issue_date = { August 2015 },
volume = { 124 },
number = { 15 },
month = { August },
year = { 2015 },
issn = { 0975-8887 },
pages = { 35-43 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume124/number15/22183-2015905709/ },
doi = { 10.5120/ijca2015905709 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T23:14:31.954499+05:30
%A Khadiga M. Seddik
%A Ali Farghaly
%A Aly Aly Fahmy
%T Arabic Anaphora Resolution: Corpus of the Holy Qur’an Annotated with Anaphoric Information
%J International Journal of Computer Applications
%@ 0975-8887
%V 124
%N 15
%P 35-43
%D 2015
%I Foundation of Computer Science (FCS), NY, USA
Abstract

This paper reports on compiling a large Arabic corpus of the Holy Qur'an script, annotated with anaphoric relation and other anaphoric information, providing multi-dimensional feature vector rich with most of basic anaphoric information needed in statistical anaphora resolution systems. About 24,653 personal pronouns are tagged with their antecedents and other anaphoric information like distance between the anaphor and its antecedent in terms of verses, words, and segments, gender, number, person, and other information which can be used to implement the feature vector of a statistical anaphora resolution system. In addition, it describes the compilation of a bank of sentence patterns consisting of 481 antecedent patterns; each pattern represents particular part-of-speech tag corresponding to its antecedent phrase. The aim is to provide a valuable resource that enables future research in Arabic anaphora resolution, and help in future work in analyzing Quran script. Also, it will be a valuable resource that can be used for training and testing anaphora resolution systems, and evaluating.

References
  1. Hammami, S., Belguith, L., and Ben Hamadou, A. (2009). “Arabic Anaphora Resolution: Corpora Annotation with Coreferential Links”. The International Arab Journal of Information Technology, (volume 6), pp. 480-488
  2. Boldrini, E., Puchol-Blasco, M. Navarro, B., Martínez-Barco, P., Vargas-Sierra, C. (2009). “AQA: a multilingual Anaphora annotation scheme for Question Answering”. Procesamiento del Lenguaje Natural, Revista. N. 42 (marzo 2009). ISSN 1135-5948, pp. 97-104
  3. Elghamry, K., Al-Sabbagh, R., El-Zeiny, N. (2007). "Arabic Anaphora Resolution Using Web as Corpus", Proceedings of the seventh conference on language engineering, Cairo, Egypt.
  4. Dukes, K., and Habash, N. (2010). “Morphological Annotation of Quranic Arabic”. Language Resources and Evaluation Conference (LREC). Valletta, Malta.
  5. Ali, M., and Fish, D. (2011). “The Holy Quran English Translation and commentary” [e-book]. USA: Ahmadiyya Islamic Society. Available at: Google Books [Accessed 18 April 2015].
  6. Sharaf A., and Atwell E. S. (2012). “QurAna: Corpus of the Quran annotated with Pronominal Anaphora”. Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC'12). European Language Resources Association (ELRA), Istanbul, Turkey
  7. Mitkov, R., Belguith, L., (1998). “Multilingual robust anaphora resolution”, In Proceedings of the 3rd Conference on Empirical Methods in Natural Language Processing, Granada, Spain, pp. 7-16
  8. Mitkov, R., 1998a. “Evaluating anaphora resolution approaches”. In Proceedings of the Discourse Anaphora and Anaphora Resolution Colloquium (DAARC'2)
  9. Tutin A., Trouilleux F., Clouzot C., Gaussier E., Zaenen A., Rayot S., and Antoniadis G., (2000). “Annotating a Large Corpus with Anaphoric Links”. in Proceedings of the Discourse Anaphora and Reference Resolution Conference, pp. 134-137, UK.
  10. Dukes, K., Atwell, E., and Sharaf, A. (2010). “Syntactic Annotation Guidelines for the Quranic Arabic Dependency Treebank”. Proc LREC'2010, Valetta, Malta
  11. Garside R., Fligestone S. and Botley S. (1997). "Discourse annotation : anaphoric relations in corpora", in R.Garside, G. Leech & A. McEnery (eds), Corpusannotation : Linguistic Information from Text Corpora, London, Longman
  12. Hirschman, L. and Chinchor, N. (1997). “MUC-7 coreference task definition”.In MUC-7 Proceedings. Science Applications International Corporation
  13. Davies S., Poesio M., Bruneseaux F., and Romary L. (1998). “Annotating Coreference in Dialogues: Proposal for a Scheme for MATE”
  14. Seddik, KH., and Farghaly A., (2014). “Anaphora/Coreference Resolution”, to appear in Zitouni I., "Natural Language Processing Approaches to Semitic Languages (Theory and Applications of Natural Language Processing)". Springer-Verlag Berlin Heidelberg, p. 247-277
  15. Seddik M.K., Farghaly A, Fahmy A. (2011) “Arabic anaphora resolution in Holy Qur’an text”. In: Proceedings of ALTIC 2011 conference on Arabic language technology, Alexandria, pp 21–28
  16. Baker, K., Brunner, A., Mitamura, T., Nyberg, E., Svoboda, D., Torrejon, E. (2002). “Pronominal Anaphora Resolution in the KANTOO Multilingual Machine Translation System”, Language Technologies Institute, Carnegie Mellon University
  17. Wilks, Y., (1973). “Preference semantics”. Stanford AI Laboratory memo AIM-206. Stanford University.
  18. Wilks, Y., (1975). "Preference semantics". The formal semantics of natural language ed. by E. Keenan, Cambridge University Press.
Index Terms

Computer Science
Information Sciences

Keywords

Anaphora resolution Arabic language Corpus Quran Pronominal anaphora.