Arabic Anaphora Resolution: Corpus of the Holy Qur’an Annotated with Anaphoric Information

Khadiga M. Seddik; Ali Farghaly; Aly Aly Fahmy

Call for Paper

March Edition

IJCA solicits high quality original research papers for the upcoming March edition of the journal. The last date of research paper submission is 20 February 2026

Submit your paper

Know more

The week's pick

A Knowledge-Graph–Driven Multimodal Large Model for Semantic Understanding and Controllable Generation of Intangible Cultural Heritage

Jundi Yang Heng Yao

Random Articles

Reseach Article

Arabic Anaphora Resolution: Corpus of the Holy Qur’an Annotated with Anaphoric Information

by Khadiga M. Seddik, Ali Farghaly, Aly Aly Fahmy

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 124 - Number 15

Year of Publication: 2015

Authors: Khadiga M. Seddik, Ali Farghaly, Aly Aly Fahmy

10.5120/ijca2015905709

Khadiga M. Seddik, Ali Farghaly, Aly Aly Fahmy . Arabic Anaphora Resolution: Corpus of the Holy Qur’an Annotated with Anaphoric Information. International Journal of Computer Applications. 124, 15 ( August 2015), 35-43. DOI=10.5120/ijca2015905709

@article{ 10.5120/ijca2015905709,

author = { Khadiga M. Seddik, Ali Farghaly, Aly Aly Fahmy },

title = { Arabic Anaphora Resolution: Corpus of the Holy Qur’an Annotated with Anaphoric Information },

journal = { International Journal of Computer Applications },

issue_date = { August 2015 },

volume = { 124 },

number = { 15 },

month = { August },

year = { 2015 },

issn = { 0975-8887 },

pages = { 35-43 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume124/number15/22183-2015905709/ },

doi = { 10.5120/ijca2015905709 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-06T23:14:31.954499+05:30

%A Khadiga M. Seddik

%A Ali Farghaly

%A Aly Aly Fahmy

%T Arabic Anaphora Resolution: Corpus of the Holy Qur’an Annotated with Anaphoric Information

%J International Journal of Computer Applications

%@ 0975-8887

%V 124

%N 15

%P 35-43

%D 2015

%I Foundation of Computer Science (FCS), NY, USA

Abstract

This paper reports on compiling a large Arabic corpus of the Holy Qur'an script, annotated with anaphoric relation and other anaphoric information, providing multi-dimensional feature vector rich with most of basic anaphoric information needed in statistical anaphora resolution systems. About 24,653 personal pronouns are tagged with their antecedents and other anaphoric information like distance between the anaphor and its antecedent in terms of verses, words, and segments, gender, number, person, and other information which can be used to implement the feature vector of a statistical anaphora resolution system. In addition, it describes the compilation of a bank of sentence patterns consisting of 481 antecedent patterns; each pattern represents particular part-of-speech tag corresponding to its antecedent phrase. The aim is to provide a valuable resource that enables future research in Arabic anaphora resolution, and help in future work in analyzing Quran script. Also, it will be a valuable resource that can be used for training and testing anaphora resolution systems, and evaluating.

References

Hammami, S., Belguith, L., and Ben Hamadou, A. (2009). “Arabic Anaphora Resolution: Corpora Annotation with Coreferential Links”. The International Arab Journal of Information Technology, (volume 6), pp. 480-488
Boldrini, E., Puchol-Blasco, M. Navarro, B., Martínez-Barco, P., Vargas-Sierra, C. (2009). “AQA: a multilingual Anaphora annotation scheme for Question Answering”. Procesamiento del Lenguaje Natural, Revista. N. 42 (marzo 2009). ISSN 1135-5948, pp. 97-104
Elghamry, K., Al-Sabbagh, R., El-Zeiny, N. (2007). "Arabic Anaphora Resolution Using Web as Corpus", Proceedings of the seventh conference on language engineering, Cairo, Egypt.
Dukes, K., and Habash, N. (2010). “Morphological Annotation of Quranic Arabic”. Language Resources and Evaluation Conference (LREC). Valletta, Malta.
Ali, M., and Fish, D. (2011). “The Holy Quran English Translation and commentary” [e-book]. USA: Ahmadiyya Islamic Society. Available at: Google Books [Accessed 18 April 2015].
Sharaf A., and Atwell E. S. (2012). “QurAna: Corpus of the Quran annotated with Pronominal Anaphora”. Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC'12). European Language Resources Association (ELRA), Istanbul, Turkey
Mitkov, R., Belguith, L., (1998). “Multilingual robust anaphora resolution”, In Proceedings of the 3rd Conference on Empirical Methods in Natural Language Processing, Granada, Spain, pp. 7-16
Mitkov, R., 1998a. “Evaluating anaphora resolution approaches”. In Proceedings of the Discourse Anaphora and Anaphora Resolution Colloquium (DAARC'2)
Tutin A., Trouilleux F., Clouzot C., Gaussier E., Zaenen A., Rayot S., and Antoniadis G., (2000). “Annotating a Large Corpus with Anaphoric Links”. in Proceedings of the Discourse Anaphora and Reference Resolution Conference, pp. 134-137, UK.
Dukes, K., Atwell, E., and Sharaf, A. (2010). “Syntactic Annotation Guidelines for the Quranic Arabic Dependency Treebank”. Proc LREC'2010, Valetta, Malta
Garside R., Fligestone S. and Botley S. (1997). "Discourse annotation : anaphoric relations in corpora", in R.Garside, G. Leech & A. McEnery (eds), Corpusannotation : Linguistic Information from Text Corpora, London, Longman
Hirschman, L. and Chinchor, N. (1997). “MUC-7 coreference task definition”.In MUC-7 Proceedings. Science Applications International Corporation
Davies S., Poesio M., Bruneseaux F., and Romary L. (1998). “Annotating Coreference in Dialogues: Proposal for a Scheme for MATE”
Seddik, KH., and Farghaly A., (2014). “Anaphora/Coreference Resolution”, to appear in Zitouni I., "Natural Language Processing Approaches to Semitic Languages (Theory and Applications of Natural Language Processing)". Springer-Verlag Berlin Heidelberg, p. 247-277
Seddik M.K., Farghaly A, Fahmy A. (2011) “Arabic anaphora resolution in Holy Qur’an text”. In: Proceedings of ALTIC 2011 conference on Arabic language technology, Alexandria, pp 21–28
Baker, K., Brunner, A., Mitamura, T., Nyberg, E., Svoboda, D., Torrejon, E. (2002). “Pronominal Anaphora Resolution in the KANTOO Multilingual Machine Translation System”, Language Technologies Institute, Carnegie Mellon University
Wilks, Y., (1973). “Preference semantics”. Stanford AI Laboratory memo AIM-206. Stanford University.
Wilks, Y., (1975). "Preference semantics". The formal semantics of natural language ed. by E. Keenan, Cambridge University Press.

Index Terms

Computer Science

Information Sciences

Keywords

Anaphora resolution Arabic language Corpus Quran Pronominal anaphora.