International Journal of Computer Applications |
Foundation of Computer Science (FCS), NY, USA |
Volume 74 - Number 17 |
Year of Publication: 2013 |
Authors: Ibrahim Imam, Nihal Nounou, Alaa Hamouda, Hebat Allah Abdul Khalek |
10.5120/12980-0237 |
Ibrahim Imam, Nihal Nounou, Alaa Hamouda, Hebat Allah Abdul Khalek . An Ontology-based Summarization System for Arabic Documents (OSSAD). International Journal of Computer Applications. 74, 17 ( July 2013), 38-43. DOI=10.5120/12980-0237
With the problem of increased web resources and the huge amount of information available, the necessity of having automatic summarization systems appeared. Since summarization is needed the most in the process of searching for information on the web, where the user aims at a certain domain of interest according to his query, domain-based summaries would serve the best. Despite the existence of plenty of research work in the domain-based summarization in English, there is lack of them in Arabic due to the shortage of existing knowledge bases. In this paper an Ontology-based Summarization System for Arabic Documents, OSSAD, is introduced. Domain knowledge is extracted from an Arabic corpus and represented by topic related concepts/keywords and the lexical relations among them. The user's query is first expanded by using the Arabic WordNet and then by adding the domain-specific knowledge base to the expansion. For summarization, decision tree algorithm (C4. 5) is used, which was trained by a set of features extracted from the original documents. For the testing dataset, Essex Arabic Summaries Corpus (EASC) was used. Recall Oriented Understudy for Gisting Evaluation (ROUGE) was used to compare OSSAD summaries with the human summaries along with other automatic summarization systems, showing that the proposed approach demonstrated promising results.