International Journal of Computer Applications |
Foundation of Computer Science (FCS), NY, USA |
Volume 93 - Number 12 |
Year of Publication: 2014 |
Authors: Riadh Ouersighni |
10.5120/16269-6001 |
Riadh Ouersighni . Robust Rule-based Approach in Arabic Processing. International Journal of Computer Applications. 93, 12 ( May 2014), 31-37. DOI=10.5120/16269-6001
A parsing system is a key element of many computer applications such as Information Retrieval, Knowledge Extraction and automatic translation. This paper presents a robust large-scale parser system for parsing Arabic sentences. From a practical point of view, the system is able to analyze real-world sentences thanks to a wide coverage of its linguistic knowledge that is realized within the DIINAR-MBC European project . The parser is designed for robustness against difficult input that cannot be parsed correctly according to the standard grammar rules in the system, whether it is an extra-grammatical, ill-formed or unexpected input. Most systems use algorithmic approaches to robustness where parsing programs are extended to include heuristics to handle defect cases. This study adopts another solution based on a robust grammar-based approach for parsing. It consists of introducing robust rules in the grammar itself and relaxing constraints if necessary. The parser has been evaluated against real-world sentences and the results were very encouraging. The parser provides 95% coverage.