International Journal of Computer Applications |
Foundation of Computer Science (FCS), NY, USA |
Volume 167 - Number 12 |
Year of Publication: 2017 |
Authors: Saima Munir, Qaisar Abbas, Bushra Jamil |
10.5120/ijca2017914492 |
Saima Munir, Qaisar Abbas, Bushra Jamil . Dependency Parsing using the URDU.KON-TB Treebank. International Journal of Computer Applications. 167, 12 ( Jun 2017), 25-31. DOI=10.5120/ijca2017914492
In this paper, we present evaluation of URDU.KON-TB in the dependency parsing domain. The URDU.KON-TB treebank is developed on the bases of the phrase structure and hyper dependency structure which are only functional constituent’s label. Treebank was annotated with three levels of annotation tagset, the semi-semantic POS (SSP), semi-semantic Syntactic (SSS) and Functional (F) tagset and was checked for the Phrase Structure Parsing domain. To evaluate this treebank in the Dependency Parsing domain we have selected MaltParser. To use data in the parser, we have converted the URDU.KON-TB treebank annotated data according to the CONLL format. The compatibility of data to CoNLL is also measured along with usability of data in the dependency parsing domain. To make the data compatible, few assumptions are taken. The converted data is used to evaluate the system by dividing 80% data as training data and 20% data as testing data. We have performed eight experiments. Four experiments are conducted with six different feature models with converted data. The experiments results show URDU.KON-TB treebank is not suitable for the dependency parsing as dependency relation because Head information was missing in the treebank. We then performed four experiments with an assumption based enhancement by adding Head information. The algorithm used to train and test data is Nivre arc-agear algorithm. The new experiments show this treebank data can be used to develop new dependency treebank for Urdu.