Loading...
- Type of Document: M.Sc. Thesis
- Language: Farsi
- Document No: 47043 (19)
- University: Sharif University of Technology
- Department: Computer Engineering
- Advisor(s): Ghasem Sani, Gholamreza
- Abstract:
- Grammar induction is one of the research topics of natural language processing. Grammar induction methods can be categorized into three main groups of supervised, semi-supervised, and unsupervised methods. Recently, developing Treebanks in different languages has motivated supervised methods. The main goal of this project has been extracting a dependency grammar based on a dependency Treebank. In a Treebank, the structure of every sentence represented as a dependency tree where the relation between words are specified. In this structure synonym sentences with free word order has the same dependency structure. Because of this property, dependency parsers accuracy does not decrease on Persian language. Two distinct set of dependency parsing approach are graph-based and transition-based methods. In this research we examine these two approaches and demonstrate that graph-based method is better than transition-based methods in Persian language, but transition-based methods has unique feature that makes it superior in some sentences. We also show that a preprocessing on sentences can improve the accuracy for transition-based methods
- Keywords:
- Natural Language Processing ; Dependency Grammar ; Persian Language ; Automatic Grammar Induction
-
محتواي کتاب
- view