
Introducing an Approach to Build a Phrase Structure Treebank Using Persian Dependency Treebank

Soltanzadeh, Fatemeh | 2014

558 Viewed
  1. Type of Document: M.Sc. Thesis
  2. Language: Farsi
  3. Document No: 45571 (31)
  4. University: Sharif University of Technology
  5. Department: Languages and Linguistics Center
  6. Advisor(s): Bahrani, Mohammad; Eslami, Moharram
  7. Abstract:
  8. Treebanks are useful in many applications of natural language processing such as machine translation, speech recognition, information extraction, and etc. They are also being used in theoretical linguistics to study languages. For instance, they are valuable for development of different syntactic theories, calculation of frequency of syntactic rules, evaluation and comparison of statistical models and etc.
    The most treebanks are based on phrase structure grammar or dependency grammar. In a phrase structure treebank, a sentence is divided to phrases. In this representation, a phrase is composed of several words. But in the dependency treebank, connection between two words is based on the dependency relations.
    The goal of this study is presenting a method to convert a dependency-based syntactic parse tree of a sentence in to an equivalent phrase structure one. We extract Persian conversion rules for various syntactic structures such as coordination, modal and auxiliary verbs, quantifier phrases, compound sentences and etc. Then we use an algorithm besides these rules to obtain phrase structure tree of a sentence from its dependency tree. The results show that the system has achieved F-measure of 94.38% for syntactic parsing of Persian sentences.
  9. Keywords:
  10. Natural Language Processing ; Persian Language ; Corpus ; Treebank ; Phrase Structure Grammar ; Dependency Grammar

 Digital Object List
