Loading...

Unsupervised grammar induction using a parent based constituent context model

Mirroshandel, S. A ; Sharif University of Technology | 2008

392 Viewed
  1. Type of Document: Article
  2. DOI: 10.3233/978-1-58603-891-5-293
  3. Publisher: IOS Press , 2008
  4. Abstract:
  5. Grammar induction is one of attractive research areas of natural language processing. Since both supervised and to some extent semi-supervised grammar induction methods require large treebanks, and for many languages, such treebanks do not currently exist, we focused our attention on unsupervised approaches. Constituent Context Model (CCM) seems to be the state of the art in unsupervised grammar induction. In this paper, we show that the performance of CCM in free word order languages (FWOLs) such as Persian is inferior to that of fixed order languages such as English. We also introduce a novel approach, called parent-based constituent context model (PCCM), and show that by using some history notion of context and constituent information of each span's parent, the performance of CCM, especially in dealing with FWOLs, can be significantly improved. © 2008 The authors and IOS Press. All rights reserved
  6. Keywords:
  7. Computational grammars ; Forestry ; Context modeling ; Fixed-order ; Free word order languages ; Grammar induction ; Semi-supervised ; State of the art ; Treebanks ; Unsupervised approaches ; Natural language processing systems
  8. Source: 18th European Conference on Artificial Intelligence, ECAI 2008, 21 July 2008 through 25 July 2008 ; Volume 178 , 2008 , Pages 293-297 ; 09226389 (ISSN); 978158603891 (ISBN)
  9. URL: https://ebooks.iospress.nl/publication/4378