Loading...

Modeling Persian Language in the Framework of Complex Networks

Sabooni Aghdam, Amir Mahdi | 2016

602 Viewed
  1. Type of Document: M.Sc. Thesis
  2. Language: Farsi
  3. Document No: 48110 (31)
  4. University: Sharif University of Technology
  5. Department: Languages and Linguistics Center
  6. Advisor(s): Bahrani, Mohammad
  7. Abstract:
  8. The interest in analyzing human language with complex networks is on the rise in recent years and a considerable body of research in this area has already been accumulated.However unfortunately, the use of applications of complex networks in Persian Linguistics research is missing. With the goal of introducing complex networks and their applications in this field, two of these applications have been studied in this research. First, we tried to build an inclusive network model, considering two levels of Syntax and Word Cooccurrence, for the Persian Language and provide Linguistics interpretations for them. In addition, by comparing co-occurrence networks of different languages, garnered from parallel corpus of Quran Translations, we examined the possibility of Clustering languages using word co-occurrence networks. The results indicates that this method is successful in clustering languages as the clustering results were acceptable. This shows that the properties of these networks are a reflection of the corresponding languages. Second, as an experiment, considering the shortcomings of n-gram language model, as a result of disregarding long range dependencies in language modeling, we tried to build a new language model using complex networks framework. Also for evaluating this model, we used the measure of Motifs and compared them within real and generated text. The results show the generated text with the new model outperform the text generated from n-gram model by at least 20 percent in motif signature
  9. Keywords:
  10. Complex Network ; Language Model ; Language Network ; Language Clustering ; Collocation Network

 Digital Object List

 Bookmark

...see more