Loading...

Designing a Hybrid Approach to Persian-English Machine Translation

Mohammadifar, Davood | 2016

411 Viewed
  1. Type of Document: M.Sc. Thesis
  2. Language: Farsi
  3. Document No: 49653 (19)
  4. University: Sharif University of Technology
  5. Department: Computer Engineering
  6. Advisor(s): Ghasem Sani, Gholamreza
  7. Abstract:
  8. Nowadays, because of growing web and consequently increasing data in different languages, the need for machine translation is inevitable. Machine translators are created to speed up the translation process. Machine translation methods are generally divided into three categories: rule-based, corpus-based, and hybrid. Rule-based machine translation uses grammar for translation, but it needs a complete grammar of language for correct translation. Corpus-based method has many variations. One of those variations is the statistical machine translation which uses probabilistic and statistical rules for translation and nowadays is frequently used. Hybrid machine translation benefits from the advantages of the two other types of translators to improve translation quality. The goal of this project is to design and develop a Persian to English hybrid machine translator. The statistical machine translation plays the main role in all proposed techniques. Three bilingual corpora Hun, Central and Mizan have been studied. Two pre-processing and post-processing methods are used to improve translation quality. Pre-processing methods concern input and post-processing check the output. Lexical modification, separating the components of Persian complex words, reducing the verb and noun forms and changing the Persian word order “Subject-Object-Verb” to English word order “Subject-Verb-Object” are some of pre-processing methods that we use to create similarity between these two languages. Post-processing methods are also based on reordering the basic elements of language. In the best case, BLEU scores enhancement in our hybrid machine translator (TAMAT), in comparison with statistical machine translator are respectively 0.92 (9.8%), 0.34 (1.5%) and 0.46 (4.1%) points for Hun, Central and Mizan, corpora
  9. Keywords:
  10. Natural Language Processing ; Statistical Machine Translation ; Hybrid Methods ; Compilation ; Persian-English Translator

 Digital Object List

 Bookmark

No TOC