Automatic Evaluation of Machine Translation Using Abstract Meaning Representation

Sadeghieh, Hamid; Rezae, Saeed Bahrani, Mohammad

Please enable javascript in your browser.

Automatic Evaluation of Machine Translation Using Abstract Meaning Representation

Sadeghieh, Hamid | 2021

861 Viewed

Type of Document: M.Sc. Thesis
Language: Farsi
Document No: 53944 (31)
University: Sharif University of Technology
Department: Languages and Linguistics Center
Advisor(s): Rezae, Saeed; Bahrani, Mohammad
Abstract:
Machine Translation Quality Evaluation, compared to the other issues dealt with in the field of Natural Language Processing, is faced with the challenge that the repetition of the translation process from the same linguistic form in the source language will not necessarily lead to a unique linguistic form in the target language. Therefore, considering the fact that the Abstract Meaning Representation (AMR) graph is the same for all the sentences of similar meaning, this thesis has been an attempt to extend the efficiency of AMR graphs to the area of Machine Translation Quality Evaluation. The main research question dealt with in the present thesis was whether the similarity of the AMR graphs of the candidate and reference translations, regardless of their lexical-syntactic structure, can provide an acceptable estimation of the quality of candidate translations. Thus, the maximum value of F score, calculated from the matching areas of candidate and reference translations’ AMR graphs (referred to as smatch score), has been considered as the semantic similarity index of the two sentences under study, and as a result, a criterion for the quality of machine translation. In order the examine the efficiency of the proposed methods, an experiment was designed and the smatch:std, smatch:-lablel, smatch:-vars, and smatch:-wsd scores were also computed for the sample data used in the Workshop of Machine Translation held in 2018 (WMT18) for seven different languages into English. The results revealed that one can consider smatch scores of AMR graphs of the candidate and reference translations as an estimation of the quality of translation, particularly if it is intended to evaluate the system-level quality of candidate translations. Moreover, it should be added that the proposed methods have revealed an acceptable performance in comparison to the baseline and even the state-of-the-art metrics in most of the language pairs under investigation, particularly in Czech – English language pair. Furthermore, smatch:-label proposed method has been the most successful method in the automated system-level evaluation of translation quality. Ultimately, it can be concluded that although the performance of the proposed methods is affected by the quality level of candidate translations, just like the baseline metrics, the consistency of the correlation of all the proposed methods, except smatch:-vars, has been greater than that of the baseline and state-of-the art metrics under investigation with the human evaluation method
Keywords:
Abstract Meaning Representation ; Semantic Graph Similarity ; Automated Machine Translation Quality Evaluation ; Machine Translation

Digital Object List

محتواي کتاب
view

Bookmark

فصل 1 : معرفی پژوهش
- 1-1 مقدمه
- 2-1 روش‌های ارزیابی خودکار کیفیت ترجمه
- 3-1 بیان مسئله
- 4-1 هدف و پرسش‌های پژوهش
- 5-1 روش پژوهش
- 6-1 کاربرد و اهمیت پژوهش
- 7-1 ساختار پایان‌نامه
فصل 2 : پیشینه و مبانی نظری پژوهش
- 1-2 مقدمه
- 2-2 کیفیت ترجمه
- 3-2 روش‌های ارزیابی کیفیت ترجمه ماشینی
  - 1-3-2 ارزیابی انسانی کیفیت ترجمه ماشینی
    - 1-1-3-2 معیارهای ارزیابی انسانی مستقیم
      - مفهوم بودن و وفاداری
      - دقت و روان بودن
      - خوانش‌پذیری و قابل‌درک بودن
      - رتبه‌بندی
    - 2-1-3-2 معیارهای ارزیابی انسانی غیرمستقیم
      - پذیرش
      - پس‌ویرایش
      - روش کلوز تست13F
      - وضوح معنا14F
      - دسته‌بندی و تحلیل خطاها
    - 3-1-3-2 مزایا و معایب روش‌های ارزیابی انسانی
  - 2-3-2 ارزیابی خودکار کیفیت ترجمه ماشینی
    - 1-2-3-2 نمونه‌هایی از روش‌های مبتنی بر شباهت واژگانی
    - 2-2-3-2 نمونه‌هایی از روش‌های مبتنی بر فاصله ویرایشی
  - 3-3-2 دستاوردهای روز در زمینه روش‌های مبتنی بر شباهت معنایی
    - 1-3-3-2 RUSE
    - 2-3-3-2 SWSS
    - 3-3-3-2 YiSi
  - 4-3-2 ویژگی‌ها و چالش‌های روش‌های ارزیابی خودکار مرجع‌محور
- 4-2 بازنمایی انتزاعی معنا
  - 1-4-2 مقایسه گراف‌های بازنمایی انتزاعی معنا
  - 2-4-2 بازنمایی انتزاعی معنا در کاربردهای میان‌زبانی
- 5-2 جمع‌بندی
فصل 3 : روش‌شناسی پژوهش
- 1-3 مقدمه
- 2-3 روش ارزیابی خودکار ترجمه به کمک بازنمایی انتزاعی معنا
- 3-3 آزمایش روش پیشنهادی
  - 1-3-3 مشخصات ابزار و محیط آزمایش
  - 2-3-3 نمونه مورد استفاده در آزمایش
  - 3-3-3 ارزیابی مستقیم انسان از نمونه‌ها (داده طلایی)
  - 4-3-3 روش‌های پایه
- 4-3 معیارهای ارزیابی
فصل 4 : نتایج ارزیابی روش‌های پیشنهادی
- 1-4 مقدمه
- 2-4 مقایسه روش‌های پیشنهادی با ارزیابی انسانی در سطح سامانه
- 3-4 مقایسه روش‌های پیشنهادی با ارزیابی انسان در سطح جمله
- 4-4 مقایسه روش‌های پیشنهادی با روش‌های پایه در سطح سامانه
- 5-4 مقایسه روش‌های پیشنهادی با روش‌های پایه در سطح جمله
- 6-4 مقایسه روش‌های پیشنهادی با روش‌های پایه در سطوح مختلف کیفیت ترجمه
- 7-4 بحث و بررسی درباره نتایج پژوهش
- 8-4 جمع‌بندی
فصل 5 : نتیجه‌گیری و پیشنهاد ادامه کار
- 1-5 مروری بر یافته‌های پژوهش
- 2-5 محدودیت‌های پژوهش
- 3-5 نوآوری‌ها و اهمیت یافته‌ها
- 4-5 پیشنهاد برای پژوهش‌های آتی
منابع و مأخذ
واژه‌نامه

Friend's email
Your name
Your email
enter code