Loading...
A Pool-based active learning method for improving farsi-english machine translation system
Bakhshaei, Somayeh ; Sharif University of Technology
435
Viewed
- Type of Document: Article
- DOI: 10.1109/ISTEL.2012.6483099
- Notes: IST 2012
- Notes: CD no.: 1097
- Abstract:
- In this paper we try to alleviate the problem of scares resources for developing Farsi-English Statistical Machine Translation system (SMT). It is done by applying Active Learning (AL) idea to choose more informative sentences to be translated by a human and then be added to the base-line corpus. While using the human translations is worthless in compare to the other approaches of corpus gathering (like automatic approaches), it is more costly too. So, in this way we can improve the translation system with less cost. This is done in intricate to human translator. Applying Active learning idea to a SMT system, changes it to a system which can improve its based-line corpus by asking for the essential data which directly leads to the system improvement. On the other hand, combination of AL idea with SMT is a way of using source side monolingual resources for improving SMT systems which is ignored in the original theory of SMT. Our results for Farsi-English system shows improvement in compare to random sentence selection
- Keywords:
- Component ; Scarece resources ; Active learning ; Farsi-English SMT ; Persian language
- Source: 2012 6th International Symposium on Telecommunications, IST 2012 ; 978-146732073-3
- URL: http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6483099