计算机应用与软件2011,Vol.28Issue(4):57-59,90,4.
基于统计机器翻译的汉维词对齐研究
RESEARCH ON STATISTICAL MACHINE TRANSLATION-BASED CHINESE-UYGHUR WORD ALIGNMENT
摘要
Abstract
This paper describes a Chinese-Uyghur word alignment system which is based on statistical machine translation. There are two models in processing procedure of the system:pre-process and word alignment. The pre-process includes Chinese text pre-process and Uyghur text pre-process, in it the pre-process procedure for Uyghur text is as follows:First the Uyghur is transferred to Latin-Uyghur,then the exceptional characters in Latin-Uyghur will be replaced by the unambiguous characters. The implementation process of word alignment is:First, IBM model 1-3 is used,then in combination with the heuristic theory by Och the optinisation is conducted,and a Chinese-Uyghur word alignment system based on statistical machine translation is constructed. Experimental results show that this system is feasible.关键词
词对齐/IBM Modell-3/启发式优化Key words
Word alignment/ IBM Model 1-3/ Heuristic optimisation引用本文复制引用
刘建明,吐尔根·依不拉音,艾山·吾买尔..基于统计机器翻译的汉维词对齐研究[J].计算机应用与软件,2011,28(4):57-59,90,4.基金项目
国家自然科学基金(60663006) (60663006)
国家语委科研项目(MZ115-75).刘建明,硕士生,主研领域:计算机应用及信息处理. (MZ115-75)