维吾尔语机器翻译研究综述OACSTPCD
Survey of Uyghur Machine Translation Research
维吾尔语机器翻译作为我国低资源机器翻译研究的重要任务之一,其发展与应用可以更好地促进不同地区和民族之间的文化交流与贸易往来.然而,维吾尔语作为一种黏着性语言,在机器翻译领域存在形态复杂、语料稀缺等问题.近年来,在维吾尔语机器翻译发展的不同阶段,研究人员针对其特点在算法和模型上不断优化与创新,取得了一定的研究成果,但缺乏系统性的综述.全面回顾维吾尔语机器翻译的相关研究,并根据方法的不同将其分为基于规则和实例的维吾尔语机器翻译、基于统计的维吾尔语机器翻译以及基于神经网络的维吾尔语机器翻译3种类型,同时对相关学术活动和语料库资源进行汇总.为进一步探索维吾尔语机器翻译的潜力,采用ChatGPT模型对维吾尔语-汉语机器翻译任务进行初步研究,实验结果表明,在Few-shot情景下,随着示例数的增加,翻译性能先升后降,在10-shot时表现最佳.此外,思维链方法在维吾尔语机器翻译任务中并未展示出更优的翻译能力.最后对维吾尔语机器翻译未来的研究方向进行了展望.
As one of the important tasks in China's low-resource machine translation research,the development and application of Uyghur machine translation can better promote cultural exchanges and trade between different regions and ethnic groups.However,Uyghur,as an adhesive language,has problems such as complex morphology and a scarce corpus in the field of machine translation.In recent years,at different stages of the development of Uyghur machine translation,researchers have optimized and innovated algorithms and models to address its characteristics and achieved various research results;however,no systematic review has been conducted.The paper comprehensively reviews the related research on Uyghur machine translation and categorizes it into three types according to methods used:rule-and example-based Uyghur machine translation,statistics-based Uyghur machine translation,and neural network-based Uyghur machine translation.Related academic activities and corpus resources are also summarized.To further explore the potential of Uyghur machine translation,the ChatGPT model is adopted as a preliminary attempt of the Uyghur-Chinese machine translation task.The experimental results show that in the Few-shot scenario,the translation performance is higher and then decreases with an increase in the number of examples,and the best performance is for 10-shot.Also,the chain-of-thought approach does not demonstrate better translation ability in the Uyghur machine translation task.Finally,future research directions for Uyghur machine translation are proposed.
哈里旦木·阿布都克里木;侯钰涛;姚登峰;阿布都克力木·阿布力孜;陈吉尚
新疆财经大学信息管理学院,新疆 乌鲁木齐 830012北京联合大学信息服务工程重点实验室,北京 100101
计算机与自动化
维吾尔语基于规则和实例的机器翻译统计机器翻译神经机器翻译大语言模型
Uyghurrule-and example-based machine translationstatistical machine translationNeural Machine Translation(NMT)Large Language Model(LLM)
《计算机工程》 2024 (001)
1-16 / 16
国家自然科学基金(61966033,62366050);国家社会科学基金(21BYY106);国家语委一般项目(YB145-25).
评论