| 注册
首页|期刊导航|计算机技术与发展|面向低资源场景的藏英神经机器翻译研究

面向低资源场景的藏英神经机器翻译研究

张佳亮 群诺 扎西平措 鲜昱恺 李嘉俊

计算机技术与发展2025,Vol.35Issue(11):138-144,7.
计算机技术与发展2025,Vol.35Issue(11):138-144,7.DOI:10.20165/j.cnki.ISSN1673-629X.2025.0169

面向低资源场景的藏英神经机器翻译研究

Research on Tibetan English Neural Machine Translation for Low Resource Scenarios

张佳亮 1群诺 1扎西平措 1鲜昱恺 2李嘉俊1

作者信息

  • 1. 西藏大学 信息科学技术学院,西藏 拉萨 850000||藏语智能全国重点实验室,西藏 拉萨 850000||西藏信息化省部共建协同创新中心,西藏 拉萨 850000
  • 2. 西藏大学 信息科学技术学院,西藏 拉萨 850000
  • 折叠

摘要

Abstract

With the development of deep learning technology,especially the emergence of Large Language Models(LLMs),Tibetan-Mandarin machine translation has made significant progress.However,in the field of Tibetan-Mandarin machine translation,the severe lack of Tibetan-Mandarin parallel corpora has seriously constrained the development of relevant research.To solve this problem,we propose a Tibetan-Mandarin neural translation framework suitable for low-resource scenarios.Specifically,we have first used a pivot language-based pseudo-parallel data generation method to successfully construct a 400 000-entry Tibetan-Mandarin parallel corpus.At the same time,For the issues such as model training instability,model insensitivity to low-frequency words,and poor translation results in low-resource scenarios,we improve the model from three aspects:pre-normalized residual connection,fixed word embedding,and nor-malization of the attention mechanism,thereby improving the model's performance in the low-resource Tibetan-English machine translation task.The experimental results show that compared with the traditional Transformer model,the proposed method improves the BLEU value by 1.68 and 2.25 on the validation set and the test set,respectively.

关键词

低资源/藏英机器翻译/归一化/语料构建/模型优化

Key words

low resources/Tibetan-English machine translation/normalization/corpus construction/model optimization

分类

计算机与自动化

引用本文复制引用

张佳亮,群诺,扎西平措,鲜昱恺,李嘉俊..面向低资源场景的藏英神经机器翻译研究[J].计算机技术与发展,2025,35(11):138-144,7.

基金项目

新一代人工智能国家科技重大专项(2022ZD0116100) (2022ZD0116100)

国家自然基金青年基金(62406257,62406256) (62406257,62406256)

西藏自治区科技计划技术创新引导项目(XZ202501JX0004) (XZ202501JX0004)

西藏大学研究生"高水平人才培养计划"项目(2025-GSP-S137) (2025-GSP-S137)

计算机技术与发展

1673-629X

访问量0
|
下载量0
段落导航相关论文