郑州大学学报(工学版)2025,Vol.46Issue(6):8-14,7.DOI:10.13705/j.issn.1671-6833.2025.03.008
基于伪孪生网络的无监督学习多语言神经机器翻译方法
Unsupervised Learning Multilingual Neural Machine Translation Based on Pseudo-siamese Network
摘要
Abstract
When unsupervised neural machine translation was trained with monolingual data,it inevitably brought a lot of noise information.The errors of the machine translation model accumulated continuously during the training iteration process,affecting the translation effect.To solve this problem,in this study an unsupervised neural ma-chine translation method was proposed based on pseudo-siamese network on the basis of cross-lingual pre-training model(XLM).The model encoder was divided into two modules,in which the pseudo-Siamese network part intro-duced a noise filtering gate mechanism to filter the noise features in the encoding process,so that the model could better learn the mapping relationship between the source language and the target language.The experimental results showed that in the interactive translation task between English,German,French,and Romanian,the proposed method had an average improvement of 3.5 percentage points compared with the baseline system,which proved its superiority in translation effect.Ablation experiments were used to verify the effectiveness of each component of the model.At the same time,the performance test of the method with different noise conditions was simulated in the German-English translation task,and it also showed good noise resistance.关键词
无监督机器翻译/伪孪生网络/单语数据/噪声过滤门机制/跨语言预训练模型Key words
unsupervised machine translation/pseudo-siamese network/monolingual data/noise filtering gate mechanism/cross-language pretraining model分类
信息技术与安全科学引用本文复制引用
都力铭,屈丹,张传财,席阳丽..基于伪孪生网络的无监督学习多语言神经机器翻译方法[J].郑州大学学报(工学版),2025,46(6):8-14,7.基金项目
国家自然科学基金资助项目(62171470) (62171470)
河南省中原科技创新领军人才项目(234200510019) (234200510019)
河南省自然科学基金项目(232300421240) (232300421240)