| 注册
首页|期刊导航|现代电子技术|基于双语词典的远距离语对无监督神经机器翻译方法

基于双语词典的远距离语对无监督神经机器翻译方法

黄孟钦

现代电子技术2024,Vol.47Issue(7):161-164,4.
现代电子技术2024,Vol.47Issue(7):161-164,4.DOI:10.16652/j.issn.1004-373x.2024.07.028

基于双语词典的远距离语对无监督神经机器翻译方法

Bilingual dictionary based unsupervised neural machine translation method for distant language pairs

黄孟钦1

作者信息

  • 1. 昆明理工大学 信息工程与自动化学院,云南 昆明 650500
  • 折叠

摘要

Abstract

Unsupervised methods,which strives to alleviate the impact of the scarcity of large parallel corpora on the quality of machine translation,have attracted much attention in the field of neural machine translation.However,their translation performances in distant language pairs still need to be improved.Therefore,the translation language model(TLM)is introduced and the Dict-TLM method is proposed.The core idea of this method is to train language models by combining monolingual corpora and unsupervised bilingual dictionaries.Specifically,the model accepts source language sentences and takes them as the input first,and then,unlike the traditional TLM that only accepts parallel corpora,the Dict-TLM model even accepts data from source language sentences processed by unsupervised bilingual dictionaries and takes them as the input.In this input,the proposed model replaces the words that appear in the bilingual dictionary in the source language sentence with the corresponding target language translation words.Importantly,the bilingual dictionary is obtained in an unsupervised manner.The experiment shows that the Dict-TLM improves the BLEU score by 3%in comparison with the traditional unsupervised machine translation in Chinese English language pairs.

关键词

无监督神经机器翻译/远距离语言对/预训练/TLM/双语词典/双语词嵌入

Key words

unsupervised neural machine translation/distant language pairs/pre-training/TLM/bilingual dictionary/bilingual word embedding

分类

信息技术与安全科学

引用本文复制引用

黄孟钦..基于双语词典的远距离语对无监督神经机器翻译方法[J].现代电子技术,2024,47(7):161-164,4.

现代电子技术

OA北大核心CSTPCD

1004-373X

访问量0
|
下载量0
段落导航相关论文