| 注册
首页|期刊导航|计算机科学与探索|基于梯度权重变化训练策略的低资源机器翻译

基于梯度权重变化训练策略的低资源机器翻译

王家琪 朱俊国 余正涛

计算机科学与探索2024,Vol.18Issue(3):731-739,9.
计算机科学与探索2024,Vol.18Issue(3):731-739,9.DOI:10.3778/j.issn.1673-9418.2211078

基于梯度权重变化训练策略的低资源机器翻译

Low-Resource Machine Translation Based on Training Strategy with Changing Gradient Weight

王家琪 1朱俊国 1余正涛1

作者信息

  • 1. 昆明理工大学 信息工程与自动化学院,昆明 650500||昆明理工大学 云南省人工智能重点实验室,昆明 650500
  • 折叠

摘要

Abstract

In recent years,neural network models such as Transformer have achieved significant success in machine translation.However,training these models relies on rich labeled data,posing a challenge for low-resource machine translation due to the limited scale of parallel corpora.This limitation often leads to subpar performance and a sus-ceptibility to overfitting on high-frequency vocabulary,thereby reducing the model's generalization ability on the test set.To alleviate these issues,this paper proposes a strategy of gradient weight modification.Specifically,it sug-gests multiplying the gradients generated for each new batch by a coefficient on top of the Adam algorithm.This co-efficient incrementally increases,aiming to weaken the model's dependence on high-frequency features during early training while maintaining the rapid convergence advantage of the algorithm in the later stages.This paper also out-lines the modified training process,including adjustments and decay of coefficients,to emphasize different aspects at different training stages.The goal of this strategy is to enhance attention to low-frequency vocabulary and prevent the model from overfitting to high-frequency terms.Experimental translation tasks are conducted on three low-resource bilingual datasets,and the proposed method demonstrates improvements of 0.72,1.37,and 1.04 BLEU scores relative to the baseline model on the respective test set.

关键词

神经机器翻译/过拟合/动态梯度权重

Key words

neural machine translation/overfitting/dynamic gradient weight

分类

信息技术与安全科学

引用本文复制引用

王家琪,朱俊国,余正涛..基于梯度权重变化训练策略的低资源机器翻译[J].计算机科学与探索,2024,18(3):731-739,9.

基金项目

国家自然科学基金(62166022,61732005) (62166022,61732005)

云南省科技厅面上项目(202101AT070077) (202101AT070077)

云南省人培项目(KKSY201903018).This work was supported by the National Natural Science Foundation of China(62166022,61732005),the General Program of Yunnan Provincial Department of Science and Technology(202101AT070077),and the Fostering Talents Program of Yunnan Province(KKSY201903018). (KKSY201903018)

计算机科学与探索

OA北大核心CSTPCD

1673-9418

访问量0
|
下载量0
段落导航相关论文