| 注册
首页|期刊导航|计算机工程|基于多策略强化学习的低资源跨语言摘要方法研究

基于多策略强化学习的低资源跨语言摘要方法研究

冯雄波 黄于欣 赖华 高玉梦

计算机工程2024,Vol.50Issue(2):68-77,10.
计算机工程2024,Vol.50Issue(2):68-77,10.DOI:10.19678/j.issn.1000-3428.0067225

基于多策略强化学习的低资源跨语言摘要方法研究

Research on Low-Resource Cross-Lingual Summarization Method Based on Multi-Strategy Reinforcement Learning

冯雄波 1黄于欣 1赖华 1高玉梦1

作者信息

  • 1. 昆明理工大学信息工程与自动化学院,云南 昆明 650504||昆明理工大学云南省人工智能重点实验室,云南 昆明 650504
  • 折叠

摘要

Abstract

Cross-Lingual Summarization(CLS)aims to generate a summary in the target language(such as Chinese)given a source language file(such as Vietnamese).The end-to-end CLS model achieves better performance on large-scale and high-quality labeled data,which are usually constructed using models to machine translate monolingual abstract corpora into CLS corpora.However,the limited performance of low-resource language translation models,introduces noise into the CLS corpus,leading to a decrease in the performance of the CLS model.This paper proposes a low-resource CLS method based on multi-strategy.Using multi-strategy reinforcement learning to solve the training problem of CLS models in low-resource noise training data scenarios,whereby source language summaries are introduced as additional supervisory signals to alleviate the impact of translated noisy target summaries.To learn reinforcement rewards,the correlation and degree of missing words between the source and generated target language abstracts are calculated,thereby optimizing the CLS model under the constraints of cross entropy loss and reinforcement rewards.To verify the performance of the proposed model,a noisy Chinese-Vietnamese CLS corpus is constructed.The experimental results on the Chinese-Vietnamese and Vietnamese-Chinese CLS datasets show that the proposed model has significantly better ROUGE scores than the NCLS baseline model,improving ROUGE-1 by 0.71 and 0.84,respectively,effectively weakening noise interference and enhancing the quality of generated summaries.

关键词

汉语-越南语跨语言摘要/低资源/噪声数据/噪声分析/多策略强化学习

Key words

Chinese-Vietnamese Cross-Lingual Summarization(CLS)/low-resource/noise data/noise analysis/multi-strategy reinforcement learning

分类

信息技术与安全科学

引用本文复制引用

冯雄波,黄于欣,赖华,高玉梦..基于多策略强化学习的低资源跨语言摘要方法研究[J].计算机工程,2024,50(2):68-77,10.

基金项目

国家自然科学基金(U21B2027) (U21B2027)

云南省重大科技专项项目(202202AD080003) (202202AD080003)

云南省基础研究计划面上项目(202201AT070915,202201AT070768) (202201AT070915,202201AT070768)

昆明理工大学"双一流"创建联合专项(202201BE070001-021). (202201BE070001-021)

计算机工程

OA北大核心CSTPCD

1000-3428

访问量0
|
下载量0
段落导航相关论文