| 注册
首页|期刊导航|电子科技大学学报|低资源场景下基于联合训练与自训练的跨语言摘要方法

低资源场景下基于联合训练与自训练的跨语言摘要方法

程绍欢 唐煜佳 刘峤 陈文宇

电子科技大学学报2024,Vol.53Issue(5):762-770,9.
电子科技大学学报2024,Vol.53Issue(5):762-770,9.DOI:10.12178/1001-0548.2024173

低资源场景下基于联合训练与自训练的跨语言摘要方法

Cross-Lingual Summarization Method Based on Joint Training and Self-Training in Low-Resource Scenarios

程绍欢 1唐煜佳 1刘峤 1陈文宇1

作者信息

  • 1. 电子科技大学计算机科学与工程学院,成都 611731
  • 折叠

摘要

Abstract

As globalization continues to develop,cross-lingual summarization has become an important topic in natural language processing.In low-resource scenarios,existing methods face challenges such as limited representation transfer and insufficient data utilization.To address these issues,this paper proposes a novel method based on joint training and self-training.Specifically,two models are used to handle the translation and cross-lingual summarization tasks,respectively,which unify the language vector space of the output and avoid the issue of limited representation transfer.Additionally,joint training is performed by aligning the output features and probabilities of parallel training pairs,thereby enhancing semantic sharing between the models.Furthermore,based on joint training,a self-training technique is introduced to generate synthetic data from additional monolingual summary data,effectively mitigating the data scarcity issue of low-resource scenarios.Experimental results demonstrate that this method outperforms existing approaches in multiple low-resource scenarios,achieving significant improvements in ROUGE scores.

关键词

跨语言摘要/联合训练/低资源场景/机器翻译/自训练

Key words

cross-lingual summarization/joint training/low-resource scenarios/machine translation/self-training

分类

信息技术与安全科学

引用本文复制引用

程绍欢,唐煜佳,刘峤,陈文宇..低资源场景下基于联合训练与自训练的跨语言摘要方法[J].电子科技大学学报,2024,53(5):762-770,9.

基金项目

国家自然科学基金企业联合基金重点项目(U22B2061) (U22B2061)

电子科技大学学报

OA北大核心CSTPCD

1001-0548

访问量0
|
下载量0
段落导航相关论文