| 注册
首页|期刊导航|信息工程大学学报|基于多阶段训练的跨语言摘要技术

基于多阶段训练的跨语言摘要技术

潘航宇 席耀一 周会娟 陈刚 郭志刚

信息工程大学学报2024,Vol.25Issue(2):139-147,9.
信息工程大学学报2024,Vol.25Issue(2):139-147,9.DOI:10.3969/j.issn.1671-0673.2024.02.003

基于多阶段训练的跨语言摘要技术

Cross-Lingual Summarization Technology Based on Multi-stage Training

潘航宇 1席耀一 1周会娟 1陈刚 1郭志刚1

作者信息

  • 1. 信息工程大学,河南郑州 450001
  • 折叠

摘要

Abstract

To solve the problem that the models of cross-lingual summarization(CLS)are poor in the semantic understanding,cross-lingual alignment and text generation,this paper proposes a CLS model based on the multi-stage training.Firstly,the model is trained by the multilingual denoising pre-training task,while learning common language knowledge in Chinese and English.Then,the model is trained by the multilingual machine translation task,simultaneously learning the following three types of abilities,semantic understanding of English,cross-lingual alignment from English to Chinese,and text generation of Chinese.Finally,the model is trained by the CLS task,further learning the above three types of abilities,eventually becoming an excellent English-to-Chinese CLS model.The experimental results show that the CLS performance of the proposed model is significantly improved,and the tasks of multilingual denoising pre-training and multilingual machine translation can both improve CLS performance.Experiments on an English-to-Chinese CLS benchmark dataset show that compared to the optimal performance in many baseline models,this model increases ROUGE-1,ROUGE-2 and ROUGE-L by 45.70%,60.53%and 43.57%,respectively.

关键词

跨语言摘要/多阶段训练/多语言去噪预训练/多语言机器翻译

Key words

cross-lingual summarization/multi-stage training/multilingual denoising pre-training/multilingual machine translation

分类

信息技术与安全科学

引用本文复制引用

潘航宇,席耀一,周会娟,陈刚,郭志刚..基于多阶段训练的跨语言摘要技术[J].信息工程大学学报,2024,25(2):139-147,9.

基金项目

国家社会科学基金资助项目(19CXW027) (19CXW027)

信息工程大学学报

1671-0673

访问量0
|
下载量0
段落导航相关论文