计算机工程与科学2025,Vol.47Issue(5):931-939,9.DOI:10.3969/j.issn.1007-130X.2025.05.017
低资源场景下的汉语—传统蒙古语跨语言摘要方法研究
Research on Chinese—traditional Mongolian cross-lingual summarization methods in low-resource scenarios
摘要
Abstract
The cross-langual summarization aims to generating a summary in the target language(such as traditional Mongolian)given a source document in one language(such as Chinese).Typically,traditional multi-task frameworks employ sequence-to-sequence networks,which apply multiple decod-ers,each dedicated to a specific task.However,when documentation is translated from one language into another,the above structures cannot effectively capture and understand the relationships and differences between the two languages due to the different morphological and structural characteristics of both lan-guages.This is particularly evident in the case of traditional Mongolian,where its complex morphological changes and diverse word formation patterns make the learning and processing of language features un-der low-resource conditions challenging.To address this challenge,we propose a cross-lingual summari-zation model that embeds consistency learning into a multi-task framework.Model consistency by calcu-lating the distance metric of the probability distribution difference between the source language summary and the generated target language summary.Subsequently,the cross-lingual summarization model is op-timized under the constraints of both cross-entropy loss and consistency loss.Furthermore,we built a Chinese—Mongolian cross-lingual summarization dataset.The competitive ROUGE scores obtained on this dataset demonstrate the effectiveness of the proposed model in resource-poor conditions.关键词
中—蒙跨语言摘要/一致性学习/低资源Key words
Chinese—Mongolian cross-lingual summarization/consistency learning/low-resource分类
信息技术与安全科学引用本文复制引用
班琪,云静,邓磊..低资源场景下的汉语—传统蒙古语跨语言摘要方法研究[J].计算机工程与科学,2025,47(5):931-939,9.基金项目
国家自然科学基金(62062055) (62062055)
内蒙古高校青年科技英才项目(NJYT24061) (NJYT24061)
内蒙古自治区直属高校基本科研业务费(JY20220249) (JY20220249)