首页|期刊导航|计算机应用与软件|基于变分蒸馏的模态联合表示学习

基于变分蒸馏的模态联合表示学习

张亚伟王晶晶李嘉贤周萌南

计算机应用与软件2025，Vol.42Issue(5)：108-115,129,9.

计算机应用与软件2025，Vol.42Issue(5)：108-115,129,9.DOI:10.3969/j.issn.1000-386x.2025.05.016

基于变分蒸馏的模态联合表示学习

MODAL JOINT REPRESENTATION LEARNING BASED ON VARIATIONAL DISTILLATION

张亚伟 ¹王晶晶 ¹李嘉贤 ¹周萌南¹

作者信息

1. 苏州大学计算机科学与技术学院江苏苏州 215006
折叠

摘要

Abstract

In recent years,the research and development of deep learning for multi-mode interaction has attracted extensive attention,among which multi-mode pretraining model is indispensable.However,experiments show that most of these large models are not suitable for single-mode scenarios,and require a large number of multi-mode aligned corpora training which is difficult to obtain,and the number of parameters is too large to deploy.Therefore,this paper proposes a lightweight modal co-encoder MJBERT,which does not need to align multi-mode corpora and focuses on single-mode scenarios.In order to train MJBERT,MJ-KD was designed.The pre-training model Bertlarge and ResNet152 were used as teacher models,and their knowledge was transferred to MJBERT by MJ-KD.Experimental results show that the performance of the proposed MJBERT is equal to or better than that of the benchmark model on multiple tasks in image and text single-modality scenarios.

关键词

知识蒸馏/多模态/变分互信息

Key words

Knowledge distillation/Multi-mode/Mutual information

分类

信息技术与安全科学

引用本文复制引用

张亚伟,王晶晶,李嘉贤,周萌南..基于变分蒸馏的模态联合表示学习[J].计算机应用与软件,2025,42(5):108-115,129,9.

基金项目

国家自然科学基金项目(62006166,62076175,62076176) （62006166,62076175,62076176）

中国博士后科学基金项目(2019 M661930) （2019 M661930）

江苏高校优势学科建设工程自主项目. （）

计算机应用与软件

OA北大核心

ISSN：1000-386X

访问量11

下载量0

段落导航