首页|期刊导航|计算机应用研究|基于模态敏感注意力机制的多模态对话模型及应用

基于模态敏感注意力机制的多模态对话模型及应用

杜维朱晓瑛许方敏郑建生朱福喜龚鸣敏李紫玉

计算机应用研究2025，Vol.42Issue(9)：2590-2598,9.

计算机应用研究2025，Vol.42Issue(9)：2590-2598,9.DOI:10.19734/j.issn.1001-3695.2025.02.0043

基于模态敏感注意力机制的多模态对话模型及应用

Multimodal dialogue model and applications based on modality-sensitive attention mechanism

杜维 ¹朱晓瑛 ²许方敏 ³郑建生 ⁴朱福喜 ¹龚鸣敏 ¹李紫玉¹

作者信息

1. 武汉学院信息工程学院,武汉 430212
2. 北京邮电大学网络空间安全学院,北京 100876
3. 北京邮电大学信息与通信工程学院,北京 100876
4. 武汉大学电子信息学院,武汉 430072
折叠

摘要

Abstract

The multimodal dialogue system adopts methods such as Transformer,cross-attention mechanism and pre-trained models to fuse text,speech and video modalities of different granularities and extracts cross-modal features.However,the existing research ignores the sensitive differences of different modal features on classification tasks,resulting in excessive fusion and information redundancy.Regarding the influence of sequential features of multimodal fusion on classification results,this paper proposed the multimodal dialogue model MDM-MSAM(multimodal dialogue model based on modality sensitive attention mechanism).The model was divided into three parts:master-slave mode screening,dual-modal cross-modal fusion,and tri-modal cross-modal fusion.By determining the master-slave modalities and extracting cross-dual-modal features,the model re-fused them with the tri-modal fusion features,then formed the modality-sensitive hierarchical cross-multimodal features.The classification accuracy on MintRec and CMU-MOSI datasets increase by 3.15％and 3.5％respectively compared with the currently best-performing model.The deployment and application of the MDM-MSAM in flow engine-based multi-round dia-logue system achieve good application results.

关键词

多模态对话系统/跨模态特征/敏感差异性/模态敏感注意力机制/主从模态

Key words

multimodal dialogue system/cross-modal features/sensitive differences/modality-sensitive attention mecha-nism/master-slave modality

分类

信息技术与安全科学

引用本文复制引用

杜维,朱晓瑛,许方敏,郑建生,朱福喜,龚鸣敏,李紫玉..基于模态敏感注意力机制的多模态对话模型及应用[J].计算机应用研究,2025,42(9):2590-2598,9.

基金项目

国家自然科学基金资助项目(42374013) （42374013）

北京市自然科学基金资助项目(L234080) （L234080）

武汉学院科研基金年度计划资助项目(JJA202304) （JJA202304）

中国高校产学研创新基金—腾讯科技创新教育专项资助项目(2022TX007) （2022TX007）

计算机应用研究

OA北大核心

ISSN：1001-3695

访问量0

下载量0

段落导航