| 注册
首页|期刊导航|计算机应用研究|基于模态敏感注意力机制的多模态对话模型及应用

基于模态敏感注意力机制的多模态对话模型及应用

杜维 朱晓瑛 许方敏 郑建生 朱福喜 龚鸣敏 李紫玉

计算机应用研究2025,Vol.42Issue(9):2590-2598,9.
计算机应用研究2025,Vol.42Issue(9):2590-2598,9.DOI:10.19734/j.issn.1001-3695.2025.02.0043

基于模态敏感注意力机制的多模态对话模型及应用

Multimodal dialogue model and applications based on modality-sensitive attention mechanism

杜维 1朱晓瑛 2许方敏 3郑建生 4朱福喜 1龚鸣敏 1李紫玉1

作者信息

  • 1. 武汉学院信息工程学院,武汉 430212
  • 2. 北京邮电大学网络空间安全学院,北京 100876
  • 3. 北京邮电大学信息与通信工程学院,北京 100876
  • 4. 武汉大学 电子信息学院,武汉 430072
  • 折叠

摘要

Abstract

The multimodal dialogue system adopts methods such as Transformer,cross-attention mechanism and pre-trained models to fuse text,speech and video modalities of different granularities and extracts cross-modal features.However,the existing research ignores the sensitive differences of different modal features on classification tasks,resulting in excessive fusion and information redundancy.Regarding the influence of sequential features of multimodal fusion on classification results,this paper proposed the multimodal dialogue model MDM-MSAM(multimodal dialogue model based on modality sensitive attention mechanism).The model was divided into three parts:master-slave mode screening,dual-modal cross-modal fusion,and tri-modal cross-modal fusion.By determining the master-slave modalities and extracting cross-dual-modal features,the model re-fused them with the tri-modal fusion features,then formed the modality-sensitive hierarchical cross-multimodal features.The classification accuracy on MintRec and CMU-MOSI datasets increase by 3.15%and 3.5%respectively compared with the currently best-performing model.The deployment and application of the MDM-MSAM in flow engine-based multi-round dia-logue system achieve good application results.

关键词

多模态对话系统/跨模态特征/敏感差异性/模态敏感注意力机制/主从模态

Key words

multimodal dialogue system/cross-modal features/sensitive differences/modality-sensitive attention mecha-nism/master-slave modality

分类

信息技术与安全科学

引用本文复制引用

杜维,朱晓瑛,许方敏,郑建生,朱福喜,龚鸣敏,李紫玉..基于模态敏感注意力机制的多模态对话模型及应用[J].计算机应用研究,2025,42(9):2590-2598,9.

基金项目

国家自然科学基金资助项目(42374013) (42374013)

北京市自然科学基金资助项目(L234080) (L234080)

武汉学院科研基金年度计划资助项目(JJA202304) (JJA202304)

中国高校产学研创新基金—腾讯科技创新教育专项资助项目(2022TX007) (2022TX007)

计算机应用研究

OA北大核心

1001-3695

访问量0
|
下载量0
段落导航相关论文