计算机应用研究2026,Vol.43Issue(2):353-360,8.DOI:10.19734/j.issn.1001-3695.2025.07.0229
基于全局-局部交互与对比学习的多模态对话情感识别
Global-local interaction with contrastive learning for multimodal emotion recognition in conversations
摘要
Abstract
This paper proposed a multimodal emotion recognition method based on global-local interaction and contrastive learning to address the lack of global guidance,difficulties in cross-modal semantic alignment,and modal learning imbalance in conversational emotion recognition.The method introduced a semantic-guided global-local interaction mechanism,where a global semantic hub directed deep feature fusion through adaptive attention allocation.It further constructed a text-audio-visual tri-modal contrastive learning framework to align and complement modal representations within a shared semantic space.Addi-tionally,it designed a modality balanced optimizer to monitor modal performance and dynamically adjust learning rates,mitiga-ting modal dominance.Experiments on the IEMOCAP and MELD datasets achieve accuracies of 76.09%and 69.66%,with weighted F1-scores of 76.20%and 68.79%,respectively,significantly surpassing existing approaches.The results confirm the method's effectiveness in enhancing multimodal collaboration and emotion recognition.关键词
多模态情感识别/多模态融合/全局-局部交互机制/对比学习/模态平衡优化Key words
multimodal emotion recognition/multimodal fusion/global-local interaction mechanism/contrastive learning/modal balance optimization分类
信息技术与安全科学引用本文复制引用
钮焱,乐颖,李军..基于全局-局部交互与对比学习的多模态对话情感识别[J].计算机应用研究,2026,43(2):353-360,8.基金项目
国家自然科学基金资助项目(62202147) (62202147)