南京师范大学学报(工程技术版)2025,Vol.25Issue(2):43-53,11.DOI:10.3969/j.issn.1672-1292.2025.02.004
基于跨模态语义信息交互学习的多模态情感分析
Multimodal Sentiment Analysis Based on Interactive Learning of Semantic Information Across Modalities
摘要
Abstract
Multimodal sentiment analysis methods aim at accurately recognizing,extracting and inferring the sentiment polarity expressed by different modalities.In this paper,aiming at the existing multimodal sentiment analysis methods,there are the existing problems of incomplete feature extraction of semantic information between image and text,and insufficient interactive fusion of feature information between graphic and textual modalities.We propose a multimodal sentiment analysis approach based on interactive learning of cross-modal semantic information(MULTI-SMSI).Firstly,by using DeBERTa as a textual semantic feature extraction tool,we are able to fully utilize its ability to pre-train on large-scale anticipation to efficiently obtain high-quality semantic representations and use ResNet50 to obtain visual semantic feature information in images.Secondly,we adopt the cross-modal mulit-head cross attention,which dynamically assigns the attention weights between the graphic modalities in the hidden state,so that it accurately learns the semantic correlations between the graphic texts during the fusion process.Meanwhile,the self-attention mechanism(self-attention)is utilized to capture the global feature information of the graphic texts respectively,which is further processed through the Transformer encoder to capture the graphic texts and key long-range dependencies within the sequence to obtain richer and more accurate feature information and improve the generalization ability of the model.Finally,the experimental results on two publicly available datasets,MVSA-Single and MVSA-Multiple,show that the model outperforms a series of benchmark models,which proves the effectiveness of the model.关键词
多模态情感分析/跨模态多头交叉注意力机制/自注意力机制/交互学习/视觉语义Key words
multimodal sentiment analysis/cross-modal multi-head cross-attention mechanism/self-attention mechanism/interactive learning/visual semantics分类
计算机与自动化引用本文复制引用
赵雪峰,柏长泽,狄恒西,仲兆满..基于跨模态语义信息交互学习的多模态情感分析[J].南京师范大学学报(工程技术版),2025,25(2):43-53,11.基金项目
国家自然科学基金项目(72174079)、江苏省"青蓝工程"优秀教学团队项目(2022-29). (72174079)