北京大学学报(自然科学版)2024,Vol.60Issue(3):393-402,10.DOI:10.13209/j.0479-8023.2024.034
基于分层融合策略和上下文信息嵌入的多模态情绪识别
Multimodal Emotion Recognition Based on Hierarchical Fusion Strategy and Contextual Information Embedding
摘要
Abstract
Existing fusion strategies often involve simple concatenation of modal features,disregarding persona-lized fusion requirements based on the characteristics of each modality.Additionally,solely considering the emo-tions of individual utterances in isolation,without accounting for their emotional states within the context,can lead to errors in emotion recognition.To address the aforementioned issues,this paper proposes a multimodal emotion recognition method based on a layered fusion strategy and the incorporation of contextual information.The method employs a layered fusion strategy,progressively integrating different modal features in a hierarchical manner to re-duce noise interference from individual modalities and address inconsistencies in expression across different mo-dalities.It leverages the contextual information to comprehensively analyze the emotional representation of each utterance within the context,enhancing overall emotion recognition performance.In binary emotion classification tasks,the proposed method achieves a 1.54%improvement in accuracy compared with the state-of-the-art(SOTA)model.In multi-class emotion recognition tasks,the F1 score is improved by 2.79%compared to SOTA model.关键词
分层融合/噪声干扰/上下文信息嵌入Key words
hierarchical fusion/noise interference/context information embedding引用本文复制引用
孙明龙,欧阳纯萍,刘永彬,任林..基于分层融合策略和上下文信息嵌入的多模态情绪识别[J].北京大学学报(自然科学版),2024,60(3):393-402,10.基金项目
湖南省自然科学基金(2022JJ30495)和湖南省教育厅重点科研项目(22A0316)资助 (2022JJ30495)