首页|期刊导航|北京信息科技大学学报（自然科学版）|基于特征融合的多任务视频情感识别模型

基于特征融合的多任务视频情感识别模型

张景浩谷晓燕

北京信息科技大学学报（自然科学版）2023，Vol.38Issue(6)：88-94,7.

北京信息科技大学学报（自然科学版）2023，Vol.38Issue(6)：88-94,7.DOI:12.16508/j.cnki.11-5866/n.2023.06.012

基于特征融合的多任务视频情感识别模型

Multi-task video emotion recognition model based on feature fusion

张景浩 ¹谷晓燕¹

作者信息

1. 北京信息科技大学信息管理学院,北京 100192
折叠

摘要

Abstract

For the problems of inefficient feature extraction,inadequate feature fusion and low prediction accuracy in current audio and video multimodal emotion analysis,a multimodal emotion analysis model based on feature fusion and multi-task learning was proposed.Initially,the pre-trained models,BERT(bidirectional encoder representations from transformers),Wav2Vec(waveform to vector),and CLIP(contrastive language-image pre-training)were used to generate low-order feature representations from text,audio,and images respectively,which were then inputted into a neural network to extract high-order features containing local and temporal characteristics.Subsequently,the proposed attention fusion module was employed to facilitate interactive fusion of the three modes.Finally,by integrating multi-task learning,the accuracy of emotion recognition was enhanced.Experimental results on the public Chinese multimodal dataset CH-SIMS indicate a significant improvement in the accuracy of emotion classification.

关键词

视频多模态/情感识别/注意力机制/多任务学习

Key words

video multimodality/emotion recognition/attention mechanism/multi-task learning

分类

信息技术与安全科学

引用本文复制引用

张景浩,谷晓燕..基于特征融合的多任务视频情感识别模型[J].北京信息科技大学学报（自然科学版）,2023,38(6):88-94,7.

北京信息科技大学学报（自然科学版）

ISSN：1674-6864

访问量0

下载量0

段落导航