东南大学学报(英文版)2017,Vol.33Issue(4):444-447,4.DOI:10.3969/j.issn.1003-7985.2017.04.009
基于深度神经网络的多模态情感识别
Multimodal emotion recognition based on deep neural network
叶佳音 1郑文明 1李阳 1蔡友谊 1崔振1
作者信息
- 1. 东南大学生物科学与医学工程学院,南京210029
- 折叠
摘要
Abstract
In order to increase the accuracy rate of emotion recognition in voice and video,the mixed convolutional neural network (CNN) and recurrent neural network (RNN) are used to encode and integrate the two information sources.For the audio signals,several frequency bands as well as some energy functions are extracted as low-level features by using a sophisticated audio technique,and then they are encoded with a one-dimensional (1D) convolutional neural network to abstract high-level features.Finally,these are fed into a recurrent neural network for the sake of capturing dynamic tone changes in a temporal dimensionality.As a contrast,a two-dimensional (2D) convolutional neural network and a similar RNN are used to capture dynamic facial appearance changes of temporal sequences.The method was used in the Chinese Natural Audio-Visual Emotion Database in the Chinese Conference on Pattern Recognition (CCPR) in 2016.Experimental results demonstrate that the classification average precision of the proposed method is 41.15%,which is increased by 16.62% compared with the baseline algorithm offered by the CCPR in 2016.It is proved that the proposed method has higher accuracy in the identification of emotional information.关键词
情感识别/卷积神经网络/递归神经网络Key words
emotion recognition/convolutional neural network (CNN)/recurrent neural networks (RNN)分类
信息技术与安全科学引用本文复制引用
叶佳音,郑文明,李阳,蔡友谊,崔振..基于深度神经网络的多模态情感识别[J].东南大学学报(英文版),2017,33(4):444-447,4.