计算机工程2025,Vol.51Issue(4):169-177,9.DOI:10.19678/j.issn.1000-3428.0069101
基于记忆胶囊与注意力的语音情感识别
Speech Emotion Recognition Based on Memory Capsules and Attention
摘要
Abstract
In current speech emotion recognition systems,the insufficient extraction of emotional features and inadequate modeling ability of models for complex emotional expressions have resulted in decreased recognition accuracy.This paper proposes a method for speech emotion recognition based on memory capsules and attention to improve the current speech emotion recognition accuracy.First,five features of speech,namely,the Mel Frequency Cepstrum Coefficient(MFCC),Root Mean Square(RMS)of energy,Mel-spectrogram,Zero-Crossing Rate(ZCR),and Chromaticity distribution(CHROMA),are extracted.Next,the first-,second-,and third-order differential dynamics of the MFCC are extracted on the basis of the MFCC features,which are then stitched together.Finally,these features are stacked into the form of one-dimensional vectors,and the classification of speech emotion recognition is completed by introducing the model constructed by the memory capsule and attention mechanism.The experimental results show that the proposed model exhibits enhanced generalization and robustness,which effectively improves the accuracy of speech emotion recognition.The accuracies achieved on three datasets,RAVDESS,EMODB,and IEMOCAP,reached 95.87%,98.82%,and 98.23%,respectively,and the recognition accuracies are effectively improved compared with existing methods.关键词
语音情感识别/特征提取/特征堆叠/记忆胶囊网络/注意力机制Key words
speech emotion recognition/feature extraction/feature stacking/memory capsule network/attention mechanism分类
信息技术与安全科学引用本文复制引用
董红亮,钮焱,孙杨,李军..基于记忆胶囊与注意力的语音情感识别[J].计算机工程,2025,51(4):169-177,9.基金项目
国家自然科学基金(62202147). (62202147)