首页|期刊导航|计算机工程与应用|面向不平衡数据集的语音情感识别研究

面向不平衡数据集的语音情感识别研究

张会云黄鹤鸣

计算机工程与应用2024，Vol.60Issue(4)：122-132,11.

计算机工程与应用2024，Vol.60Issue(4)：122-132,11.DOI:10.3778/j.issn.1002-8331.2209-0099

面向不平衡数据集的语音情感识别研究

Speech Emotion Recognition for Imbalanced Datasets

张会云 ¹黄鹤鸣¹

作者信息

1. 青海师范大学计算机学院,西宁 810008||藏语智能语音信息处理及应用国家重点实验室,西宁 810008
折叠

摘要

Abstract

The sample balance is crucial for machine learning.The importance of certain classes may be higher than its number on the imbalanced datasets.This paper studies the imbalanced datasets for speech emotion recognition.Firstly,the imbalanced baseline datasets EMODB and IEMOCAP are augmented with different signal-to-noise ratios,and the datasets EMODBM and IEMOCAPM are constructed.Secondly,six techniques namely SMOTE,RandomOverSampler,SMOTEENN,ADASYN,TomekLinks and SMOTETomek are adopted to resample the baseline datasets,and the augmented datasets are constructed to achieve the category balance.Thirdly,21-dimensional low-level descriptor features are extracted from the baseline datasets and the augmented datasets.Finally,a novel model MA-CapsNet is proposed to validate the effectiveness of the resampling techniques.The results show that all types of emotion samples are basically balanced after resampling,which makes the learning of the model MA-CapsNet fairer.In addition,the model MA-CapsNet has better robustness on the resampling datasets.

关键词

语音情感识别/重采样/胶囊网络/数据扩充

Key words

speech emotion recognition/resampling/capsule network/data augmentation

分类

信息技术与安全科学

引用本文复制引用

张会云,黄鹤鸣..面向不平衡数据集的语音情感识别研究[J].计算机工程与应用,2024,60(4):122-132,11.

基金项目

国家自然科学基金(62066039) （62066039）

青海省自然科学基金(2022-ZJ-925). （2022-ZJ-925）

计算机工程与应用

OA北大核心CSTPCD

ISSN：1002-8331

访问量10

下载量0

段落导航