东南大学学报(自然科学版)2017,Vol.47Issue(4):631-636,6.DOI:10.3969/j.issn.1001-0505.2017.04.001
面向中文语音情感识别的改进栈式自编码结构
Improved stacked autoencoder for Chinese speech emotion recognition
摘要
Abstract
An improved stacked autoencoder based on autoencoder, denoising autoencoder and sparse autoencoder is proposed to improve the Chinese speech emotion recognition.The first layer of the structure uses a denoising autoencoder to learn a hidden feature with a larger dimension than the dimension of the input features, and the second layer employs a sparse autoencoder to learn sparse features.Finally, a softmax classifer is applied to classify the features.In the training process, the layer-wise pre-training is used to achieve the purpose of initializing all parameters of the network, and then the whole network is fine-tuned.The experiments on Chinese databases show that the improved stacked autoencoders achieve a better recognition rate than the stacked denoising autoencoders or stacked sparse autoencoders.In addition, the comparative experiments based on CASIA database show that the recognition rate of the structure is improved by 53.7%, 29.8%, 14.3% and 1.9%, respectively, compared with the K-nearest neighbor algorithm, the sparse representation method, the traditional support vector machine and the artificial neural network.The recognition rate of this structure is 1.64% higher than the artificial neural network on the self-recording database.关键词
语音情感识别/改进的栈式自编码/降噪自编码/稀疏自编码Key words
speech emotion recognition/enhanced stacked autoencoder/denoising autoencoder/sparse autoencoder分类
信息技术与安全科学引用本文复制引用
朱芳枚,赵力,梁瑞宇,王青云,邹采荣..面向中文语音情感识别的改进栈式自编码结构[J].东南大学学报(自然科学版),2017,47(4):631-636,6.基金项目
国家自然科学基金资助项目(61375028,61571106,61673108)、江苏省青蓝工程资助项目、江苏省博士后科研资助计划资助项目(1601011B)、江苏省"六大人才高峰"资助项目(2016-DZXX-023)、中国博士后科学基金资助项目(2016M601695). (61375028,61571106,61673108)