首页|期刊导航|自动化与信息工程|基于多尺度卷积和多头自注意力的语音情感识别模型

基于多尺度卷积和多头自注意力的语音情感识别模型

钟善机张学习陈楚嘉高学秋陶杰

自动化与信息工程2024，Vol.45Issue(4)：36-41,49,7.

自动化与信息工程2024，Vol.45Issue(4)：36-41,49,7.DOI:10.3969/j.issn.1674-2605.2024.04.006

基于多尺度卷积和多头自注意力的语音情感识别模型

Speech Emotion Recognition Model Based on Multi-scale Convolution and Multi-head Self-attention

钟善机 ¹张学习 ¹陈楚嘉 ¹高学秋 ¹陶杰¹

作者信息

1. 广东工业大学,广东广州 510006
折叠

摘要

Abstract

A speech emotion recognition model based on multi-scale convolution and multi head self attention(MCNN-MHA)is proposed to address the problem of traditional convolutional neural networks being unable to fully capture temporal and frequency domain details in speech emotion recognition.Firstly,a multi-scale convolutional neural network is used to convolve the input at different scales,obtaining features in different time and frequency domains;Then,a multi head self attention mechanism is introduced to automatically learn relevant and important features in speech signals,and to focus on the subspaces of different features to enhance the perception ability of important features;Utilize the frequency domain mask and time domain mask in SpecAugment to enhance data samples and improve the generalization and robustness of the model.The experimental results showed that the MCNN-MHA model achieved an accuracy of 90.35%on the RAVDESS dataset.

关键词

语音情感识别/多尺度卷积神经网络/多头自注意力机制/SpecAugment

Key words

speech emotion recognition/multi-scale convolution neural network/multi-head self-attention mechanism/SpecAugment

分类

信息技术与安全科学

引用本文复制引用

钟善机,张学习,陈楚嘉,高学秋,陶杰..基于多尺度卷积和多头自注意力的语音情感识别模型[J].自动化与信息工程,2024,45(4):36-41,49,7.

基金项目

国家自然科学基金项目(62276069) （62276069）

自动化与信息工程

ISSN：1674-2605

访问量0

下载量0

段落导航