首页|期刊导航|现代电子技术|基于ResCNN-BiGRU的四川方言语音识别

基于ResCNN-BiGRU的四川方言语音识别

谢金洪魏霞

现代电子技术2024，Vol.47Issue(1)：89-93,5.

现代电子技术2024，Vol.47Issue(1)：89-93,5.DOI:10.16652/j.issn.1004-373x.2024.01.016

基于ResCNN-BiGRU的四川方言语音识别

Sichuan dialect speech recognition based on ResCNN-BiGRU

谢金洪 ¹魏霞¹

作者信息

1. 新疆大学电气工程学院, 新疆乌鲁木齐 830017
折叠

摘要

Abstract

Due to the lack of extraction ability of phonemic features of specific dialects in the speech recognition model based on deep convolutional neural network(DCNN),part of the information of the underlying features of dialect pronunciation is lost,which in turn leads to problems such as low dialect recognition accuracy and poor robustness.Therefore,a model combining residual network(RestNet)and bidirectional gated recurrent(BiGRU)network is proposed.The GFCC(Gammatone frequency cepstrum coefficient)feature map is taken as the input.A multi-scale convolutional layer is designed in the residual network.The features are extracted by convolution kernels of different sizes.And then,the long-term dependence in the sequence data is captured by the BiGRU network.Finally,the connected time series classification algorithm is used for label soft alignment to realize a Sichuan dialect speech recognition model.Experimental results on the Sichuan dialect corpus show that the recognition performance of the proposed model is better than that of the existing benchmark models.

关键词

四川方言/音素特征/双向门控循环网络/多尺度卷积/连接时序分类/标签软对齐

Key words

Sichuan dialect/phonemic feature/BiGRU network/multi-scale convolution/connection time series classification/label soft alignment

分类

信息技术与安全科学

引用本文复制引用

谢金洪,魏霞..基于ResCNN-BiGRU的四川方言语音识别[J].现代电子技术,2024,47(1):89-93,5.

现代电子技术

OA北大核心CSTPCD

ISSN：1004-373X

访问量0

下载量0

段落导航