现代电子技术2024,Vol.47Issue(1):89-93,5.DOI:10.16652/j.issn.1004-373x.2024.01.016
基于ResCNN-BiGRU的四川方言语音识别
Sichuan dialect speech recognition based on ResCNN-BiGRU
谢金洪 1魏霞1
作者信息
- 1. 新疆大学 电气工程学院, 新疆 乌鲁木齐 830017
- 折叠
摘要
Abstract
Due to the lack of extraction ability of phonemic features of specific dialects in the speech recognition model based on deep convolutional neural network(DCNN),part of the information of the underlying features of dialect pronunciation is lost,which in turn leads to problems such as low dialect recognition accuracy and poor robustness.Therefore,a model combining residual network(RestNet)and bidirectional gated recurrent(BiGRU)network is proposed.The GFCC(Gammatone frequency cepstrum coefficient)feature map is taken as the input.A multi-scale convolutional layer is designed in the residual network.The features are extracted by convolution kernels of different sizes.And then,the long-term dependence in the sequence data is captured by the BiGRU network.Finally,the connected time series classification algorithm is used for label soft alignment to realize a Sichuan dialect speech recognition model.Experimental results on the Sichuan dialect corpus show that the recognition performance of the proposed model is better than that of the existing benchmark models.关键词
四川方言/音素特征/双向门控循环网络/多尺度卷积/连接时序分类/标签软对齐Key words
Sichuan dialect/phonemic feature/BiGRU network/multi-scale convolution/connection time series classification/label soft alignment分类
信息技术与安全科学引用本文复制引用
谢金洪,魏霞..基于ResCNN-BiGRU的四川方言语音识别[J].现代电子技术,2024,47(1):89-93,5.