首页|期刊导航|电子器件|基于双通道的流式语音识别技术研究

基于双通道的流式语音识别技术研究

GAO Lu WANG Yahao ZHANG Fei REN Xiaoying HAO Bin HAN Yaxu

电子器件2025，Vol.48Issue(6)：1260-1267,8.

电子器件2025，Vol.48Issue(6)：1260-1267,8.DOI:10.3969/j.issn.1005-9490.2025.06.010

基于双通道的流式语音识别技术研究

Research on Streaming Speech Recognition Technology Based on Two-Pass Approach

GAO Lu ¹WANG Yahao ¹ZHANG Fei ¹REN Xiaoying ¹HAO Bin ¹HAN Yaxu¹

作者信息

1. Institute of Information Engineering,Inner Mongolia University of Science and Technology,Baotou Inner Mongolia 014010,China
折叠

摘要

Abstract

Recently,the end-to-end model based on RNN-Transducer(RNN-T)has shown superior performance on streaming speech rec-ognition tasks.Although this model inherently possesses streaming capabilities,the recognition quality still lags behind the advanced non-streaming model.In addition,RNN-T tends to put the prediction delay last,thus incurring higher partial latency.To better balance the character error rate(CER)and latency indicators,a two-pass model combining RNN-T and attention-based encoder-decoder is pro-posed.Specifically,the encoder in the first pass replaces the RNN-T encoder with Transformer layers that leverage blockwise paralleliza-tion to capture global context across chunks and reduce decoding cost for lower latency.The second pass adopts an improved Transform-er rescorer to process the entire streaming hypothesis in parallel for more efficient use of computational resources.Experiments on Aishell-1 show the proposed two-pass model reduces CER by approximately 40%compared to RNN-T given acceptable latency.The proposed model effectively balances recognition accuracy and latency for streaming ASR.

关键词

流式语音识别/块机制/端到端/Transformer/Conformer

Key words

streaming speech recognition/block mechanism/end-to-end/Transformer/Conformer

分类

信息技术与安全科学

引用本文复制引用

GAO Lu,WANG Yahao,ZHANG Fei,REN Xiaoying,HAO Bin,HAN Yaxu..基于双通道的流式语音识别技术研究[J].电子器件,2025,48(6):1260-1267,8.

基金项目

国家自然科学基金项目(62161041) （62161041）

内蒙古自然科学基金项目(2022SHZR0375) （2022SHZR0375）

内蒙古自治区重点研发和成果转化项目(2025SYFHH0223) （2025SYFHH0223）

电子器件

ISSN：1005-9490

访问量0

下载量0

段落导航