| 注册
首页|期刊导航|电子器件|基于双通道的流式语音识别技术研究

基于双通道的流式语音识别技术研究

GAO Lu WANG Yahao ZHANG Fei REN Xiaoying HAO Bin HAN Yaxu

电子器件2025,Vol.48Issue(6):1260-1267,8.
电子器件2025,Vol.48Issue(6):1260-1267,8.DOI:10.3969/j.issn.1005-9490.2025.06.010

基于双通道的流式语音识别技术研究

Research on Streaming Speech Recognition Technology Based on Two-Pass Approach

GAO Lu 1WANG Yahao 1ZHANG Fei 1REN Xiaoying 1HAO Bin 1HAN Yaxu1

作者信息

  • 1. Institute of Information Engineering,Inner Mongolia University of Science and Technology,Baotou Inner Mongolia 014010,China
  • 折叠

摘要

Abstract

Recently,the end-to-end model based on RNN-Transducer(RNN-T)has shown superior performance on streaming speech rec-ognition tasks.Although this model inherently possesses streaming capabilities,the recognition quality still lags behind the advanced non-streaming model.In addition,RNN-T tends to put the prediction delay last,thus incurring higher partial latency.To better balance the character error rate(CER)and latency indicators,a two-pass model combining RNN-T and attention-based encoder-decoder is pro-posed.Specifically,the encoder in the first pass replaces the RNN-T encoder with Transformer layers that leverage blockwise paralleliza-tion to capture global context across chunks and reduce decoding cost for lower latency.The second pass adopts an improved Transform-er rescorer to process the entire streaming hypothesis in parallel for more efficient use of computational resources.Experiments on Aishell-1 show the proposed two-pass model reduces CER by approximately 40%compared to RNN-T given acceptable latency.The proposed model effectively balances recognition accuracy and latency for streaming ASR.

关键词

流式语音识别/块机制/端到端/Transformer/Conformer

Key words

streaming speech recognition/block mechanism/end-to-end/Transformer/Conformer

分类

信息技术与安全科学

引用本文复制引用

GAO Lu,WANG Yahao,ZHANG Fei,REN Xiaoying,HAO Bin,HAN Yaxu..基于双通道的流式语音识别技术研究[J].电子器件,2025,48(6):1260-1267,8.

基金项目

国家自然科学基金项目(62161041) (62161041)

内蒙古自然科学基金项目(2022SHZR0375) (2022SHZR0375)

内蒙古自治区重点研发和成果转化项目(2025SYFHH0223) (2025SYFHH0223)

电子器件

1005-9490

访问量0
|
下载量0
段落导航相关论文