| 注册
首页|期刊导航|计算机应用研究|基于Conformer的端到端语音识别方法

基于Conformer的端到端语音识别方法

胡从刚 申艺翔 孙永奇 赵思聪

计算机应用研究2024,Vol.41Issue(7):2018-2024,7.
计算机应用研究2024,Vol.41Issue(7):2018-2024,7.DOI:10.19734/j.issn.1001-3695.2023.11.0563

基于Conformer的端到端语音识别方法

End-to-end method based on Conformer for speech recognition

胡从刚 1申艺翔 1孙永奇 1赵思聪2

作者信息

  • 1. 交通大数据与人工智能教育部重点实验室,北京 100044||北京交通大学计算机与信息技术学院,北京 100044
  • 2. 北京航天晨信科技有限责任公司,北京 102308
  • 折叠

摘要

Abstract

The acoustic input network based on the Conformer encoder has the problem of insufficient extraction of FBank speech information and missing channel feature information.This paper proposed an end-to-end method based on RepVGG-SE-Conformer for speech recognition to solve these problems.Firstly,the proposed model used the multi-branch structure of RepVGG to enhance the speech information extraction capability,and using the structural re-parameterization fused the multi-branch into a single branch to reduce the computational complexity and speed up the model inference.Then,based on the squeeze-and-excitation network,the channel attention mechanism made up for the missing channel feature information to im-prove speech recognition accuracy.Finally,the experimental results on the public dataset Aishell-1 show that the proposed method's character error rate is reduced by 10.67%compared with Conformer,and the advancement of the method is veri-fied.In addition,the proposed RepVGG-SE acoustic input network has good generalization ability in the end-to-end scene,which can effectively improve the overall performance of speech recognition models based on Transformer variants.

关键词

语音识别/Conformer/RepVGG/压缩和激励网络

Key words

speech recognition/Conformer/RepVGG/squeeze-and-excitation network

分类

信息技术与安全科学

引用本文复制引用

胡从刚,申艺翔,孙永奇,赵思聪..基于Conformer的端到端语音识别方法[J].计算机应用研究,2024,41(7):2018-2024,7.

基金项目

科技创新2030——"新一代人工智能"重大资助项目(2021ZD0113002) (2021ZD0113002)

计算机应用研究

OA北大核心CSTPCD

1001-3695

访问量0
|
下载量0
段落导航相关论文