计算机工程与应用2025,Vol.61Issue(6):53-63,11.DOI:10.3778/j.issn.1002-8331.2405-0145
语音识别与大语言模型融合技术研究综述
Review of Research on Fusion Technology of Speech Recognition and Large Language Models
摘要
Abstract
In the current era,various large language models(LLMs)have emerged,driving the development and innova-tion in many fields of artificial intelligence.Summarizing the positive effects of LLMs in speech recognition technology and exploring its development prospects can provide innovative ideas for the advancement of speech recognition technology.In current mainstream end-to-end speech recognition models,additional language models are often used to rescore the speech recognition results or combined with WFST algorithm to assist in decoding,thereby improving the accuracy of the speech recognition results.Recent studies have found that integrating LLMs into the end-to-end training of speech recognition models can further enhance the accuracy of the recognition results.Taking the three types of speech recogni-tion and language model fusion methods,shallow fusion,deep fusion,and cold fusion,as the main line,and their princi-ples and advantages and disadvantages are analyzed.Recent experiments by researchers have confirmed that combining LLMs with acoustic models can effectively improve recognition accuracy.After systematically reviewing the research progress of LLMs in ASR technology,it is also revealed that the models play an important role in the speech recognition area.The related technology integration of speech recognition and LLMs has gradually matured,presenting that it is valuable to commit further exploration and in-depth research.关键词
语音识别/大语言模型/深度学习Key words
speech recognition/large language model/deep learning分类
电子信息工程引用本文复制引用
王敬凯,秦董洪,白凤波,李路路,孔令儒,徐晨..语音识别与大语言模型融合技术研究综述[J].计算机工程与应用,2025,61(6):53-63,11.基金项目
广西壮族自治区中央引导地方科技发展资金项目(桂科ZY24212045) (桂科ZY24212045)
广西科技基地和人才专项(桂科AD23026054). (桂科AD23026054)