首页|期刊导航|大数据挖掘与分析(英文版)|ResDecode:Accelerating Large Language Models Inference via Residual Decoding Heads
大数据挖掘与分析(英文版)2025,Vol.8Issue(4):779-793,15.DOI:10.26599/BDMA.2024.9020074
ResDecode:Accelerating Large Language Models Inference via Residual Decoding Heads
ResDecode:Accelerating Large Language Models Inference via Residual Decoding Heads
摘要
关键词
speculative decoding/efficient inference/Large Language Models(LLMs)Key words
speculative decoding/efficient inference/Large Language Models(LLMs)引用本文复制引用
Ziqian Zeng,Jiahong Yu,Qianshi Pang,Zihao Wang,Huiping Zhuang,Fan Yu,Hongen Shao,Xiaofeng Zou..ResDecode:Accelerating Large Language Models Inference via Residual Decoding Heads[J].大数据挖掘与分析(英文版),2025,8(4):779-793,15.基金项目
This work was supported by the National Key R&D Program of China(No.2021ZD0110400),the National Natural Science Foundation of China(Nos.62406114 and 62306117),the Guangzhou Basic and Applied Basic Research Foundation(Nos.2023A04J1687 and 2024A04J3681),the Fundamental Research Funds for the Central Universities(Nos.2024ZYGXZR074 and 2023ZYGXZR023),the Guangdong Basic and Applied Basic Research Foundation(No.2024A1515010220),the Postdoctoral Fellowship Program of CPSF(No.GZC20230841),the South China University of Technology-TCL Technology Innovation Fund,and the CAAI-MindSpore Open Fund developed on Openl Community. (No.2021ZD0110400)