电子学报2025,Vol.53Issue(4):1063-1102,40.DOI:10.12263/DZXB.20240691
端智能推理加速技术综述
On-Device Intelligence Acceleration Technologies:A Survey
摘要
Abstract
Intelligent edge computing is an essential pathway towards the era of pervasive intelligence,and it has pro-pelled the rapid advancement of on-device intelligence technology.By directly deploying and running deep learning models on edge devices,on-device intelligence holds natural advantages in real-time processing,security,and personalization,among other aspects,and has found extensive applications in various scenarios such as autonomous driving,satellite recon-naissance,virtual reality/augmented reality(VR/AR),and more.However,as the parameters of deep learning models contin-ue to increase,the limited hardware resources at the edge struggle to sustain the growing computational costs.To enhance the computational efficiency of model inference on edge devices,researchers have systematically optimized from multiple perspectives including model algorithms,compilation software,and device hardware,driving the advancement and evolu-tion of on-device intelligence.This paper summarizes existing optimization efforts for deep learning model inference at the edge,covering techniques such as model compression,collaborative design of model-software-hardware,heterogeneous model parallel deployment strategies,and optimizations for large models.Lastly,it outlines the challenges faced by current on-device intelligence inference acceleration technologies and provides insights into future development trends.关键词
端智能/模型压缩/推理加速/深度学习/软硬件结合优化Key words
on-device intelligence/model compression/inference acceleration/deep learning/collaborative design of model-software-hardware分类
信息技术与安全科学引用本文复制引用
章晋睿,龙婷婷,张德宇,许愿,任炬,张尧学..端智能推理加速技术综述[J].电子学报,2025,53(4):1063-1102,40.基金项目
国家重点研发计划(No.2022YFF0604502) (No.2022YFF0604502)
国家自然科学基金(No.62122095,No.62341201) National Key Research and Development Program of China(No.2022YFF0604502) (No.62122095,No.62341201)
National Natural Science Foundation of China(No.62122095,No.62341201) (No.62122095,No.62341201)