软件导刊2025,Vol.24Issue(5):1-7,7.DOI:10.11907/rjdk.241918
深度学习训练性能优化:原理、技术与工具
Training Performance Optimization in Deep Learning:Principles,Techniques and Tools
摘要
Abstract
As the scale of models and data in training tasks of deep learning grows rapidly,the costs associated with model training continue to rise,making efficient model training one of the primary challenges in deploying deep learning solutions.This paper analyzes the general principles of performance optimization in deep learning training.Starting from the typical training workflow,it systematically analyzes and sum-marizes common performance optimization techniques for heterogeneous computing models.These techniques can be categorized based on their stage of application:data preparation,forward/backward propagation,gradient synchronization,and parameter update.From a computer ar-chitecture perspective,these techniques can further be classified into optimizations in computation,communication,memory,and I/O.This paper also introduces commonly used tools for performance analysis and visualization in deep learning,so as to provide a valuable reference for practitioners engaged in optimizing training performance of deep learning.关键词
训练性能/深度学习/前向计算/反向传播/参数更新/负载均衡Key words
training performance/deep learning/forward computation/backward propagation/parameter update/workload balance分类
计算机与自动化引用本文复制引用
介飞,张海俊,汪锦想..深度学习训练性能优化:原理、技术与工具[J].软件导刊,2025,24(5):1-7,7.基金项目
安徽省博士后研究人员科研活动经费资助项目(2022B588) (2022B588)