软件导刊2024,Vol.23Issue(6):44-52,9.DOI:10.11907/rjdk.231538
以子图融合为最小单位的混合精度推理
Mixed-Precision Inference with Subgraph Fusion as the Minimum Unit
摘要
Abstract
In recent years,convolutional neural networks,as the most important technology in deep learning,have made achievements in fields such as image classification,object detection,and speech recognition.During this period,deep neural networks composed of multi-lay-er convolutional neural networks emerged,showing significant improvements in accuracy in various tasks.However,the weights of neural net-works are often limited to single precision types,resulting in a larger memory space compared to specific hardware platforms,and single preci-sion types such as floating point 16 and INT 8 can no longer meet the practical needs of some model inference today.To this end,a mixed pre-cision inference algorithm is proposed,which uses subgraphs as the minimum unit and adds rich bits by judging the fusion relationship be-tween adjacent nodes.Firstly,adding a floating point 16 semi precision bit configuration to the search space of the original single precision quantization design increases the final search space,providing more opportunities for finding the optimal solution.Secondly,using the idea of subgraph fusion,the accuracy of different fused subgraphs is configured through integer linear programming.The computational graph is divid-ed based on three constraints:model size,inference delay,and bitwidth operands,reducing the accumulated disturbance error in the end.In the end,it was verified on the ResNet series network that the proposed model had an accuracy loss of no more than 1%compared to HAWQ V3,while also improving inference speed compared to other mixed precision quantization methods.In the ResNet18 network,the inference speed was improved by 18.15%and 19.21%,respectively,and in the ResNet50 network,the inference speed was improved by 13.15%and 13.70%,respectively.关键词
子图融合/混合精度推理/约束问题最优化求解/GPU加速Key words
subgraph fusion/mixed precision inference/constrained optimization problem-solving/GPU acceleration分类
信息技术与安全科学引用本文复制引用
崔丽群,胡磊..以子图融合为最小单位的混合精度推理[J].软件导刊,2024,23(6):44-52,9.基金项目
辽宁省高等学校基本科研项目(LJKMZ20220699) (LJKMZ20220699)