| 注册
首页|期刊导航|软件导刊|以子图融合为最小单位的混合精度推理

以子图融合为最小单位的混合精度推理

崔丽群 胡磊

软件导刊2024,Vol.23Issue(6):44-52,9.
软件导刊2024,Vol.23Issue(6):44-52,9.DOI:10.11907/rjdk.231538

以子图融合为最小单位的混合精度推理

Mixed-Precision Inference with Subgraph Fusion as the Minimum Unit

崔丽群 1胡磊1

作者信息

  • 1. 辽宁工程技术大学 软件学院,辽宁 葫芦岛 125105
  • 折叠

摘要

Abstract

In recent years,convolutional neural networks,as the most important technology in deep learning,have made achievements in fields such as image classification,object detection,and speech recognition.During this period,deep neural networks composed of multi-lay-er convolutional neural networks emerged,showing significant improvements in accuracy in various tasks.However,the weights of neural net-works are often limited to single precision types,resulting in a larger memory space compared to specific hardware platforms,and single preci-sion types such as floating point 16 and INT 8 can no longer meet the practical needs of some model inference today.To this end,a mixed pre-cision inference algorithm is proposed,which uses subgraphs as the minimum unit and adds rich bits by judging the fusion relationship be-tween adjacent nodes.Firstly,adding a floating point 16 semi precision bit configuration to the search space of the original single precision quantization design increases the final search space,providing more opportunities for finding the optimal solution.Secondly,using the idea of subgraph fusion,the accuracy of different fused subgraphs is configured through integer linear programming.The computational graph is divid-ed based on three constraints:model size,inference delay,and bitwidth operands,reducing the accumulated disturbance error in the end.In the end,it was verified on the ResNet series network that the proposed model had an accuracy loss of no more than 1%compared to HAWQ V3,while also improving inference speed compared to other mixed precision quantization methods.In the ResNet18 network,the inference speed was improved by 18.15%and 19.21%,respectively,and in the ResNet50 network,the inference speed was improved by 13.15%and 13.70%,respectively.

关键词

子图融合/混合精度推理/约束问题最优化求解/GPU加速

Key words

subgraph fusion/mixed precision inference/constrained optimization problem-solving/GPU acceleration

分类

信息技术与安全科学

引用本文复制引用

崔丽群,胡磊..以子图融合为最小单位的混合精度推理[J].软件导刊,2024,23(6):44-52,9.

基金项目

辽宁省高等学校基本科研项目(LJKMZ20220699) (LJKMZ20220699)

软件导刊

1672-7800

访问量0
|
下载量0
段落导航相关论文