| 注册
首页|期刊导航|现代信息科技|深度神经网络低比特量化方法综述

深度神经网络低比特量化方法综述

田程 李正杰 陈功富 赵明 冯树萱

现代信息科技2025,Vol.9Issue(10):23-33,38,12.
现代信息科技2025,Vol.9Issue(10):23-33,38,12.DOI:10.19850/j.cnki.2096-4706.2025.10.006

深度神经网络低比特量化方法综述

Review on Low-bit Quantization Methods for Deep Neural Networks

田程 1李正杰 1陈功富 1赵明 1冯树萱1

作者信息

  • 1. 成都华微电子科技股份有限公司,四川 成都 610041
  • 折叠

摘要

Abstract

In recent years,Deep Neural Networks(DNNs)have achieved breakthroughs in fields such as Computer Vision and Natural Language Processing.However,DNNs typically contain a large number of parameters and consume substantial computational resources,which impedes their deployment and application on resource-constrained devices.To reduce memory usage and computational burden,the research work on model compression and acceleration has emerged continuously,with model low-bit quantization being one of the primary methods.Model low-bit quantization aims to replace the original high-precision FP32 floating-point operations with more efficient low-precision(≤8 bit)fixed-point or bit operations,thereby significantly reducing the computational cost of the model and enabling the deployment of the quantized network on edge devices.For image classification and object detection tasks in the field of Computer Vision,a comprehensive survey of the current related state of model low-bit quantization techniques is conducted,and the advantages and disadvantages of these methods are analyzed in depth.Additionally,the fundamentals of Deep Neural Networks low-bit quantization are outlined,and the performance of partial representative methods on different datasets is compared.Finally,future development trends in model low-bit quantization are prospected.

关键词

深度神经网络/模型压缩与加速/低比特量化

Key words

Deep Neural Networks/model compression and acceleration/low-bit quantization

分类

信息技术与安全科学

引用本文复制引用

田程,李正杰,陈功富,赵明,冯树萱..深度神经网络低比特量化方法综述[J].现代信息科技,2025,9(10):23-33,38,12.

现代信息科技

2096-4706

访问量6
|
下载量0
段落导航相关论文