计算机工程2026,Vol.52Issue(1):33-60,28.DOI:10.19678/j.issn.1000-3428.0252721
后训练量化方法综述(特邀)
Survey of Post-Training Quantization Methods(Invited)
摘要
Abstract
Post-Training Quantization(PTQ)is an efficient model compression method that converts the parameters of high-precision floating-point models into low-bit integer representations without requiring retraining,using only a small amount of unlabeled calibration data.This method significantly reduces storage and computational overhead while maximizing the retention of the original model's inference accuracy;therefore,it is widely recognized and adopted in both academia and industry.This paper systematically summarizes the progress of research on PTQ from four dimensions:quantization steps,method classification,tool ecosystem,and application advancements.First,a clear framework for the quantization process is constructed,covering steps such as dynamic range statistics,quantization parameter calculation,weight and activation quantization,error optimization,and model generation.Second,a complete classification system for quantization methods is proposed,which includes quantization granularity,bit width,calibration methods,and structure-guided quantization.Third,the tool ecosystem supporting the large-scale application of PTQ is analyzed,and its value in hardware adaptation and engineering deployment is discussed.Finally,this paper summarizes the progress in the integration and application of PTQ methods and highlights practical challenges,particularly those related to cross-modal consistency,extremely low-bit semantic collapse,and hardware adaptation.These challenges not only reveal the limitations of current technologies but also provide important directions for future research.This review provides a reference framework for PTQ methods in academia and industry,thereby facilitating the widespread application of artificial intelligence in resource-constrained scenarios.关键词
后训练量化/后训练量化步骤/后训练量化方法分类/工具生态/应用进展Key words
Post-Training Quantization(PTQ)/PTQ steps/classification of PTQ methods/tool ecosystem/application progress分类
信息技术与安全科学引用本文复制引用
张俊娜,王泓尊,丁春涛..后训练量化方法综述(特邀)[J].计算机工程,2026,52(1):33-60,28.基金项目
国家自然科学基金(62472147,62202039) (62472147,62202039)
河南省重点研发专项(251111210500). (251111210500)