集成电路与嵌入式系统2025,Vol.25Issue(8):41-52,12.DOI:10.20193/j.ices2097-4191.2025.0042
高效Winograd卷积硬件设计及其量化方案
Design of efficient Winograd convolution hardware and its quantization scheme
严峥 1张宸硕 1白一川 1杜源 1杜力1
作者信息
- 1. 南京大学电子科学与工程学院,南京 210023
- 折叠
摘要
Abstract
Convolution is the most common operation in CNN networks,and the power consumption of multiplication and accumulation operations in convolution is high,which limits the performance of many CNN hardware accelerators.Reducing the number of multiplica-tions in convolution is one of the effective ways to improve the performance of CNN accelerators.As a fast convolution algorithm,Wino-grad algorithm could reduce up to 75%multiplications in convolution.However,the weights of the model for Winograd convolution have a significantly different distribution,which results in longer quantization bit width to maintain similar accuracy and neutralizes the hardware reduction brought by the reduction of multiplications.In this paper,we analyze this problem quantitively and propose a new quantization scheme for Winograd convolution.The quantized Winograd computation hardware module is implemented with accuracy loss less than 1%.To further reduce the hardware cost,we apply the approximate multiplier(AM)to Winograd convolution.Com-pared with the conventional convolution computation block,the Winograd block saves 27.3%of the area,and the application of the ap-proximate multiplier in Winograd block saves 39.6%of the area without significant performance loss.关键词
卷积神经网络/Winograd算法/模型量化/近似乘法器/硬件加速器Key words
convolution neural networks/Winograd algorithm/model quantization/approximate multiplier/hardware accelerator分类
信息技术与安全科学引用本文复制引用
严峥,张宸硕,白一川,杜源,杜力..高效Winograd卷积硬件设计及其量化方案[J].集成电路与嵌入式系统,2025,25(8):41-52,12.