| 注册
首页|期刊导航|集成电路与嵌入式系统|高效Winograd卷积硬件设计及其量化方案

高效Winograd卷积硬件设计及其量化方案

严峥 张宸硕 白一川 杜源 杜力

集成电路与嵌入式系统2025,Vol.25Issue(8):41-52,12.
集成电路与嵌入式系统2025,Vol.25Issue(8):41-52,12.DOI:10.20193/j.ices2097-4191.2025.0042

高效Winograd卷积硬件设计及其量化方案

Design of efficient Winograd convolution hardware and its quantization scheme

严峥 1张宸硕 1白一川 1杜源 1杜力1

作者信息

  • 1. 南京大学电子科学与工程学院,南京 210023
  • 折叠

摘要

Abstract

Convolution is the most common operation in CNN networks,and the power consumption of multiplication and accumulation operations in convolution is high,which limits the performance of many CNN hardware accelerators.Reducing the number of multiplica-tions in convolution is one of the effective ways to improve the performance of CNN accelerators.As a fast convolution algorithm,Wino-grad algorithm could reduce up to 75%multiplications in convolution.However,the weights of the model for Winograd convolution have a significantly different distribution,which results in longer quantization bit width to maintain similar accuracy and neutralizes the hardware reduction brought by the reduction of multiplications.In this paper,we analyze this problem quantitively and propose a new quantization scheme for Winograd convolution.The quantized Winograd computation hardware module is implemented with accuracy loss less than 1%.To further reduce the hardware cost,we apply the approximate multiplier(AM)to Winograd convolution.Com-pared with the conventional convolution computation block,the Winograd block saves 27.3%of the area,and the application of the ap-proximate multiplier in Winograd block saves 39.6%of the area without significant performance loss.

关键词

卷积神经网络/Winograd算法/模型量化/近似乘法器/硬件加速器

Key words

convolution neural networks/Winograd algorithm/model quantization/approximate multiplier/hardware accelerator

分类

信息技术与安全科学

引用本文复制引用

严峥,张宸硕,白一川,杜源,杜力..高效Winograd卷积硬件设计及其量化方案[J].集成电路与嵌入式系统,2025,25(8):41-52,12.

集成电路与嵌入式系统

1009-623X

访问量0
|
下载量0
段落导航相关论文