计算机应用研究2024,Vol.41Issue(4):1070-1076,7.DOI:10.19734/j.issn.1001-3695.2023.07.0378
基于可重构阵列的CNN数据量化方法
CNN data quantization method based on reconfigurable array
摘要
Abstract
Convolution operations lead to a significant increase in the network size,which makes CNN models difficult to de-ploy to the embedded hardware platform,and different granularity data is not coordinated with the underlying hardware struc-ture,which leads to low computing efficiency.Based on the reconfigurable array with the computing units supporting multiple bit widths,through software hardware cooperation and reconfigurable computing methods,this paper defined the quantization threshold using KL divergence and random integer method,proposed a strategy for finding the best basis point,designed an in-struction set and a parallel mapping scheme supporting multiple bit widths to realize three distinct bit widths in data quantiza-tion.The results show the quantization scheme with 8 bit weight and feature map can compress model parameter quantity to about 50%with 2%accuracy loss.The acceleration ratios of quantifying the test images to three different bit widths reach 1.012,1.273,and 1.556,respectively,which can shorten the execution time by up to 35.7%and reduce memory access times by 56.2%,while only bringing less than 1%relative error.This indicates that this method can achieve efficient neural network computation under three quantization bit widths,thereby implementing hardware acceleration and model compression.关键词
卷积神经网络/数据量化/可重构结构/并行映射/加速比Key words
convolutional neural network(CNN)/data quantization/reconfigurable structure/parallel mapping/acceleration ratio分类
信息技术与安全科学引用本文复制引用
朱家扬,蒋林,李远成,宋佳,刘帅..基于可重构阵列的CNN数据量化方法[J].计算机应用研究,2024,41(4):1070-1076,7.基金项目
科技创新2030-"新一代人工智能"重大项目(2022ZD0119005) (2022ZD0119005)
国家自然科学基金重点资助项目(61834005) (61834005)