安全自主可控的国产化FPGA神经网络部署框架研究OACSTPCD
Study of autonomous and controllable neural network deployment framework based on domestic FPGA
针对现有的深度学习边缘应用依赖非国产FPGA架构和加密IP实现产生潜在的安全问题,并且难以快速部署在IP不足、尚在发展中的国产化FPGA平台,设计了 一种基于国产FPGA的神经网络硬件部署框架,配备独立于FPGA厂商的硬件IP库,实现国产FPGA神经网络部署的安全自主可控,并以此为基础进行了验证性实验.实验结果表明,本文所提出的神经网络部署框架具备实用功能,基于所提出框架能够实现国产FPGA神经网络快速部署,并且基于框架实现的16位精度Lenet5网络推理速度提高了 6.67倍,仅为0.024 ms,GOPS吞吐率提升了 5.13倍,达到147.8 GOPS;框架针对卷积计算转矩阵的数据降维进行了特定优化,相比基于Intel Xeon E-2276M CPU,进行卷积数据预转换加速了124.9 倍.
Aiming at the potential security issues arising from the existing deep learning edge applications relying on non-domestic FPGA architectures and encrypted IP implementations,which are difficult to quickly deploy on domestic FPGA platforms with insufficient IP and still under development,a neural network hardware deployment framework towards domestic FPGA is designed to achieve secure,autonomous and controllable deployment of domestic FPGA neural networks.In addition,the fast pipelined convolutional circuit and the im2col conversion circuit are designed to rapidly compute the systolic array.The experimental results demonstrate that the framework is capable of generating networks,such as Lenet5,with up to 147.8 GOPS for 16-bit data type and 0.024 ms running time,which repre-sents a 5.13x improvement in throughput and a 6.67x decrease in time,respectively.An evaluation of im2col operation on RTL design yields up to 124.9x gains over CPU on Intel Xeon E-2276M.
刘济源;王保平;汤勇明;李鹤
东南大学电子科学与工程学院,南京 210096
电子信息工程
FPGA神经网络硬件部署框架Lenet5DSP
FPGAneural networkhardware deployment frameworkLenet5DSP
《集成电路与嵌入式系统》 2024 (009)
25-35 / 11
国家自然科学基金青年科学基金项目(62304037);江苏省基础研究计划自然科学基金—青年基金项目(BK20230828);第八届中国科协青年人才托举工程(2022QNRC001);东南大学新晋教师科研启动经费资助(RF1028623173).
评论