摘要
Abstract
The graph convolutional neural network(GCN)algorithm has achieved breakthrough success in processing graph structured data tasks.However,training GCN requires a large amount of memory space and multiple random memory accesses,which limits the further deployment and application of the algorithm.Existing deployment and acceleration solutions for GCN mostly rely on the Vitis HLS tool,which is developed by means of C/C++.These solutions almost entirely neglect hardware description language,leading to incomplete software-hardware acceleration.To address these issues,a FPGA deployment and acceleration architecture tailored for GCN is proposed.The architecture is composed of computing modules and storage modules,which can be implemented by means of hardware description languages.In the computing module,the hardware description language is used to implement the key algorithm of GCN,mapping it to the field-programmable gate array(FPGA)for hardware acceleration.In the caching module,the read-only memory(ROM)IP core is primarily called and a two-dimensional register file is defined to store input node features,normalized adjacency matrices,quantized parameters of various layers,and intermediate variables,enhancing the parallelism of the GCN algorithm.The model training is conducted on the Pycharm platform and parameters are extracted for quantization,then the design and simulation test for GCN are conducted on the Vivado platform,and the computational performance of CPU and GPU are compared.The experimental results show that the designed GCN acceleration architecture can improve the inference speed of the model.关键词
图卷积神经网络/FPGA加速器/硬件描述语言/计算模块/存储模块/参数量化Key words
GCN/FPGA accelerator/hardware description language/calculation module/storage module/parameter quantification分类
电子信息工程