| 注册
首页|期刊导航|计算机工程与科学|一种基于GPU的高性能稀疏卷积神经网络优化

一种基于GPU的高性能稀疏卷积神经网络优化

方程 邢座程 陈顼颢 张洋

计算机工程与科学2018,Vol.40Issue(12):2103-2111,9.
计算机工程与科学2018,Vol.40Issue(12):2103-2111,9.DOI:10.3969/j.issn.1007-130X.2018.12.002

一种基于GPU的高性能稀疏卷积神经网络优化

A GPU-based high-performance optimization method of sparse convolutional neural networks

方程 1邢座程 1陈顼颢 1张洋1

作者信息

  • 1. 国防科技大学计算机学院,湖南 长沙 410073
  • 折叠

摘要

Abstract

As an important branch of neural networks, the convolutional neural network (CNN) is currently more suitable for learning and expressing image features than other neural network methods.With the continuous development of the CNN, there are more challenges.The parameters scale of the CNN is growing larger, which makes the demand for computation enormous.There are many ways to compress CNN scale, however, the compressed CNN usually introduces a number of sparse data structures.These sparse data structures can hurt the performance of the CNN on GPU.In order to solve this problem, we adopt the direct sparse convolution algorithm proposed in 2017 to accelerate GPU's processing of sparse data.According to the characteristics of this algorithm, we transform convolution operation into an inner product of the sparse vector and dense vector on GPU platform.Our optimization makes full use of the sparse data and network structure to allocate threads for task scheduling, and uses data locality to manage memory replacement.It enables the GPU to deal with the operation on the convolution layer efficiently in the sparse CNN.Compared with the cuBLAS, our proposal achieves a speedup of 1.07×~1.23×, 1.17×~3.51×and 1.32×~5.00× on AlexNet, GoogleNet and ResNet respectively.Compared with the cuSPARSE, our method achieves a speed-up of 1.31×~1.42×, 1.09×~2.00×and 1.07×~3.22× on AlexNet, GoogleNet, and ResNet respectively.

关键词

卷积神经网络/稀疏/并行/优化/图形处理器

Key words

convolutional neural network/sparse/parallel/optimization/GPU

分类

信息技术与安全科学

引用本文复制引用

方程,邢座程,陈顼颢,张洋..一种基于GPU的高性能稀疏卷积神经网络优化[J].计算机工程与科学,2018,40(12):2103-2111,9.

基金项目

国家自然科学基金(61170083) (61170083)

计算机工程与科学

OA北大核心CSCDCSTPCD

1007-130X

访问量0
|
下载量0
段落导航相关论文