计算机工程与科学2018,Vol.40Issue(1):34-39,6.DOI:10.3969/j.issn.1007-130X.2018.01.005
CNN卷积计算在移动GPU上的加速研究
Accelerating CNN on mobile GPU
摘要
Abstract
Convolutional Neural Networks (CNNs) are playing an increasingly important role in areas such as image classification and speech recognition because of their excellent performance.Some researchers have already wanted to apply this deep learning process on mobile phones,but the performance of the porting program is unsatisfactory due to the huge amount of computation of CNN.In order to explore how to solve this problem,this paper uses a deep learning framework named MXNet to realize the forward process of CNN on mobile phones and focuses on the use of GPU that is another powerful computing device on the mobile phone.Based on the OpenCL common programming framework,we use matrix multiplication to compute the most time-consuming convolution in the forward process and move it to the GPU.Besides,serval improvements are made to achieve better performance.Finally,the experimental results show that we succeed in reducing the time of the forward process to half of the original time.关键词
CNN/手机/移动GPU/快速算法/OpenCLKey words
CNN/mobile phone/mobile GPU/fast algorithm/OpenCL分类
信息技术与安全科学引用本文复制引用
王湘新,时洋,文梅..CNN卷积计算在移动GPU上的加速研究[J].计算机工程与科学,2018,40(1):34-39,6.基金项目
国家自然科学基金(61272145) (61272145)