计算机工程2019,Vol.45Issue(3):41-46,6.DOI:10.19678/j.issn.1000-3428.0052189
基于CUDA与CUBLAS的Tucker分解模块设计与实现
Design and Implementation of Tucker Decomposition Module Based on CUDA and CUBLAS
摘要
Abstract
Because tensor Tucker decomposition is widely used in image processing, face recognition, signal processing and other fields, Tucker decomposition algorithm becomes a key research object.However, the current popular Tucker decomposition algorithm needs to expand tensors many times, which results in that the acceleration efficiency of the algorithm is mostly consumed in tensor multiple expansion.In order to solve the above problems, a modified Tucker decomposition module applied to CUDA platform is proposed.By optimizing the Tucker decomposition algorithm and CUDA platform, the tensor expansion process is omitted, and the requirements of acceleration system are reduced and the acceleration efficiency is improved.Experimental results show that the modified Tucker decomposition algorithm has better acceleration performance on CUDA platform.关键词
Tucker分解算法/张量分解/统一计算设备架构/图形处理单元/张量范数Key words
Tucker decomposition algorithm/tensor decomposition/Compute Unified Device Architecture (CUDA)/Graphics Processing Unit (GPU)/tensor norm分类
信息技术与安全科学引用本文复制引用
周琦,柴小丽,马克杰,俞则人..基于CUDA与CUBLAS的Tucker分解模块设计与实现[J].计算机工程,2019,45(3):41-46,6.基金项目
中国电子科技集团安可系统自由硬件新技术研发项目(170225). (170225)