高技术通讯2016,Vol.26Issue(12):951-959,9.DOI:10.3772/j.issn.1002-0470.2016.12.004
基于矢量DSP的并行化卷积算法
A parallelized convolution algorithm for vector digital signal processors
摘要
Abstract
To improve the efficiency of the convolution computation on a vector digital signal processor (DSP),the radix2 parallelized short convolution (PSC R2),a highly efficient parallelized algorithms was proposed.The PSC R2 algorithm uses a structure of radix-2 short convolution,not a direct structure of the conventional convolution,so that the number of algorithm cycle is effectively reduced.Furthermore,application specific DSP instructions were proposed to guarantee the high efficiency of the parallelized algorithm.It is proved by empirical analysis that the PSC R2 algorithm has the low temporal complexity,which accounts for only 43% of the traditional Vectorising the Inner Loop (VIL) algorithm and 55% of the traditional Vectorising the Outer Loop (VOL) algorithm;and has nearly the same memory consumption as the two traditional algorithms.In practical applications,the proposed PSC R2 algorithm could significantly reduce the temporal complexity in convolution,correlation and filtering operation in mobile communications and digital signal processing.关键词
卷积/并行化/矢量DSP/指令集/时间复杂度Key words
convolution/parallelization/vector digital signal processor (DSP)/instruction set/temporal complexity引用本文复制引用
林江南,周一青,孙刚,冯雪林..基于矢量DSP的并行化卷积算法[J].高技术通讯,2016,26(12):951-959,9.基金项目
国家自然科学基金(61431001)和北京市青年拔尖人才(2015000021223ZK31)资助项目. (61431001)