电子学报2016,Vol.44Issue(2):241-246,6.DOI:10.3969/j.issn.0372-2112.2016.02.001
一种高效的面向基2 FFT算法的SIMD并行存储结构
An Efficient SIMD Parallel Memory Structure for Radix-2 FFT Computation
摘要
Abstract
As more and more execution units are integrated in the digital signal processor(DSP)with single instruction multiple data stream(SIMD)extension,the flexibility and bandwidth efficiency of parallel memory access have significant effects on its whole practical performance.Based on detailed analysis of the memory access problems for radix-2 fast Fourier transform (FFT)algorithm in general SIMD DSP,this paper used parts of the address bit XOR logic to realize memory access address trans-lation,and achieved conflict-free parallel SIMD memory accesses for FFT computation.Then several memory access instructions with special shuffle modes were brought forward,which could completely eliminate extra shuffling instruction operations of radix-2 FFT algorithm in the SIMD architecture.Finally,the vector memory(VM)in 16-way SIMD DSP YHFT-Matrix2 was optimized by above methods.The test results show that the optimized VM can realize fully pipelined conflict-free memory accesses and 100%parallel memory access bandwidth utilization with increase of 18%area overheads.Compared with the design before opti-mization,the performance of different points radix-2 FFT can achieve speedup ranging from 1. 32 to 2. 66.关键词
快速傅里叶变换/单指令多数据流/低位交叉/并行存储/访问冲突/数据混洗Key words
FFT/SIMD/low-order interleave/parallel memory/access conflict/data shuffle分类
信息技术与安全科学引用本文复制引用
陈海燕,杨超,刘胜,刘仲..一种高效的面向基2 FFT算法的SIMD并行存储结构[J].电子学报,2016,44(2):241-246,6.基金项目
国家自然科学基金 ()