| 注册
首页|期刊导航|华南理工大学学报(自然科学版)|基于昇腾NPU的快速傅里叶变换算法设计与优化

基于昇腾NPU的快速傅里叶变换算法设计与优化

陆璐 王远飞 梁志宏 索思亮

华南理工大学学报(自然科学版)2025,Vol.53Issue(11):9-17,9.
华南理工大学学报(自然科学版)2025,Vol.53Issue(11):9-17,9.DOI:10.12141/j.issn.1000-565X.240524

基于昇腾NPU的快速傅里叶变换算法设计与优化

Design and Optimization of Fast Fourier Transform Algorithm Based on Ascend NPU

陆璐 1王远飞 1梁志宏 2索思亮2

作者信息

  • 1. 华南理工大学 计算机科学与工程学院,广东 广州 510006
  • 2. 南方电网科学研究院有限责任公司,广东 广州 510663||广东省电力系统网络安全企业重点实验室,广东 广州 510663
  • 折叠

摘要

Abstract

As a fundamental algorithm in scientific computing and signal processing,fast Fourier transform(FFT)has been widely applied to such fields as digital signal processing,image processing,deep learning.With the growth of data scale and the increasing demand for processing power,optimizing FFT algorithms on emerging hard-ware platforms has become particularly crucial.This paper conducts an in-depth analysis of the architectural cha-racteristics of Ascend NPU and their impacts on FFT algorithm optimization.Based on the matrix-computation-based Stockham FFT algorithm,a series of innovative optimization strategies are proposed:(1)A heuristic radix se-lection algorithm is designed to provide effective radix sequence combinations for different input sizes;(2)An effi-cient computation flow for single-iteration FFT without real-imaginary separation is developed,significantly reduc-ing the global memory access overhead;(3)An on-chip cache-based data reading optimization strategy is proposed,greatly improving data access speed;(4)A data layout optimization method for multiple iterations is designed,effec-tively enhancing overall memory access efficiency.Experimental results on Ascend Atlas 800 platform equipped with Ascend 910 AI processor demonstrate that the proposed optimization strategies achieve an average speedup of 4.61 compared to non-optimized implementations.Independent performance analysis and validation of each optimi-zation strategy demonstrate that the individual average speedup ratio ranges from 1.42 to 3.52.This research pro-vides a technical references for implementing efficient FFT algorithms on emerging NPU architectures.

关键词

快速傅里叶变换/昇腾NPU/异构计算/高性能计算

Key words

fast Fourier transform/Ascend NPU/heterogeneous computing/high-performance computing

分类

信息技术与安全科学

引用本文复制引用

陆璐,王远飞,梁志宏,索思亮..基于昇腾NPU的快速傅里叶变换算法设计与优化[J].华南理工大学学报(自然科学版),2025,53(11):9-17,9.

基金项目

广东省自然科学基金项目(2024A1515010204) (2024A1515010204)

南方电网科学研究院有限责任公司项目(1500002024030103XA00063) Supported by the Natural Science Foundation of Guangdong Province(2024A1515010204) (1500002024030103XA00063)

华南理工大学学报(自然科学版)

OA北大核心

1000-565X

访问量0
|
下载量0
段落导航相关论文