集成电路与嵌入式系统2026,Vol.26Issue(4):14-25,12.DOI:10.20193/j.ices2097-4191.2025.0138
基于高效异构并行策略加速的FPGA静态时序分析算法
FPGA static timing analysis algorithm accelerated by high-efficiency heterogeneous parallelization strategy
摘要
Abstract
The widespread integration of Field Programmable Gate Arrays(FPGAs)in high-performance computing,AI inference,and 5G communications has led to an unprecedented escalation in design scale and timing constraint complexity.These trends impose strin-gent demands on the runtime efficiency of Static Timing Analysis(STA).Current FPGA STA tools,primarily anchored in single-core or multi-core CPU architectures,are increasingly hitting a performance wall,despite persistent algorithmic refinements,they struggle with computational bottlenecks and suboptimal memory throughput when confronted with large-scale designs.In recent years,Graphics Processing Units(GPUs)with their massive parallel computing capabilities have provided new opportunities for improving FPGA STA performance.However,challenges in memory access patterns under heterogeneous GPU architectures,the optimization for timing graph loop detection,and heterogeneous parallel acceleration strategies continue to hinder the effectiveness of current GPU-accelerated methods in FPGA STA scenarios.To address these issues,we propose an FPGA STA algorithm accelerated by an efficient heterogene-ous parallel strategy.First,targeting the problem of discontinuous memory access and field interleaving in traditional object-oriented data structures under CPU-GPU heterogeneous architectures,a structure-of-arrays(SoA)-based data layout strategy is presented.Com-bined with data reordering operations,this approach effectively reduces memory access latency and improves bandwidth utilization,pro-viding a data foundation for high-performance FPGA STA computational engines.Second,to overcome the limitations of low efficiency and poor robustness in timing graph loop detection,a parallel loop detection optimization algorithm based on color propagation is de-signed,enabling efficient acceleration in the preprocessing stage of FPGA STA.Furthermore,a task decomposition and timing graph traversal method tailored for CPU-GPU heterogeneous architectures is proposed,achieving efficient acceleration of core STA operations such as delay calculation,levelization,and graph propagation.Finally,experimental results on both the OpenCores and industrial-grade FPGA benchmarks demonstrate that,compared with traditional CPU implementations,the proposed method achieves a runtime speedup of 3.125×to 33.333×,with overall performance surpassing that of the OpenTimer tool.This research provides a practical and feasible approach for efficient timing verification in large-scale FPGA designs.关键词
现场可编程门阵列/静态时序分析/异构计算/并行加速/电子设计自动化Key words
FPGA/static timing analysis/heterogeneous computing/parallel acceleration/electronic design automation分类
信息技术与安全科学引用本文复制引用
田春生,赵翔宇,王硕,王卓立,曹永铮,周婧,张瑶伟,陈雷..基于高效异构并行策略加速的FPGA静态时序分析算法[J].集成电路与嵌入式系统,2026,26(4):14-25,12.基金项目
国家自然科学基金项目(No.62374138) (No.62374138)