嵌入式异构智能计算系统并行多流水线设计OACSCDCSTPCD
Parallel Multi Pipeline Design of Embedded Heterogeneous AI Computing Systems
嵌入式智能计算系统因其功耗受限和多传感器实时智能处理需要,对硬件平台的智能算力能效比和智能计算业务并行度提出了严峻挑战.传统嵌入式计算系统常采用的DSP+FPGA数字信号处理架构,无法适用于多个神经网络模型加速场景.本文基于ARM+DLP+SRIO嵌入式异构智能计算架构,利用智能处理器多片多核多内存通道特性,提出了并行多流水线设计方法.该方法充分考虑智能计算业务中数据传输、拷贝、推理、结果反馈等环节时间开销,为不同的神经网络模型合理分配智能算力资源,以达到最大的端到端智能计算业务吞吐率.实验结果表明,采用并行多流水线设计方法的深度学习处理器利用率较单流水线平均提高约25.2%,较无流水线平均提高约30.7%,满足可见光、红外、SAR等多模图像实时智能处理需求,具有实际应用价值.
Due to the limited power consumption and the need for real-time intelligent processing of multiple sensors,embedded AI computing systems desire for higher energy efficiency and more parallel intelligent computing services simul-taneously.The digital signal processing architecture DSP+FPGA commonly used in traditional embedded computing sys-tems is not suitable for multiple ANN models inference acceleration.Based on embedded heterogeneous intelligent comput-ing architecture ARM+DLP+SRIO,this paper proposes a parallel multi pipeline design method by taking advantage of the characteristics of multi chip,multi-core and multi memory channels of deep learning processors.Considering the time cost of data transmission,copy,reference and feedback,this method allocates intelligent computing resources for different neu-ral network models to achieve the maximum end-to-end throughput.The experimental results show that the utilization of the deep learning processor using the parallel multi pipeline design method is about 25.2%higher than that of a single pipeline,and about 30.7%higher than that without pipeline.It meets the real-time intelligent processing requirements of visible light,infrared and SAR images,and is valuable for practical applications.
赵二虎;吴济文;肖思莹;晋振杰;徐勇军
中国科学院计算技术研究所专项技术研究中心,北京 100190||中国科学院大学,北京 100049中国科学院计算技术研究所专项技术研究中心,北京 100190中国科学院计算技术研究所专项技术研究中心,北京 100190中国科学院计算技术研究所专项技术研究中心,北京 100190中国科学院计算技术研究所专项技术研究中心,北京 100190
计算机与自动化
嵌入式智能计算系统异构计算架构神经网络模型并行多流水线深度学习处理器
embedded AI computing systemsheterogeneous computing architectureneural network modelparal-lel multi pipelinedeep learning processor(DLP)
《电子学报》 2023 (11)
3354-3364,11
中国科学院技术支撑人才项目北航杭州创新研究院钱江实验室开放基金(No.2020-Y8-A-023)CAS Technology Support Talent ProgramHangzhou Innovation Institute,Beihang University(No.2020-Y8-A-023)
评论