| 注册
首页|期刊导航|计算机工程与科学|基于直接内存访问和动态共享缓冲区的超长向量归约操作硬件卸载结构与方法

基于直接内存访问和动态共享缓冲区的超长向量归约操作硬件卸载结构与方法

徐金波 戴艺 翦杰

计算机工程与科学2025,Vol.47Issue(4):571-581,11.
计算机工程与科学2025,Vol.47Issue(4):571-581,11.DOI:10.3969/j.issn.1007-130X.2025.04.001

基于直接内存访问和动态共享缓冲区的超长向量归约操作硬件卸载结构与方法

A hardware offloading structure and method for ultra long vector reduction operation based on direct memory access and dynamic shared buffer

徐金波 1戴艺 1翦杰1

作者信息

  • 1. 国防科技大学计算机学院,湖南长沙 410073
  • 折叠

摘要

Abstract

MPI(Message Passing Interface)collective communication enhances system performance by organizing multiple processes across multiple computing nodes to collaboratively complete a series of communication operations.Among these,reduction operations on ultra-long operand vectors are widely used in high performance computing and AI(Artificial Intelligence)computations.This paper proposes a hardware offloading structure and method for ultra-long vector reduction operations based on DMA(Direct Memory Access)and dynamic shared buffers.It achieves control over the hardware offloading process for collective communication through a dedicated hardware communication sequence trigger mechanism.The DMA transmission protocol is employed to enhance the software-hardware transmis-sion efficiency of reduction operands.An on-chip dynamic shared buffer storage structure is introduced to achieve flexible and efficient caching of a large number of operands.By deploying an on-chip ALU(Arithmetic Logic Unit)array,computations are performed directly within the network chip.Experi-mental results demonstrate significant acceleration compared to both non-offloaded MPI methods and the original offloading method used in Tianhe,especially when dealing with longer reduction vectors.

关键词

聚合通信/归约/直接内存访问/动态共享缓冲区/硬件卸载

Key words

collective communication/reduce/direct memory access/dynamic shared buffer/hardware offloading

分类

计算机与自动化

引用本文复制引用

徐金波,戴艺,翦杰..基于直接内存访问和动态共享缓冲区的超长向量归约操作硬件卸载结构与方法[J].计算机工程与科学,2025,47(4):571-581,11.

基金项目

国防科技重点实验室基金(2022-KJWPDL-11) (2022-KJWPDL-11)

自主创新科学基金(22-ZZCX-002) (22-ZZCX-002)

计算机工程与科学

OA北大核心

1007-130X

访问量0
|
下载量0
段落导航相关论文