| 注册
首页|期刊导航|光通信研究|基于以太无损网络的智算中心光网络架构研究(特邀)

基于以太无损网络的智算中心光网络架构研究(特邀)

翟锐 李壮志 侯广营 马艺嘉 徐化朗

光通信研究Issue(5):70-75,6.
光通信研究Issue(5):70-75,6.DOI:10.13756/j.gtxyj.2024.240028

基于以太无损网络的智算中心光网络架构研究(特邀)

Research on Optical Network of Intelligent Computing Center based on Ethernet Lossless Networking

翟锐 1李壮志 1侯广营 1马艺嘉 1徐化朗1

作者信息

  • 1. 中国联合网络通信有限公司山东省分公司,济南 250002
  • 折叠

摘要

Abstract

[Objective]In recent years,Artificial Intelligence Generated Content(AIGC)has set off the artificial intelligence revo-lution.The network connection of the Intelligent Computing Center(ICC)has also developed in the direction of ultra-high band-width,intelligent lossless,and computing network convergence.Therefore,the optical network of the ICC needs to reduce the inter-card communication time in order to improve the efficiency of data access.[Methods]The paper addresses the networking architecture of optical networks for ICC scenarios to realize a lossless network with large bandwidth,low latency and high Cen-tral Processor Unit(CPU)efficiency,which can satisfy the demand of large model training and reasoning in ICC.This paper analyzes in detail the traffic distribution characteristics of the ICC and the communication flow characteristics under the AI large model training networking scenario.It also conducts in-depth research on the technologies such as Ethernet lossless network based on Remote Direct Memory Access(RDMA)technology and optoelectronic co-encapsulation.Finally it carries out the net-working practice and latency test under the ICC scenario.[Results]The RDMA over Converged Ethernet(RoCE)-based trans-port scheme proposed in this paper has the capabilities of priority-based flow control,displaying congestion notification,en-hanced transport selection and data center bridge capability switching protocols,which can realize lossless transmission based on Ethernet protocols in data centers.The test results in this paper show that the transmission delay using the RoCE protocol is approximately stable at around 1 μs and significantly outperforms the Internet Wide Area RDMA Protocol(iWARP).[Con-clusion]In this paper,based on the traffic characterization in the intelligent computing scenario,we have studied the key char-acteristics of the lossless Ethernet network in the ICC,and used the RDMA technology to realize the enhancement of the trans-mission efficiency of the optical switching network in the scenario of the ICC.We have also put forward a lossless Ethernet net-work scheme under the large model inference scenario of the ICC,and explored the feasible direction for the application of the RDMA technology in the intelligent computing scenario.The proposed scheme explores a feasible direction for the application of RDMA technology in the smart computing scenario.

关键词

长距直接内存访问/以太无损网络/智算中心/光交换

Key words

RDMA/Ethernet lossless network/ICC/optical switching

分类

信息技术与安全科学

引用本文复制引用

翟锐,李壮志,侯广营,马艺嘉,徐化朗..基于以太无损网络的智算中心光网络架构研究(特邀)[J].光通信研究,2024,(5):70-75,6.

光通信研究

OA北大核心CSTPCD

1005-8788

访问量3
|
下载量0
段落导航相关论文