| 注册
首页|期刊导航|大数据|构建支持大模型训练的计算机系统需要考虑的4个问题

构建支持大模型训练的计算机系统需要考虑的4个问题

郑纬民

大数据2024,Vol.10Issue(1):1-8,8.
大数据2024,Vol.10Issue(1):1-8,8.DOI:10.11959/j.issn.2096-0271.2024016

构建支持大模型训练的计算机系统需要考虑的4个问题

Four issues to consider in building a computer system supporting large model training

郑纬民1

作者信息

  • 1. 清华大学计算机科学与技术系,北京 100084
  • 折叠

摘要

Abstract

There are three types of computer systems that support large model training,among which the ecosystem based on domestic AI chip systems is not very good.To change this situation,it is necessary to develop 10 key software such as AI compilers and parallel acceleration.Moreover,systems based on supercomputers require good software and hardware collaborative design to better serve large model training.This article proposes a 4-point balanced design for building the infrastructure of a large model to ensure system performance,reliability,and scalability.

关键词

大模型训练/计算机系统/超算系统/大模型基础设施

Key words

large model training/computer system/supercomputing system/large model infrastructure

分类

信息技术与安全科学

引用本文复制引用

郑纬民..构建支持大模型训练的计算机系统需要考虑的4个问题[J].大数据,2024,10(1):1-8,8.

大数据

OACSTPCD

2096-0271

访问量0
|
下载量0
段落导航相关论文