信息通信技术与政策2024,Vol.50Issue(12):13-20,8.DOI:10.12267/j.issn.2096-5931.2024.12.003
大语言模型核心架构演进态势分析
Analysis of large language model architecture evolution
王蕴韬1
作者信息
- 1. 中国信息通信研究院人工智能研究所,北京 100191
- 折叠
摘要
Abstract
This paper systematically reviews and analyzes the significant innovation directions based on the Transformer architecture.It examines the evolution of large language model architecture from three dimensions:innovation within the Transformer architecture itself,fusion innovation with other architectures,and innovations in non-Transformer architecture.This paper also provides an outlook on the future development directions of foundation models.关键词
大模型架构/Transformer/注意力机制/架构创新Key words
large model architecture/Transformer/attention mechanism/architectural innovation分类
信息技术与安全科学引用本文复制引用
王蕴韬..大语言模型核心架构演进态势分析[J].信息通信技术与政策,2024,50(12):13-20,8.