| 注册
首页|期刊导航|集成电路与嵌入式系统|大模型FPGA推理实现技术综述与未来挑战

大模型FPGA推理实现技术综述与未来挑战

黄思晓 彭皓翔 施旭 苏志锋 黄明强 余浩

集成电路与嵌入式系统2025,Vol.25Issue(6):1-13,13.
集成电路与嵌入式系统2025,Vol.25Issue(6):1-13,13.DOI:10.20193/j.ices2097-4191.2025.0023

大模型FPGA推理实现技术综述与未来挑战

A survey on hardware accelerator for large model inference on FPGA

黄思晓 1彭皓翔 1施旭 1苏志锋 1黄明强 1余浩1

作者信息

  • 1. 南方科技大学深港微电子学院,深圳 518055
  • 折叠

摘要

Abstract

In recent years,with the widespread application of large models(such as GPT,LLaMA,DeepSeek,etc.),the computing power requirements and energy efficiency issues in the reasoning stage have become increasingly prominent.Although traditional GPU solutions can provide high throughput,they face challenges in power consumption,real-time performance and cost.FPGAs have become an important alternative for large model reasoning deployment with their customizable architecture,low latency determinism and high energy efficiency.This paper systematically reviews the network structure of large models and the reasoning implementation technology of large models on FPGAs,covering three major directions:hardware architecture adaptation,algorithm-hardware co-optimization and system-level challenges.At the hardware level,the focus is on the design of computing units and storage level optimization strategies;at the algorithm level,key technologies such as model compression,dynamic quantization and compiler optimization are analyzed.At the system level,challenges such as multi-FPGA expansion,thermal management and emerging storage-computing integrated architectures are discussed.In addition,this paper summarizes the limitations of the current FPGA reasoning ecosystem(such as insufficient tool chain maturity)and looks forward to future trends,including chiplet heterogeneous integration,photonic computing fusion and the es-tablishment of a standardized evaluation system.The research results show that the architectural flexibility of FPGA gives it a unique advantage in the field of efficient reasoning of large models,but interdisciplinary collaboration is still needed to promote the implementa-tion of the technology.

关键词

大语言模型/硬件加速/FPGA/存算一体架构/Transformer

Key words

large language models/hardware accelerator/FPGA/integrated storage and computing architecture/Transformer

分类

电子信息工程

引用本文复制引用

黄思晓,彭皓翔,施旭,苏志锋,黄明强,余浩..大模型FPGA推理实现技术综述与未来挑战[J].集成电路与嵌入式系统,2025,25(6):1-13,13.

基金项目

国家重点研发计划(2021YFE0204000) (2021YFE0204000)

深圳市科创委—2021深圳市高层次人才孔雀团队项目(KQTD20200820113051096) (KQTD20200820113051096)

深圳市科创委—基础研究重点项目(JCYJ20220818100217038) (JCYJ20220818100217038)

集成电路与嵌入式系统

1009-623X

访问量2
|
下载量0
段落导航相关论文