| 注册
首页|期刊导航|集成电路与嵌入式系统|一种多精度可重构张量计算单元的设计

一种多精度可重构张量计算单元的设计

胡湘宏 梁克龙 尹飞跃 冯兆樟 林元妙 蔡述庭 熊晓明

集成电路与嵌入式系统2026,Vol.26Issue(3):81-89,9.
集成电路与嵌入式系统2026,Vol.26Issue(3):81-89,9.DOI:10.20193/j.ices2097-4191.2025.0109

一种多精度可重构张量计算单元的设计

Design of multi-precision reconfigurable tensor computing unit

胡湘宏 1梁克龙 1尹飞跃 1冯兆樟 1林元妙 1蔡述庭 1熊晓明1

作者信息

  • 1. 广东工业大学 集成电路学院,广州 510000
  • 折叠

摘要

Abstract

With the rapid development of artificial intelligence and deep learning applications,tensor computing urgently demands high-efficiency and multi-precision computing hardware accelerators.The traditional general-purpose processors face energy efficiency bottle-necks when processing large-scale matrix multiplication operations,while existing dedicated accelerators often lack flexibility in support-ing diverse data precision and hybrid computing modes.This paper presents a multi-precision and mixed-precision tensor processing unit(TPU),designed based on a reconfigurable architecture,which supports five data formats(INT4,INT8,FP16,BF16,FP32)and two hybrid modes(FP16+FP32,BF16+FP32).It is capable of efficiently performing matrix multiplication and accumulation across three different dimensions(m16n16k16,m32n8k16,m8n32k16).By incorporating a reconfigurable computing array,dynamic data flow con-trol,multi-mode buffer design,and a unified floating-point processing unit,the design achieves high hardware reuse and significantly improved computational efficiency.Synthesized on the VCU118 FPGA platform at 251.13 MHz,it delivers a peak theoretical perform-ance of 257.16 GOPS/GFLOPS(INT4/INT8/FP16/BF16)and 64.29 GFLOPS(FP32).This design is well-suited for applications such as deep learning inference,autonomous driving,and medical imaging,where both computational efficiency and flexibility are critical.

关键词

张量处理单元/多精度计算/可重构架构/矩阵乘法/硬件复用

Key words

tensor processing unit/multi-precision computation/reconfigurable architecture/matrix multiplication/hardware reutilization

分类

信息技术与安全科学

引用本文复制引用

胡湘宏,梁克龙,尹飞跃,冯兆樟,林元妙,蔡述庭,熊晓明..一种多精度可重构张量计算单元的设计[J].集成电路与嵌入式系统,2026,26(3):81-89,9.

基金项目

国家自然科学基金(62301165) (62301165)

广州市科技计划项目(2023B01J0007) (2023B01J0007)

集成电路与嵌入式系统

1009-623X

访问量0
|
下载量0
段落导航相关论文