|国家科技期刊平台
首页|期刊导航|计算机工程与科学|基于SRAM缓存和存内计算的低功耗关键词唤醒系统

基于SRAM缓存和存内计算的低功耗关键词唤醒系统OA北大核心CSTPCD

A low-power keyword spotting system with SRAM buffer and computing-in-memory

中文摘要英文摘要

为了解决关键词唤醒算法部署在边缘计算硬件会带来较高功耗、给电池驱动的设备带来续航挑战的问题,提出了一种基于存内计算技术和软硬件协同优化的低功耗关键词唤醒系统.在算法层面,基于标准MFCC算法拓扑结构提出了一种三值量化MFCC-CNN联合算法,将MFCC中的全部通用矩阵乘映射到神经网络加速器当中.在电路层面,提出了一种基于SRAM的存内计算核心,用于解决传统冯·诺依曼架构加速器存在的功耗墙和存储墙问题.同时通过复用存内计算核心的SRAM存储功能提出了一种基于查找表实现的缓存电路,用于替代寄存器延迟链电路.SRAM存内计算核心和SRAM缓存电路均采用定制单元实现.在系统层面,基于以上2种定制电路设计了一种低功耗关键词唤醒系统.该系统采用ASIC与定制化电路设计流程设计,并使用28 nm CMOS工艺库对该设计进行了ASIC综合,在250 kHz下,关键词唤醒系统运行10分类任务的延迟是64 ms,整体功耗为645.28 μW,其中MFCC流水线的动态功耗占总动态功耗的5.9%,总功耗仅占系统功耗的1.3%.

This paper proposes a low-power keyword spotting(KWS)system to overcome the prob-lem of high-power consumption caused by deploying KWS algorithms on edge computing hardware,which can significantly impact the endurance of mobile devices.The proposed KWS system is based on computing-in-memory(CIM)technology and software-hardware co-design.In terms of algorithm,a ternary quantized MFCC-CNN joint algorithm based on the standard MFCC algorithm topology is pro-posed.All the general matrix multiplication(GEMM)in MFCC is mapped to the neural network accel-erator.At the circuit level,the proposed system uses a computing-in-memory(CIM)core based on SRAM to overcome the power and memory walls in traditional von Neumann architecture accelerators.Additionally,a SRAM buffer circuit based on a look-up table is proposed to replace the register delay chain,which multiplexes the memory array in the CIM core.Both the SRAM-based CIM core and buffer are implemented using custom circuit units.At the system level,a low-power KWS system is proposed utilizing the two customized circuits discussed above.The system is implemented using ASIC and cus-tomized circuit design methods and synthesized using a 28 nm process library.The proposed system achieves a processing delay of 64 ms on 10 classification tasks,with a total power consumption of 645.28μW.The dynamic power consumption of the MFCC pipeline accounts for 5.9%of the total dynamic power consumption,and the total power consumption accounts for only 1.3%of the system's power consumption.

黄至锐;贾心茹;朱浩哲;陈迟晓

复旦大学集成芯片与系统全国重点实验室,上海 200433||复旦大学芯片与系统前沿技术研究院,上海 200438

计算机与自动化

唤醒三值量化神经网络存内计算串行快速傅里叶变换软硬件协同设计

spottingternary quantized neural networkcomputing-in-memoryserial fast Fourier transform(FFT)software-hardware co-design

《计算机工程与科学》 2024 (008)

1331-1339 / 9

国家重点研发计划(2022YFB4500101)

10.3969/j.issn.1007-130X.2024.08.001

评论