基于动态压缩的高存储效率末级分支目标缓冲OA北大核心CSTPCD
A storage efficient last-level branch target buffer based on dynamic compression
随着软件系统规模及复杂度的增长,数量庞大的指令使指令高速缓存和分支目标缓冲(BTB)频繁地发生缺失,导致中央处理器(CPU)性能下降.现代工业CPU设计在分离式前端中使用充分大的多级BTB以减少缺失导致的性能损失.由于实际芯片的存储资源有限,大容量的末级BTB需要更高的存储效率.然而,现有压缩BTB采用静态分配目标偏移量存储空间的方法,无法按照分支的实际存储需求进行调整,导致其存储效率较低.针对上述问题,提出一种基于动态压缩的BTB——ZBTB.ZBTB通过可变长编码表示目标偏移量,动态分配目标偏移量存储空间,结合无额外存储的最近最少使用(LRU)和偏斜相联等方法缓解冲突,提升了存储效率.基于以第1届指令预取锦标赛(IPC-1)所发布轨迹数据进行的评估,与现有BTB相比,ZBTB在33.5 kB容量下可将误预测次数降低66%.
With the increasing size and complexity of software systems,the massive instructions bring frequent misses to instruction caches and branch target buffers(BTBs)and hurt central processing unit(CPU)performance.Modern industry CPU designs utilize sufficiently large multi-level BTBs in decoupled front end to reduce performance degra-dation from misses and consequently result in vast BTB storage requirements.However,current compressed BTBs use statical allocation policies that cannot adapt to upcoming branches.To overcome the limitations of current BT-Bs,this work proposes a dynamically compressed BTB called zipped branch target buffer(ZBTB).ZBTB uses an adaptive allocation policy enabled by the employment of variable length target offset with a storage-free least-re-cently-used(LRU)replacement and skewed associativity to reduce conflictions.Evaluate ZBTB on traces from the First Instruction Prefetching Championship(IPC-1).Compared with the state-of-the-art storage-efficient BTBs,ZBTB can reduce the misses by over 66%with the 33.5 kB storage budget.
谭弘泽;王剑
处理器芯片全国重点实验室(中国科学院计算技术研究所) 北京 100190中国科学院大学 北京 100049
分支预测分支目标缓冲(BTB)压缩偏斜相联
branch predictionbranch target buffer(BTB)compressionskewed associativity
《高技术通讯》 2024 (007)
671-680 / 10
国家重点研发计划(2022YFB3105103)资助项目.
评论