高技术通讯2025,Vol.35Issue(12):1277-1290,14.DOI:10.3772/j.issn.1002-0470.2025.12.002
基于ARM架构的线性标度三维分块算法优化及性能分析
Optimization and performance analysis of the linear scaling three-dimensional fragment method on ARM architecture
摘要
Abstract
As semiconductor device dimensions shrink to the nanoscale,the impact of quantum effects on semiconductor device simulations has become increasingly significant,necessitating the integration of electronic structure calcula-tions into computer-aided design.Faced with the dual challenges of computational complexity and communication load in large-scale electronic structure calculations,this paper presents optimizations of linear scaling three-dimen-sional fragment method(LS3 DF)at both the algorithmic and system levels on the Fugaku supercomputer platform,significantly enhancing computational efficiency and scalability.Algorithmic improvements include the adoption of a mixed-precision strategy and angular optimization of the all-band conjugate gradient method.At the system level,a coarse-grained parallel strategy,band blocking strategy,and three-dimensional fast Fourier transform(FFT)strate-gy are proposed.These optimizations result in a 4.61-fold increase in computational efficiency,with efficiency reaching 93.69%in large-scale tests involving 2 560 nodes.Additionally,a performance model is abstracted from the research,showing less than 5.00%discrepancy between estimated and actual running times.关键词
高性能计算/电子结构/第一性原理计算/线性标度三维分块算法/富岳超级计算机/性能模型Key words
high-performance computing/electronic structures/first-principles calculations/linear scaling three-dimensional fragment method/Fugaku supercomputer/performance model引用本文复制引用
严昱瑾,谭光明,贾伟乐..基于ARM架构的线性标度三维分块算法优化及性能分析[J].高技术通讯,2025,35(12):1277-1290,14.基金项目
国家重点研发计划(2021YFB030060)资助项目. (2021YFB030060)