| 注册
首页|期刊导航|高技术通讯|基于ARM架构的线性标度三维分块算法优化及性能分析

基于ARM架构的线性标度三维分块算法优化及性能分析

严昱瑾 谭光明 贾伟乐

高技术通讯2025,Vol.35Issue(12):1277-1290,14.
高技术通讯2025,Vol.35Issue(12):1277-1290,14.DOI:10.3772/j.issn.1002-0470.2025.12.002

基于ARM架构的线性标度三维分块算法优化及性能分析

Optimization and performance analysis of the linear scaling three-dimensional fragment method on ARM architecture

严昱瑾 1谭光明 2贾伟乐1

作者信息

  • 1. 高性能计算机研究中心(中国科学院计算技术研究所) 北京 100190
  • 2. 中国科学院大学 北京 100049
  • 折叠

摘要

Abstract

As semiconductor device dimensions shrink to the nanoscale,the impact of quantum effects on semiconductor device simulations has become increasingly significant,necessitating the integration of electronic structure calcula-tions into computer-aided design.Faced with the dual challenges of computational complexity and communication load in large-scale electronic structure calculations,this paper presents optimizations of linear scaling three-dimen-sional fragment method(LS3 DF)at both the algorithmic and system levels on the Fugaku supercomputer platform,significantly enhancing computational efficiency and scalability.Algorithmic improvements include the adoption of a mixed-precision strategy and angular optimization of the all-band conjugate gradient method.At the system level,a coarse-grained parallel strategy,band blocking strategy,and three-dimensional fast Fourier transform(FFT)strate-gy are proposed.These optimizations result in a 4.61-fold increase in computational efficiency,with efficiency reaching 93.69%in large-scale tests involving 2 560 nodes.Additionally,a performance model is abstracted from the research,showing less than 5.00%discrepancy between estimated and actual running times.

关键词

高性能计算/电子结构/第一性原理计算/线性标度三维分块算法/富岳超级计算机/性能模型

Key words

high-performance computing/electronic structures/first-principles calculations/linear scaling three-dimensional fragment method/Fugaku supercomputer/performance model

引用本文复制引用

严昱瑾,谭光明,贾伟乐..基于ARM架构的线性标度三维分块算法优化及性能分析[J].高技术通讯,2025,35(12):1277-1290,14.

基金项目

国家重点研发计划(2021YFB030060)资助项目. (2021YFB030060)

高技术通讯

OA北大核心

1002-0470

访问量0
|
下载量0
段落导航相关论文