计算机工程2025,Vol.51Issue(7):59-67,9.DOI:10.19678/j.issn.1000-3428.0069035
一种集成于超算作业调度系统应用的并行参数优化方法
A Parallel Parameter Optimization Method Integrated with Job Scheduling System for Supercomputing Applications
摘要
Abstract
High-performance computing architectures have facilitated software and hardware with multi-layer parallel structures.These multi-layered system resources can be assigned to computational tasks distributed across different vertical tiers and horizontal groups through various schemes.The allocation schemes,typically determined at runtime by user-defined parallel parameters,significantly affect computational efficiency.As computational scale and complexity increase,the configurable space for these parallel parameters expands,making it increasingly difficult for users to identify the optimal settings.Although such runtime optimization problems are prevalent in scientific computing applications,related research and effective solutions remain scarce.Using the Vienna Ab initio Simulation Package(VASP)as a case study,this study to analyze its multilayer parallel structure to demonstrate how different parallel parameter configurations can lead to significant variations in computational speed.It then proposes a fully automated optimization method based on a reduced parallel efficiency metric.This approach enables users to quickly determine optimal parallel parameters and identifies the most efficient hardware resource allocation,facilitating effective scaling for large-scale parallel computing.Finally,this study integrates the optimization method with a cluster job scheduling system and applies it to actual VASP calculation jobs submitted by users.Statistical results demonstrate that the proposed method significantly improves job execution speed and enhances the utilization efficiency of supercomputing resources,showing great promise for practical engineering applications.关键词
并行计算/作业调度/运行时优化/超级计算/VASP应用Key words
parallel computing/job scheduling/runtime optimization/supercomputing/VASP application分类
信息技术与安全科学引用本文复制引用
张文帅,李会民,李京,潘必才..一种集成于超算作业调度系统应用的并行参数优化方法[J].计算机工程,2025,51(7):59-67,9.基金项目
中国科学院A类先导科技专项(XDA19020102). (XDA19020102)