计算机与现代化Issue(12):38-45,8.DOI:10.3969/j.issn.1006-2475.2025.12.006
基于MPI的异构算力资源融合调度平台
MPI-based Heterogeneous Computing Resource Integration and Scheduling Platform
摘要
Abstract
Aiming to the problem that high-performance computing centers,especially small and medium-sized computing cen-ters,are unable to undertake large-scale computing jobs due to the decentralization of heterogeneous computing resources,this paper designs and implements a heterogeneous computing resource integration and scheduling platform to realize the unified man-agement of heterogeneous computing resources such as X86,ARM and so on,as well as collaborative computing.The platform adopts a layered fusion scheduling architecture,utilizes cluster manager server(CMS)and job manager client(JMC)to dynami-cally monitor the resource status,and realizes collaborative parallel computing among heterogeneous computing nodes with the help of job scheduler(JS).Through the master-slave JMC process collaboration and MPI reduction mechanism,cross-architecture data synchronization at the physical machine level is achieved,and parallel execution of a single job on heteroge-neous computing nodes at the physical machine level is realized for the first time.To address the long-tail delay effects and syn-chronization overhead caused by performance imbalances in heterogeneous clusters,this paper proposes a deadline-constrained minimal resource algorithm(DCMR),which minimizes computational resource consumption while ensuring job completion dead-lines are met.Test results show that the platform has almost no loss of computing performance in heterogeneous environments,and the DCMR algorithm effectively improves the utilization efficiency of heterogeneous computing resources,providing a reli-able system solution to deal with heterogeneous computing environments.关键词
异构算力/资源调度/MPI/物理机/中小规模计算中心Key words
heterogeneous computing resource/resource scheduling/MPI/physical machine/small and medium-sized comput-ing centers分类
信息技术与安全科学引用本文复制引用
叶宁,付康,胡少文,龚一峰,王康,杨宇仙..基于MPI的异构算力资源融合调度平台[J].计算机与现代化,2025,(12):38-45,8.基金项目
江西省重点研发计划项目(20224BBC31002) (20224BBC31002)