面向高性能计算环境的智能任务编排架构研究OA
Research on Intelligent Task Orchestration for High Performance Computing Environment
[目的]一个大规模科学计算任务往往包括多个计算作业或一个作业组,且多个计算作业之间有执行顺序、有依赖关系,用户需要等待上一个作业完成再提交下一个作业.为了减少用户的等待时间,急需一种新的作业提交方式,允许用户同时提交多个有依赖关系的作业.[方法]提出了面向高性能计算环境的智能任务编排架构,可以自动解析作业之间的依赖关系,智能编排作业提交顺序,监控作业状态,当被依赖作业完成后提交下一个作业.[结果]从实际应用效果来看,智能任务编排服务可以有效简化用户操作.[结论]具备较好的应用效果.
[Objective]A large-scale scientific computing task often includes multiple computing jobs or a job group,and there are execution orders and dependencies between multiple computing jobs.Users need to wait for the previous job to complete before submitting the next one.In order to reduce the user waiting time,there is an urgent need for new ways of submitting jobs that al-lows users to submit multiple jobs with dependencies at the same time.[Methods]This paper proposes an intelligent task orchestration scheme for high-performance computing environ-ments,which can automatically resolve dependencies between jobs,intelligently orchestrate job submission sequences,monitor job status,and submit the subsequent job after the depend-ing job is completed.[Results]From the perspective of practical application effects,the intelli-gent task orchestration service can effectively simplify user operations.[Conclusions]The scheme proposed achieves a good application effect.
吴璨;肖海力;王小宁;卢莎莎;和荣
中国科学院计算机网络信息中心,北京 100083中国科学院计算机网络信息中心,北京 100083中国科学院计算机网络信息中心,北京 100083中国科学院计算机网络信息中心,北京 100083中国科学院计算机网络信息中心,北京 100083
高性能计算环境作业组作业依赖智能任务编排
high performance computing environmentjob groupjob dependencyintelligent task orchestration
《数据与计算发展前沿》 2025 (1)
99-107,9
国家重点研发计划(2023YFB3002302)中国科学院计算机网络信息中心项目"面向国产异构超级计算机的智能任务编排架构研究"
评论