西安电子科技大学学报(自然科学版)2025,Vol.52Issue(1):181-195,15.DOI:10.19665/j.issn1001-2400.20241005
面向WaaS平台的多工作流容错调度策略
Multi-workflow fault-tolerant scheduling strategy for WaaS platforms
摘要
Abstract
As the complexity of scientific computation increases,workflows have become an essential model for automating scientific computations.Workflow as a Service(WaaS)platforms rent virtual machines from Infrastructure as a Service(IaaS)providers to offer users the service of running scientific workflow computations.However,current researches on workflow scheduling in WaaS platforms do not consider the potential for virtual machine downtime to lead to task failures and the delays in virtual machine provisioning.To address this issue,this paper proposes a multi-workflow fault-tolerant scheduling strategy for WaaS platforms.First,considering that WaaS platforms do not schedule hardware resources but operate at the level of virtual machines and containers,we establish a workflow scheduling model suitable for WaaS platforms,taking into account the impact of virtual machine provisioning delays on scheduling.Second,we propose a multi-workflow fault-tolerant scheduling strategy for WaaS platforms,which includes preprocessing,fault-tolerance selection method,task scheduling,and resource adjustment.This involves designing an improved deadline division algorithm for determining the scheduling order,creating a fault-tolerance selection algorithm that combines replication and resubmission,considering task attributes and virtual machine provisioning delays for virtual machine selection and task allocation,and designing a resource adjustment algorithm for avoiding the waiting time for the provisioning delay of virtual machines or containers by deploying resources in advance for the upcoming tasks.Finally,by comparing the proposed scheduling strategy under different virtual machine downtime probabilities,workloads,and deadlines with other algorithms,we demonstrate the effectiveness of the proposed fault-tolerant scheduling strategy for WaaS platforms.关键词
多工作流/容错调度算法/工作流即服务平台/资源供应延迟Key words
multi-workflow/fault tolerance scheduling algorithm/WaaS platforms/resource provisioning delay分类
信息技术与安全科学引用本文复制引用
支文韬,赵辉,孟繁鑫,王静,万波,王泉..面向WaaS平台的多工作流容错调度策略[J].西安电子科技大学学报(自然科学版),2025,52(1):181-195,15.基金项目
陕西省重点研发计划(2024GX-YBXM-010,2024GX-YBXM-140,2024GX-YBXM-039) (2024GX-YBXM-010,2024GX-YBXM-140,2024GX-YBXM-039)
陕西省创新能力支撑计划(2023-CX-TD-08) (2023-CX-TD-08)
陕西省秦创原"科学家+工程师"团队(2023KXJ-040) (2023KXJ-040)
中央高校基本科研业务费专项资金(ZYTS24089) (ZYTS24089)