运筹与管理2025,Vol.34Issue(7):1-8,8.DOI:10.12005/orms.2025.0200
多任务并行云工作流调度模型与算法
Model and Algorithm for Cloud Workflows Scheduling Considering Multi-task Parallelism
摘要
Abstract
Workflow is a series of related tasks connected by control flow and data dependencies.Many large-scale scientific computing problems in various fields can be represented as workflow models.These workflow models involve complex computational tasks,increasing data volume and computational requirements,which cannot be met by a single computer.With the rapid development of cloud computing technology and the increasing number of large-scale scientific computing problems,the scheduling problem of workflows in cloud computing environments has gained widespread attention.Developing scheduling strategies to map tasks in workflows to cloud resource nodes is an NP-hard problem,and the quality of scheduling algorithms affects the performance of the scheduling system,resources utilization,and user satisfaction.Therefore,studying the workflow scheduling problem in cloud computing environments is of significant importance. Existing literature has extensively studied the workflow scheduling problem under different user service quality requirements,different constraint conditions,and specific application scenarios and optimization objectives.However,most studies have not considered the parallel execution of tasks on the same virtual machine and have ignored the matching degree between task characteristics and resources diversity.To address the shortcomings in existing research,this paper studies the cloud workflows scheduling considering multi-task parallelism.Specifically,it aims to minimize the total cost while meeting the deadline requirements of workflows.The proposed approach also considers the parallel execution of multiple independent tasks on the same virtual machine to reduce data transmission between virtual machines.Finally,the matching degree between task characteristics and virtual machine resource diversity is taken into account to improve processing speed and resource utilization. Firstly,this paper constructs a mathematical model for the cloud workflows scheduling considering multi-task parallelism.Then,it designs a simulated annealing algorithm based on destruction and reconstruction to solve this model to achieve efficient scheduling and cost optimization of workflows.The simulated annealing algorithm first uses a greedy algorithm to generate an initial feasible solution.Then,starting from the initial solution,it generates neighborhood solutions through destruction and reconstruction operations.It uses the Metropolis criterion in simulated annealing to determine the acceptance rule for neighborhood solutions.If the newly generated solution is better than the current solution,the new solution is accepted.Otherwise,it accepts a worse solution with a certain probability,which helps to jump out of local optima to some extent.The algorithm repeats the destruction,reconstruction,and solution update operations until the pre-set solving time is reached,and then terminates the algorithm.Finally,the best solution obtained in the iteration process is taken as the approximate optimal solution. To verify the effectiveness of the proposed algorithm,simulation experiments are conducted using the cloud simulation platform CloudSim.CyberShake,Epigenomics,Inspiral,and Sipht workflows with different scales,released by the Pegasus project,are used as test cases,and the CPU and memory configuration requirements of tasks in different types of workflows are determined based on literature.Besides,considering that the deadline decomposition algorithm is an effective algorithm for solving cloud workflow scheduling problems in the current literature,this paper designs a deadline decomposition algorithm for this problem and compares it with the simulated annealing algorithm through experiments.The deadline decomposition algorithm first divides tasks in the workflow into levels,grouping tasks of different levels into different task sets.Next,the algorithm uses the deadline decomposition strategy to set appropriate sub-deadlines for each task set.After that,it prioritizes the tasks and generates a task scheduling list that meets the pre and post dependency relationships.Finally,taking into account both time and cost factors,the algorithm assigns virtual machines to tasks in different sets in sequence. The initial temperature value is an important factor that affects the global search performance of the simulated annealing algorithm,while the cooling rate and iteration number are closely related to the solving performance of the simulated annealing algorithm.Therefore,this paper first determines the optimal parameter values for the initial temperature,cooling rate,and iteration number through experiments.Then,the algorithm performance of the deadline decomposition algorithm and the simulated annealing algorithm under the optimal parameter values are compared.The experimental results show that the simulated annealing algorithm based on destruction and reconstruction designed in this paper has better solving performance than the deadline decomposi-tion algorithm.But the algorithm's solving time is relatively long,and stability needs to be improved.Overall,the proposed algorithm can effectively help cloud service providers develop workflow scheduling plans,improve resources utilization and user service quality,and maximize the interests of both cloud service providers and users.关键词
云计算/工作流调度/模拟退火/期限分解Key words
cloud computing/workflow scheduling/simulated annealing/deadline division分类
管理科学引用本文复制引用
王阳,王洪普,范琼瑜,刘海潮..多任务并行云工作流调度模型与算法[J].运筹与管理,2025,34(7):1-8,8.基金项目
国家自然科学基金资助项目(71971172,72371200) (71971172,72371200)
陕西省社会科学基金资助项目(2019S051) (2019S051)
西北工业大学特色文科发展计划——青年创新能力培养项目(23GH030611) (23GH030611)
西北工业大学博士论文创新基金资助项目(CX2022057) (CX2022057)