D3DQN-CAA:一种基于DRL的自适应边缘计算任务调度方法OA北大核心CSTPCD
D3DQN-CAA:a DRL-based Adaptive Edge Computing Task Scheduling Method
为解决已有基于深度强化学习的边缘计算任务调度面临的动作空间探索度固定不变、样本效率低、内存需求量大、稳定性差等问题,更好地在计算资源相对有限的边缘计算系统中进行有效的任务调度,在改进深度强化学习模型D3DQN(Dueling Double DQN)的基础上,提出了自适应边缘计算任务调度方法D3DQN-CAA.在任务卸载决策时,将任务与处理器的对应关系看作一个多维背包问题,根据当前调度任务与计算节点的状态信息,为任务选择与其匹配度最高的计算节点进行任务处理;为提高评估网络的参数更新效率,降低过估计的影响,提出一种综合性Q值计算方法;为进一步加快神经网络的收敛速度,提出了一种自适应动作空间动态探索度调整策略;为减少系统所需的存储资源,提高样本效率,提出一种自适应轻量优先级回放机制.实验结果表明,和多种基准算法相比,D3DQN-CAA方法能够有效地降低深度强化学习网络的训练步数,能充分利用边缘计算资源提升任务处理的实时性,降低系统能耗.
To solve the problems faced by the existing edge computing task scheduling based on deep reinforcement learning,such as fixed action space exploration,low sample efficiency,large memory demand and poor stability and to better carry out effective task scheduling in the edge computing system with relatively limited computing resources,an adaptive edge computing task scheduling method D3DQN-CAA is proposed based on the improved deep reinforcement learning model D3DQN(Dueling Double DQN).In the task offloading decision,the corresponding relationship between the task and processor is regarded as a multidimensional knapsack problem,and the computing node with the highest matching degree is selected for task processing according to the state information of the current scheduled task and the computing node;For improving the parameters updating efficiency of the evaluation network and reducing the influence of overestimation,a comprehensive Q-value calculation method is proposed;For accelerating the convergence speed of neural networks,an adaptive dynamic exploration degree of action space adjustment strategy is proposed;For reducing the storage resources required and improving the sample efficiency,an adaptive lightweight prioritized playback mechanism is proposed.Experimental results show that compared with multiple benchmark algorithms,the D3DQN-CAA algorithm can effectively reduce the number of training steps of deep reinforcement learning networks and make full use of edge computing resources to improve the real-time performance of task processing and reduce the system energy consumption.
巨涛;王志强;刘帅;火久元;李启南
兰州交通大学 电子与信息工程学院,甘肃 兰州 730070
电子信息工程
边缘计算任务调度深度Q学习深度强化学习
edge computingtask schedulingdeep Q-learningdeep reinforcement learning
《湖南大学学报(自然科学版)》 2024 (006)
73-85 / 13
国家自然科学基金资助项目(61862037,62262038),National Natural Science Foundation of China(61862037,62262038);甘肃省科技计划项目(23CXGA0028),Science and Technology Project of Gansu Province(23CXGA0028);甘肃省自然科学基金资助项目(22JR5RA356),Natural Science Foundation of Gansu Province(22JR5RA356).
评论