无人机任务卸载与充电协同优化OACSTPCD
Joint Optimization of UAV Task Offloading and Charging
在野外恶劣环境应用中,可以使用具有灵活性和便捷性的无人机(UAV),通过无线数据传输辅助携带用户任务到边缘服务器.然而,UAV飞行平台难以提供长时间的任务卸载服务,大大限制了其应用前景.本文研究了在移动边缘计算环境中,如何有效整合UAV的任务卸载和充电调度.首先,构建了一个新的应用模型,该模型协同处理UAV的任务卸载调度和自身充电需求,并在UAV辅助任务卸载应用场景中加入了若干个无线充电平台.其次,考虑了用户任务的价值和UAV的充电需求,以在时延敏感和能量约束的条件下优化UAV辅助用户设备进行任务卸载的收益.最后,采用深度强化学习算法,对深度Q网络(DQN)进行调优后形成Fixed DQN算法,以有效处理模型中的大规模状态动作搜索空间问题.本文以UAV仅作为任务载体并考虑其自主充电需求为前提,通过在一个半径为3 000 m、含有11个节点的区域验证Fixed DQN算法的可行性;并在不同用户节点数量、充电节点数量及服务时间条件下,通过与蚁群算法、遗传算法和DQN算法的对比实验评估其性能.实验结果表明:本文提出的Fixed DQN算法在所有测试条件下均显著优于蚁群算法、遗传算法和DQN算法,特别是在节点数量增加和服务时间延长的情景中;此外,Fixed DQN算法相对于DQN算法的性能提升突显了深度强化学习在参数调优方面的有效性.研究结果证实了Fixed DQN算法在解决UAV任务卸载和充电调度问题中的高效性和调参策略的重要性.
In applications of harsh outdoor environments,unmanned aerial vehicles(UAVs),known for their flexibility and convenience,were utilized to assist in carrying user tasks to edge servers through wireless data transmission.However,it was found that UAV flight platforms struggled to provide long-duration task offloading services,significantly limiting their application prospects.This study investigated how to ef-fectively integrate UAV task offloading and charging scheduling in a mobile edge computing environment.Firstly,a new application model was constructed,which cohesively managed UAV task offloading scheduling and its own charging needs,incorporating several wireless charging plat-forms into the UAV-assisted task offloading application scenario.These platforms enabled UAVs to autonomously recharge during task execu-tion,providing automated charging services without the need for human intervention.UAVs independently decided whether to proceed to the nearest charging node for power replenishment based on their current power level and upcoming task offloading plans.However,opting to re-charge at a charging station not only incurred additional time and energy consumption from cruising altitude to the charging station but also re-quired consideration of the time cost during the charging process and its impact on overall task scheduling.When UAVs decided to recharge,ad-ditional time and effort were needed to descend from cruising altitude to the charging node.Secondly,the value of user tasks and UAV charging needs were considered to optimize the benefits of UAV-assisted user device task offloading under conditions sensitive to delay and energy con-straints.This involved not only optimizing the UAV's flight path and task allocation but also its charging schedule,ensuring sufficient charging and efficient operation while executing tasks.Such a cooperative scheduling strategy enabled UAVs to maximize the processing of user tasks while maintaining necessary operational energy,thereby enhancing the performance of the entire mobile edge computing system.Finally,a deep reinforcement learning algorithm was employed,and the deep Q network(DQN)was fine-tuned to form the Fixed DQN algorithm,effectively ad-dressing the large-scale state-action search space issue within the model.This approach capably handled complex decision-making problems and facilitated effective learning and optimization across a wide state space.With the deep learning framework,the algorithm processed high-dimen-sional input data and made accurate offloading and charging decisions in various dynamic environments.This was significantly important for im-proving the efficiency and effectiveness of UAV task offloading and charging scheduling.The design of the algorithm comprehensively con-sidered the following key aspects:Initially,the state space and action space of the algorithm were defined,ensuring that the agent could accur-ately perceive the environment and make effective decisions.Subsequently,the composition of the reward function was detailed,guiding the agent to progress towards the desired goal during training.Solely using the maximization of task offloading benefits as a constraint was found to prevent the agent from meeting the condition of serving each user at least once.Therefore,a method of minor learning goal constraints was pro-posed in the study.Specifically,the task offloading rewards accumulated by the agent in the phase of not completing minor learning goals were not directly awarded to prevent deviation from the path to achieving these goals.Afterwards,an experience replay mechanism was introduced,which improved learning efficiency and reduced correlations between samples by storing and reusing past experiences.Additionally,two asyn-chronously updated neural networks were employed to stabilize the learning process.Based on this,the hyperparameters of the Fixed DQN al-gorithm were meticulously optimized to further enhance the algorithm's performance.Most current research was based on the assumption that UAVs possess certain task processing capabilities.However,a different assumption was adopted in the paper,where the primary role of UAVs was only to carry tasks,not directly participate in task processing.Additionally,the autonomous charging needs of UAVs were also considered.This assumption is closer to actual application scenarios,where UAVs are primarily used for data collection and transmission,rather than data processing.The limitations of UAV endurance and the need for charging during task execution were also taken into account.In the study,11 nodes were set up within a circular area with a radius of 3 000 meters as a test environment to verify the feasibility of the Fixed DQN algorithm.To comprehensively evaluate the performance of the proposed Fixed DQN algorithm,extensive experiments were subsequently conducted under various conditions,including different numbers of user nodes,charging nodes,and varying lengths of service time.For comparative analysis,the experiments also included comparisons with ant colony algorithms,genetic algorithms,and DQN algorithms.In this way,the effectiveness of the Fixed DQN algorithm in different scenarios,especially in complex and dynamically changing environments,were deeply explored.The experi-mental results showed that under all test conditions,the Fixed DQN algorithm significantly outperforms the ant colony algorithm,genetic al-gorithm,and DQN algorithm,particularly in scenarios with an increased number of nodes and extended service times.Furthermore,the perform-ance improvement of Fixed DQN over DQN highlights the effectiveness of deep reinforcement learning in parameter tuning.These findings con-firms the efficiency of the Fixed DQN algorithm and the importance of parameter tuning strategies in addressing UAV task offloading and char-ging scheduling issues.
何涵;刘鹏;赵亮;王青山
杭州电子科技大学 计算机学院,浙江 杭州 310018沈阳航空航天大学 计算机学院,辽宁 沈阳 110136合肥工业大学 数学学院,安徽 合肥 230009
计算机与自动化
边缘计算无人机任务卸载强化学习充电调度
edge computingUAVtask offloadingreinforcement learningcharging scheduling
《工程科学与技术》 2024 (001)
99-109 / 11
国家自然科学基金面上项目(62172134)
评论