| 注册
首页|期刊导航|工程科学学报|基于改进深度Q网络的异构无人机快速任务分配

基于改进深度Q网络的异构无人机快速任务分配

WANG Yuehai QIU Guoshuai XING Na ZHAO Xinyi WANG Jie HAN Xi

工程科学学报2026,Vol.48Issue(1):142-151,10.
工程科学学报2026,Vol.48Issue(1):142-151,10.DOI:10.13374/j.issn2095-9389.2025.01.20.001

基于改进深度Q网络的异构无人机快速任务分配

Fast task allocation for heterogeneous UAVs employing improved deep Q-network

WANG Yuehai 1QIU Guoshuai 2XING Na 2ZHAO Xinyi 3WANG Jie 4HAN Xi2

作者信息

  • 1. School of Artificial Intelligence and Computer Science,North China University of Technology,Beijing 100144,China||Beijing Open University,Beijing 100081,China
  • 2. School of Artificial Intelligence and Computer Science,North China University of Technology,Beijing 100144,China
  • 3. Sino-European Institute of Aviation Engineering,Civil Aviation University of China,Tianjin 300300,China
  • 4. School of Intelligence Science and Technology,University of Science and Technology Beijing,Beijing 100083,China
  • 折叠

摘要

Abstract

The rapid advancement of unmanned aerial vehicle(UAV)technology has underscored the significant potential of multi-UAV systems in managing complex tasks.Efficient task-allocation strategies are crucial for enhancing the overall performance of these systems.Although conventional methods perform adequately in simple environments,they often struggle in more complex scenarios where environmental disturbances and resource constraints hinder their effectiveness,resulting in suboptimal task allocation outcomes.By contrast,reinforcement learning(RL),as a powerful optimization technique,is particularly suitable for addressing the challenges inherent in multi-UAV task allocation.Unlike conventional approaches,RL does not rely on predefined models or external knowledge,enabling the system to learn optimal strategies via continuous interactions with the environment.This flexibility enables the system to adapt to dynamic conditions and improve its decision making over time.This study proposes an innovative approach based on deep reinforcement learning to address the challenges encountered in multi-UAV task allocation,with specific consideration given to the uncertainties typically prevalent in real-world battlefield scenarios.These uncertainties include variable wind conditions,precipitation,a nd other environmental factors that can potentially affect UAV performance.The primary objective of this study is to ensure that multi-UAV systems can respond rapidly to multiple simultaneous tasks while optimizing resource utilization.Traditional task allocation methods,which are often heuristic or rule-based,lack the flexibility required to handle environmental complexity or dynamic changes.They are typically rigid and struggle to adapt to unanticipated situations,which results in inefficiencies and delays in task allocation.To address these challenges,this study modeled the task allocation problem as a Markov Decision Process.In this framework,the system can select the most appropriate task allocation strategy based on the current state of the environment,ensuring flexibility and timeliness in decision making.To enhance the stability and robustness of the model,an evaluation network and a target network were designed in tandem to ensure reliable learning.By separating the state and advantage values,the model effectively reduces the noise introduced by action selection,resulting in more accurate predictions and enhanced decision making.In addition,this study introduces a prioritized experience replay module that ranks the importance of each experience sample based on its temporal difference error,thereby prioritizing the most useful experiences for learning.This approach enables the model to focus on more informative samples,thereby accelerating the learning process and improving algorithm efficiency.By addressing the inefficiencies of traditional experience replay methods,which often reuse low-value samples,this technique ensures a more efficient use of the available training time.Moreover,this study employed neural network approximation techniques to reduce the computational demands of online learning,which is particularly important in real-time applications with limited processing power.Experimental results demonstrate that the proposed method substantially reduces resource waste in UAV task scheduling.On average,each UAV assignment is completed in just 0.24 s,indicating substantial improvement in task allocation efficiency.The proposed algorithm outperforms traditional methods in efficiency as well as in convergence speed and stability,owing to the prioritized experience replay module.Furthermore,the scalability of the algorithm was validated via simulations involving larger UAV fleets,where performance remained robust without degradation.Additional simulation tests confirmed that the proposed method can optimize resource allocation,reduce system interference,and accelerate convergence.In conclusion,the proposed method offers significant improvements in multi-UAV system task allocation,particularly in terms of task allocation efficiency and system adaptability.

关键词

无人机群/任务分配/强化学习/深度Q网络/马尔可夫决策过程

Key words

UAV swarm/task allocation/reinforcement learning/deep Q-network/markov decision process

分类

信息技术与安全科学

引用本文复制引用

WANG Yuehai,QIU Guoshuai,XING Na,ZHAO Xinyi,WANG Jie,HAN Xi..基于改进深度Q网络的异构无人机快速任务分配[J].工程科学学报,2026,48(1):142-151,10.

基金项目

国家自然科学基金资助项目(62403009,62473042) (62403009,62473042)

北京市教育委员会科学研究计划资助项目(KM202310009001) (KM202310009001)

工程科学学报

2095-9389

访问量0
|
下载量0
段落导航相关论文