|国家科技期刊平台
首页|期刊导航|电讯技术|基于DDPG的综合化航电系统多分区任务分配优化方法

基于DDPG的综合化航电系统多分区任务分配优化方法OACSTPCD

A DDPG-based Optimization Method for Multi-partition Task Assignment of IMA

中文摘要英文摘要

综合化航电系统(Integrated Modular Avionics,IMA)通过时空分区机制实现共享资源平台下的多航电功能集成,分区间的任务分配方法的优劣决定着航电系统的整体效能.针对航电任务集合在多分区内的分配调度问题,提出了一种基于深度强化学习的优化方法.构建了航电系统模型与任务模型,以系统资源限制与任务实时性需求为约束,以提高系统资源利用率为优化目标,将任务分配过程描述为序贯决策问题.引入马尔科夫决策模型,建立基于深度确定性策略梯度(Deep Deterministic Policy Gradient,DDPG)法的IMA任务分配模型并提出通用分配架构;引入状态归一化、行为噪声等策略训练技巧,提高DDPG算法的学习性能和训练能力.仿真结果表明,提出的优化算法迭代次数达到500 次时开始收敛,分析800 次之后多分区内驻留任务方案在能满足约束要求的同时,最低处理效率提升20.55%.相较于传统分配方案和AC(Actor-Critic)算法,提出的DDPG算法在收敛能力、优化性能以及稳定性上均有显著优势.

The integrated modular avionics(IMA)system implements the integration of multiple avionics functions under a shared resource platform through a spatio-temporal partitioning mechanism.The merit of the task distribution method between partitions determines the overall effectiveness of the IMA system.An optimization method based on deep reinforcement learning(DRL)is proposed for the distribution and scheduling of avionics task sets within multiple partitions is proposed.The IMA system model and task model are constructed,and the constraints of system resource and task real-time requirements are used to improve the system resource utilization as the optimization objective.The task distribution process is described as a sequential decision problem.A Markov decision model is introduced to develop a deep deterministic policy gradient(DDPG)algorithm-based IMA task distribution model and a generic distribution architecture is proposed.Policy training techniques such as state normalization and behavioral noise are introduced to improve the learning performance and training capability of the DDPG algorithm.Simulation results show that the proposed optimization algorithm starts to converge after 500 iterations,and the efficiency of distribution scheme is improved by 20.55%while satisfying the constraint requirements after 800 iterations.Compared with the traditional assignment scheme and the Actor-Critic(AC)algorithm,the proposed DDPG algorithm has significant advantages in terms of convergence ability.

赵长啸;李道俊;汪鹏辉;田毅

中国民航大学 安全科学与工程学院,天津 300300||民航航空器适航审定技术重点实验室,天津 300300中国民航大学 安全科学与工程学院,天津 300300

综合模块化航空电子系统(IMA)任务分配及调度深度强化学习DDPG算法

integrated modular avionics(IMA)task allocation and schedulingdeep reinforcement learningDDPG algorithm

《电讯技术》 2024 (001)

58-66 / 9

国家重点研发计划(2021YFB1600601);天津市自然科学基金(21JCQNJC00900)

10.20079/j.issn.1001-893x.230103001

评论