|国家科技期刊平台
首页|期刊导航|电子器件|基于多智能体强化学习的目标跟踪辐射方法及设计

基于多智能体强化学习的目标跟踪辐射方法及设计OACSTPCD

Target Tracking Radiation Method and Design Based on Multi-Agent Reinforcement Learning

中文摘要英文摘要

针对分布式空间功率合成中单平台微波发射装置有最大发射功率限制的问题,提出了一种基于Friend-Q多智能强化学习的微波发射装置路径规划方法,以实现对目标施加持续4 min及以上的10 mW/cm2~15 mW/cm2 的辐射强度.在所提方法中通过变ε-贪婪策略平衡探索和利用的关系,同时提出一种具有选择性的输入功率方案,以减少系统耗能.通过对三种具有代表性的仿真场景进行训练,实验结果表明:①相比于分散远离场景和单一接近场景,路径结合场景的成功率分别提高了 55.7%和120.9%,证实了微波辐射源的合理位置排布可以在很大程度上提高模型的成功率;②采用多智能体强化学习训练的模型相比于采用随机策略的模型,三种仿真场景中的成功率分别提高了48.8%、72%、41.8%,进一步验证了该算法的有效性.

Aiming at the problem that single-platform microwave transmitting devices have maximum transmit power limitations in dis-tributed spatial power synthesis,a path planning method of microwave transmitting devices based on Friend-Q multi-intelligent reinforce-ment learning is proposed to achieve the radiation intensity of 10 mW/cm2~15 mW/cm2 lasting 4 min or more to the target.The rela-tionship between exploration and utilization is balanced by the variable ε-greedy strategy,and a selective input power scheme is pro-posed to reduce the energy consumption of the system.Through the training of three representative simulation scenarios,the experimental results show that compared with the scattered remote scene and single proximity scene,the success rate of path combined scene is increased by 55.7%and 120.9%,respectively,which confirms that the reasonable location arrangement of microwave radiation sources can greatly improve the success rate of the model.Compared with the model using stochastic strategy,the success rates of the model trained by multi-agent reinforcement learning in three simulation scenes are increased by 48.8%,72%and 41.8%,respectively,which further verifies the effectiveness of the algorithm.

陈翰;张远媛;何聪;朱城磊;张为

东南大学仪器科学与工程学院,江苏 南京 210096

电子信息工程

多智能体强化学习分布式空间功率合成跟踪辐射路径规划

multi-agent reinforcement learningdistributed space power synthesistracking radiationpath planning

《电子器件》 2024 (002)

544-551 / 8

2021年国防科技创新特区项目

10.3969/j.issn.1005-9490.2024.02.038

评论