| 注册
首页|期刊导航|中国空间科学技术(中英文)|回合制轨道博弈中MCTS算法的改进与应用

回合制轨道博弈中MCTS算法的改进与应用

郑鑫宇 张轶 周杰 唐佩佳 彭升人 党朝辉

中国空间科学技术(中英文)2024,Vol.44Issue(5):75-82,8.
中国空间科学技术(中英文)2024,Vol.44Issue(5):75-82,8.DOI:10.16708/j.cnki.1000-758X.2024.0075

回合制轨道博弈中MCTS算法的改进与应用

Improvement and application of MCTS in turn-based orbital games

郑鑫宇 1张轶 1周杰 1唐佩佳 1彭升人 1党朝辉2

作者信息

  • 1. 中国空间技术研究院 钱学森空间技术实验室,北京 100094
  • 2. 西北工业大学 航天学院,西安 710072
  • 折叠

摘要

Abstract

The sensing delay of orbit change in turn-based orbit pursuit-evasion game brings difficulties to differential game approaches,and deep reinforcement learning-based algorithms are still risky for engineering applications due to the inexplicability.The predictive-value-accumulate Monte Carlo tree search(PVA-MCTS)algorithm is proposed for the turn-based orbit pursuit-evasion game.Based on the predictability of spacecraft orbital motion,the algorithm predicts and accumulates the decision value in the game.This solves the problem of sparse reward and large time span in the turn-based orbit pursuit-evasion game,and improves the learning efficiency.It is used to solve the turn-based orbit pursuit-evasion game,and compared with the results obtained by Monte Carlo tree search(MCTS)algorithm.The results show that the PVA-MCTS algorithm reduces the pursuit time by about 27.6%and increases the escape time by about 6.8%for pursuer and evader respectively.The PVA-MCTS algorithm is realistic for the application of orbital game in the fields of non-cooperative target approaching and collision avoidance.

关键词

航天器追逃/回合制追逃博弈/蒙特卡洛树搜索/变轨感知延迟/预测价值积累

Key words

pursuit-evasion of spacecraft/turn-based pursuit-evasion game/Monte Carlo tree search/sensing delay of orbit change/predictive value accumulate

分类

航空航天

引用本文复制引用

郑鑫宇,张轶,周杰,唐佩佳,彭升人,党朝辉..回合制轨道博弈中MCTS算法的改进与应用[J].中国空间科学技术(中英文),2024,44(5):75-82,8.

基金项目

国家自然科学基金(12172288) (12172288)

中国空间科学技术(中英文)

OA北大核心CSTPCD

1000-758X

访问量3
|
下载量0
段落导航相关论文