| 注册
首页|期刊导航|计算机工程与应用|基于强化学习的舰船目标跟踪有限理性博弈算法研究

基于强化学习的舰船目标跟踪有限理性博弈算法研究

陈素霞 徐清雯 刘久富 解晖 刘向武

计算机工程与应用2024,Vol.60Issue(20):116-123,8.
计算机工程与应用2024,Vol.60Issue(20):116-123,8.DOI:10.3778/j.issn.1002-8331.2401-0032

基于强化学习的舰船目标跟踪有限理性博弈算法研究

Research on Bounded Rational Game Algorithm for Ship Target Tracking Based on Reinforcement Learning

陈素霞 1徐清雯 2刘久富 2解晖 2刘向武2

作者信息

  • 1. 河南轻工职业学院 计算机与艺术设计系,郑州 450008
  • 2. 南京航空航天大学 自动化学院,南京 211106
  • 折叠

摘要

Abstract

Since decision-makers in reality are not always able to analyze problems perfectly rationally,a pursuit evasion game algorithm based on bounded rationality is proposed.It establishes a pursuit evasion game model and first solves the saddle point strategies of the two players under perfect rationality.Introducing the bounded rationality level-k model,a structural assumption is made on the level of thinking strategies for pursuers and evaders.It allows both parties to have different strategic reasoning abilities,and gives corresponding levels'value functions and strategies,which satisfy the HJI equation.As the level increases,the strategy will eventually tend towards Nash equilibrium.Due to the difficulty in directly solving the HJI equation,an actor critic algorithm based on reinforcement learning is used to solve it.The algorithm is designed to enable pursuers to estimate the thinking level of evaders and adopt appropriate strategies.Simplify the motion of a ship as a two-dimensional mathematical model,this paper establishes a ship pursuit and evasion game model,and per-forms algorithm simulation verification on it.

关键词

追逃博弈/目标跟踪/强化学习/有限理性

Key words

pursuit-evasion game/target tracking/reinforcement learning/bounded rationality

分类

信息技术与安全科学

引用本文复制引用

陈素霞,徐清雯,刘久富,解晖,刘向武..基于强化学习的舰船目标跟踪有限理性博弈算法研究[J].计算机工程与应用,2024,60(20):116-123,8.

基金项目

国家自然科学基金(61473144). (61473144)

计算机工程与应用

OA北大核心CSTPCD

1002-8331

访问量2
|
下载量0
段落导航相关论文