首页|期刊导航|计算机与数字工程|多步强化学习算法的收敛性分析

多步强化学习算法的收敛性分析

杨瑞

计算机与数字工程2019，Vol.47Issue(7)：1582-1585,4.

计算机与数字工程2019，Vol.47Issue(7)：1582-1585,4.DOI:10.3969/j.issn.1672-9722.2019.07.005

多步强化学习算法的收敛性分析

Convergence Analysis of Multistep Reinforcement Learning Algorithm

杨瑞¹

作者信息

1. 天津大学数学学院天津 300072
折叠

摘要

Abstract

Recently,a new algorithm called Q(σ) has been presented to evalued value function in the theory of reinforcement learning algorithm,where σ is the degree of sampling. Q(σ) is a new method between full-sampling and no-sampling and it unifies Sarsa and Expected Sarsa. However,the original paper only tests the performance of Q(σ) on experiments. This paper gives a theo?retical analysis of Q(σ) . It gives a proof that under some conditions,Q(σ) can converge to the value functions.

关键词

强化学习/值函数估计/优化/时间差分

Key words

reinforcement learning/value function estimate/optimization/temporal difference

分类

信息技术与安全科学

引用本文复制引用

杨瑞..多步强化学习算法的收敛性分析[J].计算机与数字工程,2019,47(7):1582-1585,4.

计算机与数字工程

OACSTPCD

ISSN：1672-9722

访问量0

下载量0

段落导航