首页|期刊导航|计算机工程与应用|基于平均奖赏强化学习算法的零阶分类元系统

基于平均奖赏强化学习算法的零阶分类元系统

臧兆祥李昭王俊英但志平

计算机工程与应用2016，Vol.52Issue(21)：14-20,48,8.

计算机工程与应用2016，Vol.52Issue(21)：14-20,48,8.DOI:10.3778/j.issn.1002-8331.1603-0061

基于平均奖赏强化学习算法的零阶分类元系统

Zeroth-level classifier system with average reward reinforce-ment learning

臧兆祥 ¹李昭 ²王俊英 ¹但志平²

作者信息

1. 三峡大学水电工程智能视觉监测湖北省重点实验室，湖北宜昌 443002
2. 三峡大学计算机与信息学院，湖北宜昌 443002
折叠

摘要

Abstract

As a genetics-based machine learning technique, Zeroth-level Classifier System(ZCS)has shown promise in applying to multi-step problems. However, the standard ZCS is based on a discounted reward reinforcement learning algo-rithm, which optimizes the discounted total reward received by an agent but is not suitable for all multi-step problems. There are some average reward reinforcement learning methods available, such as R-learning, which optimize the average reward per time step. In this paper, R-learning is used as the reinforcement learning employed by ZCS, to replace its dis-counted reward reinforcement learning approach. The modification results show classifier system can effectively prevent the occurrence of overgeneralization and support long action chains, and thus is able to solve large multi-step problems.

关键词

平均奖赏/强化学习/R-学习算法/学习分类元系统(LCS)/零阶分类元系统(ZCS)/多步学习问题

Key words

average reward/reinforcement learning/R-learning/Learning Classifier Systems(LCS)/Zeroth-level Classifier System(ZCS)/multi-step problems

分类

信息技术与安全科学

引用本文复制引用

臧兆祥,李昭,王俊英,但志平..基于平均奖赏强化学习算法的零阶分类元系统[J].计算机工程与应用,2016,52(21):14-20,48,8.

基金项目

国家自然科学基金（No.61502274）；湖北省自然科学基金（No.2015CFB336，No.2014CFC1144，No.2015CFA025）；三峡大学水电工程智能视觉监测湖北省重点实验室开放基金（No.2015KLA08，No.2014KLA08）；三峡大学人才科研启动基金（No.KJ2013B064，No.KJ2013B063）。（）

计算机工程与应用

OA北大核心CSCDCSTPCD

ISSN：1002-8331

访问量0

下载量0

段落导航