| 注册
首页|期刊导航|自动化学报|一类基于谱方法的强化学习混合迁移算法

一类基于谱方法的强化学习混合迁移算法

朱美强 程玉虎 李明 王雪松 冯涣婷

自动化学报2012,Vol.38Issue(11):1765-1776,12.
自动化学报2012,Vol.38Issue(11):1765-1776,12.DOI:10.3724/SP.J.1004.2012.01765

一类基于谱方法的强化学习混合迁移算法

A Hybrid Transfer Algorithm for Reinforcement Learning Based on Spectral Method

朱美强 1程玉虎 1李明 1王雪松 1冯涣婷1

作者信息

  • 1. 中国矿业大学信息与电气工程学院 徐州221116
  • 折叠

摘要

Abstract

For scaling up state space transfer underlying the proto-value function framework, only some basis functions corresponding to smaller eigenvalues are transferred effectively, which will result in wrong approximation of value function in the target task. In order to solve the problem, according to the fact that Laplacian eigenmap can preserve the local topology structure of state space, an improved hierarchical decomposition algorithm based on the spectral graph theory is proposed and a hybrid transfer method integrating basis function transfer with subtask optimal polices transfer is designed. At first, the basis functions of the source task are constructed using spectral method. The basis functions of target task are produced through linearly interpolating basis functions of the source task. Secondly, the produced second basis function of the target task (approximating Fiedler eigenvector) is used to decompose the target task. Then the optimal polices of subtasks are obtained using the improved hierarchical decomposition algorithm. At last, the obtained basis functions and optimal subtask polices are transferred to the target task. The proposed hybrid transfer method can directly get optimal policies of some states, reduce the number of iterations and the minimum number of basis functions needed to approximate the value function. The method is suitable for scaling up state space transfer task with hierarchical control structure. Simulation results of grid world have verified the validity of the proposed hybrid transfer method.

关键词

强化学习/迁移学习/谱图理论/原型值函数/层次分解

Key words

Reinforcement learning/ transfer learning/ spectral graph theory/ proto-value functions/ hierarchical decomposition

引用本文复制引用

朱美强,程玉虎,李明,王雪松,冯涣婷..一类基于谱方法的强化学习混合迁移算法[J].自动化学报,2012,38(11):1765-1776,12.

基金项目

国家自然科学基金(60974050,61072094,61273143),中国矿业大学青年科技基金(OC080252),教育部新世纪优秀人才支持计划(NCET-08-0836,NCET-10-0765),教育部高等学校博士学科点专项科研基金(20110095110016)资助 (60974050,61072094,61273143)

自动化学报

OA北大核心CSCDCSTPCD

0254-4156

访问量0
|
下载量0
段落导航相关论文