| 注册
首页|期刊导航|计算机与数字工程|一种基于条件生成对抗网络的强化学习数据增强方法

一种基于条件生成对抗网络的强化学习数据增强方法

项宇 秦进 袁琳琳

计算机与数字工程2024,Vol.52Issue(6):1739-1745,7.
计算机与数字工程2024,Vol.52Issue(6):1739-1745,7.DOI:10.3969/j.issn.1672-9722.2024.06.025

一种基于条件生成对抗网络的强化学习数据增强方法

A Reinforcement Learning Data Augmentation Method Based on Conditional Generative Adversarial Networks

项宇 1秦进 1袁琳琳2

作者信息

  • 1. 贵州大学计算机科学与技术学院 贵阳 550025
  • 2. 贵州开放大学信息工程学院 贵阳 550023
  • 折叠

摘要

Abstract

More and more attention has been paid to the success of reinforcement learning in sequential decision making,but there is still a problem of low data efficiency when using high-dimensional state as input.One of the reasons for this problem is that it is difficult for an agent to extract effective features from a high-dimensional space.In order to improve data efficiency,this paper proposes a data augmentation method cGDA(cGANs-based Data Augment)suitable for reinforcement learning task.Conditional generative adversarial nets(cGANs)is used to model the dynamic characteristics of the environment,with the state and action at the current moment as the input of the conditional generation model.The model outputs the state of the next moment as augmented data.In the process of training,real data and augmented data are used to train agents,which can effectively help agents to extract useful knowledge from different data quickly.On the Atari100K benchmark,cGDA achieves higher performance in 16 of 26 discrete control problem environments compared with the methods with data augmentation.Higher performance is achieved in 14 environ-ments compared with the approach without data augmentation.

关键词

强化学习/数据增强/数据效率/条件生成对抗网络/雅达利游戏

Key words

reinforcement learning/data augmentation/data efficiency/conditional generative adversarial nets/Atari games

分类

信息技术与安全科学

引用本文复制引用

项宇,秦进,袁琳琳..一种基于条件生成对抗网络的强化学习数据增强方法[J].计算机与数字工程,2024,52(6):1739-1745,7.

基金项目

贵州省科学技术基金项目(编号:黔科合基础[2020]1Y275) (编号:黔科合基础[2020]1Y275)

贵州省科技计划项目(编号:黔科合基础[2019]1130号)资助. (编号:黔科合基础[2019]1130号)

计算机与数字工程

OACSTPCD

1672-9722

访问量0
|
下载量0
段落导航相关论文