自动化学报2026,Vol.52Issue(3):381-410,30.DOI:10.16383/j.aas.c250422
视觉强化学习方法研究综述
Overview of Visual Reinforcement Learning Methods
摘要
Abstract
Vision,as the primary means for reinforcement learning agents to perceive their environment,provides rich and detailed information that supports agents in making more complex and precise decisions.However,the high-dimensional nature of visual data often leads to information redundancy and low sample efficiency,posing a key challenge in the application of reinforcement learning.How to efficiently extract key visual representations from limited interaction data to enhance agents'decision-making capabilities has become a current research focus.To ad-dress this,this paper systematically reviews visual reinforcement learning methods,categorizing them into five cat-egories based on their core ideas and implementation mechanisms:Image-enhanced,model-enhanced,task-assisted,knowledge-transferred,and offline visual reinforcement learning approaches.It provides an in-depth analysis of the research progress in each category,as well as the strengths and limitations of representative works.Meanwhile,this paper reviews four major benchmark platforms:DMControl,DMControl-GB,DCS,and RL-ViGen,and summar-izes the applications of visual reinforcement learning in typical scenarios such as robotic control,autonomous driv-ing,and multimodal large models.Finally,based on current research bottlenecks,this paper discusses future devel-opment trends and potential research directions,aiming to offer a clear technical framework and research reference for this field.关键词
强化学习/视觉表征/视觉强化学习/智能体Key words
reinforcement learning/visual representation/visual reinforcement learning/agent引用本文复制引用
王荣荣,程玉虎,王雪松..视觉强化学习方法研究综述[J].自动化学报,2026,52(3):381-410,30.基金项目
国家自然科学基金(62373364,62573416),江苏省重点研发计划(BE2022095)资助Supported by National Natural Science Foundation of China(62373364,62573416)and Key Research and Development Pro-gram of Jiangsu Province(BE2022095) (62373364,62573416)