首页|期刊导航|华中科技大学学报（自然科学版）|基于动态势能奖励的双足机器人行走控制

基于动态势能奖励的双足机器人行走控制

王泉德王君豪刘子航

华中科技大学学报（自然科学版）2025，Vol.53Issue(5)：9-17,9.

华中科技大学学报（自然科学版）2025，Vol.53Issue(5)：9-17,9.DOI:10.13245/j.hust.250078

基于动态势能奖励的双足机器人行走控制

Dynamic potential based rewards for learning bipedal locomotion control

王泉德 ¹王君豪 ¹刘子航¹

作者信息

1. 武汉大学电子信息学院,湖北武汉 430072
折叠

摘要

Abstract

Aiming at the issues of insufficient exploration ability,low sample efficiency,and unstable walking mode in the learning process of legged robots,dynamic potential was integrated into the reward shaping based on potential energy,and a reward function based on dynamic potential reward shaping was proposed.The reward function dynamically adjusted the reward obtained by the robot's current action control during the training process,so as to improve the exploration ability of the learning process.In the virtual training environment of the legged robot,the proximal policy optimization algorithm(PPO)combined with the reward calculation based on the dynamic potential energy reward shaping was used to realize the fixed-speed walking control of the bipedal robot.Test results show that the proposed method can effectively accelerate the training process,and the motion of the robot is more natural and stable.

关键词

深度强化学习/双足机器人行走控制/奖励塑造/动态势能/近端策略优化算法

Key words

deep reinforcement learning/bipedal locomotion control/reward shaping/dynamic potential/proximal policy optimization algorithm

分类

信息技术与安全科学

引用本文复制引用

王泉德,王君豪,刘子航..基于动态势能奖励的双足机器人行走控制[J].华中科技大学学报（自然科学版）,2025,53(5):9-17,9.

基金项目

国家自然科学基金资助项目(62061160370). （62061160370）

华中科技大学学报（自然科学版）

OA北大核心

ISSN：1671-4512

访问量7

下载量0

段落导航