Gait learning method of quadruped robot based on policy distillation - Details

Author：

Indexed by：

EI Scopus

Abstract：

Reinforcement　learning　algorithm　represented　by　flexible　action　evaluation　(SAC)　has　been　successful　in　reproducing　the　motor　skills　of　higher　animals.　This　framework　combines　strategy　search　and　state　action　value　function.　However,　the　agent　use　strategy　exploration　is　greedy,　and　the　Q　value　function　of　evaluation　network　estimation　uses　low　valuation.　This　paper　proposes　a　policy　distillation　(PD)　soft　actor-critic　(PDSAC)　algorithm　that　integrates　PD　and　SAC　algorithms　to　enable　agents　to　adopt　better　policies.　This　algorithm　allows　the　agent　to　explore　using　hybrid　policies　and　speeds　up　the　convergence　of　the　reward　function　from　reinforcement　learning.　To　validate　the　proposed　algorithm,　Theoretical　proof　that　the　PDSAC　algorithm　improves　the　efficiency　of　policy　exploration　and　validation　in　quadruped　robot　gait　learning　tasks.　According　to　simulation　results,　the　PDSAC　outperforms　the　SAC　in　the　gait　learning　task,　achieving　a　40%　increase　in　convergence　speed　and　a　26.7%　improvement　in　the　reward　value　function.　©　2025　Beijing　University　of　Aeronautics　and　Astronautics　(BUAA).　All　rights　reserved.

Keyword：

Robot learning Multipurpose robots Reinforcement learning Adversarial machine learning Learning algorithms

Author Community：

[ 1 ] [Zhu, Xiaoqing]Faulty of Information Technology, Beijing University of Technology, Beijing; 100124, China
[ 2 ] [Zhu, Xiaoqing]Beijing Key Laboratory of Computational Intelligence and Intelligent System, Beijing University of Technology, Beijing; 100124, China
[ 3 ] [Wang, Tao]Faulty of Information Technology, Beijing University of Technology, Beijing; 100124, China
[ 4 ] [Wang, Tao]Beijing Key Laboratory of Computational Intelligence and Intelligent System, Beijing University of Technology, Beijing; 100124, China
[ 5 ] [Ruan, Xiaogang]Faulty of Information Technology, Beijing University of Technology, Beijing; 100124, China
[ 6 ] [Ruan, Xiaogang]Beijing Key Laboratory of Computational Intelligence and Intelligent System, Beijing University of Technology, Beijing; 100124, China
[ 7 ] [Chen, Jiangtao]Faulty of Information Technology, Beijing University of Technology, Beijing; 100124, China
[ 8 ] [Chen, Jiangtao]Beijing Key Laboratory of Computational Intelligence and Intelligent System, Beijing University of Technology, Beijing; 100124, China
[ 9 ] [Nan, Borui]Faulty of Information Technology, Beijing University of Technology, Beijing; 100124, China
[ 10 ] [Nan, Borui]Beijing Key Laboratory of Computational Intelligence and Intelligent System, Beijing University of Technology, Beijing; 100124, China
[ 11 ] [Bi, Lanyue]Faulty of Information Technology, Beijing University of Technology, Beijing; 100124, China
[ 12 ] [Bi, Lanyue]Beijing Key Laboratory of Computational Intelligence and Intelligent System, Beijing University of Technology, Beijing; 100124, China

Reprint Author's Address：

Email：

Show more details

Related Keywords：

Multi-strategy Central Pattern Generator and Reinforcement Learning Integration for Quadruped Locomotion
2025，3rd International Conference on Machine Learning, Cloud Computing and Intelligent Mining, MLCCIM 2024
Quadruped Robot Get Bionic Learning Method Based on Intelligent Memory Soft Actor-Critic
2023，35th Chinese Control and Decision Conference, CCDC 2023
Map building for multi-robotic system based on robot technology middleware
2012，International Conference on Automatic Control and Artificial Intelligence, ACAI 2012
Modeling and solution for assignment problem of multiple robots system
2013，Journal of Central South University (Science and Technology)

Source ：

Journal of Beijing University of Aeronautics and Astronautics

ISSN： 1001-5965

Year： 2025

Issue： 2

Volume： 51

Page： 428-439

Cited Count：

WoS CC Cited Count：

SCOPUS Cited Count：

ESI Highly Cited Papers on the List： 0 Unfold All

WanFang Cited Count：

Chinese Cited Count：

30 Days PV： 7

Affiliated Colleges：

Get Fulltext

DOI Library Discovery Baidu Scholar Search Engineering Village

Type
Departments

All Years Choose Year From to