Gait Learning of Quadruped Robot Based on Deep Arbitration Strategy - Details

Author：

Zhu, Xiaoqing (Zhu, Xiaoqing.) | Chen, Jiangtao (Chen, Jiangtao.) | Zhang, Siyuan (Zhang, Siyuan.) | Liu, Xinyuan (Liu, Xinyuan.) | Ruan, Xiaogang (Ruan, Xiaogang.)

Indexed by：

Abstract：

Reproducing　the　learning　process　of　higher　organisms　is　an　important　research　direction　in　robot　research.　Some　commonly　used　reinforcement　learning　algorithms　had　been　explored　based　on　actor　critic　(AC)　networks　to　accomplish　this　task.　Due　to　some　shortcomings　still　existed　in　the　reinforcement　learning　algorithms,　some　improvements　were　also　took　place.　For　the　deep　deterministic　policy　gradient　(DDPG),　an　overestimated　problem　to　Q　value　led　to　deterioration　of　the　learning　effect.　Inspired　by　the　arbitration　mechanism　in　the　prefrontal　cortex　of　the　brain,　a　deep　arbitration　actor　critic　(DAAC)　algorithm　was　proposed,　including　two　sets　of　evaluation　networks.　Through　the　arbitration　mechanism,　an　optimal　evaluation　network　was　selected　to　update　the　policy　parameters,　solving　the　overestimated　problem　to　Q　value　effectively.　This　algorithm　enables　the　quadruped　robot　reproduce　the　bionic　gait　learning　process.　In　simulation　experiments,　the　DAAC　algorithm　was　compared　with　three　algorithms,　DDPG,　soft　actor　critic　(SAC),　and　proximal　policy　optimization　(PPO).　The　experiment　results　show　that　the　gait　of　the　quadruped　robot　trained　by　DAAC　has　better　performance　in　three　aspects,　reward　value,　machine　stability,　and　speed,　verifying　effectively　the　superiority　of　the　algorithm.　©　2023　Beijing　Institute　of　Technology.　All　rights　reserved.

Keyword：

Learning systems Reinforcement learning Learning algorithms Multipurpose robots Deterioration

Author Community：

[ 1 ] [Zhu, Xiaoqing]Department of Information, Beijing University of Technology, Beijing; 100124, China
[ 2 ] [Zhu, Xiaoqing]Beijing Key Laboratory of Computational Intelligence and Intelligent System, Beijing; 100124, China
[ 3 ] [Chen, Jiangtao]Department of Information, Beijing University of Technology, Beijing; 100124, China
[ 4 ] [Chen, Jiangtao]Beijing Key Laboratory of Computational Intelligence and Intelligent System, Beijing; 100124, China
[ 5 ] [Zhang, Siyuan]Department of Information, Beijing University of Technology, Beijing; 100124, China
[ 6 ] [Zhang, Siyuan]Beijing Key Laboratory of Computational Intelligence and Intelligent System, Beijing; 100124, China
[ 7 ] [Liu, Xinyuan]Department of Information, Beijing University of Technology, Beijing; 100124, China
[ 8 ] [Liu, Xinyuan]Beijing Key Laboratory of Computational Intelligence and Intelligent System, Beijing; 100124, China
[ 9 ] [Ruan, Xiaogang]Department of Information, Beijing University of Technology, Beijing; 100124, China
[ 10 ] [Ruan, Xiaogang]Beijing Key Laboratory of Computational Intelligence and Intelligent System, Beijing; 100124, China

Reprint Author's Address：

Email：

Show more details

Related Keywords：

Quadruped Robot Get Bionic Learning Method Based on Intelligent Memory Soft Actor-Critic
2023，35th Chinese Control and Decision Conference, CCDC 2023
Application of Q-learning based on adaptive greedy considering negative rewards in football match system
2019，International Journal of Wireless and Mobile Computing
Multi-agent Deep Reinforcement Learning based on Maximum Entropy
2021，4th IEEE Advanced Information Management, Communicates, Electronic and Automation Control Conference, IMCEC 2021
The research on reinforcement learning based on robocup
2007，Journal of Harbin Institute of Technology

Source ：

Transaction of Beijing Institute of Technology

ISSN： 1001-0645

Year： 2023

Issue： 11

Volume： 43

Page： 1197-1204

Cited Count：

WoS CC Cited Count： 0

SCOPUS Cited Count： 1

ESI Highly Cited Papers on the List： 0 Unfold All

WanFang Cited Count：

Chinese Cited Count：

30 Days PV： 9

Affiliated Colleges：

Get Fulltext

DOI Library Discovery Baidu Scholar Search Engineering Village

Type
Departments

All Years Choose Year From to