• Complex
  • Title
  • Keyword
  • Abstract
  • Scholars
  • Journal
  • ISSN
  • Conference
搜索

Author:

Zhu, Xiaoqing (Zhu, Xiaoqing.) | Wang, Tao (Wang, Tao.) | Ruan, Xiaogang (Ruan, Xiaogang.) | Chen, Jiangtao (Chen, Jiangtao.) | Nan, Borui (Nan, Borui.) | Bi, Lanyue (Bi, Lanyue.)

Indexed by:

EI Scopus

Abstract:

Reinforcement learning algorithm represented by flexible action evaluation (SAC) has been successful in reproducing the motor skills of higher animals. This framework combines strategy search and state action value function. However, the agent use strategy exploration is greedy, and the Q value function of evaluation network estimation uses low valuation. This paper proposes a policy distillation (PD) soft actor-critic (PDSAC) algorithm that integrates PD and SAC algorithms to enable agents to adopt better policies. This algorithm allows the agent to explore using hybrid policies and speeds up the convergence of the reward function from reinforcement learning. To validate the proposed algorithm, Theoretical proof that the PDSAC algorithm improves the efficiency of policy exploration and validation in quadruped robot gait learning tasks. According to simulation results, the PDSAC outperforms the SAC in the gait learning task, achieving a 40% increase in convergence speed and a 26.7% improvement in the reward value function. © 2025 Beijing University of Aeronautics and Astronautics (BUAA). All rights reserved.

Keyword:

Robot learning Multipurpose robots Reinforcement learning Adversarial machine learning Learning algorithms

Author Community:

  • [ 1 ] [Zhu, Xiaoqing]Faulty of Information Technology, Beijing University of Technology, Beijing; 100124, China
  • [ 2 ] [Zhu, Xiaoqing]Beijing Key Laboratory of Computational Intelligence and Intelligent System, Beijing University of Technology, Beijing; 100124, China
  • [ 3 ] [Wang, Tao]Faulty of Information Technology, Beijing University of Technology, Beijing; 100124, China
  • [ 4 ] [Wang, Tao]Beijing Key Laboratory of Computational Intelligence and Intelligent System, Beijing University of Technology, Beijing; 100124, China
  • [ 5 ] [Ruan, Xiaogang]Faulty of Information Technology, Beijing University of Technology, Beijing; 100124, China
  • [ 6 ] [Ruan, Xiaogang]Beijing Key Laboratory of Computational Intelligence and Intelligent System, Beijing University of Technology, Beijing; 100124, China
  • [ 7 ] [Chen, Jiangtao]Faulty of Information Technology, Beijing University of Technology, Beijing; 100124, China
  • [ 8 ] [Chen, Jiangtao]Beijing Key Laboratory of Computational Intelligence and Intelligent System, Beijing University of Technology, Beijing; 100124, China
  • [ 9 ] [Nan, Borui]Faulty of Information Technology, Beijing University of Technology, Beijing; 100124, China
  • [ 10 ] [Nan, Borui]Beijing Key Laboratory of Computational Intelligence and Intelligent System, Beijing University of Technology, Beijing; 100124, China
  • [ 11 ] [Bi, Lanyue]Faulty of Information Technology, Beijing University of Technology, Beijing; 100124, China
  • [ 12 ] [Bi, Lanyue]Beijing Key Laboratory of Computational Intelligence and Intelligent System, Beijing University of Technology, Beijing; 100124, China

Reprint Author's Address:

Email:

Show more details

Related Keywords:

Related Article:

Source :

Journal of Beijing University of Aeronautics and Astronautics

ISSN: 1001-5965

Year: 2025

Issue: 2

Volume: 51

Page: 428-439

Cited Count:

WoS CC Cited Count:

SCOPUS Cited Count:

ESI Highly Cited Papers on the List: 0 Unfold All

WanFang Cited Count:

Chinese Cited Count:

30 Days PV: 7

Affiliated Colleges:

Online/Total:644/10655129
Address:BJUT Library(100 Pingleyuan,Chaoyang District,Beijing 100124, China Post Code:100124) Contact Us:010-67392185
Copyright:BJUT Library Technical Support:Beijing Aegean Software Co., Ltd.