Indexed by:
Abstract:
The agile movement of quadruped robot requires rich professional knowledge and tedious manual adjustment. However, reinforcement learning does not require any professional knowledge to enable the quadruped robot to learn better movement gait and skills. The TD3 algorithm is widely used in continuous motion control, but it tends to converge to boundary actions, resulting in inability to learn the optimal strategy and overfitting. Inspired by behavior cloning, this paper proposes a policy constrained TD3 algorithm(PC- TD3), which adds behavioral constraints during the policy update process of TD3, updates the policy in the direction of expected behavior, and reduces boundary actions. The experiment was conducted on the Pybullet platform. The experimental results show that the algorithm proposed in this paper can enable the quadruped Robot learning to learn the walking skills. Compared with other mainstream algorithms, the experiment shows that PC- TD3 has better performance. © 2023 IEEE.
Keyword:
Reprint Author's Address:
Email:
Source :
Year: 2023
Page: 4910-4914
Language: English
Cited Count:
WoS CC Cited Count: 0
SCOPUS Cited Count:
ESI Highly Cited Papers on the List: 0 Unfold All
WanFang Cited Count:
Chinese Cited Count:
30 Days PV: 2
Affiliated Colleges: