Indexed by:
Abstract:
Door opening, as one of the common actions in daily life, has become an important direction for robotic arm applications. Different door handles open in different ways, to enable the robotic arm to complete the corresponding door-opening operation according to the handle category, the Proximal Policy Optimization algorithm is used to open the door. Opening the door contains a multi-segment process such as approaching the handle, operating the handle and pushing the door open. The sparse reward that focuses only on the result of opening the door will lead to the extension of the training time of the robotic arm, or even fail to converge. To address this problem, this paper proposes a segmented adaptive reward. First, consider the segment task of opening the door, design the segmented reward, formulate segmented training rules, and gradually guide the robotic arm to improve the overall training effect. At the same time, the reward adds an adaptive weight adjustment mechanism, which adaptively adjusts the weights according to the current stage of attention to different tasks, and then matches the segmented training to accelerate the training speed. In a simulation environment, the experimental results show that the door opening success rate of our algorithm is 61.04% higher than that of the original PPO algorithm, and it can achieve the round handle opening task that cannot be solved by the original algorithm. © 2024 Technical Committee on Control Theory, Chinese Association of Automation.
Keyword:
Reprint Author's Address:
Email:
Source :
ISSN: 1934-1768
Year: 2024
Page: 2970-2975
Language: English
Cited Count:
SCOPUS Cited Count:
ESI Highly Cited Papers on the List: 0 Unfold All
WanFang Cited Count:
Chinese Cited Count:
30 Days PV: 8
Affiliated Colleges: