SAR-PPO(Segmented Adaptive Reward): Robotic Arm Open Door Motion Control With Reinforcement Learning Based on Segmented Adaptive Reward - Details

Author：

Yu, J. (Yu, J..) | Feng, X. (Feng, X..) | Gong, D. (Gong, D..) | Gong, Y. (Gong, Y..)

Indexed by：

EI Scopus

Abstract：

Door　opening,　as　one　of　the　common　actions　in　daily　life,　has　become　an　important　direction　for　robotic　arm　applications.　Different　door　handles　open　in　different　ways,　to　enable　the　robotic　arm　to　complete　the　corresponding　door-opening　operation　according　to　the　handle　category,　the　Proximal　Policy　Optimization　algorithm　is　used　to　open　the　door.　Opening　the　door　contains　a　multi-segment　process　such　as　approaching　the　handle,　operating　the　handle　and　pushing　the　door　open.　The　sparse　reward　that　focuses　only　on　the　result　of　opening　the　door　will　lead　to　the　extension　of　the　training　time　of　the　robotic　arm,　or　even　fail　to　converge.　To　address　this　problem,　this　paper　proposes　a　segmented　adaptive　reward.　First,　consider　the　segment　task　of　opening　the　door,　design　the　segmented　reward,　formulate　segmented　training　rules,　and　gradually　guide　the　robotic　arm　to　improve　the　overall　training　effect.　At　the　same　time,　the　reward　adds　an　adaptive　weight　adjustment　mechanism,　which　adaptively　adjusts　the　weights　according　to　the　current　stage　of　attention　to　different　tasks,　and　then　matches　the　segmented　training　to　accelerate　the　training　speed.　In　a　simulation　environment,　the　experimental　results　show　that　the　door　opening　success　rate　of　our　algorithm　is　61.04%　higher　than　that　of　the　original　PPO　algorithm,　and　it　can　achieve　the　round　handle　opening　task　that　cannot　be　solved　by　the　original　algorithm.　©　2024　Technical　Committee　on　Control　Theory,　Chinese　Association　of　Automation.

Keyword：

Segmented Adaptive Reward Robotic Arm Motion Control Proximal Policy Optimization Algorithm

Author Community：

[ 1 ] [Yu J.]Beijing University of Technology, The Beijing Key Laboratory of Computational Intelligence and Intelligent Systems, Beijing, 100020, China
[ 2 ] [Feng X.]Beijing University of Technology, The Beijing Key Laboratory of Computational Intelligence and Intelligent Systems, Beijing, 100020, China
[ 3 ] [Gong D.]Beijing University of Technology, The Beijing Key Laboratory of Computational Intelligence and Intelligent Systems, Beijing, 100020, China
[ 4 ] [Gong Y.]Beijing University of Technology, The Beijing Key Laboratory of Computational Intelligence and Intelligent Systems, Beijing, 100020, China

Reprint Author's Address：

Email：

Show more details

Related Keywords：

SAR-PPO(Segmented Adaptive Reward): Robotic Arm Open Door Motion Control With Reinforcement Learning Based on Segmented Adaptive Reward
2024，2024 43RD CHINESE CONTROL CONFERENCE, CCC 2024
Motion control strategy for robotic arm using deep cascaded feature-enhancement Bayesian broad learning system with motion constraints
2025，ISA Transactions
Motion control strategy for robotic arm using cascaded feature-enhancement ElasticNet broad learning system
2025，Control Engineering Practice
A Cascaded Broad Learning System for Manipulator Motion Control
2024，2024 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2024

Source ：

ISSN： 1934-1768

Year： 2024

Page： 2970-2975

Language： English

Cited Count：

WoS CC Cited Count：

SCOPUS Cited Count：

ESI Highly Cited Papers on the List： 0 Unfold All

WanFang Cited Count：

Chinese Cited Count：

30 Days PV： 8

Affiliated Colleges：

Get Fulltext

DOI Library Discovery Baidu Scholar Search SCOPUS

Type
Departments

All Years Choose Year From to