Indexed by:
Abstract:
The existing path planning algorithms seldom consider the problem of security, and the traditional proximal policy optimization(PPO) algorithm has a variance adaptability problem. To solve these problems, the Safe-PPO algorithm combining evolutionary strategy and safety reward function was proposed. The algorithm is safety-oriented for path planning. CMA-ES was used to improve the PPO algorithm. The hazard coefficient and movement coefficient were introduced to evaluate the safety of the path. Used a grid map for simulation experiments, and compared the traditional PPO algorithm with the Safe-PPO algorithm; The hexapod robot was used to carry out the physical experiment in the constructed scene. The simulation results show that the Safe-PPO algorithm is reasonable and feasible in safety-oriented path planning. When compared to the conventional PPO algorithm, the Safe-PPO algorithm increased the rate of convergence during training by 18% and the incentive received by 5.3%. Using the algorithm that combined the Hazard coefficient and movement coefficient during testing enabled the robot to learn to choose the safer path rather than the fastest one. The outcomes of the physical testing demonstrated that the robot could select a more secure route to the objective in the created setting. © 2023 Beijing University of Aeronautics and Astronautics (BUAA). All rights reserved.
Keyword:
Reprint Author's Address:
Email:
Source :
Journal of Beijing University of Aeronautics and Astronautics
ISSN: 1001-5965
Year: 2023
Issue: 8
Volume: 49
Page: 2108-2118
Cited Count:
WoS CC Cited Count: 0
SCOPUS Cited Count: 6
ESI Highly Cited Papers on the List: 0 Unfold All
WanFang Cited Count:
Chinese Cited Count:
30 Days PV: 10
Affiliated Colleges: