Indexed by:
Abstract:
In the domain of multi-agent pathfinding within the framework of partially observable Markov decision processes, the existing research mainly focuses on grid or particle environments, which are far from the real-world physical environments. This paper delves into enhancing the performance of collaborative multi-agent pathfinding in environments closer to actual physical constraints. In consideration of the realities of physical limitations, a multiple constraint action space that accounts for actuator saturation and underactuation is constructed. Concurrently, a multisource input state space based on distance and spatial coordinates is developed. Furthermore, an anti-redundancy reward function is designed to reduce the redundancy of the actions during the navigational processes of unmanned vehicles. Moreover, in dealing with the challenges of elevated training complexity, suboptimal efficiency, and convergence difficulties within the Gazebo simulation environment, we propose a pre-training and fine-tuning based multi-agent twinned delayed deep deterministic policy gradient algorithm. This method leverages pre-training to confer upon the model a more optimal initial state to improve the training efficiency. Subsequently, fine-tuning is employed to refine the pre-trained model, which further enhances the model's resilience to the non-stationary environment during the training phase. In the Gazebo simulation environment, the effectiveness of the proposed algorithm is verified by comparing it with the algorithms such as PMATD3, MATD3 and MADDPG. © 2025 Northeast University. All rights reserved.
Keyword:
Reprint Author's Address:
Email:
Source :
Control and Decision
ISSN: 1001-0920
Year: 2025
Issue: 6
Volume: 40
Page: 1838-1846
Cited Count:
SCOPUS Cited Count:
ESI Highly Cited Papers on the List: 0 Unfold All
WanFang Cited Count:
Chinese Cited Count:
30 Days PV: 0
Affiliated Colleges: