Indexed by:
Abstract:
Multi-agent deep reinforcement learning (MADRL) is a promising learning-based data-driven approach, particularly when it is applied to traffic signal control (TSC) in a large-scale urban road network. However, the present MADRL-based TSC methods still deserve further studies at the aspects of mutual coordination among multiple agents and computing efficiency of training algorithms. In this study, we propose an adaptive value decomposition-based multi-agent actor-critic (AVDMAC) approach using a parallel training algorithm for cooperative traffic network flow control. Firstly, the original centralized actor-critic reinforcement learning tasks are decomposed into several local learning tasks, where each actor-critic agent optimizes its control decisions in response to local environment feedback. Additionally, multiple agents striving towards a common objective are coordinated through an adaptive factorization model. In this model, weight coefficients are adaptively assigned to the action-value function of distributed agents by aggregating local rewards, so as to form a global joint actionvalue function that considers the individual contribution differences among local decisions. Further, we develop the parallel training formulations and implementation algorithm for AVDMAC approach based on Spark cloud in order to relieve of the time consumption load during MADRL training. Numerical experiments demonstrate the advantages at convergence, computational efficiency, and control performance of the AVDMAC approach.
Keyword:
Reprint Author's Address:
Email:
Source :
NEUROCOMPUTING
ISSN: 0925-2312
Year: 2025
Volume: 623
6 . 0 0 0
JCR@2022
Cited Count:
SCOPUS Cited Count:
ESI Highly Cited Papers on the List: 0 Unfold All
WanFang Cited Count:
Chinese Cited Count:
30 Days PV: 4
Affiliated Colleges: