Indexed by:
Abstract:
In this article, an adjustable behavior-guided adaptive dynamic programming (BGADP) algorithm is designed to solve the optimal regulation problem for discrete-time systems. In conventional adaptive dynamic programming methods, gradient information of system dynamics is necessary for conducting policy improvement. However, these methods face challenges when gradient information cannot be computed or when the system dynamics is non-differentiable. To overcome these limitations, a human-behavior-inspired swarm intelligence approach is used to search for superior policies during the iterative process, eliminating the need for gradient information. Additionally, a relaxation factor is introduced into the value function update to accelerate the convergence speed of the algorithm. The monotonicity and convergence properties of the iterative value function are rigorously analyzed. Finally, the effectiveness and practicality of the adjustable BGADP algorithm are validated through two simulation studies, which are implemented using the actor-critic framework with neural networks.
Keyword:
Reprint Author's Address:
Email:
Source :
NEUROCOMPUTING
ISSN: 0925-2312
Year: 2025
Volume: 636
6 . 0 0 0
JCR@2022
Cited Count:
SCOPUS Cited Count:
ESI Highly Cited Papers on the List: 0 Unfold All
WanFang Cited Count:
Chinese Cited Count:
30 Days PV: 0
Affiliated Colleges: