Indexed by:
Abstract:
In this paper, a new stabilizing value iteration Q-learning (SVIQL) algorithm is presented to achieve the online evolving control for unknown nonlinear systems. To achieve this, we aim to establish the data-driven evolving control framework, ensure the stability of iterative policies derived from SVIQL, and reduce the computational cost in online learning. First, by using an admissible policy for initialization, the monotonicity and stability properties of SVIQL are theoretically analyzed. It is highlighted that under this SVIQL framework, the iterative Q-function sequence is monotonically nonincreasing and all iterative policies are admissible. Second, we compare the convergence speed, computational complexity, and evolving control results of SVIQL and policy iteration Q-learning methods. The results shows that SVIQL has faster policy update speed with smaller computational complexity. Third, in the online evolving control stage, the offline and online data are combined to learn the new iterative policy, which is used to regulate the nonlinear system over time. For implementing the SVIQL algorithm, the critic and action networks are constructed to approximate the iterative Q-function and control policy, respectively. Moreover, the convergence of two network weights is analyzed. Finally, numerical simulations are conducted to confirm the effectiveness of the online evolving control under the SVIQL scheme.
Keyword:
Reprint Author's Address:
Email:
Source :
NONLINEAR DYNAMICS
ISSN: 0924-090X
Year: 2024
Issue: 11
Volume: 112
Page: 9137-9153
5 . 6 0 0
JCR@2022
Cited Count:
WoS CC Cited Count: 2
SCOPUS Cited Count: 2
ESI Highly Cited Papers on the List: 0 Unfold All
WanFang Cited Count:
Chinese Cited Count:
30 Days PV: 5
Affiliated Colleges: