Indexed by:
Abstract:
This article aims to achieve data-based online evolving control for zero-sum games with unknown dynamics. First of all, the value-iteration-based Q-learning framework is established. Relevant properties of the iterative Q-learning framework are analyzed, including the convergence and monotonicity. Then, the stability property is investigated and the online data is employed for off-policy learning. More importantly, two effective algorithms are designed to achieve online evolving control. In one algorithm, the monotonically nondecreasing Q-learning sequence requires the admissible criterion to guarantee the stability with the simple Q-function initialization. In another algorithm, the monotonically nonincreasing Q-function sequence can ensure the stability without the admissible criterion, but it requires an elaborate initial Q-function. In the end, by including two examples of real physical backgrounds, the excellent performance of online evolving control is exhibited with the given algorithms. © 2013 IEEE.
Keyword:
Reprint Author's Address:
Email:
Source :
IEEE Transactions on Systems, Man, and Cybernetics: Systems
ISSN: 2168-2216
Year: 2025
8 . 7 0 0
JCR@2022
Cited Count:
SCOPUS Cited Count:
ESI Highly Cited Papers on the List: 0 Unfold All
WanFang Cited Count:
Chinese Cited Count:
30 Days PV: 9
Affiliated Colleges: