Model-free tracking design for nonlinear zero-sum games with an improved utility function - Details

Author：

Wang, D. (Wang, D..) | Tang, G. (Tang, G..) | Ren, J. (Ren, J..) | Zhao, M. (Zhao, M..) | Qiao, J. (Qiao, J..)

Indexed by：

Scopus SCIE

Abstract：

In　this　article,　an　advanced　accelerated　Q-learning　(AQL)　approach　is　designed　to　address　the　nonlinear　discrete-time　optimal　tracking　problem　of　zero-sum　games　with　unknown　dynamics.　Different　from　conventional　adaptive　dynamic　programming　methods,　the　advanced　Q-learning　algorithm　incorporates　both　the　control　input　and　the　disturbance　signal　into　the　tracking　error,　which　obviates　the　quadratic　form　of　control　and　disturbance　inputs　directly.　This　innovative　Q-function　is　used　to　derive　the　optimal　tracking　control　policy　pair　that　ensures　the　terminal　tracking　error　asymptotically　converges　to　zero,　independent　of　the　feedforward　control　input.　In　order　to　improve　the　convergence　speed　of　the　iterative　process　and　reduce　computational　complexity,　an　accelerated　factor　is　introduced.　After　collecting　offline　input–output　data,　a　backpropagation　neural　network　is　employed　to　approximate　the　proposed　Q-function,　which　enables　model-free　tracking　control　of　zero-sum　games　through　an　off-policy　learning　mechanism.　Furthermore,　the　theoretical　properties　of　the　developed　algorithm　are　analyzed　under　specific　preconditions.　Finally,　the　effectiveness　of　the　AQL　algorithm　is　validated　through　a　numerical　simulation,　which　is　implemented　using　a　critic-only　structure.　©　The　Author(s),　under　exclusive　licence　to　Springer　Nature　B.V.　2025.

Keyword：

Adaptive dynamic programming Zero-sum games Nonlinear tracking control Accelerated Q-learning Neural networks

Author Community：

[ 1 ] [Wang D.]School of Information Science and Technology, Beijing University of Technology, Beijing, 100124, China
[ 2 ] [Wang D.]Beijing Key Laboratory of Computational Intelligence and Intelligent System, Beijing University of Technology, Beijing, 100124, China
[ 3 ] [Wang D.]Beijing Institute of Artificial Intelligence, Beijing University of Technology, Beijing, 100124, China
[ 4 ] [Wang D.]Beijing Laboratory of Smart Environmental Protection, Beijing University of Technology, Beijing, 100124, China
[ 5 ] [Tang G.]School of Information Science and Technology, Beijing University of Technology, Beijing, 100124, China
[ 6 ] [Tang G.]Beijing Key Laboratory of Computational Intelligence and Intelligent System, Beijing University of Technology, Beijing, 100124, China
[ 7 ] [Tang G.]Beijing Institute of Artificial Intelligence, Beijing University of Technology, Beijing, 100124, China
[ 8 ] [Tang G.]Beijing Laboratory of Smart Environmental Protection, Beijing University of Technology, Beijing, 100124, China
[ 9 ] [Ren J.]School of Information Science and Technology, Beijing University of Technology, Beijing, 100124, China
[ 10 ] [Ren J.]Beijing Key Laboratory of Computational Intelligence and Intelligent System, Beijing University of Technology, Beijing, 100124, China
[ 11 ] [Ren J.]Beijing Institute of Artificial Intelligence, Beijing University of Technology, Beijing, 100124, China
[ 12 ] [Ren J.]Beijing Laboratory of Smart Environmental Protection, Beijing University of Technology, Beijing, 100124, China
[ 13 ] [Zhao M.]School of Information Science and Technology, Beijing University of Technology, Beijing, 100124, China
[ 14 ] [Zhao M.]Beijing Key Laboratory of Computational Intelligence and Intelligent System, Beijing University of Technology, Beijing, 100124, China
[ 15 ] [Zhao M.]Beijing Institute of Artificial Intelligence, Beijing University of Technology, Beijing, 100124, China
[ 16 ] [Zhao M.]Beijing Laboratory of Smart Environmental Protection, Beijing University of Technology, Beijing, 100124, China
[ 17 ] [Qiao J.]School of Information Science and Technology, Beijing University of Technology, Beijing, 100124, China
[ 18 ] [Qiao J.]Beijing Key Laboratory of Computational Intelligence and Intelligent System, Beijing University of Technology, Beijing, 100124, China
[ 19 ] [Qiao J.]Beijing Institute of Artificial Intelligence, Beijing University of Technology, Beijing, 100124, China
[ 20 ] [Qiao J.]Beijing Laboratory of Smart Environmental Protection, Beijing University of Technology, Beijing, 100124, China

Reprint Author's Address：

Email：

Show more details

Related Keywords：

Neural critic learning for tracking control design of constrained nonlinear multi-person zero-sum games
2022，NEUROCOMPUTING
Neural Q-learning for discrete-time nonlinear zero-sum games with adjustable convergence rate
2024，Neural Networks
Accelerated Value Iteration for Nonlinear Zero-Sum Games with Convergence Guarantee
2024，Guidance, Navigation and Control
Advanced optimal tracking integrating a neural critic technique for asymmetric constrained zero-sum games
2024，Neural Networks

Source ：

Nonlinear Dynamics

ISSN： 0924-090X

Year： 2025

5 . 6 0 0

JCR@2022

Cited Count：

WoS CC Cited Count：

SCOPUS Cited Count：

ESI Highly Cited Papers on the List： 0 Unfold All

WanFang Cited Count：

Chinese Cited Count：

30 Days PV： 4

Affiliated Colleges：

Get Fulltext

DOI Library Discovery Baidu Scholar Search SCOPUS

Type
Departments

All Years Choose Year From to