Convergence and stability analysis of value iteration Q-learning under non-discounted cost for discrete-time optimal control - Details

Author：

Song, Shijie (Song, Shijie.) | Zhao, Mingming (Zhao, Mingming.) | Gong, Dawei (Gong, Dawei.) | Zhu, Minglei (Zhu, Minglei.)

Indexed by：

EI Scopus SCIE

Abstract：

This　paper　presents　a　theoretical　analysis　of　the　value　iteration　Q-learning　with　non-discounted　costs.　The　analysis　focuses　on　two　main　aspects:　the　convergence　of　the　iterative　Q-function　and　the　stability　of　the　system　under　the　final　iterative　control　policy.　Unlike　previous　theoretical　results　on　Q-learning,　our　analysis　takes　into　account　the　effect　of　approximation　errors,　leading　to　a　more　comprehensive　investigation.　We　first　discuss　the　effect　of　approximation　errors　on　the　iterative　Q-function　update.　Then,　considering　the　presence　of　approximation　errors　in　each　iteration,　we　analyze　the　convergence　of　the　iterative　Q-function.　Furthermore,　we　establish　a　sufficient　condition,　also　accounting　for　the　approximation　errors,　to　ensure　the　stability　of　the　system　under　the　final　iterative　control　policy.　Finally,　two　simulation　cases　are　conducted　to　validate　the　presented　convergence　and　stability　results.

Keyword：

Adaptive dynamic programming Neural network Adaptive critic control Optimal control Nonlinear system

Author Community：

[ 1 ] [Song, Shijie]Univ Elect Sci & Technol China, Sch Mech & Elect Engn, Chengdu 611731, Peoples R China
[ 2 ] [Gong, Dawei]Univ Elect Sci & Technol China, Sch Mech & Elect Engn, Chengdu 611731, Peoples R China
[ 3 ] [Zhao, Mingming]Beijing Univ Technol, Beijing Key Lab Computat Intelligence & Intelligen, Beijing 100124, Peoples R China
[ 4 ] [Zhu, Minglei]Southwest Jiaotong Univ, Sch City & Intelligent Transportat, Chengdu 611756, Peoples R China

Reprint Author's Address：

[Gong, Dawei]Univ Elect Sci & Technol China, Sch Mech & Elect Engn, Chengdu 611731, Peoples R China;;

Email：

shijie.song@outlook.com |
zhaomm@emails.bjut.edu.cn |
pzhzhx@126.com |
minglei.zhu@uestc.edu.cn

Show more details

Related Keywords：

Neural-network-based discounted optimal control via an integrated value iteration with accuracy guarantee
2021，NEURAL NETWORKS
Static/dynamic event-triggered learning control for constrained nonlinear systems
2024，NONLINEAR DYNAMICS
Multilayer adaptive critic design with digital twin for data-driven optimal tracking control and industrial applications
2024，ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE
Stabilizing value iteration Q-learning for online evolving control of discrete-time nonlinear systems
2024，NONLINEAR DYNAMICS

Source ：

NEUROCOMPUTING

ISSN： 0925-2312

Year： 2024

Volume： 606

6 . 0 0 0

JCR@2022

Cited Count：

WoS CC Cited Count：

SCOPUS Cited Count：

ESI Highly Cited Papers on the List： 0 Unfold All

WanFang Cited Count：

Chinese Cited Count：

30 Days PV： 2

Affiliated Colleges：

Get Fulltext

DOI Library Discovery Baidu Scholar Search Web of Science

Type
Departments

All Years Choose Year From to