Neural-network-based discounted optimal control via an integrated value iteration with accuracy guarantee - Details

Author：

Ha, Mingming (Ha, Mingming.) | Wang, Ding (Wang, Ding.) (Scholars：王鼎) | Liu, Derong (Liu, Derong.)

Indexed by：

EI Scopus SCIE

Abstract：

A　data-based　value　iteration　algorithm　with　the　bidirectional　approximation　feature　is　developed　for　discounted　optimal　control.　The　unknown　nonlinear　system　dynamics　is　first　identified　by　establishing　a　model　neural　network.　To　improve　the　identification　precision,　biases　are　introduced　to　the　model　network.　The　model　network　with　biases　is　trained　by　the　gradient　descent　algorithm,　where　the　weights　and　biases　across　all　layers　are　updated.　The　uniform　ultimate　boundedness　stability　with　a　proper　learning　rate　is　analyzed,　by　using　the　Lyapunov　approach.　Moreover,　an　integrated　value　iteration　with　the　discounted　cost　is　developed　to　fully　guarantee　the　approximation　accuracy　of　the　optimal　value　function.　Then,　the　effectiveness　of　the　proposed　algorithm　is　demonstrated　by　carrying　out　two　simulation　examples　with　physical　backgrounds.　(C)　2021　Elsevier　Ltd.　All　rights　reserved.

Keyword：

Value iteration Adaptive dynamic programming Data-based discounted optimal control Uniformly ultimately bounded stability Lyapunov method

Author Community：

[ 1 ] [Ha, Mingming]Univ Sci & Technol Beijing, Sch Automat & Elect Engn, Beijing 100083, Peoples R China
[ 2 ] [Wang, Ding]Beijing Univ Technol, Fac Informat Technol, Beijing 100124, Peoples R China
[ 3 ] [Wang, Ding]Beijing Univ Technol, Beijing Key Lab Computat Intelligence & Intellige, Beijing 100124, Peoples R China
[ 4 ] [Liu, Derong]Univ Illinois, Dept Elect & Comp Engn, Chicago, IL 60607 USA

Reprint Author's Address：

王鼎
[Wang, Ding]Beijing Univ Technol, Fac Informat Technol, Beijing 100124, Peoples R China

Email：

hamingming_0705@foxmail.com |
dingwang@bjut.edu.cn |
derong@uic.edu

Show more details

Related Keywords：

Improved value iteration for neural-network-based stochastic optimal control design
2020，NEURAL NETWORKS
Swarm-intelligence-based value iteration for optimal regulation of continuous-time nonlinear systems
2025，SWARM AND EVOLUTIONARY COMPUTATION
General multi-step value iteration for optimal learning control
2025，AUTOMATICA
Neural critic learning with accelerated value iteration for nonlinear model predictive control
2024，NEURAL NETWORKS

Source ：

NEURAL NETWORKS

ISSN： 0893-6080

Year： 2021

Volume： 144

Page： 176-186

7 . 8 0 0

JCR@2022

ESI Discipline： COMPUTER SCIENCE;

ESI HC Threshold：87

JCR Journal Grade：1

Cited Count：

WoS CC Cited Count： 18

SCOPUS Cited Count： 19

ESI Highly Cited Papers on the List： 0 Unfold All

WanFang Cited Count：

Chinese Cited Count：

30 Days PV： 4

Affiliated Colleges：

信息科学技术学院本学院/部未明确归属的数据

Get Fulltext

DOI Library Discovery Baidu Scholar Search Web of Science

Type
Departments

All Years Choose Year From to