Indexed by:
Abstract:
A data-based value iteration algorithm with the bidirectional approximation feature is developed for discounted optimal control. The unknown nonlinear system dynamics is first identified by establishing a model neural network. To improve the identification precision, biases are introduced to the model network. The model network with biases is trained by the gradient descent algorithm, where the weights and biases across all layers are updated. The uniform ultimate boundedness stability with a proper learning rate is analyzed, by using the Lyapunov approach. Moreover, an integrated value iteration with the discounted cost is developed to fully guarantee the approximation accuracy of the optimal value function. Then, the effectiveness of the proposed algorithm is demonstrated by carrying out two simulation examples with physical backgrounds. (C) 2021 Elsevier Ltd. All rights reserved.
Keyword:
Reprint Author's Address:
Source :
NEURAL NETWORKS
ISSN: 0893-6080
Year: 2021
Volume: 144
Page: 176-186
7 . 8 0 0
JCR@2022
ESI Discipline: COMPUTER SCIENCE;
ESI HC Threshold:87
JCR Journal Grade:1
Cited Count:
WoS CC Cited Count: 18
SCOPUS Cited Count: 19
ESI Highly Cited Papers on the List: 0 Unfold All
WanFang Cited Count:
Chinese Cited Count:
30 Days PV: 4
Affiliated Colleges: