Indexed by:
Abstract:
In this paper, a novel value iteration adaptive dynamic programming (ADP) algorithm is presented, which is called an improved value iteration ADP algorithm, to obtain the optimal policy for discrete stochastic processes. In the improved value iteration ADP algorithm, for the first time we propose a new criteria to verify whether the obtained policy is stable or not for stochastic processes. By analyzing the convergence properties of the proposed algorithm, it is shown that the iterative value functions can converge to the optimum. In addition, our algorithm allows the initial value function to be an arbitrary positive semi-definite function. Finally, two simulation examples are presented to validate the effectiveness of the developed method. (C) 2020 Elsevier Ltd. All rights reserved.
Keyword:
Reprint Author's Address:
Source :
NEURAL NETWORKS
ISSN: 0893-6080
Year: 2020
Volume: 124
Page: 280-295
7 . 8 0 0
JCR@2022
ESI Discipline: COMPUTER SCIENCE;
ESI HC Threshold:132
Cited Count:
WoS CC Cited Count: 15
SCOPUS Cited Count: 18
ESI Highly Cited Papers on the List: 0 Unfold All
WanFang Cited Count:
Chinese Cited Count:
30 Days PV: 10
Affiliated Colleges: