Discounted Generalized Value Iteration for Adaptive Critic Control Based on ℓ1-Regularization - Details

Author：

Ma, H. (Ma, H..) | Wang, D. (Wang, D..) | Gao, N. (Gao, N..) | Liu, A. (Liu, A..) | Ren, J. (Ren, J..)

Indexed by：

EI Scopus

Abstract：

Combined　with　policy-based　reinforcement　learning　(RL)　and　value-based　RL,　the　actor-critic　(AC)　learning　structure　is　an　effective　framework.　However,　the　cost　function　of　this　AC　framework　has　large　variances,　which　make　it　difficult　to　accomplish　an　optimization　objective.　Based　on　the　discounted　generalized　value　iteration　method　with　ℓ1-regularization,　a　regularized　AC　(RAC)　framework　is　developed　to　address　the　optimal　regulation　problems　and　make　the　cost　function　converge　faster.　Two　neural　networks　are　constructed　to　update　the　cost　function　and　the　policy　gradient,　respectively.　The　ℓ1-regularization　is　used　in　the　policy　gradient　and　the　cost　function　in　the　process　of　value　iteration.　The　cost　function　is　proved　to　converge　to　the　optimal　cost　function　in　a　monotonically　decreasing　form.　Finally,　the　effectiveness　of　RAC　is　shown　through　two　experiments.　　©　2023　IEEE.

Keyword：

adaptive dynamic programming generalized value iteration ℓ1-regularization policy gradient Actor-critic

Author Community：

[ 1 ] [Ma H.]Beijing University of Technology, Beijing, China
[ 2 ] [Wang D.]Beijing University of Technology, Beijing, China
[ 3 ] [Gao N.]Beijing University of Technology, Beijing, China
[ 4 ] [Liu A.]Beijing University of Technology, Beijing, China
[ 5 ] [Ren J.]Beijing University of Technology, Beijing, China

Reprint Author's Address：

Email：

Show more details

Related Keywords：

An advanced robust integral reinforcement learning scheme with the fuzzy inference system
2024，INTERNATIONAL JOURNAL OF ROBUST AND NONLINEAR CONTROL
Improved value iteration for nonlinear tracking control with accelerated learning
2024，International Journal of Robust and Nonlinear Control
Decentralized controller design with asymmetric input constraints for unknown unmatched interconnected systems; [未知不匹配互联系统的非对称输入约束分散控制器设计]
2024，Chinese Journal of Engineering
Online Nonaffine Optimal Tracking Control via Direct Heuristic Dynamic Programming
2024，2024 14TH ASIAN CONTROL CONFERENCE, ASCC 2024

Source ：

Year： 2023

Page： 105-110

Language： English

Cited Count：

WoS CC Cited Count： 0

SCOPUS Cited Count：

ESI Highly Cited Papers on the List： 0 Unfold All

WanFang Cited Count：

Chinese Cited Count：

30 Days PV： 8

Affiliated Colleges：

Get Fulltext

DOI Library Discovery Baidu Scholar Search SCOPUS

Type
Departments

All Years Choose Year From to