Data-driven Policy Optimization for Stochastic Systems Involving Adaptive Critic; [融合自适应评判的随机系统数据驱动策略优化] - Details

Author：

Wang, D. (Wang, D..) | Wang, J.-Y. (Wang, J.-Y..) | Qiao, J.-F. (Qiao, J.-F..)

Indexed by：

EI Scopus

Abstract：

Adaptive　critic　technology　has　been　widely　employed　to　solve　the　optimal　control　problems　of　complicated　nonlinear　systems,　but　there　are　some　limitations　to　solve　the　infinite-horizon　optimal　problems　of　discrete-time　nonlinear　stochastic　systems.　In　this　paper,　we　establish　a　data-driven　discounted　optimal　regulation　method　for　discrete-time　stochastic　systems　involving　adaptive　critic　technology.　First,　we　investigate　the　infinite-horizon　optimal　problems　with　the　discount　factor　for　stochastic　systems　under　the　relaxed　assumption.　The　developed　stochastic　Q-learning　algorithm　can　optimize　an　initial　admissible　policy　to　the　optimal　one　in　a　monotonically　nonincreasing　way.　Based　on　the　data-driven　idea,　the　policy　optimization　of　the　stochastic　Q-learning　algorithm　is　executed　without　a　dynamic　model.　Then,　the　stochastic　Q-learning　algorithm　is　implemented　by　utilizing　the　actor-critic　neural　networks.　Finally,　two　nonlinear　benchmarks　are　given　to　demonstrate　the　overall　performance　of　the　developed　stochastic　Q-learning　algorithm.　©　2024　Science　Press.　All　rights　reserved.

Keyword：

stochastic optimal control data-driven Q-learning Adaptive critic design neural networks discrete-time systems

Author Community：

[ 1 ] [Wang D.]Faculty of Information Technology, Beijing University of Technology, Beijing, 100124, China
[ 2 ] [Wang D.]Beijing Key Laboratory of Computational Intelligence and Intelligent System, Beijing, 100124, China
[ 3 ] [Wang D.]Beijing Institute of Artificial Intelligence, Beijing, 100124, China
[ 4 ] [Wang D.]Beijing Laboratory of Smart Environmental Protection, Beijing, 100124, China
[ 5 ] [Wang J.-Y.]Faculty of Information Technology, Beijing University of Technology, Beijing, 100124, China
[ 6 ] [Wang J.-Y.]Beijing Key Laboratory of Computational Intelligence and Intelligent System, Beijing, 100124, China
[ 7 ] [Wang J.-Y.]Beijing Institute of Artificial Intelligence, Beijing, 100124, China
[ 8 ] [Wang J.-Y.]Beijing Laboratory of Smart Environmental Protection, Beijing, 100124, China
[ 9 ] [Qiao J.-F.]Faculty of Information Technology, Beijing University of Technology, Beijing, 100124, China
[ 10 ] [Qiao J.-F.]Beijing Key Laboratory of Computational Intelligence and Intelligent System, Beijing, 100124, China
[ 11 ] [Qiao J.-F.]Beijing Institute of Artificial Intelligence, Beijing, 100124, China
[ 12 ] [Qiao J.-F.]Beijing Laboratory of Smart Environmental Protection, Beijing, 100124, China

Reprint Author's Address：

Email：

Show more details

Related Keywords：

Optimal Dynamic Supply of Parking Permits Under Uncertainties: A Stochastic Control Approach
2022，IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS
Policy Gradient Adaptive Critic Design With Dynamic Prioritized Experience Replay for Wastewater Treatment Process Control
2022，IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS
Event-based online learning control design with eligibility trace for discrete-time unknown nonlinear systems
2023，Engineering Applications of Artificial Intelligence
Multilayer adaptive critic design with digital twin for data-driven optimal tracking control and industrial applications
2024，Engineering Applications of Artificial Intelligence

Source ：

Acta Automatica Sinica

ISSN： 0254-4156

Year： 2024

Issue： 5

Volume： 50

Page： 980-990

Cited Count：

WoS CC Cited Count：

SCOPUS Cited Count： 1

ESI Highly Cited Papers on the List： 0 Unfold All

WanFang Cited Count：

Chinese Cited Count：

30 Days PV： 0

Affiliated Colleges：

Get Fulltext

DOI Library Discovery Baidu Scholar Search SCOPUS

Type
Departments

All Years Choose Year From to