Indexed by:
Abstract:
When facing large amounts of data, it is a challenging task to optimize policies by using all data at once. In this paper, a data-driven Q-learning scheme with parallel multi-step deduction is developed to improve learning efficiency using small batch data for discrete-time nonlinear control. Specifically, a data-driven model is established by making use of all data in advance. Then, the proposed algorithm can parallel deduce the small batch data to effectively accelerate the learning process. Furthermore, we can adjust the step size of multi-step deduction to balance the utilization between data and model. The near-optimal policy can be obtained ultimately by using hybrid data from the real system and data-driven model. Finally, a torsional pendulum plant is given to demonstrate the effectiveness of the proposed method. © 2024 IEEE.
Keyword:
Reprint Author's Address:
Email:
Source :
Year: 2024
Page: 739-744
Language: English
Cited Count:
SCOPUS Cited Count:
ESI Highly Cited Papers on the List: 0 Unfold All
WanFang Cited Count:
Chinese Cited Count:
30 Days PV: 8
Affiliated Colleges: