• Complex
  • Title
  • Keyword
  • Abstract
  • Scholars
  • Journal
  • ISSN
  • Conference
搜索

Author:

Zhao, M. (Zhao, M..) | Wang, D. (Wang, D..) | Qiao, J. (Qiao, J..)

Indexed by:

EI Scopus SCIE

Abstract:

For unknown nonlinear systems with state constraints, it is difficult to achieve the safe optimal control by using Q-learning methods based on traditional quadratic utility functions. To solve this problem, this article proposes an accelerated safe Q-learning (SQL) technique that addresses the concurrent requirements of safety and optimality for discrete-time nonlinear systems within an integrated framework. First, an adjustable control barrier function is designed and integrated into the cost function, aiming to facilitate the transformation of constrained optimal control problems into unconstrained cases. The augmented cost function is closely linked to the next state, enabling quicker deviation of the state from constraint boundaries. Second, leveraging offline data that adheres to safety constraints, we introduce an off-policy value iteration SQL approach for searching a safe optimal policy, thus mitigating the risk of unsafe interactions that may result from suboptimal iterative policies. Third, the vast amounts of offline data and the complex augmented cost function can hinder the learning speed of the algorithm. To address this issue, we integrate historical iteration information into the current iteration step to accelerate policy evaluation, and introduce the Nesterov Momentum technique to expedite policy improvement. Additionally, the theoretical analysis demonstrates the convergence, optimality, and safety of the SQL algorithm. Finally, under the influence of different parameters, simulation outcomes of two nonlinear systems with state constraints reveal the efficacy and advantages of the accelerated SQL approach. The proposed method requires fewer iterations while enabling the system state to converge to the equilibrium point more rapidly. © 2025 Elsevier Ltd

Keyword:

Adaptive dynamic programming Accelerated value iteration Control barrier functions State constraints Neural networks Safe Q-learning

Author Community:

  • [ 1 ] [Zhao M.]School of Information Science and Technology, Beijing University of Technology, Beijing, 100124, China
  • [ 2 ] [Zhao M.]Beijing Key Laboratory of Computational Intelligence and Intelligent System, Beijing University of Technology, Beijing, 100124, China
  • [ 3 ] [Zhao M.]Beijing Institute of Artificial Intelligence, Beijing University of Technology, Beijing, 100124, China
  • [ 4 ] [Zhao M.]Beijing Laboratory of Smart Environmental Protection, Beijing University of Technology, Beijing, 100124, China
  • [ 5 ] [Wang D.]School of Information Science and Technology, Beijing University of Technology, Beijing, 100124, China
  • [ 6 ] [Wang D.]Beijing Key Laboratory of Computational Intelligence and Intelligent System, Beijing University of Technology, Beijing, 100124, China
  • [ 7 ] [Wang D.]Beijing Institute of Artificial Intelligence, Beijing University of Technology, Beijing, 100124, China
  • [ 8 ] [Wang D.]Beijing Laboratory of Smart Environmental Protection, Beijing University of Technology, Beijing, 100124, China
  • [ 9 ] [Qiao J.]School of Information Science and Technology, Beijing University of Technology, Beijing, 100124, China
  • [ 10 ] [Qiao J.]Beijing Key Laboratory of Computational Intelligence and Intelligent System, Beijing University of Technology, Beijing, 100124, China
  • [ 11 ] [Qiao J.]Beijing Institute of Artificial Intelligence, Beijing University of Technology, Beijing, 100124, China
  • [ 12 ] [Qiao J.]Beijing Laboratory of Smart Environmental Protection, Beijing University of Technology, Beijing, 100124, China

Reprint Author's Address:

Email:

Show more details

Related Keywords:

Source :

Neural Networks

ISSN: 0893-6080

Year: 2025

Volume: 186

7 . 8 0 0

JCR@2022

Cited Count:

WoS CC Cited Count:

SCOPUS Cited Count:

ESI Highly Cited Papers on the List: 0 Unfold All

WanFang Cited Count:

Chinese Cited Count:

30 Days PV: 5

Affiliated Colleges:

Online/Total:504/10596339
Address:BJUT Library(100 Pingleyuan,Chaoyang District,Beijing 100124, China Post Code:100124) Contact Us:010-67392185
Copyright:BJUT Library Technical Support:Beijing Aegean Software Co., Ltd.