CLC number: TP13
On-line Access: 2021-02-01
Received: 2019-11-11
Revision Accepted: 2020-03-27
Crosschecked: 2020-09-28
Cited: 0
Clicked: 7079
Citations: Bibtex RefMan EndNote GB/T7714
Haiyun Zhang, Deyuan Meng, Jin Wang, Guodong Lu. Indirect adaptive fuzzy-regulated optimal control for unknown continuous-time nonlinear systems[J]. Frontiers of Information Technology & Electronic Engineering, 2021, 22(2): 155-169.
@article{title="Indirect adaptive fuzzy-regulated optimal control for unknown continuous-time nonlinear systems",
author="Haiyun Zhang, Deyuan Meng, Jin Wang, Guodong Lu",
journal="Frontiers of Information Technology & Electronic Engineering",
volume="22",
number="2",
pages="155-169",
year="2021",
publisher="Zhejiang University Press & Springer",
doi="10.1631/FITEE.1900610"
}
%0 Journal Article
%T Indirect adaptive fuzzy-regulated optimal control for unknown continuous-time nonlinear systems
%A Haiyun Zhang
%A Deyuan Meng
%A Jin Wang
%A Guodong Lu
%J Frontiers of Information Technology & Electronic Engineering
%V 22
%N 2
%P 155-169
%@ 2095-9184
%D 2021
%I Zhejiang University Press & Springer
%DOI 10.1631/FITEE.1900610
TY - JOUR
T1 - Indirect adaptive fuzzy-regulated optimal control for unknown continuous-time nonlinear systems
A1 - Haiyun Zhang
A1 - Deyuan Meng
A1 - Jin Wang
A1 - Guodong Lu
J0 - Frontiers of Information Technology & Electronic Engineering
VL - 22
IS - 2
SP - 155
EP - 169
%@ 2095-9184
Y1 - 2021
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/FITEE.1900610
Abstract: We present a novel indirect adaptive fuzzy-regulated optimal control scheme for continuous-time nonlinear systems with unknown dynamics, mismatches, and disturbances. Initially, the Hamilton-Jacobi-Bellman (HJB) equation associated with its performance function is derived for the original nonlinear systems. Unlike existing adaptive dynamic programming (ADP) approaches, this scheme uses a special non-quadratic variable performance function as the reinforcement medium in the actor-critic architecture. An adaptive fuzzy-regulated critic structure is correspondingly constructed to configure the weighting matrix of the performance function for the purpose of approximating and balancing the HJB equation. A concurrent self-organizing learning technique is designed to adaptively update the critic weights. Based on this particular critic, an adaptive optimal feedback controller is developed as the actor with a new form of augmented Riccati equation to optimize the fuzzy-regulated variable performance function in real time. The result is an online indirect adaptive optimal control mechanism implemented as an actor-critic structure, which involves continuous-time adaptation of both the optimal cost and the optimal control policy. The convergence and closed-loop stability of the proposed system are proved and guaranteed. Simulation examples and comparisons show the effectiveness and advantages of the proposed method.
[1]Abu-Khalaf M, Lewis FL, 2005. Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach. Automatica, 41(5):779-791.
[2]Bhasin S, Kamalapurkar R, Johnson M, et al., 2013. A novel actor-critic-identifier architecture for approximate optimal control of uncertain nonlinear systems. Automatica, 49(1):82-92.
[3]Bian T, Jiang ZP, 2016. Value iteration and adaptive dynamic programming for data-driven adaptive optimal control design. Automatica, 71:348-360.
[4]Chang XH, Yang C, Xiong J, 2019. Quantized fuzzy output feedback H∞ control for nonlinear systems with adjustment of dynamic parameters. IEEE Trans Syst Man Cybern Syst, 49(10):2005-2015.
[5]Chang Y, Wang YQ, Alsaadi FE, et al., 2019. Adaptive fuzzy output-feedback tracking control for switched stochastic pure-feedback nonlinear systems. Int J Adapt Contr Signal Process, 33(10):1567-1582.
[6]Finlayson BA, 1990. The Method of Weighted Residuals and Variational Principles. Academic Press, New York, USA.
[7]Huo X, Ma L, Zhao XD, et al., 2020. Event-triggered adaptive fuzzy output feedback control of MIMO switched nonlinear systems with average dwell time. Appl Math Comput, 365:124665.
[8]Ioannou PA, Fidan B, 2006. Advances in Design and Control. Adaptive Control Tutorial. SIAM, Philadelphia, USA.
[9]Jiang Y, Jiang ZP, 2012. Computational adaptive optimal control for continuous-time linear systems with completely unknown dynamics. Automatica, 48(10):2699-2704.
[10]Jiang Y, Jiang ZP, 2014. Robust adaptive dynamic programming and feedback stabilization of nonlinear systems. IEEE Trans Neur Netw Learn Syst, 25(5):882-893.
[11]Kiumarsi B, Lewis FL, Modares H, et al., 2014. Reinforcement Q-learning for optimal tracking control of linear discrete-time systems with unknown dynamics. Automatica, 50(4):1167-1175.
[12]Lee JM, Lee JH, 2004. Approximate dynamic programming strategies and their applicability for process control: a review and future directions. Int J Contr Autom Syst, 2(3):263-278.
[13]Lee JY, Park JB, Choi YH, 2012. Integral Q-learning and explorized policy iteration for adaptive optimal control of continuous-time linear systems. Automatica, 48(11):2850-2859.
[14]Lee JY, Park JB, Choi YH, 2015. Integral reinforcement learning for continuous-time input-affine nonlinear systems with simultaneous invariant explorations. IEEE Trans Neur Netw Learn Syst, 26(5):916-932.
[15]Lewis FL, Vrabie DL, Syrmos VL, 2012a. Optimal Control (3rd Ed.). Wiley, Hoboken, USA.
[16]Lewis FL, Vrabie D, Vamvoudakis KG, 2012b. Reinforcement learning and feedback control: using natural decision methods to design optimal adaptive controllers. IEEE Contr Syst Mag, 32(6):76-105.
[17]Li YM, Tong SC, Li TS, 2016. Hybrid fuzzy adaptive output feedback control design for uncertain MIMO nonlinear systems with time-varying delays and input saturation. IEEE Trans Fuzzy Syst, 24(4):841-853.
[18]Lin WS, 2011. Optimality and convergence of adaptive optimal control by reinforcement synthesis. Automatica, 47(5):1047-1052.
[19]Liu DR, Wei QL, 2013. Finite-approximation-error-based optimal control approach for discrete-time nonlinear systems. IEEE Trans Cybern, 43(2):779-789.
[20]Liu DR, Yang X, Li HL, 2013. Adaptive optimal control for a class of continuous-time affine nonlinear systems with unknown internal dynamics. Neur Comput Appl, 23(7):1843-1850.
[21]Liu DR, Wang D, Wang FY, et al., 2014. Neural-network-based online HJB solution for optimal robust guaranteed cost control of continuous-time uncertain nonlinear systems. IEEE Trans Cybern, 44(12):2834-2847.
[22]Ma L, Huo X, Zhao XD, et al., 2019. Adaptive fuzzy tracking control for a class of uncertain switched nonlinear systems with multiple constraints: a small-gain approach. Int J Fuzzy Syst, 21(8):2609-2624.
[23]Modares H, Lewis FL, 2014. Optimal tracking control of nonlinear partially-unknown constrained-input systems using integral reinforcement learning. Automatica, 50(7):1780-1792.
[24]Modares H, Naghibi Sistani MB, Lewis FL, 2013. A policy iteration approach to online optimal control of continuous-time constrained-input systems. ISA Trans, 52(5):611-621.
[25]Murray JJ, Cox CJ, Lendaris GG, et al., 2002. Adaptive dynamic programming. IEEE Trans Syst Man Cybern Part C, 32(2):140-153.
[26]Padhi R, Unnikrishnan N, Wang XH, et al., 2006. A Single Network Adaptive Critic (SNAC) architecture for optimal control synthesis for a class of nonlinear systems. Neur Netw, 19(10):1648-1660.
[27]Powell WB, 2007. Approximate Dynamic Programming: Solving the Curses of Dimensionality. Wiley, New York, USA.
[28]Sastry SS, 1999. Nonlinear Systems: Analysis, Stability, and Control. Springer-Verlag, New York, USA.
[29]Slotine JE, Li W, 1991. Applied Nonlinear Control. Prentice Hall, Englewood Cliffs, NJ, USA.
[30]Song RZ, Xiao WD, Zhang HG, et al., 2014. Adaptive dynamic programming for a class of complex-valued nonlinear systems. IEEE Trans Neur Netw Learn Syst, 25(9):1733-1739.
[31]Tao G, 2003. Adaptive Control Design and Analysis. In: Adaptive and Learning Systems for Signal Processing, Communications and Control Series. Wiley-Interscience, Hoboken, NJ, USA.
[32]Vamvoudakis KG, 2017. Q-learning for continuous-time linear systems: a model-free infinite horizon optimal control approach. Syst Contr Lett, 100:14-20.
[33]Vamvoudakis KG, Lewis FL, 2010. Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem. Automatica, 46(5):878-888.
[34]van der Schaft AJ, 1992. L2-gain analysis of nonlinear systems and nonlinear state-feedback H1 control. IEEE Trans Autom Contr, 37(6):770-784.
[35]Vrabie D, Pastravanu O, Abu-Khalaf M, et al., 2009. Adaptive optimal control for continuous-time linear systems based on policy iteration. Automatica, 45(2):477-484.
[36]Wang FY, Zhang HG, Liu DR, 2009. Adaptive dynamic programming: an introduction. IEEE Comput Intell Mag, 4(2):39-47.
[37]Wei QL, Zhang HG, Dai J, 2009. Model-free multiobjective approximate dynamic programming for discrete-time nonlinear systems with general performance index functions. Neurocomputing, 72(8-9):1839-1848.
[38]Werbos P, 2004. ADP: goals, opportunities and principles. In: Si J, Barto A, Powell W, et al. (Eds.), Handbook of Learning and Approximate Dynamic Programming. Institute of Electrical and Electronics Engineers, New York, USA, p.3-44.
[39]Yang X, He HB, 2018. Self-learning robust optimal control for continuous-time nonlinear systems with mismatched disturbances. Neur Netw, 99:19-30.
[40]Yang X, Liu DR, Luo B, et al., 2016. Data-based robust adaptive control for a class of unknown nonlinear constrained-input systems via integral reinforcement learning. Inform Sci, 369:731-747.
[41]Yang XY, Liu DR, Huang YZ, 2013. Neural-network-based online optimal control for uncertain non-linear continuous-time systems with control constraints. IET Contr Theory Appl, 7(17):2037-2047.
[42]Yin YF, Zhao XD, Zheng XL, 2017. New stability and stabilization conditions of switched systems with mode-dependent average dwell time. Circ Syst Signal Process, 36(1):82-98.
[43]Yu ZX, Yang YK, Li SG, et al., 2018. Observer-based adaptive finite-time quantized tracking control of nonstrict-feedback nonlinear systems with asymmetric actuator saturation. IEEE Trans Syst Man Cyber Syst, 50(11):545-4556.
[44]Zak SH, 2003. Systems and Control. Oxford University Press, New York, USA.
Open peer comments: Debate/Discuss/Question/Opinion
<1>