JZUS - Journal of Zhejiang University SCIENCE

Frontiers of Information Technology & Electronic Engineering 2025 Vol.26 No.3 P.456-471

Reinforcement learning based privacy-preserving consensus tracking control of nonstrict-feedback discrete-time multi-agent systems

Author(s): Yang YANG, Fanming HUANG, Dong YUE
Affiliation(s): College of Automation & College of Artificial Intelligence, Nanjing University of Posts and Telecommunications, Nanjing 210023, China; more
Corresponding email(s): yyang@njupt.edu.cn, medongy@vip.163.com
Key Words: Multi-agent systems, Consensus tracking, Privacy-preserving, Reinforcement learning

Share this article to： More <<< Previous Article \|Next Article >>>

Yang YANG, Fanming HUANG, Dong YUE. Reinforcement learning based privacy-preserving consensus tracking control of nonstrict-feedback discrete-time multi-agent systems[J]. Frontiers of Information Technology & Electronic Engineering, 2025, 26(3): 456-471.

@article{title="Reinforcement learning based privacy-preserving consensus tracking control of nonstrict-feedback discrete-time multi-agent systems",
author="Yang YANG, Fanming HUANG, Dong YUE",
journal="Frontiers of Information Technology & Electronic Engineering",
volume="26",
number="3",
pages="456-471",
year="2025",
publisher="Zhejiang University Press & Springer",
doi="10.1631/FITEE.2300532"
}

%0 Journal Article
%T Reinforcement learning based privacy-preserving consensus tracking control of nonstrict-feedback discrete-time multi-agent systems
%A Yang YANG
%A Fanming HUANG
%A Dong YUE
%J Frontiers of Information Technology & Electronic Engineering
%V 26
%N 3
%P 456-471
%@ 2095-9184
%D 2025
%I Zhejiang University Press & Springer
%DOI 10.1631/FITEE.2300532

TY - JOUR
T1 - Reinforcement learning based privacy-preserving consensus tracking control of nonstrict-feedback discrete-time multi-agent systems
A1 - Yang YANG
A1 - Fanming HUANG
A1 - Dong YUE
J0 - Frontiers of Information Technology & Electronic Engineering
VL - 26
IS - 3
SP - 456
EP - 471
%@ 2095-9184
Y1 - 2025
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/FITEE.2300532

Abstract
Chinese Summary
Academic Network
Reviewer Comment

Abstract: This paper investigates a privacy-preserving consensus tracking problem for a class of nonstrict-feedback discrete-time multi-agent systems (MASs). An improved Liu cryptosystem is developed to alleviate the errors between encryption and decryption on the plaintext, which ensures satisfactory recovery of the plaintext information. A reinforcement learning (RL) technique is then employed to compensate for unknown dynamics and errors between true signals and decrypted ones. Based on the backstepping and graph theory, an RL-based privacy-preserving consensus tracking control strategy is further designed. By virtue of graph theory and Lyapunov stability theory, it is shown that the consensus tracking errors and all signals in the MAS are ultimately bounded. Finally, simulation examples are presented for verification of the effectiveness of the control strategy.

基于强化学习的非严格反馈离散时间多智能体系统隐私保护一致性跟踪控制

杨杨¹，黄范铭¹，岳东^1,2
¹南京邮电大学自动化学院、人工智能学院，中国南京市，210023
²南京邮电大学碳中和先进技术研究院，中国南京市，210023
摘要：本文研究了一类非严格反馈离散时间多智能体系统的隐私保护一致性跟踪问题。为减轻明文加密和解密之间的误差影响，开发一种改进的Liu加密系统，以确保明文信息恢复良好。采用强化学习技术补偿未知动态和真实信号与解密信号之间的误差。采用反步法和图论知识，设计基于强化学习的隐私保护一致性跟踪控制策略。借助李雅普诺夫稳定性理论，证明多智能体系统的一致跟踪误差和所有信号最终有界。最后，通过仿真实例验证设计控制策略的有效性。

关键词：多智能体系统；一致跟踪；隐私保护；强化学习

Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article

Reference

[1]Bai WW, Zhang B, Zhou Q, et al., 2020a. Multigradient recursive reinforcement learning NN control for affine nonlinear systems with unmodeled dynamics. Int J Robust Nonl Contr, 30(4):1643-1663.

[2]Bai WW, Li TS, Tong SC, 2020b. NN reinforcement learning adaptive control for a class of nonstrict-feedback discrete-time systems. IEEE Trans Cybern, 50(11):4573-4584.

[3]Chen W, Liu L, Liu GP, 2023. Privacy-preserving distributed economic dispatch of microgrids: a dynamic quantization-based consensus scheme with homomorphic encryption. IEEE Trans Smart Grid, 14(1):701-713.

[4]Ding L, Li S, Gao HB, et al., 2020. Adaptive partial reinforcement learning neural network-based tracking control for wheeled mobile robotic systems. IEEE Trans Syst Man Cybern Syst, 50(7):2512-2523.

[5]Ding L, Li S, Gao HB, et al., 2021. Adaptive neural network-based finite-time online optimal tracking control of the nonlinear system with dead zone. IEEE Trans Cybern, 51(1):382-392.

[6]Elgamal T, 1985. A public key cryptosystem and a signature scheme based on discrete logarithms. IEEE Trans Inform Theory, 31(4):469-472.

[7]Fang WT, Zamani M, Chen ZY, 2021. Secure and privacy preserving consensus for second-order systems based on Paillier encryption. Syst Contr Lett, 148:104869.

[8]Gao C, Wang ZD, He X, et al., 2021. Encryption–decryption-based consensus control for multi-agent systems: handling actuator faults. Automatica, 134:109908.

[9]Gao L, Deng SJ, Ren W, 2019. Differentially private consensus with an event-triggered mechanism. IEEE Trans Contr Netw Syst, 6(1):60-71.

[10]Ge SS, Li GY, Lee TH, 2003. Adaptive NN control for a class of strict-feedback discrete-time nonlinear systems. Automatica, 39(5):807-819.

[11]Ge XH, Xiao SY, Han QL, et al., 2022. Dynamic event-triggered scheduling and platooning control co-design for automated vehicles over vehicular ad-hoc networks. IEEE/CAA J Autom Sin, 9(1):31-46.

[12]Ge XH, Han QL, Wu Q, et al., 2023. Resilient and safe platooning control of connected automated vehicles against intermittent denial-of-service attacks. IEEE/CAA J Autom Sin, 10(5):1234-1251.

[13]Ge XH, Han QL, Zhang XM, et al., 2024. Communication resource-efficient vehicle platooning control with various spacing policies. IEEE/CAA J Autom Sin, 11(2):362-376.

[14]He WL, Xu WY, Ge XH, et al., 2022. Secure control of multiagent systems against malicious attacks: a brief survey. IEEE Trans Ind Inform, 18(6):3595-3608.

[15]Ju YM, Ding DR, He X, et al., 2022. Consensus control of multi-agent systems using fault-estimation-in-the-loop: dynamic event-triggered case. IEEE/CAA J Autom Sin, 9(8):1440-1451.

[16]Kishida M, 2019. Encrypted control system with quantiser. IET Contr Theory Appl, 13(1):146-151.

[17]Li HY, Wu Y, Chen M, 2021. Adaptive fault-tolerant tracking control for discrete-time multiagent systems via reinforcement learning algorithm. IEEE Trans Cybern, 51(3):1163-1174.

[18]Li JN, Yuan L, Chai TY, et al., 2023. Consensus of nonlinear multiagent systems with uncertainties using reinforcement learning based sliding mode control. IEEE Trans Circ Syst I Regular Papers, 70(1):424-434.

[19]Li P, Hu JP, Qiu L, et al., 2022. A distributed economic dispatch strategy for power-water networks. IEEE Trans Contr Netw Syst, 9(1):356-366.

[20]Li YM, Min X, Tong SC, 2020a. Adaptive fuzzy inverse optimal control for uncertain strict-feedback nonlinear systems. IEEE Trans Fuzzy Syst, 28(10):2363-2374.

[21]Li YM, Shao XF, Tong SC, 2020b. Adaptive fuzzy prescribed performance control of nontriangular structure nonlinear systems. IEEE Trans Fuzzy Syst, 28(10):2416-2426.

[22]Li YM, Liu YJ, Tong SC, 2022. Observer-based neuro-adaptive optimized control of strict-feedback nonlinear systems with state constraints. IEEE Trans Neur Netw Learn Syst, 33(7):3131-3145.

[23]Liang CD, Ge MF, Xu JZ, et al., 2022. Secure and privacy-preserving formation control for networked marine surface vehicles with sampled-data interactions. IEEE Trans Veh Technol, 71(2):1307-1318.

[24]Lin XZ, Chen CC, Li SH, 2022. Finite-time output feedback stabilization for a class of output-constrained planar switched systems. IEEE Trans Circ Syst II Express Briefs, 69(1):164-168.

[25]Liu DX, 2013. Homomorphic Encryption for Database Querying. US Patent 20150295716.

[26]Liu L, Ding SH, Yu XH, 2021. Second-order sliding mode control design subject to an asymmetric output constraint. IEEE Trans Circ Syst II Express Briefs, 68(4):1278-1282.

[27]Liu L, Cui YJ, Liu YJ, et al., 2022. Adaptive event-triggered output feedback control for nonlinear switched systems based on full state constraints. IEEE Trans Circ Syst II Express Briefs, 69(9):3779-3783.

[28]Ning BD, Han QL, Zuo ZY, et al., 2023. Fixed-time and prescribed-time consensus control of multiagent systems and its applications: a survey of recent trends and methodologies. IEEE Trans Ind Inform, 19(2):1121-1135.

[29]Nozari E, Tallapragada P, Cortés J, 2017. Differentially private average consensus: obstructions, trade-offs, and optimal algorithm design. Automatica, 81:221-231.

[30]Peng ZH, Wang J, Wang D, et al., 2021. An overview of recent advances in coordinated control of multiple autonomous surface vehicles. IEEE Trans Ind Inform, 17(2):732-745.

[31]Qi XJ, Liu WH, Lu Y, 2023. Event-triggered-based fuzzy adaptive tracking control for nonstrict-feedback asymmetric state constrained systems. Fuzzy Sets Syst, 470:108642.

[32]Reddy SS, Sinha S, Zhang W, 2023. Design and analysis of RSA and Paillier homomorphic cryptosystems using PSO-based evolutionary computation. IEEE Trans Comput, 72(7):1886-1900.

[33]Rivest RL, Shamir A, Adleman L, 1978. A method for obtaining digital signatures and public-key cryptosystems. Commun ACM, 21(2):120-126.

[34]Ruan MH, Gao H, Wang YQ, 2019. Secure and privacy-preserving consensus. IEEE Trans Autom Contr, 64(10):4035-4049.

[35]Sakthivel R, Sakthivel R, Kaviarasan B, et al., 2019. Finite-time leaderless consensus of uncertain multi-agent systems against time-varying actuator faults. Neurocomputing, 325:159-171.

[36]Shahvali M, Naghibi-Sistani MB, Askari J, 2018. Adaptive output-feedback bipartite consensus for nonstrict-feedback nonlinear multi-agent systems: a finite-time approach. Neurocomputing, 318:7-17.

[37]Sun JL, Yi JQ, Pu ZQ, 2022. Fixed-time adaptive fuzzy control for uncertain nonstrict-feedback systems with time-varying constraints and input saturations. IEEE Trans Fuzzy Syst, 30(4):1114-1128.

[38]Sun KK, Qiu JB, Karimi HR, et al., 2021. A novel finite-time control for nonstrict feedback saturated nonlinear systems with tracking error constraint. IEEE Trans Syst Man Cybern Syst, 51(6):3968-3979.

[39]Tong SC, Li YM, Sui S, 2016. Adaptive fuzzy tracking control design for SISO uncertain nonstrict feedback non-linear systems. IEEE Trans Fuzzy Syst, 24(6):1441-1454.

[40]Tong SC, Sun KK, Sui S, 2018. Observer-based adaptive fuzzy decentralized optimal control design for strict-feedback nonlinear large-scale systems. IEEE Trans Fuzzy Syst, 26(2):569-584.

[41]Wang AJ, Liu WP, Dong T, et al., 2022. DisEHPPC: enabling heterogeneous privacy-preserving consensus-based scheme for economic dispatch in smart grids. IEEE Trans Cybern, 52(6):5124-5135.

[42]Wang H, Li M, 2022. Model-free reinforcement learning for fully cooperative consensus problem of nonlinear multiagent systems. IEEE Trans Neur Netw Learn Syst, 33(4):1482-1491.

[43]Wang SB, 2022. Asymptotic tracking control for nonaffine systems with disturbances. IEEE Trans Circ Syst II Express Briefs, 69(2):479-483.

[44]Wang YQ, 2019. Privacy-preserving average consensus via state decomposition. IEEE Trans Autom Contr, 64(11):4711-4716.

[45]Wang YQ, Lu JQ, Zheng WX, et al., 2021. Privacy-preserving consensus for multi-agent systems via node decomposition strategy. IEEE Trans Circ Syst I Regular Papers, 68(8):3474-3484.

[46]Wen GX, Li B, 2022. Optimized leader-follower consensus control using reinforcement learning for a class of second-order nonlinear multiagent systems. IEEE Trans Syst Man Cybern Syst, 52(9):5546-5555.

[47]Xie ML, Ding DR, Ge XH, et al., 2022. Distributed platooning control of automated vehicles subject to replay attacks based on proportional integral observers. IEEE/CAA J Autom Sin, 11(9):1954-1966.

[48]Xu HY, Ni YH, Liu ZX, et al., 2021. Privacy-preserving leader-following consensus via node-augment mechanism. IEEE Trans Circ Syst II Express Briefs, 68(6):2117-2121.

[49]Yan YM, Chen ZY, Varadharajan V, et al., 2021. Distributed consensus-based economic dispatch in power grids using the Paillier cryptosystem. IEEE Trans Smart Grid, 12(4):3493-3502.

[50]Yang XD, Zhang H, Wang ZP, 2022. Data-based optimal consensus control for multi-agent systems with policy gradient reinforcement learning. IEEE Trans Neur Netw Learn Syst, 33(8):3872-3883.

[51]Yang ZW, Yu LY, Liu YR, et al., 2022. Event-triggered privacy-preserving bipartite consensus for multi-agent systems based on encryption. Neurocomputing, 503:162-172.

[52]Yin TJ, Lv YZ, Yu WW, 2020. Accurate privacy preserving average consensus. IEEE Trans Circ Syst II Express Briefs, 67(4):690-694.

[53]Yu T, Ma L, Zhang HW, 2019. Prescribed performance for bipartite tracking control of nonlinear multi-agent systems with hysteresis input uncertainties. IEEE Trans Cybern, 49(4):1327-1338.

[54]Zhang HG, Jiang H, Luo YH, et al., 2017. Data-driven optimal consensus control for discrete-time multi-agent systems with unknown dynamics using reinforcement learning method. IEEE Trans Ind Electron, 64(5):4091-4100.

[55]Zhang JX, Li KW, Li YM, 2021. Output-feedback based simplified optimized backstepping control for strict-feedback systems with input and state constraints. IEEE/CAA J Autom Sin, 8(6):1119-1132.

[56]Zhang P, Huang T, Sun XQ, et al., 2023. Privacy-preserving and outsourced multi-party K-means clustering based on multi-key fully homomorphic encryption. IEEE Trans Dependab Secure Comput, 20(3):2348-2359.

[57]Zhang XM, Han QL, Ge XH, et al., 2023. Sampled-data control systems with non-uniform sampling: a survey of methods and trends. Annu Rev Contr, 55:70-91.

[58]Zhou Q, Li HY, Wang LJ, et al., 2018. Prescribed performance observer-based adaptive fuzzy control for nonstrict-feedback stochastic nonlinear systems. IEEE Trans Syst Man Cybern Syst, 48(10):1747-1758.

[59]Zuo XJ, Li LX, Peng HP, et al., 2021. Privacy-preserving multidimensional data aggregation scheme without trusted authority in smart grid. IEEE Syst J, 15(1):395-406.

Open peer comments: Debate/Discuss/Question/Opinion

<1>

Similar articles

- Go to

基于强化学习的非严格反馈离散时间多智能体系统隐私保护一致性跟踪控制

Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article

Reference