JZUS - Journal of Zhejiang University SCIENCE

ENGINEERING Information Technology & Electronic Engineering

Accepted manuscript available online (unedited version)

Black-box adversarial attacks on deep reinforcement learning-based proportional–integral–derivative controllers for load frequency control

Author(s): Wei WANG, Zhenyong ZHANG, Xin WANG, Xuguo JIAO
Affiliation(s): State Key Laboratory of Public Big Data, College of Computer Science and Technology, Guizhou University, Guiyang 550025, China; more
Corresponding email(s): zhangzy@gzu.edu.cn
Key Words: Adaptive controller; Deep reinforcement learning; Load frequency control; Adversarial attacks

Share this article to： More <<< Previous Paper \|Next Paper >>>

Wei WANG, Zhenyong ZHANG, Xin WANG, Xuguo JIAO. Black-box adversarial attacks on deep reinforcement learning-based proportional–integral–derivative controllers for load frequency control[J]. Frontiers of Information Technology & Electronic Engineering,in press.https://doi.org/10.1631/FITEE.2401021

@article{title="Black-box adversarial attacks on deep reinforcement learning-based proportional–integral–derivative controllers for load frequency control",
author="Wei WANG, Zhenyong ZHANG, Xin WANG, Xuguo JIAO",
journal="Frontiers of Information Technology & Electronic Engineering",
year="in press",
publisher="Zhejiang University Press & Springer",
doi="https://doi.org/10.1631/FITEE.2401021"
}

%0 Journal Article
%T Black-box adversarial attacks on deep reinforcement learning-based proportional–integral–derivative controllers for load frequency control
%A Wei WANG
%A Zhenyong ZHANG
%A Xin WANG
%A Xuguo JIAO
%J Frontiers of Information Technology & Electronic Engineering
%P 2128-2142
%@ 2095-9184
%D in press
%I Zhejiang University Press & Springer
doi="https://doi.org/10.1631/FITEE.2401021"

TY - JOUR
T1 - Black-box adversarial attacks on deep reinforcement learning-based proportional–integral–derivative controllers for load frequency control
A1 - Wei WANG
A1 - Zhenyong ZHANG
A1 - Xin WANG
A1 - Xuguo JIAO
J0 - Frontiers of Information Technology & Electronic Engineering
SP - 2128
EP - 2142
%@ 2095-9184
Y1 - in press
PB - Zhejiang University Press & Springer
ER -
doi="https://doi.org/10.1631/FITEE.2401021"

Abstract
Chinese Summary
Academic Network
Reviewer Comment

Abstract: Load frequency control (LFC) is usually managed by traditional proportional–integral–derivative (PID) controllers. Recently, deep reinforcement learning (DRL)-based adaptive controllers have been widely studied for their superior performance. However, the DRL-based adaptive controller exhibits inherent vulnerability due to adversarial attacks. To develop more robust control systems, this study conducts a deep analysis of DRL-based adaptive controller vulnerability under adversarial attacks. First, an adaptive controller is developed based on the DRL algorithm. Subsequently, considering the limited capability of attackers, the DRL-based LFC is evaluated under adversarial attacks using the zeroth-order optimization (ZOO) method. Finally, we use adversarial training to enhance the robustness of DRL-based adaptive controllers. Extensive simulations are conducted to evaluate the performance of the DRL-based PID controller with and without adversarial attacks.

面向负载频率控制场景下深度强化学习比例-积分-微分控制器的黑盒对抗攻击

王威¹，张镇勇^1,2，王鑫²，焦绪国^3,4
¹贵州大学计算机科学与技术学院公共大数据国家重点实验室，中国贵阳市，550025
²齐鲁工业大学（山东省科学院）算力互联网与信息安全教育部重点实验室，中国济南市，250353
³青岛理工大学信息与控制工程学院，中国青岛市，266033
⁴浙江大学控制科学与工程学院工业控制技术全国重点实验室，中国杭州市，310027
摘要：负载频率控制通常由传统的比例-积分-微分（PID）控制器管理。近年来，基于深度强化学习的自适应控制器因其卓越性能而备受关注。然而，这种基于深度强化学习的自适应控制器存在固有的脆弱性，容易受到对抗攻击的影响。为开发更鲁棒的控制系统，本文对基于深度强化学习的自适应控制器在对抗攻击下的脆弱性进行深入分析。首先，基于深度强化学习算法开发了自适应控制器。其次，考虑到攻击者的能力有限，采用零阶优化方法评估基于深度强化学习的负载频率控制在对抗攻击下的表现。最后，通过对抗训练增强基于深度强化学习的自适应控制器的鲁棒性。通过大量仿真，评估了存在和不存在对抗攻击时基于深度强化学习的PID控制器的性能。

关键词组：自适应控制器；深度强化学习；负载频率控制；对抗攻击

Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article

Reference

[1]Albeladi F, Barati M, 2023. Grid-supportive load frequency control using deep reinforcement learning. Proc IEEE Kansas Power and Energy Conf, p.1-5.

[2]Behzadan V, Munir A, 2017. Vulnerability of deep reinforcement learning to policy induction attacks. Proc 13^th Int Conf on Machine Learning and Data Mining in Pattern Recognition, p.262-275.

[3]Chaojun G, Jirutitijaroen P, Motani M, 2015. Detecting false data injection attacks in AC state estimation. IEEE Trans Smart Grid, 6(5):2476-2483.

[4]Chen PY, Zhang H, Sharma Y, et al., 2017. ZOO: zeroth order optimization based black-box attacks to deep neural networks without training substitute models. Proc 10^th ACM Workshop on Artificial Intelligence and Security, p.15-26.

[5]Chen ST, Liu GJ, Zhou ZY, et al., 2024. Robust multi-agent reinforcement learning method based on adversarial domain randomization for real-world dual-UAV cooperation. IEEE Trans Intell Veh, 9(1):1615-1627.

[6]Chen TM, 2010. Stuxnet, the real start of cyber warfare? IEEE Netw, 24(6):2-3.

[7]Doan DV, Nguyen K, Thai QV, 2022. Load-frequency control of three-area interconnected power systems with renewable energy sources using novel PSO PID-like fuzzy logic controllers. Eng Technol Appl Sci Res, 12(3):8597-8604.

[8]Dogru O, Velswamy K, Ibrahim F, et al., 2022. Reinforcement learning approach to autonomous PID tuning. Comput Chem Eng, 161:107760.

[9]Fujimoto S, Hoof H, Meger D, 2018. Addressing function approximation error in actor-critic methods. Proc 35^th Int Conf on Machine Learning, p.1582-1591.

[10]Gleave A, Dennis M, Wild C, et al., 2020. Adversarial policies: attacking deep reinforcement learning. Proc 8^th Int Conf on Learning Representations.

[11]Guo WR, Liu GJ, Zhou ZY, et al., 2024. Enhancing the robustness of QMIX against state-adversarial attacks. Neurocomputing, 572:127191.

[12]Hao JB, Tao Y, 2022. Adversarial attacks on deep learning models in smart grids. Energy Rep, 8:123-129.

[13]Jia XJ, Zhang Y, Wu BY, et al., 2022. LAS-AT: adversarial training with learnable attack strategy. Proc IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.13388-13398.

[14]Liu XH, Jiao QM, Yan ZM, 2023. Load frequency control with deep reinforcement learning under adversarial attacks. Proc 18^th Conf on Industrial Electronics and Applications, p.257-262.

[15]Madry A, Makelov A, Schmidt L, et al., 2018. Towards deep learning models resistant to adversarial attacks. Proc 6^th Int Conf on Learning Representations.

[16]Maei HR, 2011. Gradient Temporal-Difference Learning Algorithms. PhD Thesis, Department of Computing Science, University of Alberta, Canada.

[17]Michel A, Jha SK, Ewetz R, 2022. A survey on the vulnerability of deep neural networks against adversarial attacks. Prog Artif Intell, 11(2):131-141.

[18]Mishchenko D, Oleinikova I, Erdődi L, et al., 2024. Multidomain cyber-physical testbed for power system vulnerability assessment. IEEE Access, 12:38135-38149.

[19]Moldovan D, Ayyanar R, 2024. DNP3 implementation in a high DER penetration distribution system. Proc IEEE Kansas Power and Energy Conf, p.1-5.

[20]Muduli R, Jena D, Moger T, 2025. Application of reinforcement learning-based adaptive PID controller for automatic generation control of multi-area power system. IEEE Trans Automat Sci Eng, 22:1057-1068.

[21]Muhammad MS, Alshra‘a AS, German R, 2024. Survey of cybersecurity in smart grids protocols and datasets. Proc Comput Sci, 241:365-372.

[22]Nafees MN, Saxena N, Cardenas A, et al., 2023. Smart grid cyber-physical situational awareness of complex operational technology attacks: a review. ACM Comput Surv, 55(10):215.

[23]Pandey SK, Gupta P, Dwivedi SS, 2020. Full order observer based load frequency control of single area power system. Proc 12^th Int Conf on Computational Intelligence and Communication Networks, p.239-242.

[24]Pattanaik A, Tang ZY, Liu SJ, et al., 2018. Robust deep reinforcement learning with adversarial attacks. Proc 17^th Int Conf on Autonomous Agents and Multiagent Systems, p.2040-2042.

[25]Qassim QS, Ali MAM, Tahir NM, 2023. Security analysis of DNP3 protocol in SCADA system. Proc 13^th Int Conf on Control System, Computing and Engineering, p.314-319.

[26]Qiaoben Y, Ying CY, Zhou XN, et al., 2024. Understanding adversarial attacks on observations in deep reinforcement learning. Sci China Inform Sci, 67(5):152104.

[27]Raju GV, Srikanth NV, 2024. Frequency control of an islanded microgrid with multi-stage PID control approach using moth-flame optimization algorithm. Electr Eng, 107:8861-8878.

[28]Rasolomampionona DD, Połecki M, Zagrajek K, et al., 2024. A comprehensive review of load frequency control technologies. Energies, 17(12):2915.

[29]Saxena S, Bhatia S, Gupta R, 2021. Cybersecurity analysis of load frequency control in power systems: a survey. Designs, 5(3):52.

[30]Shabani H, Vahidi B, Ebrahimpour M, 2013. A robust PID controller based on imperialist competitive algorithm for load-frequency control of power systems. ISA Trans, 52(1):88-95.

[31]Shangguan XC, Zhang CK, He Y, et al., 2021. Robust load frequency control for power system considering transmission delay and sampling period. IEEE Trans Ind Inform, 17(8):5292-5303.

[32]Shi HR, Liu GJ, Zhang KW, et al., 2023. MARL sim2real transfer: merging physical reality with digital virtuality in metaverse. IEEE Trans Syst Man Cybern Syst, 53(4):2107-2117.

[33]Shuprajhaa T, Sujit SK, Srinivasan K, 2022. Reinforcement learning based adaptive PID controller design for control of linear/nonlinear unstable processes. Appl Soft Comput, 128:109450.

[34]Takiddin A, Ismail M, Serpedin E, 2023. Robust data-driven detection of electricity theft adversarial evasion attacks in smart grids. IEEE Trans Smart Grid, 14(1):663-676.

[35]Tan KL, Esfandiari Y, Lee XY, et al., 2020. Robustifying reinforcement learning agents via action space adversarial training. Proc American Control Conf, p.3959-3964.

[36]Tian JW, Wang BH, Li J, et al., 2022. Adversarial attacks and defense for CNN based power quality recognition in smart grid. IEEE Trans Netw Sci Eng, 9(2):807-819.

[37]Xue J, Liu ZN, Liu GJ, et al., 2024. Robust wind-resistant hovering control of quadrotor UAVs using deep reinforcement learning. IEEE Trans Intell Veh, early access.

[38]Yan S, Gu ZH, Park JH, 2021. Memory-event-triggered H_∞load frequency control of multi-area power systems with cyber-attacks and communication delays. IEEE Trans Netw Sci Eng, 8(2):1571-1583.

[39]Yan ZM, Xu Y, 2019. Data-driven load frequency control for stochastic power systems: a deep reinforcement learning method with continuous action search. IEEE Trans Power Syst, 34(2):1653-1656.

[40]Zhang LH, Jiang CM, Pang AP, 2022. Black-box attacks and defense for DNN-based power quality classification in smart grid. Energy Rep, 8:12203-12214.

[41]Zhang ZY, Deng RL, Yau DKY, et al., 2021. Zero-parameter-information data integrity attacks and countermeasures in IoT-based smart grid. IEEE Int Things J, 8(8): 6608-6623.

[42]Zhang ZY, Yang ZB, Yau DKY, et al., 2023a. Data security of machine learning applied in low-carbon smart grid: a formal model for the physics-constrained robustness. Appl Energy, 347:121405.

[43]Zhang ZY, Deng RL, Tian YL, et al., 2023b. SPMA: stealthy physics-manipulated attack and countermeasures in cyber-physical smart grid. IEEE Trans Inform Forensics Secur, 18:581-596.

[44]Zhang ZY, Yang KD, Tian YL, et al., 2024a. An anti-disguise authentication system using the first impression of avatar in metaverse. IEEE Trans Inform Forensics Secur, 19:6393-6408.

[45]Zhang ZY, Deng RL, Yau DK, 2024b. Vulnerability of the load frequency control against the network parameter attack. IEEE Trans Smart Grid, 15(1):921-933.

[46]Zhang ZY, Liu MX, Sun MY, et al., 2024c. Vulnerability of machine learning approaches applied in IoT-based smart grid: a review. IEEE Int Things J, 11(11):18951-18975.

[47]Zheng Y, Yan ZM, Chen KJ, et al., 2021. Vulnerability assessment of deep reinforcement learning models for power system topology optimization. IEEE Trans Smart Grid, 12(4):3613-3623.

[48]Zhou ZY, Liu GJ, Guo WR, et al., 2024a. Adversarial attacks on multiagent deep reinforcement learning models in continuous action space. IEEE Trans Syst Man Cybern Syst, 54(12):7633-7646.

[49]Zhou ZY, Liu GJ, Zhou MC, 2024b. A robust mean-field actor-critic reinforcement learning against adversarial perturbations on agent states. IEEE Trans Neur Netw Learn Syst, 35(10):14370-14381.

Open peer comments: Debate/Discuss/Question/Opinion

<1>

- Go to

面向负载频率控制场景下深度强化学习比例-积分-微分控制器的黑盒对抗攻击

Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article

Reference