CLC number: TP391.1
On-line Access: 2024-05-06
Received: 2023-07-12
Revision Accepted: 2024-05-06
Crosschecked: 2023-10-08
Cited: 0
Clicked: 223
Citations: Bibtex RefMan EndNote GB/T7714
Wei LIN, Lichuan LIAO. Towards sustainable adversarial training with successive perturbation generation[J]. Frontiers of Information Technology & Electronic Engineering,in press.https://doi.org/10.1631/FITEE.2300474 @article{title="Towards sustainable adversarial training with successive perturbation generation", %0 Journal Article TY - JOUR
基于连续扰动生成方法的可持续对抗训练1福建理工大学计算机科学与数学学院,中国福州市,350118 2西安理工大学经济与管理学院,中国西安市,710048 3福建理工大学福建省大数据挖掘与应用技术重点实验室,中国福州市,350118 摘要:基于在线生成对抗性样本的对抗性训练在防御对抗性攻击和提高卷积神经网络(CNN)模型鲁棒性方面取得良好效果。然而,大多数现有对抗训练方法都致力于寻找强对抗例子迫使模型学习对抗数据分布,这不可避免地增加了大量计算开销并导致干净数据丢失。本文展示了在不同训练世代中渐进式地增强对抗样本本身的对抗强度能有效提高模型鲁棒性,适当的模型转换可以保持模型泛化性能,且这一转换过程的计算成本可忽略不计。因此,本文提出一种针对对抗训练的连续扰动生成方法(SPGAT),该方法通过在前一训练世代转移的对抗样本上添加扰动逐步增强对抗样本,并跨世代转换模型以提高对抗训练效率。实验表明,本文所提SPGAT方法既高效又有效;例如,所提方法计算时间为900分钟,标准对抗训练持续时间为4100分钟,对抗精度和干净样本精度性能提升分别超过7%和3%。在不同数据集上对SPGAT进行广泛评估,包括小规模MNIST、中等规模CIFAR-10和大规模CIFAR-100。实验结果表明,相比于目前最优方法,所提方法更有效。 关键词组: Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article
Reference[1]Andriushchenko M, Flammarion N, 2020. Understanding and improving fast adversarial training. Proc 34th Int Conf on Neural Information Processing Systems, Article 1346. [2]Athalye A, Carlini N, Wagner D, 2018. Obfuscated gradients give a false sense of security: circumventing defenses to adversarial examples. Proc 35th Int Conf on Machine Learning, p.274-283. [3]Baluja S, Fischer I, 2018. Learning to attack: adversarial transformation networks. Proc 32nd AAAI Conf on Artificial Intelligence, p.2687-2695. [4]Buckman J, Roy A, Raffel C, et al., 2018. Thermometer encoding: one hot way to resist adversarial examples. Proc Int Conf on Learning Representations. [5]Cai QZ, Liu C, Song D, 2018. Curriculum adversarial training. Proc 27th Int Joint Conf on Artificial Intelligence, p.3740-3747. [6]Carlini N, Katz G, Barrett C, et al., 2017. Provably minimally-distorted adversarial examples. https://arxiv.org/abs/1709.10207 [7]Chen B, Yin JL, Chen SK, et al., 2023. An adaptive model ensemble adversarial attack for boosting adversarial transferability. http://export.arxiv.org/abs/2308.02897 [8]Chen PY, Zhang H, Sharma Y, et al., 2017. ZOO: zeroth order optimization based black-box attacks to deep neural networks without training substitute models. Proc 10th ACM Workshop on Artificial Intelligence and Security, p.15-26. [9]Cheng YH, Lu F, Zhang XC, 2018. Appearance-based gaze estimation via evaluation-guided asymmetric regression. Proc 15th European Conf on Computer Vision, p.105-121. [10]Croce F, Hein M, 2020. Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks. https://arxiv.org/abs/2003.01690v1 [11]Ding KY, Liu XL, Niu WN, et al., 2021. A low-query black-box adversarial attack based on transferability. Knowl-Based Syst, 226:107102. [12]Doan BG, Abbasnejad E, Ranasinghe DC, 2020. Februus: input purification defense against Trojan attacks on deep neural network systems. Proc Annual Computer Security Applications Conf, p.897-912. [13]Dong YP, Liao FZ, Pang TY, et al., 2018. Boosting adversarial attacks with momentum. Proc IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.9185-9193. [14]Eykholt K, Evtimov I, Fernandes E, et al., 2018. Robust physical-world attacks on deep learning visual classification. Proc IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.1625-1634. [15]Finlayson SG, Bowers JD, Ito J, et al., 2019. Adversarial attacks on medical machine learning. Science, 363(6433):1287-1289. [16]Goldblum M, Fowl L, Feizi S, et al., 2020. Adversarially robust distillation. Proc 34th AAAI Conference on Artificial Intelligence, p.3996-4003. [17]Goodfellow IJ, Shlens J, Szegedy C, 2015. Explaining and harnessing adversarial examples. Proc 3rd Int Conf on Learning Representations. [18]Guo C, Rana M, Cisse M, et al., 2017. Countering adversarial images using input transformations. Proc 6th Int Conf on Learning Representations. [19]Hu YY, Sun SL, 2021. RL-VAEGAN: adversarial defense for reinforcement learning agents via style transfer. Knowl-Based Syst, 221:106967. [20]Huang B, Wang Y, Wang W, 2019. Model-agnostic adversarial detection by random perturbations. Proc 28th Int Joint Conf on Artificial Intelligence, p.4689-4696. [21]Izmailov P, Podoprikhin D, Garipov T, et al., 2018. Averaging weights leads to wider optima and better generalization. Proc 34th Conf on Uncertainty in Artificial Intelligence, p.876-885. [22]Kariyappa S, Qureshi MK, 2019. Improving adversarial robustness of ensembles with diversity training. https://arxiv.org/abs/1901.09981 [23]Krizhevsky A, Hinton G, 2009. Learning Multiple Layers of Features from Tiny Images. Technical Report, Computer Science Department, University of Toronto, Canada. [24]Kurakin A, Goodfellow IJ, Bengio S, 2017. Adversarial examples in the physical world. Proc 5th Int Conf on Learning Representations. [25]Lecun Y, Bottou L, Bengio Y, et al., 1998. Gradient-based learning applied to document recognition. Proc IEEE, 86(11):2278-2324. [26]Li B, Wang SQ, Jana S, et al., 2021. Towards understanding fast adversarial training. https://arxiv.org/abs/2006.03089v1 [27]Liu L, Du Y, Wang Y, et al., 2022. LRP2A: layer-wise relevance propagation based adversarial attacking for graph neural networks. Knowl-Based Syst, 256:109830. [28]Madaan D, Shin J, Hwang SJ, 2020. Adversarial neural pruning with latent vulnerability suppression. Proc 37th Int Conf on Machine Learning, Article 610. [29]Madry A, Makelov A, Schmidt L, et al., 2018. Towards deep learning models resistant to adversarial attacks. Proc 6th Int Conf on Learning Representations. [30]Moosavi-Dezfooli SM, Fawzi A, Frossard P, 2016. DeepFool: a simple and accurate method to fool deep neural networks. Proc IEEE Conf on Computer Vision and Pattern Recognition, p.2574-2582. [31]Pang TY, Xu K, Du C, et al., 2019. Improving adversarial robustness via promoting ensemble diversity. https://arxiv.org/abs/1901.08846 [32]Papernot N, McDaniel P, Wu X, et al., 2016a. Distillation as a defense to adversarial perturbations against deep neural networks. Proc IEEE Symp on Security and Privacy, p.582-597. [33]Papernot N, McDaniel P, Jha S, et al., 2016b. The limitations of deep learning in adversarial settings. Proc IEEE European Symp on Security and Privacy, p.372-387. [34]Shafahi A, Najibi M, Ghiasi A, et al., 2019. Adversarial training for free! Proc 33rd Int Conf on Neural Information Processing Systems, Article 302. [35]Szegedy C, Zaremba W, Sutskever I, et al., 2014. Intriguing properties of neural networks. Proc Int Conf on Learning Representations. [36]Tramer F, Kurakin A, Papernot N, et al., 2018. Ensemble adversarial training: attacks and defenses. Proc 6th Int Conf on Learning Representations. [37]Vivek BS, Babu RV, 2020. Single-step adversarial training with dropout scheduling. Proc IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.947-956. [38]Wang ZB, Guo HC, Zhang ZF, et al., 2021. Feature importance-aware transferable adversarial attacks. Proc IEEE/CVF Int Conf on Computer Vision, p.7639-7648. [39]Wei XX, Liang SY, Chen N, et al., 2019. Transferable adversarial attacks for image and video object detection. Proc 28th Int Joint Conf on Artificial Intelligence, p.954-960. [40]Wong E, Rice L, Kolter JZ, 2020. Fast is better than free: revisiting adversarial training. Proc 8th Int Conf on Learning Representations. [41]Yamamura K, Sato H, Tateiwa N, et al., 2022. Diversified adversarial attacks based on conjugate gradient method. Proc 39th Int Conf on Machine Learning, p.24872-24894. [42]Yang HR, Zhang JY, Dong HL, et al., 2020. DVERGE: diversifying vulnerabilities for enhanced robust generation of ensembles. Proc 34th Int Conf on Neural Information Processing Systems, Article 462. [43]Zhang JB, Qian WH, Nie RC, et al., 2022. LP-BFGS attack: an adversarial attack based on the Hessian with limited pixels. https://arxiv.org/abs/2210.15446 [44]Zhang JF, Xu XL, Han B, et al., 2020. Attacks which do not kill training make adversarial learning stronger. Proc 27th Int Conf on Machine Learning, Article 1046. [45]Zheng HZ, Zhang ZQ, Gu JC, et al., 2020. Efficient adversarial training with transferable adversarial examples. Proc IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.1178-1187. [46]Zhu C, Huang WR, Li HD, et al., 2019. Transferable clean-label poisoning attacks on deep neural nets. Proc 36th Int Conf on Machine Learning, p.7614-7623. Journal of Zhejiang University-SCIENCE, 38 Zheda Road, Hangzhou
310027, China
Tel: +86-571-87952783; E-mail: cjzhang@zju.edu.cn Copyright © 2000 - 2024 Journal of Zhejiang University-SCIENCE |
Open peer comments: Debate/Discuss/Question/Opinion
<1>