CLC number: TP14
On-line Access: 2021-11-15
Received: 2020-11-08
Revision Accepted: 2021-02-15
Crosschecked: 2021-04-01
Cited: 0
Clicked: 5270
Citations: Bibtex RefMan EndNote GB/T7714
Bihao Sun, Jinhui Hu, Dawen Xia, Huaqing Li. A distributed stochastic optimization algorithm with gradient-tracking and distributed heavy-ball acceleration[J]. Frontiers of Information Technology & Electronic Engineering,in press.https://doi.org/10.1631/FITEE.2000615 @article{title="A distributed stochastic optimization algorithm with gradient-tracking and distributed heavy-ball acceleration", %0 Journal Article TY - JOUR
基于梯度跟踪和分布式重球加速的分布式随机优化算法1西南大学电子信息工程学院非线性电路与智能信息处理重庆市重点实验室,中国重庆市,400715 2贵州民族大学数据科学与信息工程学院,中国贵阳市,550025 摘要:由于在机器学习和信号处理中的广泛应用,近年来分布式优化得到良好发展。本文致力于研究分布式优化以求解目标函数全局最小值。该目标是分布在个节点的无向网络上的平滑且强凸的局部成本函数总和。与已有工作不同的是,我们使用分布式重球项以提高算法的收敛性能。为使现有分布式随机一阶梯度算法的收敛加速,将动量项与梯度跟踪技术结合。仿真结果表明,在不增加复杂度的情况下,所提算法具有比GT-SAGA更高收敛速率。在真实数据集上的数值实验证明了该算法的有效性和正确性。 关键词组: Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article
Reference[1]Bertsekas D, Gafni E, 1983. Projected Newton methods and optimization of multicommodity flows. IEEE Trans Autom Contr, 28(12):1090-1096. [2]Boyd S, Parikh N, Chu E, et al., 2011. Distributed optimization and statistical learning via the alternating direction method of multipliers. Found Trends® Mach Learn, 3(1):1-122. [3]Cheng B, Li ZK, 2019. Coordinated tracking control with asynchronous edge-based event-triggered communications. IEEE Trans Autom Contr, 64(10):4321-4328. [4]Cheng S, Chen MY, Wai RJ, et al., 2014. Optimal placement of distributed generation units in distribution systems via an enhanced multi-objective particle swarm optimization algorithm. J Zhejiang Univ-Sci C (Comput & Electron), 15(4):300-311. [5]Cohen K, Nedić A, Srikant R, 2017. Distributed learning algorithms for spectrum sharing in spatial random access wireless networks. IEEE Trans Autom Contr, 62(6):2854-2869. [6]Defazio A, Bach F, Lacoste-Julien S, 2014. SAGA: a fast incremental gradient method with support for non-strongly convex composite objectives. Proc 27th Int Conf on Neural Information Processing Systems, p.1646-1654. [7]Dua D, Graff C, 2017. UCI Machine Learning Repository. http://archive.ics.uci.edu/ml [8]Duchi JC, Agarwal A, Wainwright MJ, 2012. Dual averaging for distributed optimization: convergence analysis and network scaling. IEEE Trans Autom Contr, 57(3):592-606. [9]Eisen M, Mokhtari A, Ribeiro A, 2017. Decentralized quasi-Newton methods. IEEE Trans Signal Process, 65(10):2613-2628. [10]Erseghe T, Zennaro D, Dall’Anese E, et al., 2011. Fast consensus by the alternating direction multipliers method. IEEE Trans Signal Process, 59(11):5523-5537. [11]Guan L, Sun T, Qiao LB, et al., 2020. An efficient parallel and distributed solution to nonconvex penalized linear SVMs. Front Inform Technol Electron Eng, 21(4):587-603. [12]Han ZM, Lin ZY, Fu MY, et al., 2015. Distributed coordination in multi-agent systems: a graph Laplacian perspective. Front Inform Technol Electron Eng, 16(6):429-448. [13]Hu JH, Yan Y, Li HQ, et al., 2021. Convergence of an accelerated distributed optimisation algorithm over time-varying directed networks. IET Contr Theory Appl, 15(1):24-39. [14]Johnson R, Zhang T, 2013. Accelerating stochastic gradient descent using predictive variance reduction. Proc 26th Int Conf on Neural Information Processing Systems, p.315-323. [15]Lan Q, Qiao LB, Wang YJ, 2018. Stochastic extra-gradient based alternating direction methods for graph-guided regularized minimization. Front Inform Technol Electron Eng, 19(6):755-762. [16]Li HQ, Cheng HQ, Wang Z, et al., 2020. Distributed Nesterov gradient and heavy-ball double accelerated asynchronous optimization. IEEE Trans Neur Netw Learn Syst, in press. [17]Li Z, Shi W, Yan M, 2019. A decentralized proximal-gradient method with network independent step-sizes and separated convergence rates. IEEE Trans Signal Process, 67(17):4494-4506. [18]Ling Q, Ribeiro A, 2014. Decentralized dynamic optimization through the alternating direction method of multipliers. IEEE Trans Signal Process, 62(5):1185-1197. [19]Ling Q, Tian Z, 2010. Decentralized sparse signal recovery for compressive sleeping wireless sensor networks. IEEE Trans Signal Process, 58(7):3816-3827. [20]Liu R, Sun WC, Hou T, et al., 2019. Block coordinate descentwith time perturbation for nonconvex nonsmooth problems in real-world studies. Front Inform Technol Electron Eng, 20(10):1390-1403. [21]Lü QG, Liao XF, Li HQ, et al., 2020. A Nesterov-like gradient tracking algorithm for distributed optimization over directed networks. IEEE Trans Syst Man Cybern, in press. [22]Mateos G, Bazerque JA, Giannakis GB, 2010. Distributed sparse linear regression. IEEE Trans Signal Process, 58(10):5262-5276. [23]Matthews TP, Wang K, Li CP, et al., 2016. Nonlinear waveform inversion by use of the regularized dual averaging method for ultrasound computed tomography. Progress in Electromagnetic Research Symp, p.3948. [24]McMahan B, Moore E, Ramage D, et al., 2017. Communication-efficient learning of deep networks from decentralized data. Proc 20th Int Conf on Artificial Intelligence and Statistics, p.1273-1282. [25]Nedić A, Ozdaglar A, 2009. Distributed subgradient methods for multi-agent optimization. IEEE Trans Autom Contr, 54(1):48-61. [26]Nedić A, Olshevsky A, Shi W, 2017a. Achieving geometric convergence for distributed optimization over time-varying graphs. SIAM J Optim, 27(4):2597-2633. [27]Nedić A, Olshevsky A, Shi W, et al., 2017b. Geometrically convergent distributed optimization with uncoordinated step-sizes. American Control Conf, p.3950-3955. [28]Schmidt M, Le Roux N, Bach F, 2017. Minimizing finite sums with the stochastic average gradient. Math Program, 162(1-2):83-112. [29]Tan CH, Ma SQ, Dai YH, et al., 2016. Barzilai-Borwein step size for stochastic gradient descent. Proc 30th Int Conf on Neural Information Processing Systems, p.685-693. [30]Tsitsiklis J, Bertsekas D, Athans M, 1986. Distributed asynchronous deterministic and stochastic gradient optimization algorithms. IEEE Trans Autom Contr, 31(9):803-812. [31]Wang B, Jiang HY, Fang J, et al., 2018. A proximal ADMM for decentralized composite optimization. IEEE Signal Process Lett, 25(8):1121-1125. [32]Wang Z, Li HQ, 2020. Edge-based stochastic gradient algorithm for distributed optimization. IEEE Trans Netw Sci Eng, 7(3):1421-1430. [33]Wei EM, Ozdaglar A, Jadbabaie A, 2013. A distributed Newton method for network utility maximization—I: algorithm. IEEE Trans Autom Contr, 58(9):2162-2175. [34]Xi CG, Khan UA, 2017. DEXTRA: a fast algorithm for optimization over directed graphs. IEEE Trans Autom Contr, 62(10):4980-4993. [35]Xia YS, Wang J, 2004. A one-layer recurrent neural network for support vector machine learning. IEEE Trans Syst Man Cybern B, 34(2):1261-1269. [36]Xin R, Khan UA, 2018. A linear algorithm for optimization over directed graphs with geometric convergence. IEEE Contr Syst Lett, 2(3):315-320. [37]Xin R, Khan UA, 2020. Distributed heavy-ball: a generalization and acceleration of first-order methods with gradient tracking. IEEE Trans Autom Contr, 65(6):2627-2633. [38]Xin R, Jakovetić D, Khan UA, 2019a. Distributed Nesterov gradient methods over arbitrary graphs. IEEE Signal Process Lett, 26(8):1247-1251. [39]Xin R, Sahu AK, Khan UA, et al., 2019b. Distributed stochastic optimization with gradient tracking over strongly-connected networks. Proc IEEE 58th Conf on Decision and Control, p.8353-8358. [40]Xin R, Xi CG, Khan UA, 2019c. FROST—fast row-stochastic optimization with uncoordinated step-sizes. EURASIP J Adv Signal Process, 2019(1):1. [41]Xin R, Khan UA, Kar S, 2020. Variance-reduced decentralized stochastic optimization with accelerated convergence. IEEE Trans Signal Process, 68:6255-6271. [42]Xu JM, Zhu SY, Soh YC, et al., 2015. Augmented distributed gradient methods for multi-agent optimization under uncoordinated constant stepsizes. Proc 54th IEEE Conf on Decision and Control, p.2055-2060. [43]Yin R, Zhang Y, Yu GD, et al., 2010. Centralized and distributed resource allocation in OFDM based multi-relay system. J Zhejiang Univ-Sci C (Comput & Electron), 11(6):450-464. [44]Yuan DM, Ma Q, Wang Z, 2013. Distributed dual averaging method for solving saddle-point problems over multi-agent networks. Proc 32nd Chinese Control Conf, p.6868-6872. [45]Zhang CL, Ahmad M, Wang YQ, 2019. ADMM based privacy-preserving decentralized optimization. IEEE Trans Inform Forens Secur, 14(3):565-580. [46]Zinkevich MA, Weimer M, Smola A, et al., 2010. Parallelized stochastic gradient descent. Proc 23th Int Conf on Neural Information Processing Systems, p.2595-2603. Journal of Zhejiang University-SCIENCE, 38 Zheda Road, Hangzhou
310027, China
Tel: +86-571-87952783; E-mail: cjzhang@zju.edu.cn Copyright © 2000 - 2024 Journal of Zhejiang University-SCIENCE |
Open peer comments: Debate/Discuss/Question/Opinion
<1>