CLC number: TN929.5
On-line Access: 2022-08-22
Received: 2021-11-19
Revision Accepted: 2022-08-29
Crosschecked: 2022-02-10
Cited: 0
Clicked: 2298
Citations: Bibtex RefMan EndNote GB/T7714
https://orcid.org/0000-0002-9047-8889
Peixi LIU, Jiamo JIANG, Guangxu ZHU, Lei CHENG, Wei JIANG, Wu LUO, Ying DU, Zhiqin WANG. Training time minimization for federated edge learning with optimized gradient quantization and bandwidth allocation[J]. Frontiers of Information Technology & Electronic Engineering, 2022, 23(8): 1247-1263.
@article{title="Training time minimization for federated edge learning with optimized gradient quantization and bandwidth allocation",
author="Peixi LIU, Jiamo JIANG, Guangxu ZHU, Lei CHENG, Wei JIANG, Wu LUO, Ying DU, Zhiqin WANG",
journal="Frontiers of Information Technology & Electronic Engineering",
volume="23",
number="8",
pages="1247-1263",
year="2022",
publisher="Zhejiang University Press & Springer",
doi="10.1631/FITEE.2100538"
}
%0 Journal Article
%T Training time minimization for federated edge learning with optimized gradient quantization and bandwidth allocation
%A Peixi LIU
%A Jiamo JIANG
%A Guangxu ZHU
%A Lei CHENG
%A Wei JIANG
%A Wu LUO
%A Ying DU
%A Zhiqin WANG
%J Frontiers of Information Technology & Electronic Engineering
%V 23
%N 8
%P 1247-1263
%@ 2095-9184
%D 2022
%I Zhejiang University Press & Springer
%DOI 10.1631/FITEE.2100538
TY - JOUR
T1 - Training time minimization for federated edge learning with optimized gradient quantization and bandwidth allocation
A1 - Peixi LIU
A1 - Jiamo JIANG
A1 - Guangxu ZHU
A1 - Lei CHENG
A1 - Wei JIANG
A1 - Wu LUO
A1 - Ying DU
A1 - Zhiqin WANG
J0 - Frontiers of Information Technology & Electronic Engineering
VL - 23
IS - 8
SP - 1247
EP - 1263
%@ 2095-9184
Y1 - 2022
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/FITEE.2100538
Abstract: Training a machine learning model with federated edge learning (FEEL) is typically time consuming due to the constrained computation power of edge devices and the limited wireless resources in edge networks. In this study, the training time minimization problem is investigated in a quantized FEEL system, where heterogeneous edge devices send quantized gradients to the edge server via orthogonal channels. In particular, a stochastic quantization scheme is adopted for compression of uploaded gradients, which can reduce the burden of per-round communication but may come at the cost of increasing the number of communication rounds. The training time is modeled by taking into account the communication time, computation time, and the number of communication rounds. Based on the proposed training time model, the intrinsic trade-off between the number of communication rounds and per-round latency is characterized. Specifically, we analyze the convergence behavior of the quantized FEEL in terms of the optimality gap. Furthermore, a joint data-and-model-driven fitting method is proposed to obtain the exact optimality gap, based on which the closed-form expressions for the number of communication rounds and the total training time are obtained. Constrained by the total bandwidth, the training time minimization problem is formulated as a joint quantization level and bandwidth allocation optimization problem. To this end, an algorithm based on alternating optimization is proposed, which alternatively solves the subproblem of quantization optimization through successive convex approximation and the subproblem of bandwidth allocation by bisection search. With different learning tasks and models, the validation of our analysis and the near-optimal performance of the proposed optimization algorithm are demonstrated by the simulation results.
[1]Alistarh D, Grubic D, Li JZ, et al., 2017. QSGD: communication-efficient SGD via gradient quantization and encoding. Proc 31st Int Conf on Neural Information Processing Systems, p.1707-1718.
[2]Amiri MM, Gündüz D, 2020a. Federated learning over wireless fading channels. IEEE Trans Wirel Commun, 19(5):3546-3557.
[3]Amiri MM, Gündüz D, 2020b. Machine learning at the wireless edge: distributed stochastic gradient descent over-the-air. IEEE Trans Signal Process, 68:2155-2169.
[4]Basu D, Data D, Karakus C, et al., 2020. Qsparse-local-SGD: distributed SGD with quantization, sparsification, and local computations. IEEE J Sel Areas Inform Theory, 1(1):217-226.
[5]Bernstein J, Wang YX, Azizzadenesheli K, et al., 2018. signSGD: compressed optimisation for non-convex problems. Proc 35th Int Conf on Machine Learning, p.560-569.
[6]Chang WT, Tandon R, 2020. Communication efficient federated learning over multiple access channels. https://arxiv.org/abs/2001.08737
[7]Chen MZ, Poor HV, Saad W, et al., 2021a. Convergence time optimization for federated learning over wireless networks. IEEE Trans Wirel Commun, 20(4):2457-2471.
[8]Chen MZ, Yang ZH, Saad W, et al., 2021b. A joint learning and communications framework for federated learning over wireless networks. IEEE Trans Wirel Commun, 20(1):269-283.
[9]Cover TM, Thomas JA, 2006. Elements of Information Theory (2nd Ed.). John Wiley & Sons, Hoboken, USA.
[10]Dhillon HS, Huang H, Viswanathan H, 2017. Wide-area wireless communication challenges for the Internet of Things. IEEE Commun Mag, 55(2):168-174.
[11]Diamond S, Boyd S, 2016. CVXPY: a python-embedded modeling language for convex optimization. J Mach Learn Res, 17(1):2909-2913.
[12]Dinh CT, Tran NH, Nguyen MNH, et al., 2021. Federated learning over wireless networks: convergence analysis and resource allocation. IEEE/ACM Trans Netw, 29(1):398-409.
[13]Gong XW, Vorobyov SA, Tellambura C, 2011. Optimal bandwidth and power allocation for sum ergodic capacity under fading channels in cognitive radio networks. IEEE Trans Signal Process, 59(4):1814-1826.
[14]Gradshteyn IS, Ryzhik IM, 2014. Table of Integrals, Series, and Products. Academic Press, Cambridge, USA.
[15]He KM, Zhang XY, Ren SQ, et al., 2016. Deep residual learning for image recognition. IEEE Conf on Computer Vision and Pattern Recognition, p.770-778.
[16]Jin R, He X, Dai H, 2020. On the design of communication efficient federated learning over wireless networks. https://arxiv.org/abs/2004.07351v1
[17]Kairouz P, McMahan HB, Avent B, et al., 2019. Advances and open problems in federated learning. Found Trends® Mach Learn, 14(1-2):1-210.
[18]Letaief KB, Chen W, Shi YM, et al., 2019. The roadmap to 6G: AI empowered wireless networks. IEEE Commun Mag, 57(8):84-90.
[19]Li X, Huang KX, Yang WH, et al., 2020. On the convergence of FedAvg on non-IID data. Proc 8th Int Conf on Learning Representations, p.1-26.
[20]Liu DZ, Simeone O, 2021. Privacy for free: wireless federated learning via uncoded transmission with adaptive power control. IEEE J Sel Areas Commun, 39(1):170-185.
[21]Luo B, Li X, Wang SQ, et al., 2021. Cost-effective federated learning design. IEEE Conf on Computer Communications, p.1-10.
[22]Nguyen VD, Sharma SK, Vu TX, et al., 2021. Efficient federated learning algorithm for resource allocation in wireless IoT networks. IEEE Int Things J, 8(5):3394-3409.
[23]Nori MK, Yun S, Kim IM, 2021. Fast federated learning by balancing communication trade-offs. IEEE Trans Commun, 69(8):5168-5182.
[24]Park J, Samarakoon S, Bennis M, et al., 2019. Wireless network intelligence at the edge. Proc IEEE, 107(11):2204-2239.
[25]Park J, Samarakoon S, Elgabli A, et al., 2021. Communication-efficient and distributed learning over wireless networks: principles and applications. Proc IEEE, 109(5):796-819.
[26]Razaviyayn M, 2014. Successive Convex Approximation: Analysis and Applications. PhD Thesis, University of Minnesota, Minnesota, USA.
[27]Reisizadeh A, Mokhtari A, Hassani H, et al., 2020. FedPAQ: a communication-efficient federated learning method with periodic averaging and quantization. Proc 23rd Int Conf on Artificial Intelligence Statistics, p.2021-2031.
[28]Ren JK, He YH, Wen DZ, et al., 2020. Scheduling for cellular federated edge learning with importance and channel awareness. IEEE Trans Wirel Commun, 19(11):7690-7703.
[29]Salehi M, Hossain E, 2021. Federated learning in unreliable and resource-constrained cellular wireless networks. IEEE Trans Commun, 69(8):5136-5151.
[30]Shi SH, Chu XW, Cheung KC, et al., 2019. Understanding top-k sparsification in distributed deep learning. https://arxiv.org/abs/1911.08772v1
[31]Shlezinger N, Chen MZ, Eldar YC, et al., 2021. UVeQFed: universal vector quantization for federated learning. IEEE Trans Signal Process, 69:500-514.
[32]Stich SU, Cordonnier JB, Jaggi M, 2018. Sparsified SGD with memory. Proc 32nd Int Conf on Neural Information Processing Systems, p.4452-4463.
[33]Tse D, Viswanath P, 2005. Fundamentals of Wireless Communication. Cambridge University Press, New York, USA.
[34]Wan S, Lu JX, Fan PY, et al., 2021. Convergence analysis and system design for federated learning over wireless networks. IEEE J Sel Areas Commun, 39(12):3622-3639.
[35]Wang SQ, Tuor T, Salonidis T, et al., 2019. Adaptive federated learning in resource constrained edge computing systems. IEEE J Sel Areas Commun, 37(6):1205-1221.
[36]Wang YM, Xu YQ, Shi QJ, et al., 2022. Quantized federated learning under transmission delay and outage constraints. IEEE J Sel Areas Commun, 40(1):323-341.
[37]Wangni JQ, Wang JL, Liu J, et al., 2018. Gradient sparsification for communication-efficient distributed optimization. https://arxiv.org/abs/1710.09854v1
[38]Yang ZH, Chen MZ, Saad W, et al., 2021. Energy efficient federated learning over wireless communication networks. IEEE Trans Wirel Commun, 20(3):1935-1949.
[39]Zhu GX, Wang Y, Huang KB, 2020a. Broadband analog aggregation for low-latency federated edge learning. IEEE Trans Wirel Commun, 19(1):491-506.
[40]Zhu GX, Liu DZ, Du YQ, et al., 2020b. Toward an intelligent edge: wireless communication meets machine learning. IEEE Commun Mag, 58(1):19-25.
[41]Zhu GX, Du YQ, Gündüz D, et al., 2021. One-bit over-the-air aggregation for communication-efficient federated edge learning: design and convergence analysis. IEEE Trans Wirel Commun, 20(3):2120-2135.
Open peer comments: Debate/Discuss/Question/Opinion
<1>