CLC number: TP39
On-line Access: 2024-08-27
Received: 2023-10-17
Revision Accepted: 2024-05-08
Crosschecked: 2023-04-07
Cited: 0
Clicked: 1474
Citations: Bibtex RefMan EndNote GB/T7714
Tao SHEN, Jie ZHANG, Xinkang JIA, Fengda ZHANG, Zheqi LV, Kun KUANG, Chao WU, Fei WU. Federated mutual learning: a collaborative machine learning method for heterogeneous data, models, and objectives[J]. Frontiers of Information Technology & Electronic Engineering, 2023, 24(10): 1390-1402.
@article{title="Federated mutual learning: a collaborative machine learning method for heterogeneous data, models, and objectives",
author="Tao SHEN, Jie ZHANG, Xinkang JIA, Fengda ZHANG, Zheqi LV, Kun KUANG, Chao WU, Fei WU",
journal="Frontiers of Information Technology & Electronic Engineering",
volume="24",
number="10",
pages="1390-1402",
year="2023",
publisher="Zhejiang University Press & Springer",
doi="10.1631/FITEE.2300098"
}
%0 Journal Article
%T Federated mutual learning: a collaborative machine learning method for heterogeneous data, models, and objectives
%A Tao SHEN
%A Jie ZHANG
%A Xinkang JIA
%A Fengda ZHANG
%A Zheqi LV
%A Kun KUANG
%A Chao WU
%A Fei WU
%J Frontiers of Information Technology & Electronic Engineering
%V 24
%N 10
%P 1390-1402
%@ 2095-9184
%D 2023
%I Zhejiang University Press & Springer
%DOI 10.1631/FITEE.2300098
TY - JOUR
T1 - Federated mutual learning: a collaborative machine learning method for heterogeneous data, models, and objectives
A1 - Tao SHEN
A1 - Jie ZHANG
A1 - Xinkang JIA
A1 - Fengda ZHANG
A1 - Zheqi LV
A1 - Kun KUANG
A1 - Chao WU
A1 - Fei WU
J0 - Frontiers of Information Technology & Electronic Engineering
VL - 24
IS - 10
SP - 1390
EP - 1402
%@ 2095-9184
Y1 - 2023
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/FITEE.2300098
Abstract: federated learning (FL) is a novel technique in deep learning that enables clients to collaboratively train a shared model while retaining their decentralized data. However, researchers working on FL face several unique challenges, especially in the context of heterogeneity. Heterogeneity in data distributions, computational capabilities, and scenarios among clients necessitates the development of customized models and objectives in FL. Unfortunately, existing works such as FedAvg may not effectively accommodate the specific needs of each client. To address the challenges arising from heterogeneity in FL, we provide an overview of the heterogeneities in data, model, and objective (DMO). Furthermore, we propose a novel framework called federated mutual learning (FML), which enables each client to train a personalized model that accounts for the data heterogeneity (DH). A "meme model" serves as an intermediary between the personalized and global models to address model heterogeneity (MH). We introduce a knowledge distillation technique called deep mutual learning (DML) to transfer knowledge between these two models on local data. To overcome objective heterogeneity (OH), we design a shared global model that includes only certain parts, and the personalized model is task-specific and enhanced through mutual learning with the meme model. We evaluate the performance of FML in addressing DMO heterogeneities through experiments and compare it with other commonly used FL methods in similar scenarios. The results demonstrate that FML outperforms other methods and effectively addresses the DMO challenges encountered in the FL setting.
[1]Alam S, Liu LY, Yan M, et al., 2023. FedRolex: model-heterogeneous federated learning with rolling sub-model extraction. https://arxiv.org/abs/2212.01548
[2]Chen HT, Wang YH, Xu C, et al., 2019. Data-free learning of student networks. IEEE/CVF Int Conf on Computer Vision, p.3513-3521.
[3]Chen HY, Chao WL, 2022. On bridging generic and personalized federated learning for image classification. https://arxiv.org/abs/2107.00778
[4]Corchado JM, Li WG, Bajo J, et al., 2016. Special issue on distributed computing and artificial intelligence. Front Inform Technol Electron Eng, 17(4):281-282.
[5]Gao DS, Ju C, Wei XG, et al., 2020. HHHFL: hierarchical heterogeneous horizontal federated learning for electroencephalography. https://arxiv.org/abs/1909.05784
[6]Gao JQ, Li JQ, Shan HM, et al., 2023. Forget less, count better: a domain-incremental self-distillation learning benchmark for lifelong crowd counting. Front Inform Technol Electron Eng, 24(2):187-202.
[7]He CY, Annavaram M, Avestimehr S, et al., 2021. FedNAS: federated deep learning via neural architecture search. https://arxiv.org/abs/2004.08546v1
[8]Hinton G, Vinyals O, Dean J, 2015. Distilling the knowledge in a neural network. https://arxiv.org/abs/1503.02531
[9]Jiang YH, Konečný J, Rush K, et al., 2023. Improving federated learning personalization via model agnostic meta learning. https://arxiv.org/abs/1909.12488
[10]Kairouz P, McMahan HB, Avent B, et al., 2021. Advances and open problems in federated learning. Found Trends Mach Learn, 14(1-2):1-210.
[11]Khodak M, Balcan MF, Talwalkar A, 2019. Adaptive gradient-based meta-learning methods. https://arxiv.org/abs/1906.02717
[12]Krizhevsky A, 2009. Learning Multiple Layers of Features from Tiny Images. Master Thesis, Department of Computer Science, University of Toronto, Canada.
[13]LeCun Y, Boser B, Denker J, et al., 1989. Handwritten digit recognition with a back-propagation network. Proc 2nd Int Conf on Neural Information Processing Systems, p.396-404.
[14]LeCun Y, Bottou L, Bengio Y, et al., 1998. Gradient-based learning applied to document recognition. Proc IEEE, 86(11):2278-2324.
[15]Li DL, Wang JP, 2019. FedMD: heterogenous federated learning via model distillation. https://arxiv.org/abs/1910.03581
[16]Li JH, 2018. Cyber security meets artificial intelligence: a survey. Front Inform Technol Electron Eng, 19(12):1462-1474.
[17]Li T, Sahu AK, Zaheer M, et al., 2020. Federated optimization in heterogeneous networks. https://arxiv.org/abs/1812.06127v5
[18]Li WH, Bilen H, 2020. Knowledge distillation for multi-task learning. Proc European Conf on Computer Vision, p.163-176.
[19]Li X, Huang KX, Yang WH, et al., 2019. On the convergence of FedAvg on non-IID data. https://arxiv.org/abs/1907.02189
[20]Li X, Yang WH, Wang SS, et al., 2021. Communication efficient decentralized training with multiple local updates. https://arxiv.org/abs/1910.09126v1
[21]Lian XR, Zhang C, Zhang H, et al., 2017. Can decentralized algorithms outperform centralized algorithms? A case study for decentralized parallel stochastic gradient descent. Proc 31st Int Conf on Neural Information Processing Systems, p.5336-5346.
[22]Liang PP, Liu T, Liu ZY, et al., 2020. Think locally, act globally: federated learning with local and global representations. https://arxiv.org/abs/2001.01523
[23]Lim WYB, Luong NC, Hoang DT, et al., 2020 Federated learning in mobile edge networks: a comprehensive survey. IEEE Commun Surv Tutor, 22(3):2031-2063.
[24]Liu FL, Wu X, Ge S, et al., 2020. Federated learning for vision-and-language grounding problems. Proc AAAI Conf Artif Intell, 34(7):11572-11579.
[25]Liu PX, Jiang JM, Zhu GX, et al., 2022. Training time minimization for federated edge learning with optimized gradient quantization and bandwidth allocation. Front Inform Technol Electron Eng, 23(8):1247-1263.
[26]McMahan B, Moore E, Ramage D, et al., 2017. Communication-efficient learning of deep networks from decentralized data. Proc 20th Int Conf on Artificial Intelligence and Statistics, p.1273-1282.
[27]Padhya M, Jinwala DC, 2019. MULKASE: a novel approach for key-aggregate searchable encryption for multi-owner data. Front Inform Technol Electron Eng, 20(12):1717-1748.
[28]Pan YH, 2017. Special issue on artificial intelligence 2.0. Front Inform Technol Electron Eng, 18(1):1-2.
[29]Pan YH, 2018. 2018 special issue on artificial intelligence 2.0: theories and applications. Front Inform Technol Electron Eng, 19(1):1-2.
[30]Smith V, Chiang CK, Sanjabi M, et al., 2017. Federated multi-task learning. Proc 31st Int Conf on Neural Information Processing Systems, p.4427-4437.
[31]Wang J, Li R, Wang J, et al., 2020. Artificial intelligence and wireless communications. Front Inform Technol Electron Eng, 21(10):1413-1425.
[32]Wang TZ, Zhu JY, Torralba A, et al., 2020. Dataset distillation. https://arxiv.org/abs/1811.10959
[33]Wu BC, Dai XL, Zhang PZ, et al., 2019. FBNet: hardware-aware efficient ConvNet design via differentiable neural architecture search. IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.10726-10734.
[34]Wu JX, Li JH, Ji XS, 2018. Security for cyberspace: challenges and opportunities. Front Inform Technol Electron Eng, 19(12):1459-1461.
[35]Yang Q, Liu Y, Cheng Y, et al., 2019. Federated Learning. Springer, Cham, Switzerland, p.1-207.
[36]Yu T, Bagdasaryan E, Shmatikov V, 2022. Salvaging federated learning by local adaptation. https://arxiv.org/abs/2002.04758
[37]Zhang X, Li YC, Li WP, et al., 2022. Personalized federated learning via variational Bayesian inference. Proc Int Conf on Machine Learning, p.26293-26310.
[38]Zhang Y, Xiang T, Hospedales TM, et al., 2018. Deep mutual learning. IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.4320-4328.
[39]Zhao Y, Li M, Lai LZ, et al., 2022. Federated learning with non-IID data. https://arxiv.org/abs/1806.00582
Open peer comments: Debate/Discuss/Question/Opinion
<1>