Full Text:   <127>

CLC number: 

On-line Access: 2024-11-05

Received: 2024-06-01

Revision Accepted: 2024-09-13

Crosschecked: 0000-00-00

Cited: 0

Clicked: 214

Citations:  Bibtex RefMan EndNote GB/T7714

-   Go to

Article info.
Open peer comments

Journal of Zhejiang University SCIENCE C 1998 Vol.-1 No.-1 P.

http://doi.org/10.1631/FITEE.2400468


Adaptive layer splitting for wireless LLM inference in edge computing: a model-based reinforcement learning approach


Author(s):  Yuxuan CHEN, Rongpeng LI, Xiaoxue YU, Zhifeng ZHAO, Honggang ZHANG

Affiliation(s):  College of Information Science & Electronic Engineering, Zhejiang University, Hangzhou 310027, China; more

Corresponding email(s):   cyx00@zju.edu.cn, lirongpeng@zju.edu.cn, sdwhyxx@zju.edu.cn, zhaozf@zhejianglab.com, honggangzhang@zju.edu.cn

Key Words:  Large language models (LLMs), Edge computing, Model-based reinforcement learning (MBRL), Split inference, Transformer


Yuxuan CHEN, Rongpeng LI, Xiaoxue YU, Zhifeng ZHAO, Honggang ZHANG. Adaptive layer splitting for wireless LLM inference in edge computing: a model-based reinforcement learning approach[J]. Frontiers of Information Technology & Electronic Engineering, 1998, -1(-1): .

@article{title="Adaptive layer splitting for wireless LLM inference in edge computing: a model-based reinforcement learning approach",
author="Yuxuan CHEN, Rongpeng LI, Xiaoxue YU, Zhifeng ZHAO, Honggang ZHANG",
journal="Frontiers of Information Technology & Electronic Engineering",
volume="-1",
number="-1",
pages="",
year="1998",
publisher="Zhejiang University Press & Springer",
doi="10.1631/FITEE.2400468"
}

%0 Journal Article
%T Adaptive layer splitting for wireless LLM inference in edge computing: a model-based reinforcement learning approach
%A Yuxuan CHEN
%A Rongpeng LI
%A Xiaoxue YU
%A Zhifeng ZHAO
%A Honggang ZHANG
%J Journal of Zhejiang University SCIENCE C
%V -1
%N -1
%P
%@ 2095-9184
%D 1998
%I Zhejiang University Press & Springer
%DOI 10.1631/FITEE.2400468

TY - JOUR
T1 - Adaptive layer splitting for wireless LLM inference in edge computing: a model-based reinforcement learning approach
A1 - Yuxuan CHEN
A1 - Rongpeng LI
A1 - Xiaoxue YU
A1 - Zhifeng ZHAO
A1 - Honggang ZHANG
J0 - Journal of Zhejiang University Science C
VL - -1
IS - -1
SP -
EP -
%@ 2095-9184
Y1 - 1998
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/FITEE.2400468


Abstract: 
Optimizing the deployment of large language models (LLMs) in edge computing environments is critical for enhancing privacy and computational efficiency. In the path toward efficient wireless LLM inference in edge computing, this study comprehensively analyzes the impact of different splitting points in mainstream open-source LLMs. Accordingly, this study introduces a framework taking inspiration from model-based reinforcement learning (MBRL) to determine the optimal splitting point across the edge and user equipment (UE). By incorporating a reward surrogate model, our approach significantly reduces the computational cost of frequent performance evaluations. Extensive simulations demonstrate that this method effectively balances inference performance and computational load under varying network conditions, providing a robust solution for LLM deployment in decentralized settings.

Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article

Open peer comments: Debate/Discuss/Question/Opinion

<1>

Please provide your name, email address and a comment





Journal of Zhejiang University-SCIENCE, 38 Zheda Road, Hangzhou 310027, China
Tel: +86-571-87952783; E-mail: cjzhang@zju.edu.cn
Copyright © 2000 - 2024 Journal of Zhejiang University-SCIENCE