JZUS - Journal of Zhejiang University SCIENCE

Frontiers of Information Technology & Electronic Engineering

Accepted manuscript available online (unedited version)

Multi-agent reinforcement learning behavioral control for nonlinear second-order systems

Author(s): Zhenyi ZHANG, Jie HUANG, Congjie PAN
Affiliation(s): College of Electrical Engineering and Automation, Fuzhou University, Fuzhou 350108, China; more
Corresponding email(s): jie.huang@fzu.edu.cn
Key Words: Reinforcement learning; Behavioral control; Second-order systems; Mission supervisor

Share this article to： More <<< Previous Paper \|Next Paper >>>

Zhenyi ZHANG, Jie HUANG, Congjie PAN. Multi-agent reinforcement learning behavioral control for nonlinear second-order systems[J]. Frontiers of Information Technology & Electronic Engineering,in press.https://doi.org/10.1631/FITEE.2300394

@article{title="Multi-agent reinforcement learning behavioral control for nonlinear second-order systems",
author="Zhenyi ZHANG, Jie HUANG, Congjie PAN",
journal="Frontiers of Information Technology & Electronic Engineering",
year="in press",
publisher="Zhejiang University Press & Springer",
doi="https://doi.org/10.1631/FITEE.2300394"
}

%0 Journal Article
%T Multi-agent reinforcement learning behavioral control for nonlinear second-order systems
%A Zhenyi ZHANG
%A Jie HUANG
%A Congjie PAN
%J Frontiers of Information Technology & Electronic Engineering
%P
%@ 2095-9184
%D in press
%I Zhejiang University Press & Springer
doi="https://doi.org/10.1631/FITEE.2300394"

TY - JOUR
T1 - Multi-agent reinforcement learning behavioral control for nonlinear second-order systems
A1 - Zhenyi ZHANG
A1 - Jie HUANG
A1 - Congjie PAN
J0 - Frontiers of Information Technology & Electronic Engineering
SP -
EP -
%@ 2095-9184
Y1 - in press
PB - Zhejiang University Press & Springer
ER -
doi="https://doi.org/10.1631/FITEE.2300394"

Abstract
Chinese Summary
Academic Network
Reviewer Comment

Abstract: Reinforcement learning behavioral control (RLBC) is limited to individual agent without any swarm mission, because it models the behavior priority learning as a Markov decision process. In this research, a novel multi-agent reinforcement learning behavioral control (MARLBC) is proposed to overcome such limitations by implementing joint learning. Specifically, a multi-agent reinforcement learning mission supervisor (MARLMS) is designed for a group of nonlinear second-order systems to assign the behavior priorities at decision layer. Through modeling behavior priority switching as a cooperative Markov game, the MARLMS learns an optimal joint behavior priority to reduce dependence on human intelligence and high-performance computing hardware. At the control layer, a group of second-order reinforcement learning controllers (SORLC) is designed to learn the optimal control policies to track position and velocity signals simultaneously. In particular, input saturation constraints are strictly implemented via designing a group of adaptive compensators. Numerical simulation results show that the proposed MARLBC has a lower switching frequency and control cost than finite-time and fixed-time behavioral control and RLBC methods.

Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article

Reference

Open peer comments: Debate/Discuss/Question/Opinion

<1>

- Go to

Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article

Reference