Zhenyu LIU, Gang LEI, Yong XIAN, Leliang REN, Shaopeng LI, Daqiao ZHANG. A time control entry guidance method for hypersonic glide vehicles based on deep reinforcement learning[J]. Journal of Zhejiang University Science A,in press.Frontiers of Information Technology & Electronic Engineering,in press.https://doi.org/10.1631/jzus.A2500144
@article{title="A time control entry guidance method for hypersonic glide vehicles based on deep reinforcement learning", author="Zhenyu LIU, Gang LEI, Yong XIAN, Leliang REN, Shaopeng LI, Daqiao ZHANG", journal="Journal of Zhejiang University Science A", year="in press", publisher="Zhejiang University Press & Springer", doi="https://doi.org/10.1631/jzus.A2500144" }
%0 Journal Article %T A time control entry guidance method for hypersonic glide vehicles based on deep reinforcement learning %A Zhenyu LIU %A Gang LEI %A Yong XIAN %A Leliang REN %A Shaopeng LI %A Daqiao ZHANG %J Journal of Zhejiang University SCIENCE A %P %@ 1673-565X %D in press %I Zhejiang University Press & Springer doi="https://doi.org/10.1631/jzus.A2500144"
TY - JOUR T1 - A time control entry guidance method for hypersonic glide vehicles based on deep reinforcement learning A1 - Zhenyu LIU A1 - Gang LEI A1 - Yong XIAN A1 - Leliang REN A1 - Shaopeng LI A1 - Daqiao ZHANG J0 - Journal of Zhejiang University Science A SP - EP - %@ 1673-565X Y1 - in press PB - Zhejiang University Press & Springer ER - doi="https://doi.org/10.1631/jzus.A2500144"
Abstract: To meet the requirement of simultaneous arrival for multiple hypersonic glide vehicles (HGVs), we propose a time control entry guidance (TCEG) method leveraging deep reinforcement learning. First, the entry guidance problem is solved with a reinforcement learning framework based on a designed reference flight profile. By appropriately designing the observation space and training environment, the well-trained agent demonstrates robust guidance performance under varying widths of the heading error corridor. Then, a novel method for predicting the remaining flight time is established, which consists of two main components. The first component estimates the remaining flight time using an analytical formula, while the second component employs a deep neural network (DNN) to predict the residual error between the estimated and the true value. Subsequently, based on the predicted arrival time error, the threshold of the heading error and the observation vector are corrected in real time, there-by guiding the agent to dynamically adjust its output actions. This enables precise control of the terminal arrival time. Since the generation of guidance commands only requires forward computation computations by the neural network, the proposed method exhibits excellent real-time performance. Finally, the effectiveness and robustness of the method are demonstrated through numerical simulations in various scenarios.
Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article
Reference
Open peer comments: Debate/Discuss/Question/Opinion
Open peer comments: Debate/Discuss/Question/Opinion
<1>