
Zhenyu LIU, Gang LEI, Yong XIAN, Leliang REN, Shaopeng LI, Daqiao ZHANG. Time control entry guidance method for hypersonic glide vehicles based on deep reinforcement learning[J]. Journal of Zhejiang University Science A,in press.Frontiers of Information Technology & Electronic Engineering,in press.https://doi.org/10.1631/jzus.A2500144 @article{title="Time control entry guidance method for hypersonic glide vehicles based on deep reinforcement learning", %0 Journal Article TY - JOUR
基于深度强化学习的高超声速滑翔飞行器时间控制再入制导方法机构:火箭军工程大学,导弹工程系,中国西安,710025 目的:解决多高超声速滑翔飞行器(HGV)在协同打击任务中对同时到达(时间协同)的需求。 创新点:1.提出了一种基于深度强化学习(DRL)的时间控制再入制导(TCEG)框架;2.设计了一种解析公式与深度神经网络(DNN)相结合的混合式剩余飞行时间预测方法;3.提出了一种自适应调节航向误差走廊的TCEG方法,且有效地将基于学习的控制和任务级目标模块化集成在一起。 方法:1.基于参考飞行剖面设计强化学习环境与观测空间,并训练智能体在变宽度航向误差走廊下实现鲁棒制导;2.利用解析公式估算剩余时间,用DNN预测估算值与真实值的残差,并将两者结合提升预测精度;3.根据预测的到达时间误差,实时修正航向误差阈值和观测向量,并引导智能体动态调整输出动作以控制到时。 结论:1.提出了一种基于DRL的时间控制再入制导方法,通过实时预测剩余时间并自适应调节航向误差,无需复杂参考轨迹设计即可实现高精度、强鲁棒性的再入制导;2.通过神经网络的前向计算生成指令,相比传统的再入制导方法大幅降低了计算需求,具备优异的实时性。 关键词组: Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article
Reference[1]BaoCY, LiXC, XuWL, et al., 2025. Coordinated reentry guidance with A* and deep reinforcement learning for hypersonic morphing vehicles under multiple no-fly zones. Aerospace, 12(7):591. [2]BrunnerCW, LuP, 2012. Comparison of fully numerical predictor-corrector and Apollo skip entry guidance algorithms. The Journal of the Astronautical Sciences, 59(3):517-540. [3]ChaiRQ, TsourdosA, SavvarisA, et al., 2021. Review of advanced guidance and control algorithms for space/aerospace vehicles. Progress in Aerospace Sciences, 122:100696. [4]ChengL, JiangFH, WangZB, et al., 2021. Multiconstrained real-time entry guidance using deep neural networks. IEEE Transactions on Aerospace and Electronic Systems, 57(1):325-340. [5]ChungJ, GulcehreC, ChoK, et al., 2015. Gated feedback recurrent neural networks. Proceedings of the 32nd International Conference on International Conference on Machine Learning, p.2067-2075. https://proceedings.mlr.press/v37/chung15.html [6]GaoY, ZhouR, ChenJY, 2024. Integrated entry guidance with no-fly zone constraint using reinforcement learning and predictor-corrector technique. Proceedings of the Institution of Mechanical Engineers, Part G: Journal of Aerospace Engineering, 238(7):728-741. [7]GaudetB, DrozdK, FurfaroR, 2022. Adaptive approach phase guidance for a hypersonic glider via reinforcement meta learning. AIAA SCITECH 2022 Forum, p.1-19. [8]GuoYH, LiX, ZhangHJ, et al., 2020. Entry guidance with terminal time control based on quasi-equilibrium glide condition. IEEE Transactions on Aerospace and Electronic Systems, 56(2):887-896. [9]HarpoldJC, GavertDE, 1983. Space shuttle entry guidance performance results. Journal of Guidance, Control, and Dynamics, 6(6):442-447. [10]HuQL, CaoRH, HanT, et al., 2021. Field-of-view limited guidance with impact angle constraint and feasibility analysis. Aerospace Science and Technology, 114:106753. [11]HuYD, GaoCS, LiJL, et al., 2022. A novel adaptive lateral reentry guidance algorithm with complex distributed no-fly zones constraints. Chinese Journal of Aeronautics, 35(7):128-143. [12]KimHG, LeeJY, KimHJ, et al., 2020. Look-angle-shaping guidance law for impact angle and time control with field-of-view constraint. IEEE Transactions on Aerospace and Electronic Systems, 56(2):1602-1612. [13]LeeS, LeeY, KimY, et al., 2023. Impact angle control guidance considering seeker’s field-of-view limit based on reinforcement learning. Journal of Guidance, Control, and Dynamics, 46(11):2168-2182. [14]LiJQ, ZhangGQ, ShanQH, et al., 2023. A novel cooperative design for USV–UAV systems: 3-D mapping guidance and adaptive fuzzy control. IEEE Transactions on Control of Network Systems, 10(2):564-574. [15]LiZH, HuC, DingCB, et al., 2018. Stochastic gradient particle swarm optimization based entry trajectory rapid planning for hypersonic glide vehicles. Aerospace Science and Technology, 76:176-186. [16]LiZH, HeB, WangMH, et al., 2019. Time-coordination entry guidance for multi-hypersonic vehicles. Aerospace Science and Technology, 89:123-135. [17]LiangZX, LiQD, RenZ, 2017. Virtual terminal-based adaptive predictor–corrector entry guidance. Journal of Aerospace Engineering, 30(4):04017013. [18]LiangZX, LvC, ZhuSY, 2023. Lateral entry guidance with terminal time constraint. IEEE Transactions on Aerospace and Electronic Systems, 59(3):2544-2553. [19]LiuX, LiX, ZhangHJ, et al., 2025. Entry guidance with terminal time constraint based on reduced-order dynamics. IEEE Transactions on Aerospace and Electronic Systems, 61(3):5949-5961. [20]LuP, 1997. Entry guidance and trajectory control for reusable launch vehicle. Journal of Guidance, Control, and Dynamics, 20(1):143-149. [21]LuP, 2014. Entry guidance: a unified method. Journal of Guidance, Control, and Dynamics, 37(3):713-728. [22]PhillipsTH, 2003. A common aero vehicle (CAV) model, description, and employment guide. Schafer Corporation for AFRL and AFSPC, 27:1-9. [23]QiuXQ, LaiP, GaoCS, et al., 2024. Recorded recurrent deep reinforcement learning guidance laws for intercepting endoatmospheric maneuvering missiles. Defence Technology, 31:457-470. [24]RenLL, XianY, LiSP, et al., 2023. Robust depletion shutdown guidance algorithm for long-range vehicles with a solid divert control system in large deviation conditions. Advances in Space Research, 72(9):3818-3841. [25]RenLL, GuoWL, XianY, et al., 2025. Deep reinforcement learning based integrated evasion and impact hierarchical intelligent policy of exo-atmospheric vehicles. Chinese Journal of Aeronautics, 38(1):103193. [26]SchulmanJ, WolskiF, DhariwalP, et al., 2017. Proximal policy optimization algorithms. arXiv:1707.06347. [27]ShenZJ, LuP, 2003. Onboard generation of three-dimensional constrained entry trajectories. Journal of Guidance, Control, and Dynamics, 26(1):111-121. [28]SureshM, SwarSC, ShyamS, 2023. Autonomous cooperative guidance strategies for unmanned aerial vehicles during on-board emergency. Journal of Aerospace Information Systems, 20(2):102-113. [29]WangCY, WangWL, DongW, et al., 2024. Multiple-stage spatial–temporal cooperative guidance without time-to-go estimation. Chinese Journal of Aeronautics, 37(9):399-416. [30]WangHN, GuoJ, WangX, et al., 2022. Time-coordination entry guidance using a range-determined strategy. Aerospace Science and Technology, 129:107842. [31]WangNY, WangXG, CuiNG, et al., 2022. Deep reinforcement learning-based impact time control guidance law with constraints on the field-of-view. Aerospace Science and Technology, 128:107765. [32]XueSB, LuP, 2010. Constrained predictor–corrector entry guidance. Journal of Guidance, Control, and Dynamics, 33(4):1273-1281. [33]YangHD, LiangHZ, LiuJQ, et al., 2024. Analytical time-coordinated entry guidance for multi-hypersonic vehicles within three-dimensional corridor. Aerospace Science and Technology, 155:109639. [34]YangHW, HuJC, LiS, et al., 2024. Reinforcement-learning-based robust guidance for asteroid approaching. Journal of Guidance, Control, and Dynamics, 47(10):2058-2072. [35]YuWB, ChenWC, JiangZG, et al., 2019. Analytical entry guidance for coordinated flight with multiple no-fly-zone constraints. Aerospace Science and Technology, 84:273-290. [36]ZengL, ZhangHB, ZhengW, 2018. A three-dimensional predictor–corrector entry guidance based on reduced-order motion equations. Aerospace Science and Technology, 73:223-231. CLC number: On-line Access: 2026-04-18 Received: 2025-04-27 Revision Accepted: 2025-11-07 Crosschecked: 2026-04-20 Cited: 0 Clicked: 1064 Journal of Zhejiang University-SCIENCE, 38 Zheda Road, Hangzhou
310027, China
Tel: +86-571-87952783; E-mail: cjzhang@zju.edu.cn Copyright © 2000 - 2026 Journal of Zhejiang University-SCIENCE | ||||||||||||||
Open peer comments: Debate/Discuss/Question/Opinion
<1>