Journal of Zhejiang University

Journal of Zhejiang University SCIENCE A

Accepted manuscript available online (unedited version)

Efficient sensorimotor cues for training a glider to soar autonomously

Author(s): Siyuan ZHENG, Jiachi ZHAO, Lifang ZENG, Zhouhong WANG, Jun LI
Affiliation(s): School of Aeronautics and Astronautics, Zhejiang University,Hangzhou310027,China; more
Corresponding email(s): jiachizhao@outlook.com, lifang_zeng@zju.edu.cn
Key Words: Autonomous soaring; Glider; Reinforcement learning; Twin delayed deep deterministic policy gradient (TD3); Sensorimotor cues

Share this article to： More <<< Previous Paper \|Next Paper >>>

Siyuan ZHENG, Jiachi ZHAO, Lifang ZENG, Zhouhong WANG, Jun LI. Efficient sensorimotor cues for training a glider to soar autonomously[J]. Journal of Zhejiang University Science A,in press.Frontiers of Information Technology & Electronic Engineering,in press.https://doi.org/10.1631/jzus.A2400567

@article{title="Efficient sensorimotor cues for training a glider to soar autonomously",
author="Siyuan ZHENG, Jiachi ZHAO, Lifang ZENG, Zhouhong WANG, Jun LI",
journal="Journal of Zhejiang University Science A",
year="in press",
publisher="Zhejiang University Press & Springer",
doi="https://doi.org/10.1631/jzus.A2400567"
}

%0 Journal Article
%T Efficient sensorimotor cues for training a glider to soar autonomously
%A Siyuan ZHENG
%A Jiachi ZHAO
%A Lifang ZENG
%A Zhouhong WANG
%A Jun LI
%J Journal of Zhejiang University SCIENCE A
%P 128-141
%@ 1673-565X
%D in press
%I Zhejiang University Press & Springer
doi="https://doi.org/10.1631/jzus.A2400567"

TY - JOUR
T1 - Efficient sensorimotor cues for training a glider to soar autonomously
A1 - Siyuan ZHENG
A1 - Jiachi ZHAO
A1 - Lifang ZENG
A1 - Zhouhong WANG
A1 - Jun LI
J0 - Journal of Zhejiang University Science A
SP - 128
EP - 141
%@ 1673-565X
Y1 - in press
PB - Zhejiang University Press & Springer
ER -
doi="https://doi.org/10.1631/jzus.A2400567"

Abstract
Chinese Summary
Academic Network
Reviewer Comment

Abstract: Migratory birds depend on the perception of atmospheric updraft for long-distance flight. To realize more efficient autonomous soaring in an unpowered glider, different strategies for using potential sensorimotor cues to achieve autonomous soaring efficiency were compared and optimized. A simulation framework of autonomous soaring for an unpowered glider was developed based on a reinforcement learning algorithm. The framework was composed of three models: an updraft environment model, the glider’s dynamics and control model, and a reinforcement learning agent, which learns to harvest more energy in flight. Based on the simulation, effects of different combinations of 12 potential sensorimotor cues on soaring efficiency were studied. Firstly, the absence of one particular sensorimotor cue and the use of only a single valid cue in autonomous soaring were analyzed. The results showed that the vertical airflow velocity gradient (aw) and the wing-tip updraft velocity difference (τ) have advantages over the other cues. Secondly, strategies combiningaw orτ with other cues were analyzed to achieve more effective autonomous soaring, and seven potentially effective combinations of sensorimotor cues were identified. The final results showed that, among the tested combinations, the combination of vertical airflow velocity (Vw) andτ, enables the most efficient autonomous soaring. This study identified a highly effective sensorimotor cue strategy to guide an intelligent glider to achieve long-distance autonomous soaring flight.

适用于滑翔机高效自主翱翔的感知线索研究

作者：郑思园¹，赵嘉墀¹，曾丽芳¹，王洲翃¹，黎军^1,2
机构：¹浙江大学，航空航天学院，中国杭州，310027；²浣江实验室，中国绍兴，311800
目的：本研究针对无动力滑翔机利用大气热气流进行自主翱翔这一课题，基于深度强化学习方法，通过对比分析不同感知线索及其组合策略，旨在确定能够最大化能量获取效率且兼具低感知依赖的最佳感知方案，从而提升滑翔机的长航时飞行能力。
创新点：1.建立了感知线索效能的系统化评估框架：针对强化学习自主翱翔中状态空间设计往往依赖经验试错、缺乏量化依据的问题，通过缺失分析、单一变量测试及多线索组合分析，系统性地揭示了12种感知线索对翱翔效能的独立贡献与协同耦合机制。2.提出了兼具低感知依赖与高效率的极简策略：突破了传统方法对多维复杂信息的依赖，发现并验证了仅利用左右翼尖垂直气流速度差（τ）与垂直气流速度（V_w）的双变量组合，即可实现高效自主翱翔，在保证飞行性能的同时显著降低了系统对传感器的感知依赖。
方法：本研究基于双延迟深度确定性策略梯度（TD3）强化学习算法，构建无动力滑翔机自主翱翔仿真框架。该框架由三个核心模型组成：上升气流环境模型、滑翔机动力学与控制模型以及强化学习智能体。在此平台上，选取包括垂直气流速度梯度（a_w）、τ及V_w在内的12种潜在感知线索。研究过程主要包括：1.敏感性分析：通过缺失特定线索及仅使用单一线索的测试，筛选出对翱翔效能最具影响力的核心变量（图8和10）；2.组合策略评估：以核心变量为基础，构建并评估将a_w或τ与其他线索结合的7种潜在组合策略在能量获取上的表现；3.轨迹特征对比：针对表现优异的组合策略（τ+a_w与τ+V_w），进一步对比分析其在不同初始位置下的飞行轨迹特征（特别是盘旋的向心性与偏心度），验证策略对气流中心的定位与跟踪能力（图17）。
结论：1.关键单一线索：a_w和τ是两个最核心的感知线索；相较于其他线索，它们能独立引导滑翔机实现自主翱翔，具有显著优势。2.最优组合策略：在所有测试的线索组合中，τ+V_w组合效果最好，可使滑翔机的自主翱翔效率达到最优。3.效能验证：轨迹分析表明，相比τ+a_w组合导致的偏心轨迹，τ+V_w组合能够引导滑翔机更紧密地围绕气流中心盘旋，从而采集更多能量，验证了该低感知依赖策略在长距离飞行中的高效性。

关键词组：自主翱翔；滑翔机；强化学习；双延迟深度确定性策略梯度（TD3）；感知线索

Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article

Reference

[1]AllenM,2005.Autonomous soaring for improved endurance of a small uninhabitated air vehicle.The 43rd AIAA Aerospace Sciences Meeting and Exhibit, article1025.

[2]ChanWL,LeeCS,HsiaoFB,2011.Real-time approaches to the estimation of local wind velocity for a fixed-wing unmanned air vehicle.Measurement Science and Technology,22(10):105203.

[3]ChungJJ,LawranceNRJ,SukkariehS,2015.Learning to soar: resource-constrained exploration in reinforcement learning.International Journal of Robotics Research,34(2):158-172.

[4]DepenbuschNT,BirdJJ,LangelaanJW,2018a.The AutoSOAR autonomous soaring aircraft part 2: hardware implementation and flight results.Journal of Field Robotics,35(4):435-458.

[5]DepenbuschNT,BirdJJ,LangelaanJW,2018b.The AutoSOAR autonomous soaring aircraft, part 1: autonomy algorithms.Journal of Field Robotics,35(6):868-889.

[6]EdwardsDJ,SilverbergLM,2010.Autonomous soaring: the montague cross-country challenge.Journal of Aircraft,47(5):1763-1769.

[7]EdwardsDJ,KahnAD,KellyM,et al.,2016.Maximizing net power in circular turns for solar and autonomous soaring aircraft.Journal of Aircraft,53(5):1237-1247.

[8]HueyRB,DeutschC,2016.How frigate birds soar around the doldrums.Science,353(6294):26-27.

[9]KahnAD,2017.Atmospheric thermal location estimation.Journal of Guidance, Control, and Dynamics,40(9):2363-2369.

[10]LangelaanJW,AlleyN,NeidhoeferJ,2011.Wind field estimation for small unmanned aerial vehicles.Journal of Guidance, Control, and Dynamics,34(4):1016-1030.

[11]LawranceNRJ,SukkariehS,2011.Autonomous exploration of a wind field with a gliding aircraft.Journal of Guidance, Control, and Dynamics,34(3):719-733.

[12]LiuD,LuF,YangT,et al.,2021.A review of dynamic soaring: a new aapproach to extending the endurance of fixed-wing UAVs.Proceedings of the Unmanned Systems Summit, p.71-77.

[13]MacCreadyPB,1958.Optimum airspeed selector.Soaring,10(11):10.

[14]MooreRJD,ThurrowgoodS,SrinivasanMV,2012.Vision-only estimation of wind field strength and direction from an aerial platform.IEEE/RSJ International Conference on Intelligent Robots and Systems, p.4544-4549.

[15]NotterS,ZürnM,GroßP,et al.,2019.Reinforced learning to cross-country soar in the vertical plane of motion.AIAA Scitech Forum, article1420.

[16]NotterS,SchimpfF,FichterW,2021.Hierarchical reinforcement learning approach towards autonomous cross-country soaring.AIAA Scitech Forum, article2010.

[17]NotterS,MüllerG,FichterW,2022.Integrated updraft localization and exploitation: end-to-end type reinforcement learning approach.Proceedings of the 2022 CEAS EuroGNC conference, CEAS-GNC-2022-077.

[18]PowersTC,SilverbergLM,GopalarathnamA,2020.Artificial lumbered flight for autonomous soaring.Journal of Guidance, Control, and Dynamics,43(3):553-566.

[19]ReddyG,CelaniA,SejnowskiTJ,et al.,2016.Learning to soar in turbulent environments.Proceedings of the National Academy of Sciences of the United States of America,113(33):E4877-E4884.

[20]ReddyG,Wong-NgJ,CelaniA,et al.,2018.Glider soaring via reinforcement learning in the field.Nature,562(7726):236-239.

[21]RhudyMB,LarrabeeT,ChaoHY,et al.,2013.UAV attitude, heading, and wind estimation using GPS/INS and an air data system.AIAA Guidance, Navigation, and Control Conference, article5201.

[22]WaltonC,KaminerI,DobrokhodovV,et al.,2018.Alternate strategies for optimal unmanned aerial vehicle thermaling.Journal of Aircraft,55(6):2347-2356.

[23]WharingtonJ,HerszbergI,1998.Control of a high endurance unmanned air vehicle.Proceedings of the 21st ICAS Congress.

[24]WoodburyTD,DunnC,ValasekJ,2014.Autonomous soaring using reinforcement learning for trajectory generation.The 52nd Aerospace Sciences Meeting, article0990.

[25]ZhaoJC,LiJ,ZengLF,2023.Energy-harvesting strategy investigation for glider autonomous soaring using reinforcement learning.Aerospace,10(10):895.

Open peer comments: Debate/Discuss/Question/Opinion

<1>