CLC number:
On-line Access: 2024-08-27
Received: 2023-10-17
Revision Accepted: 2024-05-08
Crosschecked: 2022-06-24
Cited: 0
Clicked: 5650
Ya-kun ZHANG, Guo-fang GONG, Hua-yong YANG, Yu-xi CHEN, Geng-lin CHEN. Towards autonomous and optimal excavation of shield machine: a deep reinforcement learning-based approach[J]. Journal of Zhejiang University Science A,in press.Frontiers of Information Technology & Electronic Engineering,in press.https://doi.org/10.1631/jzus.A2100325 @article{title="Towards autonomous and optimal excavation of shield machine: a deep reinforcement learning-based approach", %0 Journal Article TY - JOUR
迈向盾构机自主最优掘进:一种基于深度强化学习的方法机构:1浙江大学,流体动力与机电系统国家重点实验室,中国杭州,310027;2中国矿业大学,电气与动力工程学院,中国徐州,221116 目的:自主掘进作业是新一代智能隧道掘进机(TBM)发展的趋势。然而,现有技术局限于有监督机器学习和静态优化,其性能无法超越人工操作,也难以处理不断变化的地质条件和长期掘进性能指标。本文旨在解决盾构机掘进性能的动态优化问题,实现自主最优掘进。 创新点:1.针对掘进过程的盾构机-环境交互作用动力学,提出了一种基于第一性原理分析和深度神经网络相结合的高精度混合建模方法,改善模型的可解释性并简化了特征选择过程;2.提出了一种适用于盾构机智能操作系统的无量纲多目标综合掘进性能指标;3.提出了一种深度学习与最优控制结合的盾构自主最优掘进方法,实现盾构掘进参数的智能决策与长期综合掘进性能的多目标动态优化。 方法:1.通过理论推导,揭示掘进过程的多系统耦合作用关系,得到自主最优掘进系统设计的两个自由度(图8);2.通过机理与数据联合驱动的混合建模,构建深度强化学习智能体的高精度训练环境;3.通过仿真模拟,利用施工现场数据,对自主最优掘进系统与人工操作的性能进行比较,验证所提方法的可行性和有效性(图11~13)。 结论:1.人类司机在进行掘进参数决策时,掘进比速度和掘进比能耗的相对权重比接近6:4。2.不同的地质条件应采用不同的掘进参数决策策略:常规地质应采用k1值较高的自主最优掘进系统,而在掘进比速度明显降低的困难地质则应采用k2值较高的自主最优掘进系统。3.尽管训练深度强化学习智能体非常耗时,但与培训熟练的盾构司机相比仍具有巨大的优势。 关键词组: Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article
Reference[1]AntsaklisPJ, RahnamaA, 2018. Control and machine intelligence for system autonomy. Journal of Intelligent & Robotic Systems, 91(1):23-34. ![]() [2]AntsaklisPJ, PassinoKM, WangSJ, 1991. An introduction to autonomous control systems. IEEE Control Systems Magazine, 11(4):5-13. ![]() [3]AtesU, BilginN, CopurH, 2014. Estimating torque, thrust and other design parameters of different type TBMs with some criticism to TBMs used in Turkish tunneling projects. Tunnelling and Underground Space Technology, 40:46-63. ![]() [4]BusoniuL, BabuskaR, de SchutterB, et al., 2017. Reinforcement Learning and Dynamic Programming Using Function Approximators. CRC Press, Boca Raton, USA, p.1-13. ![]() [5]CarrerasM, YuhJ, BatlleJ, et al., 2005. A behavior-based scheme using reinforcement learning for autonomous underwater vehicles. IEEE Journal of Oceanic Engineering, 30(2):416-427. ![]() [6]ChenRP, ZhangP, KangX, et al., 2019. Prediction of maximum surface settlement caused by earth pressure balance (EPB) shield tunneling with ANN methods. Soils and Foundations, 59(2):284-295. ![]() [7]CobbeK, KlimovO, HesseC, et al., 2019. Quantifying generalization in reinforcement learning. Proceedings of the 36th International Conference on Machine Learning, p.1282-1289. ![]() [8]DietterichTG, 2000. Hierarchical reinforcement learning with the MAXQ value function decomposition. Journal of Artificial Intelligence Research, 13:227-303. ![]() [9]El SallabA, AbdouM, PerotE, et al., 2017. Deep reinforcement learning framework for autonomous driving. Electronic Imaging, 2017(19):70-76. ![]() [10]GengQ, WeiZY, HeF, et al., 2015. Comparison of the mechanical performance between two-stage and flat-face cutter head for the rock tunnel boring machine (TBM). Journal of Mechanical Science and Technology, 29(5):2047-2058. ![]() [11]HanMD, CaiZX, QuCY, et al., 2017. Dynamic numerical simulation of cutterhead loads in TBM tunnelling. Tunnelling and Underground Space Technology, 70:286-298. ![]() [12]HeKM, ZhangXY, RenSQ, et al., 2015. Delving deep into rectifiers: surpassing human-level performance on ImageNet classification. 2015 IEEE International Conference on Computer Vision (ICCV), p.1026-1034. ![]() [13]HuoJZ, SunW, ChenJ, et al., 2010. Optimal disc cutters plane layout design of the full-face rock tunnel boring machine (TBM) based on a multi-objective genetic algorithm. Journal of Mechanical Science and Technology, 24(2):521-528. ![]() [14]KingmaDP, BaJ, 2015. Adam: a method for stochastic optimization. The 3rd International Conference on Learning Representations. ![]() [15]KoopialipoorM, NikoueiSS, MartoA, et al., 2019. Predicting tunnel boring machine performance through a new model based on the group method of data handling. Bulletin of Engineering Geology and the Environment, 78(5):3799-3813. ![]() [16]KuwaharaH, HaradaM, 1988. Application of fuzzy reasoning to the control of shield tunnelling. Journal of the Society of Instrument and Control Engineers, 27(11):1030-1037. ![]() [17]LillicrapTP, HuntJJ, PritzelA, et al., 2016. Continuous control with deep reinforcement learning. The 4th International Conference on Learning Representations. ![]() [18]LiuXY, ShaoC, MaHF, et al., 2011. Optimal earth pressure balance control for shield tunneling based on LS-SVM and PSO. Automation in Construction, 20(4):321-327. ![]() [19]MahdevariS, ShahriarK, YagizS, et al., 2014. A support vector regression model for predicting tunnel boring machine penetration rates. International Journal of Rock Mechanics and Mining Sciences, 72:214-229. ![]() [20]NamliM, BilginN, 2017. A model to predict daily advance rates of EPB-TBMs in a complex geology in Istanbul. Tunnelling and Underground Space Technology, 62:43-52. ![]() [21]NgAY, CoatesA, DielM, et al., 2006. Autonomous inverted helicopter flight via reinforcement learning. In: Ang MH, Khatib O (Eds.), Experimental Robotics IX. Springer, Berlin, Heidelberg, Germany, p.363-372. ![]() [22]NinićJ, MeschkeG, 2015. Model update and real-time steering of tunnel boring machines using simulation-based meta models. Tunnelling and Underground Space Technology, 45:138-152. ![]() [23]PanXL, YouYR, WangZY, et al., 2017. Virtual to real reinforcement learning for autonomous driving. British Machine Vision Conference. ![]() [24]QinCJ, ShiG, TaoJF, et al., 2021. Precise cutterhead torque prediction for shield tunneling machines using a novel hybrid deep neural network. Mechanical Systems and Signal Processing, 151:107386. ![]() [25]SalimiA, FaradonbehRS, MonjeziM, et al., 2018. TBM performance estimation using a classification and regression tree (CART) technique. Bulletin of Engineering Geology and the Environment, 77(1):429-440. ![]() [26]SaridisGN, 2001. Hierarchically Intelligent Machines. World Scientific, Hong Kong, China, p.25-32. ![]() [27]Shalev-ShwartzS, ShammahS, ShashuaA, 2016. Safe, multi-agent, reinforcement learning for autonomous driving. https://arxiv.org/abs/1610.03295v1 ![]() [28]ShaoC, LanDS, 2014. Optimal control of an earth pressure balance shield with tunnel face stability. Automation in Construction, 46:22-29. ![]() [29]ShiH, YangHY, GongGF, et al., 2011. Determination of the cutterhead torque for EPB shield tunneling machine. Automation in Construction, 20(8):1087-1095. ![]() [30]SongX, LiuJQ, GuoW, 2010. A cutter head torque forecast model based on multivariate nonlinear regression for EPB shield tunneling. International Conference on Artificial Intelligence and Computational Intelligence, p.104-108. ![]() [31]SunW, HuoJZ, ChenJ, et al., 2011. Disc cutters’layout design of the full-face rock tunnel boring machine (TBM) using a cooperative coevolutionary algorithm. Journal of Mechanical Science and Technology, 25(2):415. ![]() [32]SunW, ShiML, ZhangC, et al., 2018a. Dynamic load prediction of tunnel boring machine (TBM) based on heterogeneous in-situ data. Automation in Construction, 92:23-34. ![]() [33]SunW, WangXB, ShiML, et al., 2018b. Multidisciplinary design optimization of hard rock tunnel boring machine using collaborative optimization. Advances in Mechanical Engineering, 10(1):1-12. ![]() [34]WangLT, GongGF, ShiH, et al., 2012. A new calculation model of cutterhead torque and investigation of its influencing factors. Science China Technological Sciences, 55(6):1581-1588. ![]() [35]WangLT, SunW, LongYY, et al., 2018a. Reliability-based performance optimization of tunnel boring machine considering geological uncertainties. IEEE Access, 6:19086-19098. ![]() [36]WangLT, YangX, GongGF, et al., 2018b. Pose and trajectory control of shield tunneling machine in complicated stratum. Automation in Construction, 93:192-199. ![]() [37]XieHB, DuanXM, YangHY, et al., 2012. Automatic trajectory tracking control of shield tunneling machine under complex stratum working condition. Tunnelling and Underground Space Technology, 32:87-97. ![]() [38]YehIC, 1997. Application of neural networks to automatic soil pressure balance control for shield tunneling. Automation in Construction, 5(5):421-426. ![]() [39]YuA, Palefsky-SmithR, BediR, 2016. Deep Reinforcement Learning for Simulated Autonomous Vehicle Control. Technical Report, Stanford University, California, USA. ![]() [40]ZhangP, ChenRP, WuHN, 2019. Real-time analysis and regulation of EPB shield steering using Random Forest. Automation in Construction, 106:102860. ![]() [41]ZhangP, WuHN, ChenRP, et al., 2020a. A critical evaluation of machine learning and deep learning in shield-ground interaction prediction. Tunnelling and Underground Space Technology, 106:103593. ![]() [42]ZhangP, LiH, HaQP, et al., 2020b. Reinforcement learning based optimizer for improvement of predicting tunneling-induced ground responses. Advanced Engineering Informatics, 45:101097. ![]() [43]ZhangQ, KangYL, QuCY, et al., 2010. Mechanical model for operational loads prediction on shield cutter head during excavation. IEEE/ASME International Conference on Advanced Intelligent Mechatronics, p.1252-1256. ![]() [44]ZhangQ, HuangT, HuangGY, et al., 2013. Theoretical model for loads prediction on shield tunneling machine with consideration of soil-rock interbedded ground. Science China Technological Sciences, 56(9):2259-2267. ![]() [45]ZhangQ, QuCY, CaiZX, et al., 2014. Modeling of the thrust and torque acting on shield machines during tunneling. Automation in Construction, 40:60-67. ![]() [46]ZhangQ, HouZD, HuangGY, et al., 2015. Mechanical characterization of the load distribution on the cutterhead–ground interface of shield tunneling machines. Tunnelling and Underground Space Technology, 47:106-113. ![]() [47]ZhangWJ, YangGS, LinYZ, et al., 2018. On definition of deep learning. World Automation Congress (WAC), p. 1-5. ![]() [48]ZhangYK, GongGF, YangHY, et al., 2019. Data-driven direct automatic tuning scheme for fixed-structure digital controllers of hybrid systems. IET Control Theory & Applications, 13(2):248-257. ![]() [49]ZhangYK, GongGF, YangHY, et al., 2020. Precision versus intelligence: autonomous supporting pressure balance control for slurry shield tunnel boring machines. Automation in Construction, 114:103173. ![]() [50]ZhouC, DingLY, HeR, 2013. PSO-based Elman neural network model for predictive control of air chamber pressure in slurry shield tunneling under Yangtze River. Automation in Construction, 36:208-217. ![]() [51]ZhouC, DingLY, SkibniewskiMJ, et al., 2018. Data based complex network modeling and analysis of shield tunneling performance in metro construction. Advanced Engineering Informatics, 38:168-186. ![]() [52]ZhouC, XuHC, DingLY, et al., 2019a. Dynamic prediction for attitude and position in shield tunneling: a deep learning method. Automation in Construction, 105:102840. ![]() [53]ZhouC, DingLY, ZhouY, et al., 2019b. Hybrid support vector machine optimization model for prediction of energy consumption of cutter head drives in shield tunneling. Journal of Computing in Civil Engineering, 33(3):04019019. ![]() [54]ZhouJ, ZhouYH, WangBC, et al., 2019. Human–cyber–physical systems (HCPSs) in the context of new-generation intelligent manufacturing. Engineering, 5(4):624-636. ![]() Journal of Zhejiang University-SCIENCE, 38 Zheda Road, Hangzhou
310027, China
Tel: +86-571-87952783; E-mail: cjzhang@zju.edu.cn Copyright © 2000 - 2025 Journal of Zhejiang University-SCIENCE |
Open peer comments: Debate/Discuss/Question/Opinion
<1>