
CLC number:
On-line Access: 2024-08-27
Received: 2023-10-17
Revision Accepted: 2024-05-08
Crosschecked: 2022-04-23
Cited: 0
Clicked: 4495
Citations: Bibtex RefMan EndNote GB/T7714
Yu LIU, Zhi LI, Zhizhuo JIANG, You HE. Prospects for multi-agent collaboration and gaming: challenge, technology, and application[J]. Frontiers of Information Technology & Electronic Engineering,in press.https://doi.org/10.1631/FITEE.2200055 @article{title="Prospects for multi-agent collaboration and gaming: challenge, technology, and application", %0 Journal Article TY - JOUR
多智能体协作与博弈展望:挑战、技术和应用1清华大学电子工程系,中国北京市,100084 2清华大学深圳国际研究生院,中国深圳市,518055 摘要:近年来,多智能体系统在解决复杂环境中各种决策问题方面取得显著进步,并已实现与人类相似甚至更好的决策性能。本文从任务挑战、技术方向和应用领域3个角度简要回顾多智能体协作和博弈相关技术。首先回顾近期多智能体系统工作中的典型研究问题和挑战,然后进一步讨论关于多智能体协作和游戏任务的前沿研究方向,最后对多智能体协作与博弈的应用领域进行重点展望。 关键词组: Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article
Reference[1]Arora S, Doshi P, 2021. A survey of inverse reinforcement learning: challenges, methods and progress. Artif Intell, 297:103500. ![]() [2]Arulkumaran K, Deisenroth MP, Brundage M, et al., 2017. Deep reinforcement learning: a brief survey. IEEE Signal Process Mag, 34(6):26-38. ![]() [3]Bailey JP, Piliouras G, 2019. Multi-agent learning in network zero-sum games is a Hamiltonian system. Proc 18th Int Conf on Autonomous Agents and Multiagent Systems, p.233-241. ![]() [4]Balduzzi D, Racanière S, Martens J, et al., 2018. The mechanics of n-player differentiable games. Proc 35th Int Conf on Machine Learning, p.354-363. ![]() [5]Baltrušaitis T, Ahuja C, Morency LP, 2019. Multimodal machine learning: a survey and taxonomy. IEEE Trans Patt Anal Mach Intell, 41(2):423-443. ![]() [6]Barron EN, 2013. Game Theory: an Introduction. John Wiley & Sons, Hoboken, USA. ![]() [7]Beattie C, Leibo JZ, Teplyashin D, et al., 2016. DeepMind Lab. https://arxiv.org/abs/1612.03801v2 ![]() [8]Bellemare MG, Naddaf Y, Veness J, et al., 2013. The arcade learning environment: an evaluation platform for general agents. J Artif Intell Res, 47:253-279. ![]() [9]Berner C, Brockman G, Chan B, et al., 2019. Dota 2 with large scale deep reinforcement learning. https://arxiv.org/abs/1912.06680 ![]() [10]Betancourt C, Chen WH, 2021. Deep reinforcement learning for portfolio management of markets with a dynamic number of assets. Expert Syst Appl, 164:114002. ![]() [11]Brockman G, Cheung V, Pettersson L, et al., 2016. OpenAI Gym. https://arxiv.org/abs/1606.01540 ![]() [12]Busoniu L, Babuska R, De Schutter B, 2008. A comprehensive survey of multiagent reinforcement learning. IEEE Trans Syst Man Cybern Part C, 38(2):156-172. ![]() [13]Cañizares PC, Merayo MG, Núñez M, et al., 2017. A multi-agent system architecture for statistics managing and soccer forecasting. Proc 2nd IEEE Int Conf on Computational Intelligence and Applications, p.572-576. ![]() [14]Coulom R, 2007. Efficient selectivity and backup operators in Monte-Carlo tree search. Proc 5th Int Conf on Computers and Games, p.72-83. ![]() [15]Das A, Gervet T, Romoff J, et al., 2019. TarMAC: targeted multi-agent communication. Proc 36th Int Conf on Machine Learning, p.1538-1546. ![]() [16]Dionisio JDN, Burns WGIII, Gilbert R, 2013. 3D virtual worlds and the metaverse: current status and future possibilities. ACM Comput Surv, 45(3):34. ![]() [17]Foerster JN, Assael YM, de Freitas N, et al., 2016. Learning to communicate with deep multi-agent reinforcement learning. Proc 30th Int Conf on Neural Information Processing Systems, p.2145-2153. ![]() [18]Georgeff MP, 1988. Communication and interaction in multi-agent planning. In: Bond AH, Gasser L (Eds.), Distributed Artificial Intelligence. Morgan Kaufmann Publishers Inc., San Francisco, USA, p.200-204. ![]() [19]Grigorescu S, Trasnea B, Cocias T, et al., 2020. A survey of deep learning techniques for autonomous driving. J Field Robot, 37(3):362-386. ![]() [20]Hernandez-Leal P, Kaisers M, Baarslag T, et al., 2017. A survey of learning in multiagent environments: dealing with non-stationarity. https://arxiv.org/abs/1707.09183v1 ![]() [21]Hoen PJ, Tuyls K, Panait L, et al., 2005. An overview of cooperative and competitive multiagent learning. Proc 1st Int Conf on Learning and Adaption in Multi-Agent Systems, p.1-46. ![]() [22]Hüttenrauch M, Šošić A, Neumann G, 2019. Deep reinforcement learning for swarm systems. J Mach Learn Res, 20(54):1-31. ![]() [23]Jennings NR, Sycara K, Wooldridge M, 1998. A roadmap of agent research and development. Auton Agent Multi-Agent Syst, 1(1):7-38. ![]() [24]Jiang JC, Lu ZQ, 2018. Learning attentional communication for multi-agent cooperation. Proc 32nd Int Conf on Neural Information Processing Systems, p.7265-7275. ![]() [25]Johnson M, Hofmann K, Hutton T, et al., 2016. The Malmo platform for artificial intelligence experimentation. Proc 25th Int Joint Conf on Artificial Intelligence, p.4246-4247. ![]() [26]Kempka M, Wydmuch M, Runc G, et al., 2016. ViZDoom: a doom-based AI research platform for visual reinforcement learning. Proc IEEE Conf on Computational Intelligence and Games, p.1-8. ![]() [27]Kim D, Moon S, Hostallero D, et al., 2019. Learning to schedule communication in multi-agent reinforcement learning. https://arxiv.org/abs/1902.01554 ![]() [28]Lagorse J, Paire D, Miraoui A, 2010. A multi-agent system for energy management of distributed power sources. Renewab Energy, 35(1):174-182. ![]() [29]Lazaridou A, Peysakhovich A, Baroni M, 2017. Multi-agent cooperation and the emergence of (natural) language. https://arxiv.org/abs/1612.07182 ![]() [30]Leonardos S, Piliouras G, Spendlove K, 2021. Exploration-exploitation in multi-agent competition: convergence with bounded rationality. https://arxiv.org/abs/2106.12928 ![]() [31]Li YM, Ren SL, Wu PX, et al., 2021. Learning distilled collaboration graph for multi-agent perception. https://arxiv.org/abs/2111.00643v2 ![]() [32]Li ZY, Yuan Q, Luo GY, et al., 2021. Learning effective multi-vehicle cooperation at unsignalized intersection via bandwidth-constrained communication. Proc IEEE 94th Vehicular Technology Conf, p.1-7. ![]() [33]Lin XM, Adams SC, Beling PA, 2019. Multi-agent inverse reinforcement learning for certain general-sum stochastic games. J Artif Intell Res, 66:473-502. ![]() [34]Liu YC, Tian JJ, Glaser N, et al., 2020a. When2com: multi-agent perception via communication graph grouping. Proc IEEE/CVF Conf on Compute Vision and Pattern Recognition, p.4105-4114. ![]() [35]Liu YC, Tian JJ, Ma CY, et al., 2020b. Who2com: collaborative perception via learnable handshake communication. Proc IEEE Int Conf on Robotics and Automation, p.6876-6883. ![]() [36]Mao HY, Gong ZB, Zhang ZC, et al., 2019. Learning multi-agent communication under limited-bandwidth restriction for Internet packet routing. https://arxiv.org/abs/1903.05561 ![]() [37]Mazumdar E, Ratliff LJ, Jordan MI, et al., 2020. Policy-gradient algorithms have no guarantees of convergence in linear quadratic games. Proc 19th Int Conf on Autonomous Agents and Multiagent Systems, p.860-868. ![]() [38]Mei SW, Wei W, Liu F, 2017. On engineering game theory with its application in power systems. Contr Theory Technol, 15(1):1-12. ![]() [39]Mordatch I, Abbeel P, 2018. Emergence of grounded compositional language in multi-agent populations. https://arxiv.org/abs/1703.04908 ![]() [40]Nachum O, Gu SX, Lee H, et al., 2018. Data-efficient hierarchical reinforcement learning. Proc 32nd Int Conf on Neural Information Processing Systems, p.3307-3317. ![]() [41]Neumeyer C, Oliehoek FA, Gavrila DM, 2021. General-sum multi-agent continuous inverse optimal control. IEEE Robot Autom Lett, 6(2):3429-3436. ![]() [42]Nguyen TT, Nguyen ND, Nahavandi S, 2020. Deep reinforcement learning for multiagent systems: a review of challenges, solutions, and applications. IEEE Trans Cybern, 50(9):3826-3839. ![]() [43]Oroojlooy A, Hajinezhad D, 2019. A review of cooperative multi-agent deep reinforcement learning. https://arxiv.org/abs/1908.03963 ![]() [44]Peng P, Wen Y, Yang YD, et al., 2017. Multiagent bidirectionally-coordinated nets: emergence of human-level coordination in learning to play StarCraft combat games. https://arxiv.org/abs/1703.10069 ![]() [45]Polydoros AS, Nalpantidis L, 2017. Survey of model-based reinforcement learning: applications on robotics. J Intell Robot Syst, 86(2):153-173. ![]() [46]Puterman ML, 1994. Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley & Sons, Hoboken, USA. ![]() [47]Rakhlin A, Sridharan K, 2013. Optimization, learning, and games with predictable sequences. Proc 26th Int Conf on Neural Information Processing Systems, p.3066-3074. ![]() [48]Shao K, Zhu YH, Zhao DB, 2019. StarCraft micromanagement with reinforcement learning and curriculum transfer learning. IEEE Trans Emerg Top Comput Intell, 3(1):73-84. ![]() [49]Silver D, Huang A, Maddison CJ, et al., 2016. Mastering the game of Go with deep neural networks and tree search. Nature, 529(7587):484-489. ![]() [50]Silver D, Schrittwieser J, Simonyan K, et al., 2017. Mastering the game of Go without human knowledge. Nature, 550(7676):354-359. ![]() [51]Singh A, Jain T, Sukhbaatar S, 2018. Learning when to communicate at scale in multiagent cooperative and competitive tasks. https://arxiv.org/abs/1812.09755 ![]() [52]Spielberg SPK, Gopaluni RB, Loewen PD, 2017. Deep reinforcement learning approaches for process control. Proc 6th Int Symp on Advanced Control of Industrial Processes, p.201-206. ![]() [53]Synnaeve G, Nardelli N, Auvolat A, et al., 2016. TorchCraft: a library for machine learning research on real-time strategy games. https://arxiv.org/abs/1611.00625 ![]() [54]Tao F, Zhang H, Liu A, et al., 2019. Digital Twin in industry: state-of-the-art. IEEE Trans Ind Inform, 15(4):2405-2415. ![]() [55]Tessler C, Givony S, Zahavy T, et al., 2017. A deep hierarchical approach to lifelong learning in minecraft. Proc 31st AAAI Conf on Artificial Intelligence, p.1553-1561. ![]() [56]Todorov E, Erez T, Tassa Y, 2012. MuJoCo: a physics engine for model-based control. Proc IEEE/RSJ Int Conf on Intelligent Robots and Systems, p.5026-5033. ![]() [57]Tso KS, Tharp GK, Zhang W, et al., 1999. A multi-agent operator interface for unmanned aerial vehicles. Proc Gateway to the New Millennium. Proc 18th Digital Avionics Systems Conf, Article 6.A.4. ![]() [58]Vinyals O, Babuschkin I, Czarnecki WM, et al., 2019. Grandmaster level in StarCraft II using multi-agent reinforcement learning. Nature, 575(7782):350-354. ![]() [59]Wang RD, He X, Yu RS, et al., 2020. Learning efficient multi-agent communication: an information bottleneck approach. Proc 37th Int Conf on Machine Learning, p.9908-9918. ![]() [60]Wang Y, Cheng ZS, Xiao M, 2020. UAVs' formation keeping control based on multi-agent system consensus. IEEE Access, 8:49000-49012. ![]() [61]Wang YN, Xu T, Niu X, et al., 2022. STMARL: a spatio-temporal multi-agent reinforcement learning approach for cooperative traffic light control. IEEE Trans Mob Comput, 21(6):2228-2242. ![]() [62]Zhang KQ, Yang RZ, Başar T, 2021. Multi-agent reinforcement learning: a selective overview of theories and algorithms. In: Vamvoudakis KG, Wan Y, Lewis FL, et al. (Eds.), Derya Cansever Handbook of Reinforcement Learning and Control. Springer, Cham, p.321-384. ![]() [63]Zhang Y, Yang Q, 2018. An overview of multi-task learning. Nat Sci Rev, 5(1):30-43. ![]() [64]Zhou HY, Zhang HF, Zhou YS, et al., 2018. Botzone: an online multi-agent competitive platform for AI education. Proc 23rd Annual ACM Conf on Innovation and Technology in Computer Science Education, p.33-38. ![]() [65]Zhuang FZ, Qi ZY, Duan KY, et al., 2021. A comprehensive survey on transfer learning. Proc IEEE, 109(1):43-76. ![]() Journal of Zhejiang University-SCIENCE, 38 Zheda Road, Hangzhou
310027, China
Tel: +86-571-87952783; E-mail: cjzhang@zju.edu.cn Copyright © 2000 - 2026 Journal of Zhejiang University-SCIENCE | ||||||||||||||


ORCID:
Open peer comments: Debate/Discuss/Question/Opinion
<1>