CLC number:
On-line Access: 2024-08-27
Received: 2023-10-17
Revision Accepted: 2024-05-08
Crosschecked: 2021-04-29
Cited: 0
Clicked: 5395
Yueting Zhuang, Siliang Tang. Visual knowledge: an attempt to explore machine creativity[J]. Frontiers of Information Technology & Electronic Engineering,in press.https://doi.org/10.1631/FITEE.2100116 @article{title="Visual knowledge: an attempt to explore machine creativity", %0 Journal Article TY - JOUR
视觉知识:智能创意初探浙江大学计算机科学与技术学院人工智能研究所,中国杭州市,310027 概要:长期以来困扰人工智能领域的一个问题是:人工智能是否具有创造力,或者说,算法的推理过程是否可以具有创造性。本文从思维科学的角度探讨人工智能创造力的问题。首先,列举形象思维推理的相关研究;然后,重点介绍一种特殊的视觉知识表示形式,即视觉场景图;最后,详细介绍视觉场景图构造问题与潜在应用。所有证据表明,视觉知识和视觉思维不仅可以改善当前人工智能任务的性能,而且可以用于机器创造力的实践。 关键词组: Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article
Reference[1]Arnheim R, 1997. Visual Thinking. University of California Press, San Francisco, USA. ![]() [2]Bau D, Zhu JY, Wulff J, et al., 2019. Seeing what a GAN cannot generate. Proc IEEE/CVF Int Conf on Computer Vision, p.4501-4510. ![]() [3]Chen L, Zhang HW, Xiao J, et al., 2019. Counterfactual critic multi-agent training for scene graph generation. Proc IEEE/CVF Int Conf on Computer Vision, p.4612-4622. ![]() [4]Denis M, 1991. Imagery and thinking. In: Cornoldi C, McDaniel MA (Eds.), Imagery and Cognition. Springer, New York, NY, USA, p.103-131. ![]() [5]Elgammal A, Liu BC, Elhoseiny M, et al., 2017. CAN: creative adversarial networks, generating “art” by learning about styles and deviating from style norms. https://arxiv.org/abs/1706.07068 ![]() [6]Gazzaniga MS, 1967. The split brain in man. Sci Am, 217(2):24-29. ![]() [7]Gu JX, Zhao HD, Lin Z, et al., 2019. Scene graph generation with external knowledge and image reconstruction. Proc IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.1969-1978. ![]() [8]Haurilet M, Roitberg A, Stiefelhagen R, 2019. It’s not about the journey; it’s about the destination: following soft paths under question-guidance for visual reasoning. Proc IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.1930-1939. ![]() [9]Herzig R, Bar A, Xu HJ, et al., 2020. Learning canonical representations for scene graph to image generation. 16th European Conf on Computer Vision, p.210-227. ![]() [10]Hudson DA, Manning CD, 2019. GQA: a new dataset for real-world visual reasoning and compositional question answering. https://arxiv.org/abs/1902.09506 ![]() [11]Johnson J, Gupta A, Li FF, 2018. Image generation from scene graphs. Proc IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.1219-1228. ![]() [12]Kolodner J, 2014. Case-Based Reasoning. Morgan Kaufmann, San Mateo, USA. ![]() [13]Krishna R, Zhu YK, Groth O, et al., 2017. Visual genome: connecting language and vision using crowdsourced dense image annotations. Int J Comput Vis, 123(1):32-73. ![]() [14]Li ML, Zareian A, Zeng Q, et al., 2020. Cross-media structured common space for multimedia event extraction. https://arxiv.org/abs/2005.02472 ![]() [15]Li YL, Xu L, Huang XJ, et al., 2019. HAKE: human activity knowledge engine. https://arxiv.org/abs/1904.06539v2 ![]() [16]Liu DQ, Zhang HW, Zha ZJ, et al., 2019. Referring expression grounding by marginalizing scene graph likelihood. https://arxiv.org/abs/1906.03561v1 ![]() [17]McCarthy J, Minsky ML, Rochester N, et al., 2006. A proposal for the Dartmouth summer research project on artificial intelligence. AI Mag, 27(4):12-14. ![]() [18]Mittal G, Agrawal S, Agarwal A, et al., 2019. Interactive image generation using scene graphs. https://arxiv.org/abs/1905.03743 ![]() [19]Mu Z, Tang S, Tan J, et al., 2021. Disentangled motif-aware graph learning for phrase grounding. Proc 35th AAAI Conf on Artificial Intelligence. ![]() [20]Norcliffe-Brown W, Vafeais E, Parisot S, 2018. Learning conditioned graph structures for interpretable visual question answering. https://arxiv.org/abs/1806.07243v1 ![]() [21]Pan YH, 2019. On visual knowledge. Front Inform Technol Electron Eng, 20(8):1021-1025. ![]() [22]Pan YH, 2020a. Miniaturized five fundamental issues about visual knowledge. Front Inform Technol Electron Eng, online. ![]() [23]Pan YH, 2020b. Multiple knowledge representation of artificial intelligence. Engineering, 6(3):216-217. ![]() [24]Radford A, Metz L, Chintala S, 2015. Unsupervised representation learning with deep convolutional generative adversarial networks. https://arxiv.org/abs/1511.06434 ![]() [25]Shen K, Wu LF, Xu FL, et al., 2020. Hierarchical attention based spatial-temporal graph-to-sequence learning for grounded video description. Proc 29th Int Joint Conf on Artificial Intelligence, p.941-947. ![]() [26]Tripathi S, Bhiwandiwalla A, Bastidas A, et al., 2019. Using scene graph context to improve image generation. https://arxiv.org/abs/1901.03762 ![]() [27]Yang JW, Lu JS, Lee S, et al., 2018. Graph R-CNN for scene graph generation. Proc 15th European Conf on Computer Vision, p.690-706. ![]() [28]Yang X, Tang KH, Zhang HW, et al., 2019. Auto-encoding scene graphs for image captioning. Proc IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.10677-10686. ![]() [29]Yang XY, Mei T, Xu YQ, et al., 2016. Automatic generation of visual-textual presentation layout. ACM Trans Multim Comput Commun Appl, 12(2):33. ![]() [30]Yu RC, Li A, Morariu VI, et al., 2017. Visual relationship detection with internal and external linguistic knowledge distillation. Proc IEEE Int Conf on Computer Vision, p.1068-1076. ![]() [31]Zareian A, Karaman S, Chang SF, 2020. Weakly supervised visual semantic parsing. Proc IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.3733-3742. ![]() [32]Zhang HW, Kyaw Z, Chang SF, et al., 2017. Visual translation embedding network for visual relation detection. Proc IEEE Conf on Computer Vision and Pattern Recognition, p.3107-3115. ![]() [33]Zhang W, Wang XE, Tang S, et al., 2020. Relational graph learning for grounded video description generation. Proc 28th ACM Int Conf on Multimedia, p.3807-3828. ![]() [34]Zhang W, Shi H, Tang S, et al., 2021. Consensus graph representation learning for better grounded image captioning. Proc 35th AAAI Conf on Artificial Intelligence. ![]() Journal of Zhejiang University-SCIENCE, 38 Zheda Road, Hangzhou
310027, China
Tel: +86-571-87952783; E-mail: cjzhang@zju.edu.cn Copyright © 2000 - 2025 Journal of Zhejiang University-SCIENCE |
Open peer comments: Debate/Discuss/Question/Opinion
<1>