JZUS - Journal of Zhejiang University SCIENCE

Frontiers of Information Technology & Electronic Engineering

Accepted manuscript available online (unedited version)

A deep Q-learning network based active object detection model with a novel training algorithm for service robots

Author(s): Shaopeng LIU, Guohui TIAN, Yongcheng CUI, Xuyang SHAO
Affiliation(s): School of Control Science and Engineering, Shandong University, Jinan 250061, China
Corresponding email(s): shaopeng.liu66@mail.sdu.edu.cn, g.h.tian@sdu.edu.cn
Key Words: Active object detection; Deep Q-learning network; Training method; Service robots

Share this article to： More <<< Previous Paper \|Next Paper >>>

Shaopeng LIU, Guohui TIAN, Yongcheng CUI, Xuyang SHAO. A deep Q-learning network based active object detection model with a novel training algorithm for service robots[J]. Frontiers of Information Technology & Electronic Engineering,in press.https://doi.org/10.1631/FITEE.2200109

@article{title="A deep Q-learning network based active object detection model with a novel training algorithm for service robots",
author="Shaopeng LIU, Guohui TIAN, Yongcheng CUI, Xuyang SHAO",
journal="Frontiers of Information Technology & Electronic Engineering",
year="in press",
publisher="Zhejiang University Press & Springer",
doi="https://doi.org/10.1631/FITEE.2200109"
}

%0 Journal Article
%T A deep Q-learning network based active object detection model with a novel training algorithm for service robots
%A Shaopeng LIU
%A Guohui TIAN
%A Yongcheng CUI
%A Xuyang SHAO
%J Frontiers of Information Technology & Electronic Engineering
%P 1673-1683
%@ 2095-9184
%D in press
%I Zhejiang University Press & Springer
doi="https://doi.org/10.1631/FITEE.2200109"

TY - JOUR
T1 - A deep Q-learning network based active object detection model with a novel training algorithm for service robots
A1 - Shaopeng LIU
A1 - Guohui TIAN
A1 - Yongcheng CUI
A1 - Xuyang SHAO
J0 - Frontiers of Information Technology & Electronic Engineering
SP - 1673
EP - 1683
%@ 2095-9184
Y1 - in press
PB - Zhejiang University Press & Springer
ER -
doi="https://doi.org/10.1631/FITEE.2200109"

Abstract
Chinese Summary
Academic Network
Reviewer Comment

Abstract: This paper focuses on the problem of active object detection (AOD). AOD is important for service robots to complete tasks in the family environment, and leads robots to approach the target object by taking appropriate moving actions. Most of the current AOD methods are based on reinforcement learning with low training efficiency and testing accuracy. Therefore, an AOD model based on a deep Q-learning network (DQN) with a novel training algorithm is proposed in this paper. The DQN model is designed to fit the Q-values of various actions, and includes state space, feature extraction, and a multilayer perceptron. In contrast to existing research, a novel training algorithm based on memory is designed for the proposed DQN model to improve training efficiency and testing accuracy. In addition, a method of generating the end state is presented to judge when to stop the AOD task during the training process. Sufficient comparison experiments and ablation studies are performed based on an AOD dataset, proving that the presented method has better performance than the comparable methods and that the proposed training algorithm is more effective than the raw training algorithm.

基于深度Q学习网络与新训练算法的服务机器人主动物品检测模型

刘少鹏，田国会，崔永成，邵旭阳
山东大学控制科学与工程学院，中国济南市，250061
摘要：本文研究了主动物品检测(AOD)问题。AOD是服务机器人在家庭环境中完成服务任务的重要组成部分，通过适当的移动动作引导机器人接近目标物品。目前基于强化学习的AOD模型存在训练效率低和测试精度差的问题。因此，本文提出一种基于深度Q学习网络的AOD模型，并设计了一种新的模型训练算法。该模型旨在拟合各种动作Q值，包括状态空间、特征提取和多层感知机。与现有研究不同，本文针对所提AOD模型设计了一种基于记忆的训练算法，以提高模型训练效率和测试精度。此外，提出一种最终状态生成方法判断训练过程中AOD任务何时停止。本文所提方法在AOD数据集上进行了充分的对比实验和消融实验。实验结果表明所提方法优于其他同类方法，所设计的训练算法比原始训练算法更高效。

关键词组：主动物品检测；深度Q学习网络；训练算法；服务机器人

Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article

Reference

[1]Ammirato P, Poirson P, Park E, et al., 2017. A dataset for developing and benchmarking active vision. Proc IEEE Int Conf on Robotics and Automation, p.1378-1385.

[2]Ammirato P, Berg AC, Košecká J, 2018. Active vision dataset benchmark. Proc IEEE/CVF Conf on Computer Vision and Pattern Recognition Workshops, p.2046-2049.

[3]Dos Reis DH, Welfer D, De Souza Leite Cuadros MA, et al., 2019. Mobile robot navigation using an object recognition software with RGBD images and the YOLO algorithm. Appl Artif Intell, 33(14):1290-1305.

[4]Duan KW, Bai S, Xie LX, et al., 2019. CenterNet: keypoint triplets for object detection. Proc IEEE/CVF Int Conf on Computer Vision, p.6568-6577.

[5]Han XN, Liu HP, Sun FC, et al., 2019. Active object detection with multistep action prediction using deep Q-network. IEEE Trans Ind Inform, 15(6):3723-3731.

[6]He KM, Zhang XY, Ren SQ, et al., 2016. Deep residual learning for image recognition. Proc IEEE Conf on Computer Vision and Pattern Recognition, p.770-778.

[7]Liu SP, Tian GH, Zhang Y, et al., 2022a. Active object detection based on a novel deep Q-learning network and long-term learning strategy for the service robot. IEEE Trans Ind Electron, 69(6):5984-5993.

[8]Liu SP, Tian GH, Zhang Y, et al., 2022b. Service planning oriented efficient object search: a knowledge-based framework for home service robot. Exp Syst Appl, 187:115853.

[9]Mnih V, Kavukcuoglu K, Silver D, et al., 2015. Human-level control through deep reinforcement learning. Nature, 518(7540):529-533.

[10]Mousavian A, Toshev A, Fišer M, et al., 2019. Visual representations for semantic target driven navigation. Proc IEEE Int Conf on Robotics and Automation, p.8846-8852.

[11]Paletta L, Pinz A, 2000. Active object recognition by view integration and reinforcement learning. Robot Autom Syst, 31(1-2):71-86.

[12]Pu SL, Zhao W, Chen WJ, et al., 2021. Unsupervised object detection with scene-adaptive concept learning. Front Inform Technol Electron Eng, 22(5):638-651.

[13]Ren SQ, He KM, Girshick R, et al., 2017. Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans Patt Anal Mach Intell, 39(6):1137-1149.

[14]Schmid JF, Lauri M, Frintrop S, 2019. Explore, approach, and terminate: evaluating subtasks in active visual object search based on deep reinforcement learning. Proc IEEE/RSJ Int Conf on Intelligent Robots and Systems, p.5008-5013.

[15]Shuai W, Chen XP, 2019. KeJia: towards an autonomous service robot with tolerance of unexpected environmental changes. Front Inform Technol Electron Eng, 20(3):307-317.

[16]Singh A, Sha J, Narayan KS, et al., 2014. BigBIRD: a large-scale 3D database of object instances. Proc IEEE Int Conf on Robotics and Automation, p.509-516.

[17]van Hasselt H, Guez A, Silver D, 2016. Deep reinforcement learning with double Q-learning. Proc AAAI Conf on Artificial Intelligence, p.2094-2100.

[18]Wan SH, Goudos S, 2020. Faster R-CNN for multi-class fruit detection using a robotic vision system. Comput Netw, 168:107036.

[19]Wang Q, Fan Z, Sheng WH, et al., 2019. Finding misplaced items using a mobile robot in a smart home environment. Front Inform Technol Electron Eng, 20(8):1036-1048.

[20]Xu QL, Fang F, Gauthier N, et al., 2021. Towards efficient multiview object detection with adaptive action prediction. Proc IEEE Int Conf on Robotics and Automation, p.13423-13429.

[21]Zhang H, Liu HP, Guo D, et al., 2017. From foot to head: active face finding using deep Q-learning. Proc IEEE Int Conf on Image Processing, p.1862-1866.

[22]Zhou XY, Zhuo JC, Krähenbühl P, 2019. Bottom-up object detection by grouping extreme and center points. Proc IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.850-859.

Open peer comments: Debate/Discuss/Question/Opinion

<1>

- Go to

基于深度Q学习网络与新训练算法的服务机器人主动物品检测模型

Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article

Reference