CLC number: TP391
On-line Access: 2024-06-29
Received: 2023-07-26
Revision Accepted: 2023-11-14
Crosschecked: 2024-09-29
Cited: 0
Clicked: 772
Ruihui PENG, Jie LAI, Xueting YANG, Dianxing SUN, Shuncheng TAN, Yingjuan SONG, Wei GUO. Camouflaged target detection based on multimodal image input pixel-level fusion[J]. Frontiers of Information Technology & Electronic Engineering, 2024, 25(9): 1226-1239.
@article{title="Camouflaged target detection based on multimodal image input pixel-level fusion",
author="Ruihui PENG, Jie LAI, Xueting YANG, Dianxing SUN, Shuncheng TAN, Yingjuan SONG, Wei GUO",
journal="Frontiers of Information Technology & Electronic Engineering",
volume="25",
number="9",
pages="1226-1239",
year="2024",
publisher="Zhejiang University Press & Springer",
doi="10.1631/FITEE.2300503"
}
%0 Journal Article
%T Camouflaged target detection based on multimodal image input pixel-level fusion
%A Ruihui PENG
%A Jie LAI
%A Xueting YANG
%A Dianxing SUN
%A Shuncheng TAN
%A Yingjuan SONG
%A Wei GUO
%J Frontiers of Information Technology & Electronic Engineering
%V 25
%N 9
%P 1226-1239
%@ 2095-9184
%D 2024
%I Zhejiang University Press & Springer
%DOI 10.1631/FITEE.2300503
TY - JOUR
T1 - Camouflaged target detection based on multimodal image input pixel-level fusion
A1 - Ruihui PENG
A1 - Jie LAI
A1 - Xueting YANG
A1 - Dianxing SUN
A1 - Shuncheng TAN
A1 - Yingjuan SONG
A1 - Wei GUO
J0 - Frontiers of Information Technology & Electronic Engineering
VL - 25
IS - 9
SP - 1226
EP - 1239
%@ 2095-9184
Y1 - 2024
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/FITEE.2300503
Abstract: Camouflaged targets are a type of nonsalient target with high foreground and background fusion and minimal target feature information, making target recognition extremely difficult. Most detection algorithms for camouflaged targets use only the target’s single-band information, resulting in low detection accuracy and a high missed detection rate. We present a multimodal image fusion camouflaged target detection technique (MIF-YOLOv5) in this paper. First, we provide a multimodal image input to achieve pixel-level fusion of the camouflaged target’s optical and infrared images to improve the effective feature information of the camouflaged target. Second, a loss function is created, and the K-Means++ clustering technique is used to optimize the target anchor frame in the dataset to increase camouflage personnel detection accuracy and robustness. Finally, a comprehensive detection index of camouflaged targets is proposed to compare the overall effectiveness of various approaches. More crucially, we create a multispectral camouflage target dataset to test the suggested technique. Experimental results show that the proposed method has the best comprehensive detection performance, with a detection accuracy of 96.5%, a recognition probability of 92.5%, a parameter number increase of 1×104, a theoretical calculation amount increase of 0.03 GFLOPs, and a comprehensive detection index of 0.85. The advantage of this method in terms of detection accuracy is also apparent in performance comparisons with other target algorithms.
[1]Bhajantri NU, Nagabhushan P, 2006. Camouflage defect identification: a novel approach. Proc 9th Int Conf on Information Technology, p.145-148.
[2]Bochkovskiy A, Wang CY, Liao HY, et al., 2020. YOLOv4: optimal speed and accuracy of object detection. https://arxiv.org/abs/2004.10934
[3]Cheng XL, Geng KK, Wang ZW, et al., 2023. SLBAF-Net: super-lightweight bimodal adaptive fusion network for UAV detection in low recognition environment. Multim Tools Appl, 82(30):47773-47792.
[4]Cheng Y, Hao HZ, Ji Y, et al., 2022. Attention-based neighbor selective aggregation network for camouflaged object detection. Proc Int Joint Conf on Neural Networks, p.1-8.
[5]Fan DP, Ji GP, Sun GL, et al., 2020a. Camouflaged object detection. Proc IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.2777-2787.
[6]Fan DP, Ji GP, Zhou T, et al., 2020b. PraNet: parallel reverse attention network for polyp segmentation. Proc 23rd Int Conf on Medical Image Computing and Computer-Assisted Intervention, p.263-273.
[7]Fang QY, Han DP, Wang ZK, 2021. Cross-modality fusion Transformer for multispectral object detection. https://arxiv.org/abs/2111.00273
[8]Gevorgyan Z, 2022. SIoU loss: more powerful learning for bounding box regression. https://arxiv.org/abs/2205.12740
[9]Girshick R, 2015. Fast R-CNN. Proc IEEE Int Conf on Computer Vision, p.1440-1448.
[10]Girshick R, Donahue J, Darrell T, et al., 2014. Rich feature hierarchies for accurate object detection and semantic segmentation. Proc IEEE Conf on Computer Vision and Pattern Recognition, p.580-587.
[11]Hu JH, Cui GZ, Qin L, 2015. A new method of multispectral image processing with camouflage effect detection. Proc SPIE 9675, Image Processing and Analysis, Article 967510.
[12]Liang XY, Lin HK, Yang H, et al., 2021. Construction of semantic segmentation dataset of camouflage target image. Lasers Optoelectron Prog, 58(4):0410015 (in Chinese).
[13]Lin ZY, Goyal P, Girshick R, et al., 2020. Focal loss for dense object detection. IEEE Trans Patt Anal Mach Intell, 42(2):318-327.
[14]Liu CX, 2022. Research on the Fusion Algorithms of Infrared and Visible Image. MS Thesis, Lanzhou Jiaotong University, Lanzhou, China (in Chinese).
[15]Liu W, Anguelov D, Erhan D, et al., 2016. SSD: single shot multibox detector. Proc 14th European Conf on Computer Vision, p.21-37.
[16]Lv YQ, Zhang J, Dai YC, et al., 2021. Simultaneously localize, segment and rank the camouflaged objects. Proc IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.11591-11601.
[17]Putatunda R, Gangopadhyay A, Erbacher RF, et al., 2022. Camouflaged object detection system at the edge. Proc SPIE 12096, Automatic Target Recognition XXXII, Article 120960I.
[18]Qi B, 2022. Research on Fusion of Infrared and Visible Light Image Based on Co-occurrence Analysis Shearlet Transform. MS Thesis, Changchun Institute of Optics, Fine Mechanics and Physics, Chinese Academy of Sciences, Changchun, China (in Chinese).
[19]Redmon J, Farhadi A, 2017. YOLO9000: better, faster, stronger. Proc IEEE Conf on Computer Vision and Pattern Recognition, p.7263-7271.
[20]Redmon J, Divvala S, Girshick R, et al., 2016. You only look once: unified, real-time object detection. Proc IEEE Conf on Computer Vision and Pattern Recognition, p.779-788.
[21]Sun XH, Guan Z, Wang X, 2023. Vision Transformer for fusing infrared and visible images in groups. J Image Graph, 28(1):166-178 (in Chinese).
[22]Tan XY, Hu X, Yang JX, et al., 2022. Camouflaged object detection based on progressive feature enhancement aggregation. J Comput Appl, 42(7):2192-2200 (in Chinese).
[23]Wu GJ, Lyu XL, Xing HN, et al., 2015. Application of three-dimensional convex analysis in pattern painting camouflage detection. J PLA Univ Sci Technol (Nat Sci Ed), 16(6):582-586 (in Chinese).
[24]Yadav D, Arora MK, Tiwari KC, et al., 2018. Detection and identification of camouflaged targets using hyperspectral and LiDAR data. Def Sci J, 68(6):540-546.
[25]Zhang W, Zhou QK, Li RZ, et al., 2022. Research on camouflaged human target detection based on deep learning. Comput Intell Neurosci, 2022:7703444.
Open peer comments: Debate/Discuss/Question/Opinion
<1>