Journal of Zhejiang University

ENGINEERING Information Technology & Electronic Engineering 2026 Vol.27 No.2 P.1-12

http://doi.org/10.1631/ENG.ITEE.2025.0177

An attention mechanism-based multi-domain feature fusion approach for active sonar target recognition

Author(s): Tongjing SUN, Haoran XU, Shishuo REN, Denghui ZHANG
Affiliation(s): 1. School of Automation, Hangzhou Dianzi University,Hangzhou 310018,China more
Corresponding email(s): stj@hdu.edu.cn
Key Words: Acoustic target recognition, Neural network, Attention mechanism, Multi-domain feature fusion

Share this article to： More <<< Previous Article \|Next Article >>>

Tongjing SUN, Haoran XU, Shishuo REN, Denghui ZHANG. An attention mechanism-based multi-domain feature fusion approach for active sonar target recognition[J]. Journal of Zhejiang University Science C, 2026, 27(2): 1-12.

@article{title="An attention mechanism-based multi-domain feature fusion approach for active sonar target recognition",
author="Tongjing SUN, Haoran XU, Shishuo REN, Denghui ZHANG",
journal="Journal of Zhejiang University Science C",
volume="27",
number="2",
pages="1-12",
year="2026",
publisher="Zhejiang University Press & Springer",
doi="10.1631/ENG.ITEE.2025.0177"
}

%0 Journal Article
%T An attention mechanism-based multi-domain feature fusion approach for active sonar target recognition
%A Tongjing SUN
%A Haoran XU
%A Shishuo REN
%A Denghui ZHANG
%J Frontiers of Information Technology & Electronic Engineering
%V 27
%N 2
%P 1-12
%@ 1869-1951
%D 2026
%I Zhejiang University Press & Springer
%DOI 10.1631/ENG.ITEE.2025.0177

TY - JOUR
T1 - An attention mechanism-based multi-domain feature fusion approach for active sonar target recognition
A1 - Tongjing SUN
A1 - Haoran XU
A1 - Shishuo REN
A1 - Denghui ZHANG
J0 - Frontiers of Information Technology & Electronic Engineering
VL - 27
IS - 2
SP - 1
EP - 12
%@ 1869-1951
Y1 - 2026
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/ENG.ITEE.2025.0177

Abstract
Chinese Summary
Academic Network
Reviewer Comment

Abstract: Due to the complex and changeable marine environment, the active sonar target recognition problem has always been difficult in the field of underwater acoustics. Deep learning-based fusion recognition technology provides an effective way to solve this problem, but relying on simple concatenation strategies to fuse multi-domain features can cause information redundancy, and it is not easy to effectively mine correlation information between domains. Therefore, this paper proposes an attention mechanism-based multi-domain feature fusion approach for active sonar target recognition. By preprocessing active sonar echo signals and constructing a multi-domain feature extraction and fusion network, this method uses a one-dimensional convolutional neural network with long short-term memory (1DCNN-LSTM) and a two-dimensional convolutional neural network (2DCNN) with channel attention introduced to extract deep features from different domains. Subsequently, combining feature concatenation and constructing multi-domain cross-attention, intra- and cross-domain feature fusion is performed, which can effectively eliminate redundant information and promote inter-domain information interaction, while maximizing the retention of target features. Experimental results show that compared with single-domain methods, the network using an attention mechanism for multi-domain feature fusion strengthens cross-domain information interaction and significantly improves feature representation capability. Compared with other methods, the proposed method has obvious advantages in performance and maintains stable generalization ability in scenarios with low signal-clutter ratios.

一种基于注意力机制的主动声呐目标多域特征融合识别方法

孙同晶^1,2，徐浩然^1,2，任诗硕^1,2，张登晖^1,2
¹杭州电子科技大学自动化学院，中国杭州市，310018
²杭州电子科技大学通信信息传输与融合技术国防重点学科实验室，中国杭州市，310018
摘要：由于海洋环境复杂多变，主动声呐目标识别问题在水声领域一直是难点问题。基于深度学习的融合识别技术为解决该问题提供了一条有效途径，但依靠简单拼接策略融合多域特征会造成信息冗余，且难以有效挖掘域间关联信息。因此，提出一种基于注意力机制的主动声呐目标多域特征融合识别方法。通过对主动声呐回波信号进行预处理并构建多域特征提取与融合网络，该方法利用具有长短期记忆的一维卷积神经网络（1DCNN-LSTM）与引入通道注意力的二维卷积神经网络（2DCNN）来提取不同域的深度特征。随后，结合特征拼接并构建多域交叉注意力，进行同域和跨域的特征融合，在最大化保留目标特征的同时，有效消除冗余信息并促进域间信息交互。实验结果表明，与单域方法相比，基于注意力机制的多域特征融合网络强化了跨域信息交互并显著提升了特征表征能力。与其它方法相比，本方法在性能上具有明显优势，在低信混比场景下仍保持稳定的泛化能力。

关键词：水声目标识别；神经网络；注意力机制；多域特征融合

Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article

Reference

[1]Arrabito GR, Cooke BE, McFadden SM, 2005. Recommendations for enhancing the role of the auditory modality for processing sonar data. Appl Acoust, 66(8):986-1005.

[2]Choo Y, Lee K, Hong W, et al., 2024. Active underwater target detection using a shallow neural network with spectrogram-based temporal variation features. IEEE J Ocean Eng, 49(1):279-293.

[3]Domingos LCF, Santos PE, Skelton PSM, et al., 2022. A survey of underwater acoustic data classification methods using deep learning for shoreline surveillance. Sensors, 22(6):2181.

[4]Dosovitskiy A, Beyer L, Kolesnikov A, et al., 2021. An image is worth 16x16 words: Transformers for image recognition at scale.

[5]Fang SL, Du SP, Luo XW, et al., 2019. Feature analysis and recognition technology of underwater acoustic targets. Bull Chin Acad Sci, 34(3):297-305(in Chinese).

[6]Gao F, Jin XP, Zhou XW, et al., 2025. MSFMamba: multiscale feature fusion state space model for multisource remote sensing image classification. IEEE Trans Geosci Remote Sens, 63:5504116.

[7]Han XC, Ren CX, Wang LM, et al., 2022. Underwater acoustic target recognition method based on a joint neural network. PLoS ONE, 17(4):e0266425.

[8]He KM, Zhang XY, Ren SQ, et al., 2016. Deep residual learning for image recognition. Proc IEEE Conf on Computer Vision and Pattern Recognition, p.770-778.

[9]He L, Liu SY, An R, et al., 2023. An end-to-end framework based on vision-language fusion for remote sensing cross-modal text-image retrieval. Mathematics, 11(10):2279.

[10]Hong F, Liu CW, Guo L, et al., 2021. Underwater acoustic target recognition with a residual network and the optimized feature extraction method. Appl Sci, 11(4):1442.

[11]Hu G, Wang KJ, Peng Y, et al., 2018. Deep learning methods for underwater target feature extraction and recognition. Comput Intell Neurosci, 2018:1214301.

[12]Hu G, Wang KJ, Liu LL, 2021. Underwater acoustic target recognition based on depthwise separable convolution neural networks. Sensors, 21(4):1429.

[13]Hu J, Shen L, Sun G, 2018. Squeeze-and-excitation networks. Proc IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.7132-7141.

[14]Huang HN, Li Y, 2019. Underwater acoustic detection: current status and future trends. Bull Chin Acad Sci, 34(3):264-271(in Chinese).

[15]Kamal S, Chandran CS, Supriya MH, 2021. Passive sonar automated target classifier for shallow waters using end-to-end learnable deep convolutional LSTMs. Eng Sci Technol Int J, 24(4):860-871.

[16]Khan A, Fouda MM, Do DT, et al., 2024. Underwater target detection using deep learning: methodologies, challenges, applications, and future evolution. IEEE Access, 12:12618-12635.

[17]Lee S, Seo I, Seok J, et al., 2020. Active sonar target classification with power-normalized cepstral coefficients and convolutional neural network. Appl Sci, 10(23):8450.

[18]Li HC, Lu YT, Zhu HD, 2024. Multi-modal sentiment analysis based on image and text fusion based on cross-attention mechanism. Electronics, 13(11):2069.

[19]Liu DL, Zhao XC, Cao WJ, et al., 2020. Design and performance evaluation of a deep neural network for spectrum recognition of underwater targets. Comput Intell Neurosci, 2020:8848507.

[20]Pan XY, Sun J, Feng TH, et al., 2025. Underwater target recognition based on adaptive multi-feature fusion network. Multim Tools Appl, 84(10):7297-7317.

[21]Shadlou Jahromi M, Bagheri V, Rostami H, et al., 2019. Feature extraction in fractional Fourier domain for classification of passive sonar signals. J Signal Process Syst, 91(5):511-520.

[22]Shin FB, Kil DH, Wayland RF, 1997. Active impulsive echo discrimination in shallow water by mapping target physics-derived features to classifiers. IEEE J Ocean Eng, 22(1):66-80.

[23]Simonyan K, Zisserman A, 2015. Very deep convolutional networks for large-scale image recognition.

[24]Vaswani A, Shazeer N, Parmar N, et al., 2017. Attention is all you need. Proc 31^st Int Conf on Neural Information Processing Systems, p.6000-6010.

[25]Wang QC, Du SP, Zhang W, et al., 2024. Active sonar target recognition method based on multi-domain transformations and attention-based fusion network. IET Radar Sonar Navig, 18(10):1814-1828.

[26]Yang HH, Gan AQ, Chen HL, et al., 2016. Underwater acoustic target recognition using SVM ensemble via weighted sample and feature selection. 13^th Int Bhurban Conf on Applied Sciences and Technology, p.522-527.

[27]Young VW, Hines PC, 2007. Perception-based automatic classification of impulsive-source active sonar echoes. J Acoust Soc Am, 122(3):1502-1517.

[28]Zhang W, Wu YQ, Wang DZ, et al., 2018. Underwater target feature extraction and classification based on Gammatone filter and machine learning. Int Conf on Wavelet Analysis and Pattern Recognition, p.42-47.

Open peer comments: Debate/Discuss/Question/Opinion

<1>