
CLC number: TP391.4
On-line Access: 2024-08-27
Received: 2023-10-17
Revision Accepted: 2024-05-08
Crosschecked: 2020-07-20
Cited: 0
Clicked: 8361
Citations: Bibtex RefMan EndNote GB/T7714
Luo-yang Xue, Qi-rong Mao, Xiao-hua Huang, Jie Chen. NLWSNet: a weakly supervised network for visual sentiment analysis in mislabeled web images[J]. Frontiers of Information Technology & Electronic Engineering,in press.https://doi.org/10.1631/FITEE.1900618 @article{title="NLWSNet: a weakly supervised network for visual sentiment analysis in mislabeled web images", %0 Journal Article TY - JOUR
NLWSNet:基于弱监督学习的嘈杂标签Web图像情感分析1江苏大学计算机科学与通信工程学院,中国镇江市,212013 2江苏省工业网络安全技术重点实验室,中国镇江市,212013 3南京工程学院计算机工程学院,中国南京市,211167 4奥卢大学机器视觉和信号分析中心,芬兰奥卢市,8000 摘要:大规模数据集推动了基于深度卷积神经网络情感分析的快速发展。但是,注释大规模数据集既昂贵又耗时。相反,从网络中很容易获得弱标注的Web图像。当直接使用Web图像训练深度学习模型时,嘈杂标签会导致性能急剧下降。针对这种情况,提出一种端到端的弱监督学习结构,该结构对于弱标签的Web图像具有鲁棒性。具体地说,该注意力模块通过降低训练过程中注意力得分,自动抑制带有错误标签样本的负面影响。另外,在弱监督学习方法中,类激活图模块通过关注正确标签样本的情感区域促进网络学习。除特征学习过程外,将正则化应用于分类器,以最小化同一类别样本的距离,并最大化不同类别样本质心之间的距离。对标记正确和错误的网页图像数据集进行定量和定性评估,结果表明该算法优于现有方法。 关键词组: Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article
Reference[1]Borth D, Ji RR, Chen T, et al., 2013. Large-scale visual sentiment ontology and detectors using adjective noun pairs. Proc 21st ACM Int Conf on Multimedia, p.223-232. ![]() [2]Campos V, Salvador A, Giró-i-Nieto X, et al., 2015. Diving deep into sentiment: understanding fine-tuned CNNs for visual sentiment prediction. https://arxiv.org/abs/1508.05056 ![]() [3]Campos V, Jou B, Giró-i-Nieto X, 2017. From pixels to sentiment: fine-tuning CNNs for visual sentiment prediction. Image Vis Comput, 65:15-22. ![]() [4]Chang CC, Lin CJ, 2011. LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol, 2(3):27. ![]() [5]Chen L, Zhang HW, Xiao J, et al., 2017. SCA-CNN: spatial and channel-wise attention in convolutional networks for image captioning. Proc IEEE Conf on Computer Vision and Pattern Recognition, p.6298-6306. ![]() [6]Chen SX, Zhang CJ, Dong M, et al., 2017. Using ranking-CNN for age estimation. Proc IEEE Conf on Computer Vision and Pattern Recognition, p.742-751. ![]() [7]Chen SX, Zhang CJ, Dong M, 2018a. Coupled end-to-end transfer learning with generalized Fisher information. Proc IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.4329-4338. ![]() [8]Chen SX, Zhang CJ, Dong M, 2018b. Deep age estimation: from classification to ranking. IEEE Trans Multim, 20(8):2209-2222. ![]() [9]Chen T, Borth D, Darrell T, et al., 2014a. DeepSentiBank: visual sentiment concept classification with deep convolutional neural networks. https://arxiv.org/abs/1410.8586 ![]() [10]Chen T, Yu FX, Chen JW, et al., 2014b. Object-based visual sentiment concept analysis and application. Proc 22nd ACM Int Conf on Multimedia, p.367-376. ![]() [11]Corbetta M, Shulman GL, 2002. Control of goal-directed and stimulus-driven attention in the brain. Nat Rev Neurosci, 3(3):201-205. ![]() [12]Deng J, Dong W, Socher R, et al., 2009. ImageNet: a large-scale hierarchical image database. Proc IEEE Conf on Computer Vision and Pattern Recognition, p.248-255. ![]() [13]Durand T, Mordan T, Thome N, et al., 2017. WILDCAT: weakly supervised learning of deep ConvNets for image classification, pointwise localization and segmentation. Proc IEEE Conf on Computer Vision and Pattern Recognition, p.5957-5966. ![]() [14]Fang Y, Tan H, Zhang J, 2018. Multi-strategy sentiment analysis of consumer reviews based on semantic fuzziness. IEEE Access, 6:20625-20631. ![]() [15]Girshick R, Donahue J, Darrell T, et al., 2014. Rich feature hierarchies for accurate object detection and semantic segmentation. Proc IEEE Conf on Computer Vision and Pattern Recognition, p.580-587. ![]() [16]He XT, Peng YX, 2017. Weakly supervised learning of part selection model with spatial constraints for fine-grained image classification. Proc 21st AAAI Conf on Artificial Intelligence, p.4075-4081. ![]() [17]He XT, Peng YX, Zhao JJ, 2019. Fast fine-grained image classification via weakly supervised discriminative localization. IEEE Trans Circ Syst Video Technol, 29(5):1394-1407. ![]() [18]Hinton GE, 2008. Visualizing high-dimensional data using t-SNE. Vigil Christ, 9(2):2579-2605. ![]() [19]Hu J, Shen L, Sun G, 2018. Squeeze-and-excitation networks. Proc IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.7132-7141. ![]() [20]Huang G, Liu Z, van der Maaten L, et al., 2017. Densely connected convolutional networks. Proc IEEE Conf on Computer Vision and Pattern Recognition, p.2261-2269. ![]() [21]Itti L, Koch C, Niebur E, 1998. A model of saliency-based visual attention for rapid scene analysis. IEEE Trans Patt Anal Mach Intell, 20(11):1254-1259. ![]() [22]Jia XB, Jin Y, Li N, et al., 2018. Words alignment based on association rules for cross-domain sentiment classification. Front Inform Technol Electron Eng, 19(2):260-272. ![]() [23]Katsurai M, Satoh S, 2016. Image sentiment analysis using latent correlations among visual, textual, and sentiment views. Proc IEEE Int Conf on Acoustics, Speech and Signal Processing, p.2837-2841. ![]() [24]Krizhevsky A, 2009. Learning Multiple Layers of Features from Tiny Images. Technical Report TR-2009, University of Toronto, Toronto, Canada. ![]() [25]Krizhevsky A, Sutskever I, Hinton GE, 2017. ImageNet classification with deep convolutional neural networks. Commun ACM, 60(6):84-90. ![]() [26]LeCun Y, Boser B, Denker JS, et al., 1989. Backpropagation applied to handwritten zip code recognition. Neur Comput, 1(4):541-551. ![]() [27]Li ZH, Fan YY, Liu WH, et al., 2018. Image sentiment prediction based on textual descriptions with adjective noun pairs. Multim Tools Appl, 77(1):1115-1132. ![]() [28]Liu GL, Reda FA, Shih KJ, et al., 2018. Image inpainting for irregular holes using partial convolutions. Proc 15th European Conf on Computer Vision, p.89-105. ![]() [29]Machajdik J, Hanbury A, 2010. Affective image classification using features inspired by psychology and art theory. Proc 18th ACM Int Conf on Multimedia, p.83-92. ![]() [30]Mikels JA, Fredrickson BL, Larkin GR, et al., 2005. Emotional category data on images from the international affective picture system. Behav Res Methods, 37(4):626-630. ![]() [31]Oquab M, Bottou L, Laptev I, et al., 2015. Is object localization for free?—Weakly-supervised learning with convolutional neural networks. Proc IEEE Conf on Computer Vision and Pattern Recognition, p.685-694. ![]() [32]Ou WH, Luan X, Gou JP, et al., 2018. Robust discriminative nonnegative dictionary learning for occluded face recognition. Patt Recogn Lett, 107:41-49. ![]() [33]Park J, Woo S, Lee JY, et al., 2018. BAM: bottleneck attention module. Proc British Machine Vision Conf, Article 147. ![]() [34]Peng KC, Sadovnik A, Gallagher A, et al., 2016. Where do emotions come from? Predicting the emotion stimuli map. Proc IEEE Int Conf on Image Processing, p.614-618. ![]() [35]Peng YX, Qi JW, Zhuo YK, 2019a. MAVA: multi-level adaptive visual-textual alignment by cross-media bi-attention mechanism. IEEE Trans Image Process, 29:2728-2741. ![]() [36]Peng YX, Zhao YZ, Zhang JC, 2019b. Two-stream collaborative learning with spatial-temporal attention for video classification. IEEE Trans Circ Syst Video Technol, 29(3):773-786. ![]() [37]Rohrbach A, Rohrbach M, Hu RH, et al., 2016. Grounding of textual phrases in images by reconstruction. Proc 14th European Conf on Computer Vision, p.817-834. ![]() [38]Simonyan K, Zisserman A, 2014. Very deep convolutional networks for large-scale image recognition. https://arxiv.org/abs/1409.1556 ![]() [39]Sun M, Yang JF, Wang K, et al., 2016. Discovering affective regions in deep convolutional neural networks for visual sentiment prediction. Proc IEEE Int Conf on Multimedia and Expo, p.1-6. ![]() [40]Szegedy C, Vanhoucke V, Ioffe S, et al., 2016. Rethinking the inception architecture for computer vision. Proc IEEE Conf on Computer Vision and Pattern Recognition, p.2818-2826. ![]() [41]Wang F, Jiang MQ, Qian C, et al., 2017. Residual attention network for image classification. Proc IEEE Conf on Computer Vision and Pattern Recognition, p.6450-6458. ![]() [42]Woo S, Park J, Lee JY, et al., 2018. CBAM: convolutional block attention module. Proc 15th European Conf on Computer Vision, p.3-19. ![]() [43]Xiao FY, Sigal L, Lee YJ, 2017. Weakly-supervised visual grounding of phrases with linguistic structures. Proc IEEE Conf on Computer Vision and Pattern Recognition, p.5253-5262. ![]() [44]Yang JF, She DY, Sun M, 2017a. Joint image emotion classification and distribution learning via deep convolutional neural network. Proc 26th Int Joint Conf on Artificial Intelligence, p.3266-3272. ![]() [45]Yang JF, Sun M, Sun XX, 2017b. Learning visual sentiment distributions via augmented conditional probability neural network. Proc 31st AAAI Conf on Artificial Intelligence, p.224-230. ![]() [46]Yang JF, She DY, Sun M, et al., 2018a. Visual sentiment prediction based on automatic discovery of affective regions. IEEE Trans Multim, 20(9):2513-2525. ![]() [47]Yang JF, She DY, Lai YK, et al., 2018b. Weakly supervised coupled networks for visual sentiment analysis. Proc IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.7584-7592. ![]() [48]You QZ, Luo JB, Jin HL, et al., 2015. Robust image sentiment analysis using progressively trained and domain transferred deep networks. Proc 29th AAAI Conf on Artificial Intelligence, p.381-388. ![]() [49]You QZ, Luo JB, Jin HL, et al., 2016. Building a large scale dataset for image emotion recognition: the fine print and the benchmark. Proc 30th AAAI Conf on Artificial Intelligence, p.308-314. ![]() [50]You QZ, Jin HL, Luo JB, 2017. Visual sentiment analysis by attending on local image regions. Proc 31st AAAI Conf on Artificial Intelligence, p.231-237. ![]() [51]Yu JH, Lin Z, Yang JM, et al., 2018. Generative image inpainting with contextual attention. Proc IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.5505-5514. ![]() [52]Yuan JB, Mcdonough S, You QZ, et al., 2013. Sentribute: image sentiment analysis from a mid-level perspective. Proc 2nd Int Workshop on Issues of Sentiment Discovery and Opinion Mining, p.1-8. ![]() [53]Zagoruyko S, Komodakis N, 2017. Paying more attention to attention: improving the performance of convolutional neural networks via attention transfer. https://arxiv.org/abs/1612.03928 ![]() [54]Zeng SN, Gou JP, Yang X, 2018. Improving sparsity of coefficients for robust sparse and collaborative representation-based image classification. Neur Comput Appl, 30(10):2965-2978. ![]() [55]Zhang FF, Mao QR, Shen XJ, et al., 2018a. Spatially coherent feature learning for pose-invariant facial expression recognition. ACM Trans Multim Comput Commun Appl, 14(1s):27. ![]() [56]Zhang FF, Zhang TZ, Mao QR, et al., 2018b. Joint pose and expression modeling for facial expression recognition. Proc IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.3359-3368. ![]() [57]Zhang N, Donahue J, Girshick R, et al., 2014. Part-based R-CNNs for fine-grained category detection. Proc 13th European Conf on Computer Vision, p.834-849. ![]() [58]Zhang QS, Zhu SC, 2018. Visual interpretability for deep learning: a survey. Front Inform Technol Electron Eng, 19(1):27-39. ![]() [59]Zhao SC, Gao Y, Jiang XL, et al., 2014. Exploring principles-of-art features for image emotion recognition. Proc 22nd ACM Int Conf on Multimedia, p.47-56. ![]() [60]Zhao SC, Yao HX, Gao Y, et al., 2016. Predicting personalized emotion perceptions of social images. Proc 24th ACM Int Conf on Multimedia, p.1385-1394. ![]() [61]Zhao SC, Ding GG, Gao Y, et al., 2017. Approximating discrete probability distribution of image emotions by multi-modal features fusion. Proc 26th Int Joint Conf on Artificial Intelligence, p.4669-4675. ![]() [62]Zhou BL, Khosla A, Lapedriza A, et al., 2016. Learning deep features for discriminative localization. Proc IEEE Conf on Computer Vision and Pattern Recognition, p.2921-2929. ![]() [63]Zhu Y, Zhou YZ, Ye QX, et al., 2017. Soft proposal networks for weakly supervised object localization. Proc IEEE Int Conf on Computer Vision, p.1859-1868. ![]() [64]Zhu YK, Groth O, Bernstein M, et al., 2016. Visual7W: grounded question answering in images. Proc IEEE Conf on Computer Vision and Pattern Recognition, p.4995-5004. ![]() [65]Zhuang BH, Liu LQ, Li Y, et al., 2017. Attend in groups: a weakly-supervised deep learning framework for learning from web data. Proc IEEE Conf on Computer Vision and Pattern Recognition, p.2915-2924. ![]() Journal of Zhejiang University-SCIENCE, 38 Zheda Road, Hangzhou
310027, China
Tel: +86-571-87952783; E-mail: cjzhang@zju.edu.cn Copyright © 2000 - 2026 Journal of Zhejiang University-SCIENCE | ||||||||||||||


ORCID:
Open peer comments: Debate/Discuss/Question/Opinion
<1>