CLC number: TP183
On-line Access: 2024-08-27
Received: 2023-10-17
Revision Accepted: 2024-05-08
Crosschecked: 2018-01-19
Cited: 0
Clicked: 7171
Shuang Li, Shi-ji Song, Cheng Wu. Layer-wise domain correction for unsupervised domain adaptation[J]. Frontiers of Information Technology & Electronic Engineering, 2018, 19(1): 91-103.
@article{title="Layer-wise domain correction for unsupervised domain adaptation",
author="Shuang Li, Shi-ji Song, Cheng Wu",
journal="Frontiers of Information Technology & Electronic Engineering",
volume="19",
number="1",
pages="91-103",
year="2018",
publisher="Zhejiang University Press & Springer",
doi="10.1631/FITEE.1700774"
}
%0 Journal Article
%T Layer-wise domain correction for unsupervised domain adaptation
%A Shuang Li
%A Shi-ji Song
%A Cheng Wu
%J Frontiers of Information Technology & Electronic Engineering
%V 19
%N 1
%P 91-103
%@ 2095-9184
%D 2018
%I Zhejiang University Press & Springer
%DOI 10.1631/FITEE.1700774
TY - JOUR
T1 - Layer-wise domain correction for unsupervised domain adaptation
A1 - Shuang Li
A1 - Shi-ji Song
A1 - Cheng Wu
J0 - Frontiers of Information Technology & Electronic Engineering
VL - 19
IS - 1
SP - 91
EP - 103
%@ 2095-9184
Y1 - 2018
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/FITEE.1700774
Abstract: Deep neural networks have been successfully applied to numerous machine learning tasks because of their impressive feature abstraction capabilities. However, conventional deep networks assume that the training and test data are sampled from the same distribution, and this assumption is often violated in real-world scenarios. To address the domain shift or data bias problems, we introduce layer-wise domain correction (LDC), a new unsupervised domain adaptation algorithm which adapts an existing deep network through additive correction layers spaced throughout the network. Through the additive layers, the representations of source and target domains can be perfectly aligned. The corrections that are trained via maximum mean discrepancy, adapt to the target domain while increasing the representational capacity of the network. LDC requires no target labels, achieves state-of-the-art performance across several adaptation benchmarks, and requires significantly less training time than existing adaptation methods.
The online version of this article contains electronic supplementary materials, which are available to authorized users.
[1]Ajakan H, Germain P, Larochelle H, et al., 2014. Domain-adversarial neural networks. https://arxiv.org/abs/1412.4446
[2]Ben-David S, Blitzer J, Crammer K, et al., 2010. A theory of learning from different domains. Mach Learn, 79(1-2):151-175.
[3]Blitzer J, McDonald R, Pereira F, 2006. Domain adaptation with structural correspondence learning. Proc Conf on Empirical Methods in Natural Language Processing, p.120-128.
[4]Borgwardt KM, Gretton A, Rasch MJ, et al., 2006. Integrating structured biological data by kernel maximum mean discrepancy. Bioinformatics, 22(14):e49-e57.
[5]Chen MM, Weinberger KQ, Blitzer JC, 2011. Co-training for domain adaptation. Advances in Neural Information Processing Systems, p.2456-2464.
[6]Chen MM, Xu ZX, Weinberger K, et al., 2012. Marginalized denoising autoencoders for domain adaptation. https://arxiv.org/abs/1206.4683
[7]Donahue J, Jia YQ, Vinyals O, et al., 2014. Decaf: a deep convolutional activation feature for generic visual recognition. Proc 31st Int Conf on Machine Learning, p.647-655.
[8]Duan LX, Tsang IW, Xu D, et al., 2009. Domain transfer SVM for video concept detection. IEEE Conf on Computer Vision and Pattern Recognition, p.1375-1381.
[9]Duan LX, Tsang IW, Xu D, 2012. Domain transfer multiple kernel learning. IEEE Trans Patt Anal Mach Intell, 34(3):465-479.
[10]Ganin Y, Lempitsky V, 2015. Unsupervised domain adaptation by backpropagation. Proc 32nd Int Conf on Machine Learning, p.1180-1189.
[11]Gardner JR, Upchurch P, Kusner MJ, et al., 2015. Deep manifold traversal: changing labels with convolutional features. https://arxiv.org/abs/1511.06421
[12]Gehring J, Auli M, Grangier D, et al., 2017. Convolutional sequence to sequence learning. https://arxiv.org/abs/1705.03122
[13]Glorot X, Bordes A, Bengio Y, 2011. Domain adaptation for large-scale sentiment classification: a deep learning approach. Proc 28th Int Conf on Machine Learning, p.513-520.
[14]Gong BQ, Shi Y, Sha F, et al., 2012. Geodesic flow kernel for unsupervised domain adaptation. IEEE Conf on Computer Vision and Pattern Recognition, p.2066-2073.
[15]Gong BQ, Grauman K, Sha F, 2013. Connecting the dots with landmarks: discriminatively learning domain-invariant features for unsupervised domain adaptation. Proc 30th Int Conf on Machine Learning, p.222-230.
[16]Gretton A, Borgwardt KM, Rasch MJ, et al., 2012. A kernel two-sample test. J Mach Learn Res, 13(1):723-773.
[17]He KM, Zhang XY, Ren SQ, et al., 2015. Delving deep into rectifiers: surpassing human-level performance on ImageNet classification. IEEE Int Conf on Computer Vision, p.1026-1034.
[18]He KM, Zhang XY, Ren SQ, et al., 2016. Deep residual learning for image recognition. IEEE Conf on Computer Vision and Pattern Recognition, p.770-778.
[19]Hoffman J, Tzeng E, Park T, et al., 2017. CyCADA: cycle-consistent adversarial domain adaptation. https://arxiv.org/abs/1711.03213
[20]Ioffe S, Szegedy C, 2015. Batch normalization: accelerating deep network training by reducing internal covariate shift. Proc 32nd Int Conf on Machine Learning, p.448-456.
[21]Kingma DP, Ba J, 2014. Adam: a method for stochastic optimization. https://arxiv.org/abs/1412.6980
[22]Krizhevsky A, Sutskever I, Hinton GE, 2017. ImageNet classification with deep convolutional neural networks. Commun ACM, 60(6):84-90.
[23]LeCun Y, Bottou L, Bengio Y, et al., 1998. Gradient-based learning applied to document recognition. Proc IEEE, 86(11):2278-2324.
[24]Li YJ, Swersky K, Zemel R, 2015. Generative moment matching networks. Proc 32nd Int Conf on Machine Learning, p.1718-1727.
[25]Long MS, Wang JM, Ding GG, et al., 2013. Transfer feature learning with joint distribution adaptation. Proc IEEE Int Conf on Computer Vision, p.2200-2207.
[26]Long MS, Wang JM, Ding GG, et al., 2014. Transfer joint matching for unsupervised domain adaptation. Proc IEEE Conf on Computer Vision and Pattern Recognition, p.1410-1417.
[27]Long MS, Cao Y, Wang JM, et al., 2015. Learning transferable features with deep adaptation networks. Proc 32nd Int Conf on Machine Learning, p.97-105.
[28]Long MS, Wang JM, Cao Y, et al., 2016a. Deep learning of transferable representation for scalable domain adaptation. IEEE Trans Knowl Data Eng, 28(8):2027-2040.
[29]Long MS, Zhu H, Wang JM, et al., 2016b. Unsupervised domain adaptation with residual transfer networks. Advances in Neural Information Processing Systems, p.136-144.
[30]Mikolov T, Sutskever I, Chen K, et al., 2013. Distributed representations of words and phrases and their compositionality. Proc 26th Int Conf on Neural Information Processing Systems, p.3111-3119.
[31]Netzer Y, Wang T, Coates A, et al., 2011. Reading digits in natural images with unsupervised feature learning. NIPS Workshop on Deep Learning and Unsupervised Feature Learning, p.1-9.
[32]Oquab M, Bottou L, Laptev I, et al., 2014. Learning and transferring mid-level image representations using convolutional neural networks. Proc IEEE Conf on Computer Vision and Pattern Recognition, p.1717-1724.
[33]Pan SJL, Yang Q, 2010. A survey on transfer learning. IEEE Trans Knowl Data Eng, 22(10):1345-1359.
[34]Pan SJL, Tsang IW, Kwok JT, et al., 2011. Domain adaptation via transfer component analysis. IEEE Trans Neur Netw, 22(2):199-210.
[35]Russakovsky O, Deng J, Su H, et al., 2015. ImageNet large scale visual recognition challenge. Int J Comput Vis, 115(3):211-252.
[36]Saenko K, Kulis B, Fritz M, et al., 2010. Adapting visual category models to new domains. LNCS, 6314:213-226.
[37]Simonyan K, Zisserman A, 2014. Very deep convolutional networks for large-scale image recognition. https://arxiv.org/abs/1409.1556
[38]Srivastava N, Hinton G, Krizhevsky A, et al., 2014. Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res, 15(1):1929-1958.
[39]Sutskever I, Martens J, Dahl G, et al., 2013. On the importance of initialization and momentum in deep learning. Proc 30th Int Conf on Machine Learning, p.1139-1147.
[40]Sutskever I, Vinyals O, Le Q, 2014. Sequence to sequence learning with neural networks. Advances in Neural Information Processing Systems, p.3104-3112.
[41]Tzeng E, Hoffman J, Zhang N, et al., 2014. Deep domain confusion: maximizing for domain invariance. https://arxiv.org/abs/1412.3474
[42]van der Maaten L, Hinton G, 2008. Visualizing data using t-SNE. J Mach Learn Res, 9(11):2579-2605.
[43]Yosinski J, Clune J, Bengio Y, et al., 2014. How transferable are features in deep neural networks? Proc 27th Int Conf on Neural Information Processing Systems, p.3320-3328.
Open peer comments: Debate/Discuss/Question/Opinion
<1>