Full Text:   <3004>

CLC number: TP391

On-line Access: 2012-10-01

Received: 2012-02-12

Revision Accepted: 2012-07-31

Crosschecked: 2012-09-11

Cited: 0

Clicked: 7681

Citations:  Bibtex RefMan EndNote GB/T7714

-   Go to

Article info.
1. Reference List
Open peer comments

Journal of Zhejiang University SCIENCE C 2012 Vol.13 No.10 P.719-735


Learning a hierarchical image manifold for Web image classification

Author(s):  Rong Zhu, Min Yao, Li-hua Ye, Jun-ying Xuan

Affiliation(s):  School of Information Engineering, Jiaxing University, Jiaxing 314001, China; more

Corresponding email(s):   sikexing@163.com, myao@zju.edu.cn

Key Words:  Web image classification, Manifold learning, Image manifold, Semantic granularity, Distance measure

Share this article to: More |Next Article >>>

Rong Zhu, Min Yao, Li-hua Ye, Jun-ying Xuan. Learning a hierarchical image manifold for Web image classification[J]. Journal of Zhejiang University Science C, 2012, 13(10): 719-735.

@article{title="Learning a hierarchical image manifold for Web image classification",
author="Rong Zhu, Min Yao, Li-hua Ye, Jun-ying Xuan",
journal="Journal of Zhejiang University Science C",
publisher="Zhejiang University Press & Springer",

%0 Journal Article
%T Learning a hierarchical image manifold for Web image classification
%A Rong Zhu
%A Min Yao
%A Li-hua Ye
%A Jun-ying Xuan
%J Journal of Zhejiang University SCIENCE C
%V 13
%N 10
%P 719-735
%@ 1869-1951
%D 2012
%I Zhejiang University Press & Springer
%DOI 10.1631/jzus.C1200032

T1 - Learning a hierarchical image manifold for Web image classification
A1 - Rong Zhu
A1 - Min Yao
A1 - Li-hua Ye
A1 - Jun-ying Xuan
J0 - Journal of Zhejiang University Science C
VL - 13
IS - 10
SP - 719
EP - 735
%@ 1869-1951
Y1 - 2012
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/jzus.C1200032

Image classification is an essential task in content-based image retrieval. However, due to the semantic gap between low-level visual features and high-level semantic concepts, and the diversification of Web images, the performance of traditional classification approaches is far from users’ expectations. In an attempt to reduce the semantic gap and satisfy the urgent requirements for dimensionality reduction, high-quality retrieval results, and batch-based processing, we propose a hierarchical image manifold with novel distance measures for calculation. Assuming that the images in an image set describe the same or similar object but have various scenes, we formulate two kinds of manifolds, object manifold and scene manifold, at different levels of semantic granularity. Object manifold is developed for object-level classification using an algorithm named extended locally linear embedding (ELLE) based on intra- and inter-object difference measures. Scene manifold is built for scene-level classification using an algorithm named locally linear submanifold extraction (LLSE) by combining linear perturbation and region growing. Experimental results show that our method is effective in improving the performance of classifying Web images.

Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article


[1]Ames, M., Naaman, M., 2007. Why We Tag: Motivations for Annotation in Mobile and Online Media. SIGCHI Conf. on Human Factors in Computing, p.971-980.

[2]Belkin, M., Niyogi, P., 2001. Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering. Advances in Neural Information Processing Systems 14. MIT Press, p.585-591.

[3]Bellman, R.E., 1961. Adaptive Control Processes: a Guided Tour. Princeton University Press, New Jersey.

[4]Briggs, F., Raich, R., Fern, X.Z., 2009. Audio Classification of Bird Species: a Statistical Manifold Approach. Ninth IEEE Int. Conf. on Data Mining, p.51-60.

[5]Carlsson, G., Ishkhanov, T., de Silva, V., Zomorodian, A., 2008. On the local behavior of spaces of natural images. Int. J. Comput. Vis., 76(1):1-12.

[6]Chai, Y.M., Zhu, X.Y., Zhou, S., Bian, Y.T., Bu, F., Li, W., Zhu, J., 2009. Ontology-Based Digital Photo Annotation Using Multi-source Information. IEEE Int. Conf. on Computational Intelligence for Measurement Systems and Applications, p.38-41.

[7]Chang, E., Goh, K., Sychay, G., Wu, G., 2003. CBSA: content-based soft annotation for multimodal image retrieval using Bayes point machines. IEEE Trans. Circ. Syst. Video Technol., 13(1):26-38.

[8]Cheng, E., Jing, F., Zhang, L., 2009. A unified relevance feedback framework for Web image retrieval. IEEE Trans. Image Process., 18(6):1350-1357.

[9]Datta, R., Joshi, D., Li, J., Wang, J.Z., 2008. Image retrieval: ideas, influences, and trends of the new age. ACM Comput. Surv., 40(2):1-60.

[10]de Juan, C., Bodenheimer, B., 2004. Cartoon Textures. Proc. ACM SIGGRAPH/Eurographics Symp. on Computer Animation, p.267-276.

[11]de Ridder, D., Kouropteva, O., Okun, O., Pietikainen, M., Duin, R.P.W., 2003. Supervised locally linear embedding. LNCS, 2714:175.

[12]dos Santos, J.A., Ferreira, C.D., Torres, R.S., Goncalves, M.A., Lamparelli, R.A.C., 2011. A relevance feedback method based on genetic programming for classification of remote sensing images. Inform. Sci., 181(13):2671-2684.

[13]El Sayad, I., Martinet, J., Urruty, T., Amir, S., Dieraba, C., 2010. Effective Object-Based Image Retrieval Using Higher-Level Visual Representation. Int. Conf. on Machine and Web Intelligence, p.218-224.

[14]Enser, P., Sandom, C., 2003. Towards a Comprehensive Survey of the Semantic Gap in Visual Image Retrieval. Int. Conf. on Image and Video Retrieval, p.291-299.

[15]Fan, J.P., Gao, Y.L., Luo, H.Z., Jain, R., 2008. Mining multilevel image semantic via hierarchical classification. IEEE Trans. Multimedia, 10(2):167-187.

[16]Fan, W., Yeung, D.Y., 2006. Locally Linear Models on Faces Appearance Manifolds with Application to Dual-Subspace Based Classification. IEEE Computer Society Conf. on Computer Vision and Pattern Recognition, 2:1384-1390.

[17]Farajtabar, M., Rabbiee, H.R., Shaban, A., Soltani-Farani, A., 2011. Efficient Iterative Semi-supervised Classification on Manifold. IEEE 11th Int. Conf. on Data Mining Workshops, p.228-235.

[18]Fischer, B., Buhmann, J.M., 2003. Path-based clustering for grouping of smooth curves and texture segmentation. IEEE Trans. Pattern Anal. Mach. Intell., 25(4):513-518.

[19]Gao, Y., Fan, J.P., 2005. Semantic Image Classification with Hierarchical Feature Subset Selection. Proc. 7th ACM SIGMM Int. Workshop on Multimedia Information Retrieval, p.135-142.

[20]Guo, G.D., Jain, A.K., Ma, W.Y., Zhang, H.J., 2002. Learning similarity measure for natural image retrieval with relevance feedback. IEEE Trans. Neur. Networks, 13(4):811-820.

[21]Huang, J., Kumar, S.R., Zabih, R., 2003. Automatic hierarchical color image classification. EURASIP J. Appl. Signal Process., (2):151-159.

[22]Huiskes, M.J., Lew, M.S., 2008. The MIR Flickr Retrieval Evaluation. Proc. 1st ACM Int. Conf. on Multimedia Information Retrieval, p.39-43.

[23]Jaimes, A., Smith, J.R., 2003. Semi-automatic, Data-Driven Construction of Multimedia Ontologies. Proc. Int. Conf. on Multimedia and Expo, 1:781-784.

[24]Jaimes, A., Jaimes, R., Chang, S.F., 1999. Model-Based Classification of Visual Information for Content-Based Retrieval. Conf. on Storage and Retrieval for Image and Video Databases, p.402-414.

[25]Joshi, A.J., Porikli, F., Papanikolopoulos, N., 2009. Multi-class Active Learning for Image Classification. IEEE Computer Society Conf. on Computer Vision and Pattern Recognition, Poster Session 5.

[26]Jun, G., Ghosh, J., 2010. Nearest-Manifold Classification with Gaussian Processes. 20th Int. Conf. on Pattern Recognition, p.914-917.

[27]Kang, S.D., Park, S.S., Yoo, H.W., Shin, Y.G., Jang, D.S., 2009. Development of expert system for extraction of the objects of interest. Exp. Syst. Appl., 36(3):7210-7218.

[28]Kim, B.S., Park, J.Y., Mohan, A., Gilbert, A., Savarese, S., 2010. Hierarchical Classification of Images by Sparse Approximation. Proc. British Machine Vision Conf., p.106.1-106.11.

[29]Kim, D.W., Song, J.H., Lee, J.H., Choi, B.G., 2007. Support vector machine learning for region-based image retrieval with relevance feedback. ETRI J., 29(5):700-702.

[30]Kim, T.K., Kittle, J., Cipolla, R., 2007. Discriminative learning and recognition of image set classes using canonical correlations. IEEE Trans. Pattern Anal. Mach. Intell., 29(6):1005-1008.

[31]Klaydios, K., 2004. Relevance Feedback Methods for Web Image. PhD Thesis, Technical University of Crete, Chania, Greece.

[32]Li, L.J., Wang, C., Lim, Y.W., Blei, D.M., Li, F.F., 2010. Building and Using a Semantivisual Image Hierarchy. IEEE Conf. on Computer Vision and Pattern Recognition, p.336-3343.

[33]Li, X.R., Snoek, C.G.M., Worring, M., 2010. Unsupervised Multi-feature Tag Relevance Learning for Social Image Retrieval. Proc. ACM Int. Conf. on Image and Video Retrieval, p.10-17.

[34]Lin, Y.Q., Lv, F.J., Zhu, S.H., Yang, M., Cour, T., Yu, K., Cao, L.L., Huang, T., 2011. Large-Scale Image Classification: Fast Feature Extraction and SVM Training. IEEE Conf. on Computer Vision and Pattern Recognition, p.1689-1696.

[35]Liu, D., Yang, S.C., Mu, Y.D., Hua, X.S., Zhang, H.J., 2011. Towards Optimal Discriminating Order for Multiclass Classification. IEEE 11th Int. Conf. on Data Mining, p.388-397.

[36]Lu, D., Weng, Q., 2007. A survey of image classification methods and techniques for improving classification performance. Int. J. Remote Sens., 28(5):823-870.

[37]Luo, D.J., Huang, H., Ding, C., 2010. Discriminative High Order SVD: Adaptive Tensor Subspace Selection for Image Classification, Clustering, and Retrieval. IEEE Int. Conf. on Computer Vision, p.1443-1448.

[38]Luo, J.B., Singhal, A., Etz, S.P., Gray, R.T., 2004. A computational approach to determination of main subject regions in photographic images. Image Vis. Comput., 22(3):227-241.

[39]Parikh, D., 2011. Recognizing Jumbled Images: the Role of Local and Global Information in Image Classification. IEEE Int. Conf. on Computer Vision, p.519-526.

[40]Patterson, F., 1986. Photography and the Art of Seeing. Baker & Taylor Books, Charlotte, North Carolina.

[41]Pillati, M., Viroli, C., 2005. Supervised Locally Linear Embedding for Classification: an Application to Gene Expression Data Analysis. Annual Conf. of the German Classification Society, p.15-18.

[42]Rizon, M., Yazid, H., Saad, P., Shakaff, A.Y.M., Saad, A.R., Mamat, M.R., Yaacob, S., Desa, H., Karthigayan, M., 2006. Object detection using geometric invariant moment. Am. J. Appl. Sci., 3(6):1876-1878.

[43]Roweis, S.T., Saul, L.K., 2000. Nonlinear dimensional reduction by locally linear embedding. Science, 290(5500):2323-2326.

[44]Rui, Y., Huang, T.S., Ortega, M., Mehrotra, S., 1998. Relevance feedback: a power tool for interactive content-based image retrieval. IEEE Trans. Circ. Video Technol., 8(5):644-655.

[45]Saul, L.K., Roweis, S.T., 2003. Think globally, fit locally: unsupervised learning of low dimensional manifolds. J. Mach. Learn. Res., 4:119-155.

[46]Seung, H.S., Lee, D.D., 2000. The manifold ways of perception. Science, 290(5500):2268-2269.

[47]Shao, L., Brady, M., 2006. Specific object retrieval based on salient regions. Pattern Recogn., 39(10):1932-1948.

[48]Souvenir, R., Pless, R., 2005. Manifold Clustering. Int. Conf. on Computer Vision, p.648-653.

[49]Sun, A., Bhowmick, S.S., Nguyen, K.T.N., Bai, G., 2011. Tag-based social image retrieval: an empirical evaluation. J. Am. Soc. Inform. Sci. Technol., 62(12):2364-2381.

[50]Tao, D., Tang, X., Li, X., Rui, Y., 2006. Direct kernel biased discriminant analysis: a new content-based image retrieval relevance feedback algorithm. IEEE Trans. Multimedia, 8(4):716-727.

[51]Tenenbaum, J.B., Silva, V.D., Langford, J.C., 2000. A global geometric framework for nonlinear dimensionality reduction. Science, 290(5500):2319-2323.

[52]Vailaya, A., Jain, A., Zhang, H.J., 1998. On image classification: city images vs. landscapes. Pattern Recogn., 31(12):1921-1935.

[53]Vieux, R., Domenger, J.P., Benois-Pineau, J., Braquelaire, A., 2007. Image Classification with User Defined Ontology. 15th European Signal Processing Conf., p.723-727.

[54]Viola, P., Jones, M., 2004. Robust real-time face detection. Int. J. Comput. Vis., 57(2):137-154.

[55]Wang, C.H., Zhang, L., Zhang, H.J., 2008. Learning to Reduce the Semantic Gap in Web Image Retrieval and Annotation. Proc. 31st Annual Int. ACM SIGIR Conf. on Research and Development in Information Retrieval, p.355-362.

[56]Wang, L., Wang, X., Feng, J., 2006. Subspace distance analysis with application to adaptive Bayesian algorithm for face recognition. Pattern Recogn., 39(3):456-464.

[57]Wang, R.P., Shan, S.G., Chen, X.L., Gao, W., 2008. Manifold-Manifold Distance with Application to Face Recognition Based on Image Set. IEEE Conf. on Computer Vision and Pattern Recognition, p.1-8.

[58]Wu, Y., Chan, K.L., 2004. An Extended Isomap Algorithm for Learning Multi-class Manifold. Int. Conf. on Machine Learning and Cybernetics, 6:3429-3433.

[59]Yang, M.H., 2002. Extended Isomap for Pattern Classification. Proc. AAAI/AAI, p.224-229.

[60]Zeng, Z.Y., Yao, Z.Q., Liu, S.G., 2009. An Efficient and Effective Image Representation for Region-Based Image Retrieval. Proc. 2nd Int. Conf. on Interaction Sciences: Information Technology, Culture and Human, p.429-434.

[61]Zhai, S.D., Luo, B., Zhang, C.Y., 2008. Video abstraction based on manifold learning and mixture model. J. Image Graph., 13(4):735-740 (in Chinese).

[62]Zhang, Y.J., 2008. Image Classification and Retrieval with Mining Technologies. In: Song, M., Wu, Y.F.B. (Eds.), Handbook of Research on Text and Web Mining Technologies, Chapter VI, p.96-110.

[63]Zhou, X., Cui, N., Li, Z., Liang, F., Huang, T.S., 2009. Hierarchical Gaussianization for Image Classification. IEEE 12th Int. Conf. on Computer Vision, p.1971-1977.

[64]Zhu, R., Yao, M., 2009. Image feature optimization based on nonlinear dimensionality reduction. J. Zhejiang Univ.-Sci. A, 10(12):1720-1737.

Open peer comments: Debate/Discuss/Question/Opinion


Please provide your name, email address and a comment

Journal of Zhejiang University-SCIENCE, 38 Zheda Road, Hangzhou 310027, China
Tel: +86-571-87952783; E-mail: cjzhang@zju.edu.cn
Copyright © 2000 - 2024 Journal of Zhejiang University-SCIENCE