Full Text:   <1860>

CLC number: TP391.4

On-line Access: 2010-11-04

Received: 2010-09-14

Revision Accepted: 2010-09-30

Crosschecked: 2010-09-14

Cited: 3

Clicked: 3313

Citations:  Bibtex RefMan EndNote GB/T7714

-   Go to

Article info.
1. Reference List
Open peer comments

Journal of Zhejiang University SCIENCE C 2010 Vol.11 No.11 P.860-871


Multi-task multi-label multiple instance learning

Author(s):  Yi Shen, Jian-ping Fan

Affiliation(s):  Department of Computer Science, University of North Carolina at Charlotte 28223, USA

Corresponding email(s):   yshen9@uncc.edu, jfan@uncc.edu

Key Words:  Object network, Loosely tagged images, Multi-task learning, Multi-label learning, Multiple instance learning

Yi Shen, Jian-ping Fan. Multi-task multi-label multiple instance learning[J]. Journal of Zhejiang University Science C, 2010, 11(11): 860-871.

@article{title="Multi-task multi-label multiple instance learning",
author="Yi Shen, Jian-ping Fan",
journal="Journal of Zhejiang University Science C",
publisher="Zhejiang University Press & Springer",

%0 Journal Article
%T Multi-task multi-label multiple instance learning
%A Yi Shen
%A Jian-ping Fan
%J Journal of Zhejiang University SCIENCE C
%V 11
%N 11
%P 860-871
%@ 1869-1951
%D 2010
%I Zhejiang University Press & Springer
%DOI 10.1631/jzus.C1001005

T1 - Multi-task multi-label multiple instance learning
A1 - Yi Shen
A1 - Jian-ping Fan
J0 - Journal of Zhejiang University Science C
VL - 11
IS - 11
SP - 860
EP - 871
%@ 1869-1951
Y1 - 2010
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/jzus.C1001005

For automatic object detection tasks, large amounts of training images are usually labeled to achieve more reliable training of the object classifiers; this is cost-expensive since it requires hiring professionals to label large-scale training images. When a large number of object classes come into view, the issue of obtaining a large enough amount of the labeled training images becomes more critical. There are three potential solutions to reduce the burden for image labeling: (1) allowing people to provide the object labels loosely at the image level rather than at the object level (e.g., loosely-tagged images without identifying the exact object locations in the images); (2) harnessing large-scale collaboratively-tagged images that are available on the Internet; and, (3) developing new machine learning algorithms that can directly leverage large-scale collaboratively- or loosely-tagged images for achieving more effective training of a large number of object classifiers. Based on these observations, a multi-task multi-label multiple instance learning (MTML-MIL) algorithm is developed in this paper by leveraging both inter-object correlations and large-scale loosely-labeled images for object classifier training. By seamlessly integrating multi-task learning, multi-label learning, and multiple instance learning, our MTML-MIL algorithm can achieve more accurate training of a large number of inter-related object classifiers (where an object network is constructed for determining the inter-related learning tasks directly in the feature space rather than in the label space). Our experimental results have shown that our MTML-MIL algorithm can achieve higher detection accuracy rates for automatic object detection.

Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article


[1]Boutell, M.R., Luo, J., Shen, X., Brown, C.M., 2004. Learning multi-label scene classification. Pattern Recogn., 37(9):1757-1771.

[2]Chen, Y., Bi, J., Wang, J.Z., 2006. MILES: multiple instance learning via embedded instance selection. IEEE Trans. PAMI, 28(12):1931-1947.

[3]Deng, Y., Manjunath, B.S., 1999. Color Image Segmentation. IEEE CVPR, p.2446-2451.

[4]Evgeniou, T., Micchelli, C.A., Pontil, M., 2005. Learning multiple tasks with kernel methods. J. Mach. Learn. Res., 6:615-637.

[5]Fan, J., Gao, Y., Luo, H., 2004. Multi-Level Annotation of Natural Scenes Using Dominant Image Components and Semantic Image Concepts. ACM Multimedia, p.540-547.

[6]Fan, J., Luo, H., Gao, Y., Jain, R., 2007. Incorporating concept ontology for hierarchical video classification, annotation and visualization. IEEE Trans. Multimedia, 9(5):939-957.

[7]Fan, J., Gao, Y., Luo, H., 2008a. Integrating concept ontology and multi-task learning to achieve more effective classifier training for multi-level image annotation. IEEE Trans. Image Process., 17(3):407-426.

[8]Fan, J., Gao, Y., Luo, H., Jain, R., 2008b. Mining multi-level image semantics via hierarchical classification. IEEE Trans. Multimedia, 10(1):167-187.

[9]Fan, J., Shen, Y., Zhou, N., Gao, Y., 2010. Harvesting Large-Scale Weakly-Tagged Image Databases from the Web. IEEE CVPR, p.802-809.

[10]Fan, R., Chen, P., Lin, C.J., 2005. Working set selection using the second order information for training SVM. J. Mach. Learn. Res., 6:1889-1918.

[11]Frey, B.J., Dueck, D., 2007. Clustering by passing messages between data points. Science, 315(5814):972-976.

[12]Graf, H.P., Cosatto, E., Bottou, L., Durdanovic, I., Vapnik, V., 2004. Parallel Support Vector Machines: the Cascade SVM. NIPS, p.1-8.

[13]Hanley, J.A., McNeil, B.J., 1982. The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology, 143(1):29-36.

[14]Jiang, W., Chang, S.F., Loui, A., 2007. Context-Based Concept Fusion with Boosted Conditional Random Fields. IEEE ICASSP, p.949-952.

[15]Joachims, T., Finley, T., Yu, C., 2009. Cutting-plane training of structural SVMs. Mach. Learn., 77(1):27-59.

[16]Kumar, S., Herbert, M., 2006. Discriminative random fields. Int. J. Comput. Vis., 68(2):179-201.

[17]Liu, J., Li, M., Ma, W.Y., Liu, Q., Lu, H., 2006. An Adaptive Graph Model for Automatic Image Annotation. ACM Multimedia Workshop on MIR, p.61-70.

[18]Maron, O., Ratan, A.L., 1998. Multiple-Instance Learning for Natural Scene Classification. ICML, p.341-349.

[19]Qi, G.J., Hua, X.S., Rui, Y., Tang, J., Mei, T., Zhang, H.J., 2007. Correlative Multi-Label Video Annotation. ACM Multimedia, p.17-26.

[20]Russell, B., Efros, A., Sivic, J., Freeman, W., Zisserman, A., 2006. Using Multiple Segmentations to Discover Objects and Their Extent in Image Collections. IEEE CVPR, p.1605-1614.

[21]Tang, J., Hua, X., Wang, M., Gu, Z., Qi, G., Wu, X., 2009. Correlative linear neighborhood propagation for video annotation. IEEE Trans. SMC, 39(2):409-416.

[22]Torralba, A., Murphy, K.P., Freeman, W.T., 2004. Sharing Features: Efficient Boosting Procedures for Multiclass Object Detection. IEEE CVPR, p.762-769.

[23]Tsochantaridis, I., Joachims, T., Hofmann, T., Altun, Y., 2005. Large margin methods for structured and interdependent output variables. J. Mach. Learn. Res., 6:1453-1484.

[24]Vijayanarasimhan, S., Grauman, K., 2008. Keywords to Visual Categories: Multiple-Instance Learning for Weakly Supervised Object Categorization. IEEE CVPR, p.1-8.

[25]Yang, J., Liu, Y., Ping, E.X., Hauptmann, A.G., 2007. Harmonium Models for Semantic Video Representation and Classification. SIAM Conf. on Data Mining, p.1-12.

[26]Zha, Z., Hua, X.S., Mei, T., Wang, J., Qi, G.J., Wang, Z., 2008. Joint Multi-Label Multi-Instance Learning for Image Classification. IEEE CVPR, p.1-8.

[27]Zhang, Q., Yu, W., Goldman, S.A., Fritts, J.E., 2002. Content-Based Image Retrieval Using Multiple-Instance Learning. ICML, p.682-689.

[28]Zhu, Z.H., Zhang, M.L., 2006. Multi-Instance Multi-Label Learning with Application to Scene Classification. NIPS, p.1609-1616.

Open peer comments: Debate/Discuss/Question/Opinion


Please provide your name, email address and a comment

Journal of Zhejiang University-SCIENCE, 38 Zheda Road, Hangzhou 310027, China
Tel: +86-571-87952783; E-mail: cjzhang@zju.edu.cn
Copyright © 2000 - Journal of Zhejiang University-SCIENCE