Full Text:   <1235>

CLC number: TP311

On-line Access: 2012-04-07

Received: 2011-08-05

Revision Accepted: 2012-01-18

Crosschecked: 2012-02-27

Cited: 3

Clicked: 2867

Citations:  Bibtex RefMan EndNote GB/T7714

-   Go to

Article info.
1. Reference List
Open peer comments

Journal of Zhejiang University SCIENCE C 2012 Vol.13 No.4 P.257-267


VDoc+: a virtual document based approach for matching large ontologies using MapReduce

Author(s):  Hang Zhang, Wei Hu, Yu-zhong Qu

Affiliation(s):  State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing 210093, China; more

Corresponding email(s):   hangzhang@smail.nju.edu.cn, whu@nju.edu.cn

Key Words:  Ontology matching, Virtual document, MapReduce, TF-IDF, Semantic Web

Hang Zhang, Wei Hu, Yu-zhong Qu. VDoc+: a virtual document based approach for matching large ontologies using MapReduce[J]. Journal of Zhejiang University Science C, 2012, 13(4): 257-267.

@article{title="VDoc+: a virtual document based approach for matching large ontologies using MapReduce",
author="Hang Zhang, Wei Hu, Yu-zhong Qu",
journal="Journal of Zhejiang University Science C",
publisher="Zhejiang University Press & Springer",

%0 Journal Article
%T VDoc+: a virtual document based approach for matching large ontologies using MapReduce
%A Hang Zhang
%A Wei Hu
%A Yu-zhong Qu
%J Journal of Zhejiang University SCIENCE C
%V 13
%N 4
%P 257-267
%@ 1869-1951
%D 2012
%I Zhejiang University Press & Springer
%DOI 10.1631/jzus.C1101007

T1 - VDoc+: a virtual document based approach for matching large ontologies using MapReduce
A1 - Hang Zhang
A1 - Wei Hu
A1 - Yu-zhong Qu
J0 - Journal of Zhejiang University Science C
VL - 13
IS - 4
SP - 257
EP - 267
%@ 1869-1951
Y1 - 2012
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/jzus.C1101007

Many ontologies have been published on the semantic Web, to be shared to describe resources. Among them, large ontologies of real-world areas have the scalability problem in presenting semantic technologies such as ontology matching (OM). This either suffers from too long run time or has strong hypotheses on the running environment. To deal with this issue, we propose a three-stage mapReduce-based approach V-Doc+ for matching large ontologies, based on the mapReduce framework and virtual document technique. Specifically, two mapReduce processes are performed in the first stage to extract the textual descriptions of named entities (classes, properties, and instances) and blank nodes, respectively. In the second stage, the extracted descriptions are exchanged with neighbors in Resource Description Framework (RDF) graphs to construct virtual documents. This extraction process also benefits from the mapReduce-based implementation. A word-weight-based partitioning method is proposed in the third stage to conduct parallel similarity calculation using the term frequency–inverse document frequency (TF-IDF) model. Experimental results on two large-scale real datasets and the benchmark testbed from Ontology Alignment Evaluation Initiative (OAEI) are reported, showing that the proposed approach significantly reduces the run time with minor loss in precision and recall.

Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article


[1]Bethea, W.L., Fink, C.R., Beecher-Deighan, J.S., 2006. JHU/APL Onto-Mapology Results for OAEI 2006. Proc. ISWC Workshop on Ontology Matching, p.144-152.

[2]Castano, S., Ferrara, A., Messa, G., 2006. Results of the HMatch Ontology Matchmaker in OAEI 2006. Proc. ISWC Workshop on Ontology Matching, p.134-143.

[3]Dean, J., Ghemawat, S., 2008. MapReduce: simplified data processing on large clusters. Commun. ACM, 51(1):107-113.

[4]Do, H.H., Rahm, E., 2007. Matching large schemas: approaches and evaluation. Inform. Syst., 32(6):857-885.

[5]Euzenat, J., Shvaiko, P., 2007. Ontology Matching. Springer, Heidelberg, Germany.

[6]Euzenat, J., Ferrara, A., Meilicke, C., Nikolov, A., Pane, J., Scharffe, F., Shvaiko, P., Stuckenschmidt, H., Šváb-Zamazal, O., Svátek, V., et al., 2010. Results of the Ontology Alignment Evaluation Initiative 2010. Proc. ISWC Workshop on Ontology Matching, p.85-117.

[7]Gross, A., Hartung, M., Kirsten, T., Rahm, E., 2010. On matching large life science ontologies in parallel. LNCS, 6254:35-49.

[8]Hu, W., Qu, Y.Z., Cheng, G., 2008. Matching large ontologies: a divide-and-conquer approach. Data Knowl. Eng., 67(1):140-160.

[9]Kotis, K., Valarakos, A.G., Vouros, G.A., 2006. AUTOMS: Automated Ontology Mapping Through Synthesis of Methods. Proc. ISWC Workshop on Ontology Matching, p.96-106.

[10]Li, J.Z., Tang, J., Li, Y., Luo, Q., 2009. RiMOM: a dynamic multistrategy ontology alignment framework. IEEE Trans. Knowl. Data Eng., 21(8):1218-1232.

[11]Mao, M., Peng, Y.F., Spring, M., 2010. An adaptive ontology mapping approach with neural network based constraint satisfaction. Web Semant., 8(1):14-25.

[12]Mork, P., Bernstein, P., 2004. Adapting a Generic Match Algorithm to Align Ontologies of Human Anatomy. Proc. 20th Int. Conf. on Data Engineering, p.787-790.

[13]Nagy, M., Vargas-Vera, M., 2011. Multi-agent ontology mapping framework for the semantic Web. IEEE Trans. Syst. Man Cybern. A, 41(4):693-704.

[14]Qu, Y.Z., Hu, W., Cheng, G., 2006. Constructing Virtual Documents for Ontology Matching. Proc. 15th Int. Conf. on World Wide Web, p.23-31.

[15]Rahm, E., 2011. Towards Large-Scale Schema and Ontology Matching. In: Bellahsene, Z., Bonifati, A., Rahm, E. (Eds.), Schema Matching and Mapping. Springer, Heidelberg, Germany, p.3-27.

[16]Rosse, C., Mejino, J.L.V., 2008. The foundational model of anatomy ontology. Comput. Biol., 6(1):59-117.

[17]Salton, G., McGill, M.J., 1986. Introduction to Modern Information Retrieval. McGraw-Hill, NY, USA.

[18]Shvaiko, P., Euzenat, J., 2008. Ten challenges for ontology matching. LNCS, 5332:1164-1182.

[19]van Hage, W.R., Sini, M., Finch, L., Kolb, H., Schreiber, G., 2010. The OAEI food task: an analysis of a thesaurus alignment task. Appl. Ontol., 5(1):1-28.

[20]Vernica, R., Carey, M., Li, C., 2010. Efficient Parallel Set-Similarity Joins Using MapReduce. Proc. Int. Conf. on Management of Data, p.495-506.

[21]Wang, P., Zhou, Y.M., Xu, B.W., 2011. Matching Large Ontologies Based on Reduction Anchors. Proc. 22nd Int. Joint Conf. on Artificial Intelligence, p.2343-2348.

[22]Watters, C., 1999. Information retrieval and the virtual document. J. Am. Soc. Inform. Sci., 50(11):1028-1029.

[23]Zhang, H., Hu, W., Qu, Y.Z., 2011. Constructing virtual documents for ontology matching using MapReduce. LNCS, 7185:48-63.

Open peer comments: Debate/Discuss/Question/Opinion


Please provide your name, email address and a comment

Journal of Zhejiang University-SCIENCE, 38 Zheda Road, Hangzhou 310027, China
Tel: +86-571-87952783; E-mail: cjzhang@zju.edu.cn
Copyright © 2000 - Journal of Zhejiang University-SCIENCE