Full Text:   <1610>

CLC number: TP392

On-line Access: 2012-04-07

Received: 2011-08-08

Revision Accepted: 2012-01-27

Crosschecked: 2012-02-17

Cited: 0

Clicked: 2529

Citations:  Bibtex RefMan EndNote GB/T7714

-   Go to

Article info.
1. Reference List
Open peer comments

Journal of Zhejiang University SCIENCE C 2012 Vol.13 No.4 P.281-294

http://doi.org/10.1631/jzus.C1101009


Improving SPARQL query performance with algebraic expression tree based caching and entity caching


Author(s):  Gang Wu, Meng-dong Yang

Affiliation(s):  College of Information Science and Engineering, Northeastern University, Shenyang 110004, China; more

Corresponding email(s):   wugang@ise.neu.edu.cn, mdyang@seu.edu.cn

Key Words:  SPARQL, Resource Description Framework (RDF), Semantic caching, Algebraic expression tree (AET), Entity


Gang Wu, Meng-dong Yang. Improving SPARQL query performance with algebraic expression tree based caching and entity caching[J]. Journal of Zhejiang University Science C, 2012, 13(4): 281-294.

@article{title="Improving SPARQL query performance with algebraic expression tree based caching and entity caching",
author="Gang Wu, Meng-dong Yang",
journal="Journal of Zhejiang University Science C",
volume="13",
number="4",
pages="281-294",
year="2012",
publisher="Zhejiang University Press & Springer",
doi="10.1631/jzus.C1101009"
}

%0 Journal Article
%T Improving SPARQL query performance with algebraic expression tree based caching and entity caching
%A Gang Wu
%A Meng-dong Yang
%J Journal of Zhejiang University SCIENCE C
%V 13
%N 4
%P 281-294
%@ 1869-1951
%D 2012
%I Zhejiang University Press & Springer
%DOI 10.1631/jzus.C1101009

TY - JOUR
T1 - Improving SPARQL query performance with algebraic expression tree based caching and entity caching
A1 - Gang Wu
A1 - Meng-dong Yang
J0 - Journal of Zhejiang University Science C
VL - 13
IS - 4
SP - 281
EP - 294
%@ 1869-1951
Y1 - 2012
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/jzus.C1101009


Abstract: 
To obtain comparable high query performance with relational databases, diverse database technologies have to be adapted to confront the complexity posed by both resource Description Framework (RDF) data and SPARQL query. Database caching is one of such technologies that improves the performance of database with reasonable space expense based on the spatial/ temporal/semantic locality principle. However, existing caching schemes exploited in RDF stores are found to be dysfunctional for complex query semantics. Although semantic caching approaches work effectively in this case, little work has been done in this area. In this paper, we try to improve SPARQL query performance with semantic caching approaches, i.e., SPARQL algebraic expression tree (AET) based caching and entity caching. Successive queries with multiple identical sub-queries and star-shaped joins can be efficiently evaluated with these two approaches. The approaches are implemented on a two-level-storage structure. The main memory stores the most frequently accessed cache items, and items swapped out are stored on the disk for future possible reuse. Evaluation results on three mainstream RDF benchmarks illustrate the effectiveness and efficiency of our approaches. Comparisons with previous research are also provided.

Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article

Reference

[1]Abadi, D.J., Marcus, A., Madden, S., Hollenbach, K.J., 2007. Scalable Semantic Web Data Management Using Vertical Partitioning. 33rd Int. Conf. on Very Large Data Bases, p.411-422.

[2]Bizer, C., Schultz, A., 2009. The Berlin SPARQL Benchmark. Int. J. Semant. Web Inform. Syst., 5(2):1-24.

[3]Broekstra, J., Kampman, A., van Harmelen, F., 2002. Sesame: a generic architecture for storing and querying RDF and RDF schema. LNCS, 2342:54-68.

[4]Castillo, R., Leser, U., Rothe, C., 2010. RDFMatView: Indexing RDF Data for SPARQL Queries. Technical Report, Humboldt University, Berlin, Germany.

[5]Chen, L., Rundensteiner, E.A., Wang, S., 2002. XCache: a Semantic Caching System for XML Queries. ACM SIGMOD Int. Conf. on Management of Data, p.618.

[6]Chong, E.I., Das, S., Eadon, G., Srinivasan, J., 2005. An Efficient SQL-Based RDF Querying Scheme. 31st Int. Conf. on Very Large Data Bases, p.1216-1227.

[7]Dar, S., Franklin, M.J., Jónsson, B.T., Srivastava, D., Tan, M., 1996. Semantic Data Caching and Replacement. 22nd Int. Conf. on Very Large Data Bases, p.330-341.

[8]Erling, O., Mikhailov, I., 2007. RDF Support in the Virtuoso DBMS. First Conf. on Social Semantic Web, p.59-68.

[9]Guo, Y., Pan, Z., Heflin, J., 2005. LUBM: a benchmark for OWL knowledge base systems. Web Semant., 3(2-3):158-182.

[10]Harth, A., Umbrich, J., Hogan, A., Decker, S., 2007. YARS2: a federated repository for querying graph structured data from the Web. LNCS, 4825:211-224.

[11]Klyne, G., Carroll, J.J., 2004. Resource Description Framework (RDF): Concepts and Abstract Syntax. W3C Recommendation. Available from http://www.w3.org/TR/2004/REC-rdf-concepts-20040212/ [Accessed on Jan. 16, 2012].

[12]Li, L., König-Ries, B., Pissinou, N., Makki, K., 2001. Strategies for Semantic Caching. 12th Int. Conf. on Database and Expert Systems Applications, p.284-298.

[13]Martin, M., Unbehauen, J., Auer, S., 2010. Improving the performance of semantic Web applications with SPARQL query caching. LNCS, 6089:304-318.

[14]Neumann, T., Weikum, G., 2008. RDF-3X: a risc-style engine for RDF. Proc. VLDB Endow., 1(1):647-659.

[15]Owens, A., Seaborne, A., Gibbins, N., Schraefel, M., 2008. Clustered TDB: a Clustered Triple Store for Jena. Available from http://eprints.ecs.soton.ac.uk/16974/1/www2009fixedref.pdf [Accessed on Jan. 16, 2012].

[16]Prud′hommeaux, E., Seaborne, A., 2008. SPARQL Query Language for RDF. W3C Recommendation. Available from http://www.w3.org/TR/2008/REC-rdf-sparql-query-20080115/ [Accessed on Jan. 16, 2012].

[17]Ren, Q., Dunham, M.H., Kumar, V., 2003. Semantic caching and query processing. IEEE Trans. Knowl. Data Eng., 15(1):192-210.

[18]Ross, K.A., 2009. Cache-Conscious Query Processing. Encyclopedia of Database Systems, p.301-304.

[19]Sakr, S., Al-Naymat, G., 2010. Relational processing of RDF queries: a survey. ACM SIGMOD Rec., 38(4):23-28.

[20]Schmidt, M., Hornung, T., Lausen, G., Pinkel, C., 2008. SP2Bench: a SPARQL Performance Benchmark. IEEE 25th Int. Conf. on Data Engineering, p.222-233.

[21]Wikipedia, 2012. Resource Description Framework. Available from http://en.wikipedia.org/wiki/Resource_Description_ Framework [Accessed on Jan. 16, 2012].

[22]Wilkinson, K., Sayers, C., Kuno, H.A., Reynolds, D., 2003. Efficient RDF Storage and Retrieval in Jena2. First Int. Workshop on Semantic Web and Databases, p.131-150.

Open peer comments: Debate/Discuss/Question/Opinion

<1>

Please provide your name, email address and a comment





Journal of Zhejiang University-SCIENCE, 38 Zheda Road, Hangzhou 310027, China
Tel: +86-571-87952783; E-mail: cjzhang@zju.edu.cn
Copyright © 2000 - Journal of Zhejiang University-SCIENCE