CLC number: TP391
On-line Access: 2024-08-27
Received: 2023-10-17
Revision Accepted: 2024-05-08
Crosschecked: 2015-01-05
Cited: 4
Clicked: 7007
Citations: Bibtex RefMan EndNote GB/T7714
Ahmet Sayar, Süleyman Eken, Okan Öztürk. Kd-tree and quad-tree decompositions for declustering of 2D range queries over uncertain space[J]. Frontiers of Information Technology & Electronic Engineering, 2015, 16(2): 98-108.
@article{title="Kd-tree and quad-tree decompositions for declustering of 2D range queries over uncertain space",
author="Ahmet Sayar, Süleyman Eken, Okan Öztürk",
journal="Frontiers of Information Technology & Electronic Engineering",
volume="16",
number="2",
pages="98-108",
year="2015",
publisher="Zhejiang University Press & Springer",
doi="10.1631/FITEE.1400165"
}
%0 Journal Article
%T Kd-tree and quad-tree decompositions for declustering of 2D range queries over uncertain space
%A Ahmet Sayar
%A Süleyman Eken
%A Okan Öztürk
%J Frontiers of Information Technology & Electronic Engineering
%V 16
%N 2
%P 98-108
%@ 2095-9184
%D 2015
%I Zhejiang University Press & Springer
%DOI 10.1631/FITEE.1400165
TY - JOUR
T1 - Kd-tree and quad-tree decompositions for declustering of 2D range queries over uncertain space
A1 - Ahmet Sayar
A1 - Süleyman Eken
A1 - Okan Öztürk
J0 - Frontiers of Information Technology & Electronic Engineering
VL - 16
IS - 2
SP - 98
EP - 108
%@ 2095-9184
Y1 - 2015
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/FITEE.1400165
Abstract: We present a study to show the possibility of using two well-known space partitioning and indexing techniques, kd trees and quad trees, in declustering applications to increase input/output (I/O) parallelization and reduce spatial data processing times. This parallelization enables time-consuming computational geometry algorithms to be applied efficiently to big spatial data rendering and querying. The key challenge is how to balance the spatial processing load across a large number of worker nodes, given significant performance heterogeneity in nodes and processing skews in the workload.
The paper aims to evaluate the efficiency of kd-tree and quad-tree space partitioning techniques in terms of reducing skew in distributed parallel spatial query processing. Specifically, the paper addresses the declustering of spatial domain for load-balancing spatial (2D) range query execution among worker nodes. The dataset evaluated is a point dataset. The paper presents a metric based on the area of the intersection region and presents experimental results.
[1]Bentley, J.L., 1975. Multidimensional binary search trees used for associative searching. Commun. ACM, 18(9):509-517.
[2]Beynon, M., Chang, C., Catalyurek, U., et al., 2002. Processing large-scale multi-dimensional data in parallel and distributed environments. Parall. Comput., 28(5):827-859.
[3]Chakka, V.P., Everspaugh, A.C., Patel, J.M., 2003. Indexing large trajectory data sets with SETI. Proc. 1st Biennial Conf. on Innovative Data Systems Research.
[4]Chilès, J.P., Delfiner, P., 2009. Geostatistics: Modeling Spatial Uncertainty. John Wiley & Sons, New York, USA.
[5]Chou, T.C.K., Abraham, J.A., 1982. Load balancing in distributed systems. IEEE Trans. Softw. Eng., SE-8(4):401-412.
[6]Cudre-Mauroux, P., Wu, E., Madden, S., 2010. TrajStore: an adaptive storage system for very large trajectory data sets. Proc. IEEE 26th Int. Conf. on Data Engineering, p.109-120.
[7]DeWitt, D., Gray, J., 1992. Parallel database systems: the future of high performance database systems. Commun. ACM, 35(6):85-98.
[8]Furht, B., Escalante, A., 2011. Handbook of Data Intensive Computing. Springer, New York, USA.
[9]Li, R., Bhanu, B., Ravishankar, C., et al., 2007. Uncertain spatial data handling: modeling, indexing and query. Comput. Geosci., 33(1):42-61.
[10]Moon, B., Saltz, J.H., 1998. Scalability analysis of declustering methods for multidimensional range queries. IEEE Trans. Knowl. Data Eng., 10(2):310-327.
[11]Ray, S., Simion, B., Brown, A.D., et al., 2013. A parallel spatial data analysis infrastructure for the cloud. Proc. 21st ACM SIGSPATIAL Int. Conf. on Advances in Geographic Information Systems, p.284-293.
[12]Reich, B.J., Chang, H.H., Strickland, M.J., 2014. Spatial health effects analysis with uncertain residential locations. Stat. Methods Med. Res., 23(2):156-168.
[13]Samet, H., 2006. Foundations of Multidimensional and Metric Data Structures. Morgan Kaufmann, San Francisco, USA.
[14]Sayar, A., 2013. Fine-grained federation of geographic information services through metadata aggregation. Sci. Res. Essays, 8(46):2242-2256.
[15]Sayar, A., Marlon, P., Geoffrey, F.C., 2014. An adaptive range-query optimization technique with distributed replicas. J. Cent. South Univ., 21(1):190-198.
[16]Sinha, R., Samaddar, S., Bhattacharyya, D., et al., 2010. A tutorial on spatial data handling. Int. J. Database Theory Appl., 3(1):1-12.
[17]Wang, L., Wu, P., Chen, H., 2013. Finding probabilistic prevalent colocations in spatially uncertain data sets. IEEE Trans. Knowl. Data Eng., 25(4):790-804.
[18]Wei, W., 2010. Analysis of spatial database index technology. Proc. 2nd Int. Conf. on Computer Engineering and Technology, p.29-32.
[19]Zhang, Y., Lin, X., Zhang, W., et al., 2010. Effectively indexing the uncertain space. IEEE Trans. Knowl. Data Eng., 22(9):1247-1261.
[20]Zhong, Y., Han, J., Zhang, T., et al., 2012. Towards parallel spatial query processing for big spatial data. Proc. IEEE 26th Int. Parallel and Distributed Processing Symp. Workshops & PhD Forum, p.2085-2094.
Open peer comments: Debate/Discuss/Question/Opinion
<1>