Abstract
This work introduces decentralized query processing techniques based on MIDAS, a novel distributed multidimensional index. In particular, MIDAS implements a distributed k-d tree, where leaves correspond to peers, and internal nodes dictate message routing. MIDAS requires that peers maintain little network information, and features mechanisms that support fault tolerance and load balancing. The proposed algorithms process point and range queries over the multidimensional indexed space in only O(log n) hops in expectance, where n is the network size. For nearest neighbor queries, two processing alternatives are discussed. The first, termed eager processing, has low latency (expected value of O(log n) hops) but may involve a large number of peers. The second, termed iterative processing, has higher latency (expected value of O(log2 n) hops) but involves far fewer peers. A detailed experimental evaluation demonstrates that our query processing techniques outperform existing methods for settings involving real spatial data as well as in the case of high dimensional synthetic data.
Similar content being viewed by others
Notes
Peers periodically inform their backlinks about their load.
In our implementation, a timeout process is initiated. Note that missed messages do not affect the algorithm’s correctness, since the global guarantee is correctly computed (see proof of Lemma 9) on the set G of retrieved local guarantees.
References
Aberer K, Cudré-Mauroux P, Datta A, Despotovic Z, Hauswirth M, Punceva M, Schmidt R (2003) P-grid: a self-organizing structured p2p system. SIGMOD Record 32(3):29–33
Bentley JL (1975) Multidimensional binary search trees used for associative searching. Commun ACM 18(9):509–517
Bentley JL (1990) K-d trees for semidynamic point sets. In: Symposium on computational geometry, pp 187–197
Bharambe AR, Agrawal M, Seshan S (2004) Mercury: supporting scalable multi-attribute range queries. In: SIGCOMM, pp 353–366
Blanas S, Samoladas V (2007) Contention-based performance evaluation of multidimensional range search in p2p networks. In: InfoScale’07, pp 1–8
Cai M, Frank MR, Chen J, Szekely PA (2004) Maan: a multi-attribute addressable network for grid information services. J Grid Comp 2(1):3–14
Datta A, Hauswirth M, John R, Schmidt R, Aberer K (2005) Range queries in trie-structured overlays. In: P2P Computing, pp 57–66
Duch A, Estivill-Castro V, Martínez C (1998) Randomized k-dimensional binary search trees. In: ISAAC, pp 199–208
Falchi F, Gennaro C, Zezula P (2008) Nearest neighbor search in metric spaces through content-addressable networks. Inf Process Manag 44(1):411–429
Ganesan P, Yang B, Garcia-Molina H (2004) One torus to rule them all: multidimensional queries in p2p systems. In: WebDB, pp 19–24
Jagadish HV, Ooi BC, Vu QH (2005) Baton: a balanced tree structure for peer-to-peer networks. In: VLDB, pp. 661–672
Jagadish HV, Ooi BC, Vu QH, Zhang R, Zhou A (2006) Vbi-tree: a peer-to-peer framework for supporting multi-dimensional indexing schemes. In: ICDE, p 34
Jain R, Chiu D, Hawe W (1984) A quantitative measure of fairness and discrimination for resource allocation in shared computer systems. In: DEC Research Report TR-301
Karger D, Lehman E, Leighton T, Panigrahy R, Levine M, Lewin D (1997) Consistent hashing and random trees: Distributed caching protocols for relieving hot spots on the World Wide Web. In: ACM Symp. on Theory of Comp., pp 654–663
Maymounkov P, Mazières D (2002) Kademlia: a peer-to-peer information system based on the xor metric. In: IPTPS, pp 53–65
Plaxton CG, Rajaraman R, Richa AW (1999) Accessing nearby copies of replicated objects in a distributed environment. Theory Comput Syst 32(3):241–280
Ratnasamy S, Francis P, Handley M, Karp R, Schenker S (2001) A scalable content-addressable network. In: SIGCOMM ’01, pp 161–172
Reed BA (2003) The height of a random binary search tree. J ACM 50(3):306–332
Rowstron AIT, Druschel P (2001) Pastry: scalable, decentralized object location, and routing for large-scale peer-to-peer systems. In: Middleware, pp 329–350
Shu Y, Ooi BC, Tan KL, Zhou A (2005) Supporting multi-dimensional range queries in peer-to-peer systems. In: Peer-to-Peer computing, pp 173–180
Stoica I, Morris R, Liben-Nowell D, Karger DR, Kaashoek MF, Dabek F, Balakrishnan H (2003) Chord: a scalable p2p lookup protocol for internet applications. IEEE/ACM Trans Netw 11(1):17–32
Tsatsanifos G, Sacharidis D, Sellis T (2011) Midas: multi-attribute indexing for distributed architecture systems. In: Proceedings of the international symposium on spatial and temporal databases (SSTD)
Wang J, Wu S, Gao H, Li J, Ooi BC (2010) Indexing multi-dimensional data in a cloud system. In: SIGMOD, pp 591–602
Zhao B, Kubiatowicz J, Joseph AD (2004) Tapestry: a resilient global-scale overlay for service deployment. IEEE J Sel Areas Commun 22(1):41–53
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Tsatsanifos, G., Sacharidis, D. & Sellis, T. Index-based query processing on distributed multidimensional data. Geoinformatica 17, 489–519 (2013). https://doi.org/10.1007/s10707-012-0163-x
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10707-012-0163-x