Abstract
Applications demanding multidimensional index structures for performing efficient similarity queries often involve a large amount of data. The conventional tuple-loading approach to building such an index structure for a large data set is inefficient. To overcome the problem, a number of algorithms to bulk-load the index structures, like the R-tree, from scratch for large data sets in continuous data spaces have been proposed. However, many of them cannot be directly applied to a non-ordered discrete data space (NDDS) where data values on each dimension are discrete and have no natural ordering. No bulk-loading algorithm has been developed specifically for an index structure, such as the ND-tree, in an NDDS. In this paper, we present a bulk-loading algorithm, called the NDTBL, for the ND-tree in NDDSs. It adopts a special in-memory structure to efficiently construct the target ND-tree. It utilizes and extends some operations in the original ND-tree tuple-loading algorithm to exploit the properties of an NDDS in choosing and splitting data sets/nodes during the bulk-loading process. It also employs some strategies such as multi-way splitting and memory buffering to enhance efficiency. Our experimental studies show that the presented algorithm is quite promising in bulk-loading the ND-tree for large data sets in NDDSs.
Research supported by the US National Science Foundation (under grants # IIS-0414576 and # IIS-0414594), the US National Institute of Health (under OK-INBRE Grant # 5P20-RR-016478), The University of Michigan, and Michigan State University.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Arge, L., Hinrichs, K., Vahrenhold, J., Viter, J.S.: Efficient Bulk Operations on Dynamic R-trees. In: Goodrich, M.T., McGeoch, C.C. (eds.) ALENEX 1999. LNCS, vol. 1619, pp. 328–348. Springer, Heidelberg (1999)
Arge, L., Berg, M., Haverkort, H., Yi, K.: The Priority R-tree: a practically efficient and worst-case optimal R-tree. In: Proc. of SIGMOD, pp. 347–358 (2004)
Beckman, N., Kriegel, H., Schneider, R., Seeger, B.: The R*-tree: an efficient and robust access method for points and rectangles. In: Proc. of SIGMOD, pp. 322–331 (1990)
Berchtold, S., Keim, D.A., Kriegel, H.-P.: The X-tree: an index structure for high-dimensional data. In: Proc. of VLDB 1996, pp. 28–39 (1996)
Berchtold, S., Bohm, C., Kriegel, H.-P.: Improving the Query Performance of High-Dimensional Index Structures by Bulk-Load Operations. In: Schek, H.-J., Saltor, F., Ramos, I., Alonso, G. (eds.) EDBT 1998. LNCS, vol. 1377, pp. 216–230. Springer, Heidelberg (1998)
Bercken, J., Seeger, B., Widmayer, P.: A Generic Approach to Bulk Loading Multidimensional Index Structures. In: Proc. of VLDB, pp. 406–415 (1997)
Bercken, J., Seeger, B.,, B.: An Evaluation of Generic Bulk Loading Techniques. In: Proc. of VLDB, pp. 461–470 (2001)
Ciaccia, P., Patella, M.: Bulk loading the M-tree. In: Proc. of the 9th Australian Database Conference, pp. 15–26 (1998)
De Witt, D., Kabra, N., Luo, J., Patel, J., Yu, J.: Client-Server Paradise. In: Proc. of VLDB, pp. 558–569 (1994)
Garcia, Y., Lopez, M., Leutenegger, S.: A greedy algorithm for bulk loading R-trees. In: Proc. of ACM-GIS, pp. 02–07 (1998)
Guttman, A.: R-trees: a dynamic index structure for spatial searching. In: Proc. of SIGMOD, pp. 47–57 (1984)
Jermaine, C., Datta, A., Omiecinski, E.: A novel index supporting high volumne data warehouse insertion. In: Proc. of VLDB, pp. 235–246 (1999)
Kamel, I., Faloutsos, C.: On packing R-trees. In: Proc. of CIKM, pp. 490–499 (1993)
Leutenegger, S., Edgington, J., Lopez, M.: STR: A Simple and Efficient Algorithm for R-Tree Packing. In: Proc. of ICDE, pp. 497–506 (1997)
Qian, G., Zhu, Q., Xue, Q., Pramanik, S.: The ND-Tree: a dynamic indexing technique for multidimensional non-ordered discrete data spaces. In: Aberer, K., Koubarakis, M., Kalogeraki, V. (eds.) VLDB 2003. LNCS, vol. 2944, pp. 620–631. Springer, Heidelberg (2004)
Qian, G., Zhu, Q., Xue, Q., Pramanik, S.: A Space-Partitioning-Based Indexing Method for Multidimensional Non-ordered Discrete Data Spaces. ACM TOIS 23, 79–110 (2006)
Qian, G., Zhu, Q., Xue, Q., Pramanik, S.: Dynamic Indexing for Multidimensional Non-ordered Discrete Data Spaces Using a Data-Partitioning Approach. ACM TODS 31, 439–484 (2006)
Robinson, J.T.: The K-D-B-tree: a search structure for large multidimensional dynamic indexes. In: Proc. of SIGMOD, pp. 10–18 (1981)
Roussopoulos, N., Leifker, D.: Direct spatial search on pictorial databases using packed R-trees. In: Proc. of SIGMOD, pp. 17–31 (1985)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Seok, HJ., Qian, G., Zhu, Q., Oswald, A.R., Pramanik, S. (2008). Bulk-Loading the ND-Tree in Non-ordered Discrete Data Spaces. In: Haritsa, J.R., Kotagiri, R., Pudi, V. (eds) Database Systems for Advanced Applications. DASFAA 2008. Lecture Notes in Computer Science, vol 4947. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-78568-2_14
Download citation
DOI: https://doi.org/10.1007/978-3-540-78568-2_14
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-78567-5
Online ISBN: 978-3-540-78568-2
eBook Packages: Computer ScienceComputer Science (R0)