Skip to main content

Space-Partitioning-Based Bulk-Loading for the NSP-Tree in Non-ordered Discrete Data Spaces

  • Conference paper
Database and Expert Systems Applications (DEXA 2008)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5181))

Included in the following conference series:

  • 1153 Accesses

Abstract

Properly-designed bulk-loading techniques are more efficient than the conventional tuple-loading method in constructing a multidimensional index tree for a large data set. Although a number of bulk-loading algorithms have been proposed in the literature, most of them were designed for continuous data spaces (CDS) and cannot be directly applied to non-ordered discrete data spaces (NDDS). In this paper, we present a new space-partitioning-based bulk-loading algorithm for the NSP-tree — a multidimensional index tree recently developed for NDDSs . The algorithm constructs the target NSP-tree by repeatedly partitioning the underlying NDDS for a given data set until input vectors in every subspace can fit into a leaf node. Strategies to increase the efficiency of the algorithm, such as multi-way splitting, memory buffering and balanced space partitioning, are employed. Histograms that characterize the data distribution in a subspace are used to decide space partitions. Our experiments show that the proposed bulk-loading algorithm is more efficient than the tuple-loading algorithm and a popular generic bulk-loading algorithm that could be utilized to build the NSP-tree.

Research supported by US National Science Foundation (under grants # IIS-0414576 and # IIS-0414594), US National Institute of Health (under OK-INBRE Grant # P2PRR016478), The University of Michigan, and Michigan State University.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Arge, L., Berg, M., Haverkort, H., Yi, K.: The Priority R-tree: a practically efficient and worst-case optimal R-tree. In: Proc. of SIGMOD, pp. 347–358 (2004)

    Google Scholar 

  2. Berchtold, S., Bohm, C., Kriegel, H.-P.: Improving the query performance of high-dimensional index structures by bulk-load operations. In: Proc. of EDBT, pp. 216–230 (1998)

    Google Scholar 

  3. Bercken, J., Seeger, B., Widmayer, P.: A generic approach to bulk loading multidimensional index structures. In: Proc. of VLDB, pp. 406–415 (1997)

    Google Scholar 

  4. Bercken, J., Seeger, B.: An evaluation of generic bulk loading techniques. In: Proc. of VLDB, pp. 461–470 (2001)

    Google Scholar 

  5. Ciaccia, P., Patella, M.: Bulk loading the M-tree. In: Proc. of the 9th Australian Database Conference, pp. 15–26 (1998)

    Google Scholar 

  6. DeWitt, D., Kabra, N., Luo, J., Patel, J., Yu, J.: Client-server paradise. In: Proc. of VLDB, pp. 558–569 (1994)

    Google Scholar 

  7. Garcia, Y., Lopez, M., Leutenegger, S.: A greedy algorithm for bulk loading R-trees. In: Proc. of ACM-GIS, pp. 2–7 (1998)

    Google Scholar 

  8. http://www.ncbi.nlm.nih.gov/Genbank/

  9. Guttman, A.: R-trees: a dynamic index structure for spatial searching. In: Proc. of SIGMOD, pp. 47–57 (1984)

    Google Scholar 

  10. Jermaine, C., Datta, A., Omiecinski, E.: A novel index supporting high volume data warehouse insertion. In: Proc. of VLDB, pp. 235–246 (1999)

    Google Scholar 

  11. Kamel, I., Faloutsos, C.: On packing R-trees. In: Proc. of CIKM, pp. 490–499 (1993)

    Google Scholar 

  12. Leutenegger, S., Edgington, J., Lopez, M.: STR: A Simple and Efficient Algorithm for R-Tree Packing. In: Proc. of ICDE, pp. 497–506 (1997)

    Google Scholar 

  13. Qian, G., Zhu, Q., Xue, Q., Pramanik, S.: The ND-tree: a dynamic indexing technique for multidimensional non-ordered discrete data spaces. In: Proc. of VLDB, pp. 620–631 (2003)

    Google Scholar 

  14. Qian, G., Zhu, Q., Xue, Q., Pramanik, S.: A space-partitioning-based indexing method for multidimensional non-ordered discrete data spaces. ACM TOIS 23, 79–110 (2006)

    Article  Google Scholar 

  15. Qian, G., Zhu, Q., Xue, Q., Pramanik, S.: Dynamic indexing for multidimensional non-ordered discrete data spaces using a data-partitioning approach. ACM TODS 31, 439–484 (2006)

    Article  Google Scholar 

  16. Roussopoulos, N., Leifker, D.: Direct spatial search on pictorial databases using packed R-trees. In: Proc. of SIGMOD, pp. 17–31 (1985)

    Google Scholar 

  17. Seok, H.-J., Qian, G., Zhu, Q., Pramanik, S.: Bulk-loading the ND-tree in non-ordered discrete data spaces. In: Haritsa, J.R., Kotagiri, R., Pudi, V. (eds.) DASFAA 2008. LNCS, vol. 4947, pp. 156–171. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  18. Zipf, G.K.: Human behavior and the principle of least effort. Addison-Wesley, Reading (1949)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Sourav S. Bhowmick Josef Küng Roland Wagner

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Qian, G., Seok, HJ., Zhu, Q., Pramanik, S. (2008). Space-Partitioning-Based Bulk-Loading for the NSP-Tree in Non-ordered Discrete Data Spaces. In: Bhowmick, S.S., Küng, J., Wagner, R. (eds) Database and Expert Systems Applications. DEXA 2008. Lecture Notes in Computer Science, vol 5181. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-85654-2_37

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-85654-2_37

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-85653-5

  • Online ISBN: 978-3-540-85654-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics