Bulk Insertion for R-Tree by Seeded Clustering
- 518 Downloads
In many scientific and commercial applications such as Earth Observation System (EOSDIS) and mobile phone services tracking a large number of clients, it is a daunting task to archive and index ever increasing volume of complex data that are continuously added to databases. To efficiently manage multidimensional data in scientific and data warehousing environments, R-tree based index structures have been widely used. In this paper, we propose a scalable technique called Seeded Clustering that allows us to maintain R-tree indexes by bulk insertion while keeping pace with high data arrival rates. Our approach uses a seed tree, which is copied from the top k levels of a target R-tree, to classify input data objects into clusters. We then build an R-tree for each of the clusters and insert the input R-trees into the target R-tree in bulk one at a time. We present detailed algorithms for the seeded clustering and bulk insertion as well as the results from our extensive experimental study. The experimental results show that the bulk insertion by seeded clustering outperforms the previously known methods in terms of insertion cost and the quality of target R-trees measured by their query performance.
Unable to display preview. Download preview PDF.
- 2.Beckmann, N., Kriegel, H.-P., Schneider, R., Seeger, B.: The R*-tree: an efficient and robust access method for points and rectangles. In: Proceedings of the 1990 ACM SIGMOD international conference on Management of data, pp. 322–331 (1990)Google Scholar
- 3.Chen, L., Choubey, R., Rundensteiner, E.A.: Bulk-insertions into Rtrees using the small-tree-large-tree approach. In: Proceedings of the sixth ACM international symposium on Advances in geographic information systems, pp. 161–162 (1998)Google Scholar
- 4.Choubey, R., Chen, L., Rundersteiner, E.A.: GBI: A Generalized R-tree Bulk-Insertion Strategy. In: Advances in Spatial Databases, pp. 91–108 (1997)Google Scholar
- 5.Guttman, A.: R-Trees: A dynamic index structure for spatial searching. In: Proceedings of the 1984 ACM-SIGMOD Conference, pp. 47–57 (June 1984)Google Scholar
- 6.Kamel, I., Khalil, M., Kouramajian, V.: Bulk insertion in dynamic R-trees. In: Proceedings of the 4th International Symposium on Spatial Data Handling (SDH 1996), pp. 31–42 (1996)Google Scholar
- 7.Kamel, I., Faloutsos, C.: On packing R-trees. In: Proceedings of the second international conference on Information and knowledge management, pp. 490–499 (1993)Google Scholar
- 8.Leutenegger, S.T., Edgington, J.M., Lopez, M.A.: STR: A Simple and Efficient Algorithm for R-Tree Packing. In: Proceedings of the IEEE Data Engineering, pp. 497–506 (1997)Google Scholar
- 9.TIGER/Line Files, Technical Documentation, U.S. Bureau of Census, Washington DC (2000), accessible via http://www.census.gov/geo/www/tiger/tigerua/uatgr2k.html
- 10.TPC-H, Transaction Processing Performance Council, accessible via http://www.tpc.org/tpch/