Skip to main content

Optimal Distributed Declustering Using Replication

  • Conference paper
Database Theory - ICDT 2005 (ICDT 2005)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 3363))

Included in the following conference series:

Abstract

A common technique for improving performance for database query retrieval is to decluster the database among multiple disks so that retrievals can be parallelized. In this paper we focus on answering range queries over a multidimensional database, where each of its dimensions are divided uniformly to obtain tiles which are placed on different disks; there has been a significant amount of research for determining how to place the records on disks to minimize the retrieval time. Recently, the idea of using replication (i.e., placing records on more than one disk) to improve performance has been introduced. When using replication there are two goals: i) to minimize the retrieval time and ii) to minimize the scheduling overhead it takes to determine which disk obtains a specific record when processing a query. The previously known replicated declustering schemes with low retrieval times are randomized; and one of the primary advantages of randomized schemes is that they balance the load evenly among the disks for large queries with high probability. In this paper we introduce a new class of replicated placement schemes called the shift schemes that are: i) deterministic, ii) have retrieval performance that is comparable to the randomized schemes, iii) have a strictly optimal retrieval time for all large queries, and iv) have a more efficient query scheduling algorithm than those for the randomized placements. Furthermore, we display experimental results that suggest that the shift schemes have stronger average performance (in terms of retrieval times) than the randomized schemes.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Czumaj, C.R.A., Scheideler, C.: Perfectly Balanced Allocation. In: Arora, S., Jansen, K., Rolim, J.D.P., Sahai, A. (eds.) RANDOM 2003 and APPROX 2003. LNCS, vol. 2764, pp. 240–251. Springer, Heidelberg (2003)

    Google Scholar 

  2. Abdel-Ghaffar, K., Abbadi, A.E.: Optimal Allocation of Two-dimensional Data. In: International Conference on Database Theory, pp. 409–418 (1997)

    Google Scholar 

  3. Aerts, J., Korst, J., Egner, S.: Random Duplicate Storage for Load Balancing in Multimedia Servers. Information Processing Letters 76(1–2), 51–59 (2000)

    Article  MathSciNet  Google Scholar 

  4. Atallah, M., Frikken, K.: Replicated Parallel I/O without Additional Scheduling Costs. In: Mařík, V., Štěpánková, O., Retschitzegger, W. (eds.) DEXA 2003. LNCS, vol. 2736, pp. 223–232. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  5. Atallah, M.J., Prabhakar, S.: (Almost) Optimal Parallel Block Access to Range Queries. In: Proceedings of the nineteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, pp. 205–215. ACM Press, New York (2000)

    Chapter  Google Scholar 

  6. Bhatia, R., Sinha, R., Chen, C.-M.: Hierarchical Declustering Schemes for Range Queries. In: 7th Int’l Conf. on Extending Database Technology (2000)

    Google Scholar 

  7. Bhatia, R., Sinha, R.K., Chen, C.-M.: Declustering using Golden Ratio Sequences. In: ICDE, pp. 271–280 (2000)

    Google Scholar 

  8. Chen, C.-M., Cheng, C.T.: From Discrepancy to Declustering: Near-optimal Multidimensional Declustering Strategies for Range Queries. In: Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, pp. 29–38. ACM Press, New York (2002)

    Chapter  Google Scholar 

  9. Chen, C.-M., Cheng, C.T.: Replication and Retrieval Strategies of Multidimensional Data on Parallel Disks. In: Proceedings of the twelfth international conference on Information and knowledge management, pp. 32–39. ACM Press, New York (2003)

    Chapter  Google Scholar 

  10. Chen, L.T., Rotem, D.: Optimal Response Time Retrieval of Replicated Data (extended abstract). In: Proceedings of the thirteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems, pp. 36–44. ACM Press, New York (1994)

    Chapter  Google Scholar 

  11. Du, H., Sobolewski, J.: Disk Allocation for Cartesian Product Files on Multiple Disk Systems. ACM Transactions on Database System, 82–101 (1982)

    Google Scholar 

  12. Frikken, K., Atallah, M., Prabhakar, S., Safavi-Naini, R.: Optimal Parallel I/O for Range Queries through Replication. In: Hameurlain, A., Cicchetti, R., Traunmüller, R. (eds.) DEXA 2002. LNCS, vol. 2453, pp. 669–678. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  13. Himatsingka, B., Srivastava, J., Li, J.-Z., Rotem, D.: Latin Hypercubes: A Class of Multidimensional Declustering Techniques (1994)

    Google Scholar 

  14. Hsiao, H.-I., DeWitt, D.: A new Availability Strategy for Multiprocessor Database Machines. In: Proceedings of Data Engineering, pp. 456–465 (1990)

    Google Scholar 

  15. Kim, M.H., Pramanik, S.: Optimal File Distribution for Partial Match Retrieval. In: Proceedings of the 1988 ACM SIGMOD international conference on Management of data, pp. 173–182. ACM Press, New York (1988)

    Chapter  Google Scholar 

  16. Matousek, J.: Geometric discrepancy, an illustrated guide. Springer, Heidelberg (1999)

    MATH  Google Scholar 

  17. Prabhakar, S., Abdel-Ghaffar, K., Agrawal, D., Abbadi, A.E.: Cyclic Allocation of Two-Dimensional Data. In: 14th International Conference on Data Engineering, pp. 94–101 (1998)

    Google Scholar 

  18. Sanders, P.: Reconciling Simplicity and Realism in Parallel Disk Models. In: Proceedings of the twelfth annual ACM-SIAM symposium on Discrete algorithms, pp. 67–76. ACM Press, New York (2001)

    Google Scholar 

  19. Sanders, P., Egner, S., Korst, J.: Fast Concurrent Access to Parallel Disks. In: Proceedings of the eleventh annual ACM-SIAM symposium on Discrete algorithms, pp. 849–858. ACM Press, New York (2000)

    Google Scholar 

  20. Sinha, R.K., Bhatia, R., Chen, C.-M.: Asymptotically Optimal Declustering Schemes for Range Queries. In: Van den Bussche, J., Vianu, V. (eds.) ICDT 2001. LNCS, vol. 1973, p. 144. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  21. Tosun, A., Ferhatosmanoglu, H.: Optimal Parallel I/O using Replication. Technical Report OSU-CISRC-11/01-TR26 (2001)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Frikken, K.B. (2004). Optimal Distributed Declustering Using Replication. In: Eiter, T., Libkin, L. (eds) Database Theory - ICDT 2005. ICDT 2005. Lecture Notes in Computer Science, vol 3363. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30570-5_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-30570-5_10

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-24288-8

  • Online ISBN: 978-3-540-30570-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics