Abstract
In this paper we approach the solution of large instances of the distribution design problem. Traditional approaches do not consider that the size of the instances can significantly affect the efficiency of the solution process. This paper shows the feasibility to solve large scale instances of the distribution design problem by compressing the instance to be solved. The goal of the compression is to obtain a reduction in the amount of resources needed to solve the original instance, without significantly reducing the quality of its solution. In order to preserve the solution quality, the compression summarizes the access pattern of the original instance using clustering techniques. In order to validate the approach we tested it on a new model of the replicated version of the distribution design problem that incorporates generalized database objects. The experimental results show that our approach permits to reduce the computational resources needed for solving large instances, using an efficient clustering algorithm. We present experimental evidence of the clustering efficiency of the algorithm.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Garey, M., Johnson, D.: Computer and Intractability: A guide to the theory of NP-Completeness. W.H. Freeman, New York (1979)
Papadimitriou, C., Steiglitz, K.: Combinatorial Optimization: Algorithms and Complexity. Dover Publications, New York (1998)
Barr, R.S., Golden, B.L., Kelly, J., Steward, W.R., Resende, M.: Guidelines for designing and reporting on computational experiments with heuristic methods. In: Proceedings of International Conference on Metaheuristics for Optimization, pp. 1–17. Kluwer Academic Publishers, Dordrecht (2001)
Michalewicz, Z., Fogel, D.B.: How to Solve It: Modern Heuristics. Springer, Heidelberg (1999)
Pérez, J.: Integración de la Fragmentación Vertical y Ubicación en el Diseño Adaptativo de Bases de Datos Distribuidas. PhD thesis, ITESM, Morelos, México (1999)
Pérez, J., Pazos, R., Vélez, L., Rodríguez, G.: Automatic generation of control parameters for the threshold accepting algorithm. In: Coello Coello, C.A., de Albornoz, Á., Sucar, L.E., Battistutti, O.C. (eds.) MICAI 2002. LNCS, vol. 2313, pp. 119–127. Springer, Heidelberg (2002)
Pérez, J., Pazos, R., Frausto, J., Romero, D., Cruz, L.: Data-object réplication, distribution and mobility in network environment. In: Broy, M., Zamulin, A.V. (eds.) PSI 2003. LNCS, vol. 2890, pp. 539–545. Springer, Heidelberg (2004)
Ceri, S., Navathe, S., Wiederhold, G.: Distribution design of logical database schemes. IEEE Transactions on Software Engineering SE-9, 487 (1983)
Navathe, S., Ceri, S., Wiederhold, G., Dou, J.: Vertical partitioning algorithms for database design. ACM Transactions On Database Systems 9, 680–710 (1984)
Apers, P.: Data allocation in distributed database systems. Vol. 13, 263–304 (1988)
Johansson, J., March, S., Naumann, J.: The effects of parallel processing on update response time in distributed database design. In: Proceedings of the 21st International Conference On Information Systems, pp. 187–196 (2000)
Visinescu, C.: Incremental data distibution on internet-based distributed systems: A spring system approach. Master’s thesis, University of Waterloo, Ontario, Canada (2003)
Baiao, F., Mattoso, M., Zaverucha, G.: A distribution design metodology for objects dbms. Distributed and Parallel Databases. Kluwer Academic Publishers 16, 45–90 (2004)
Zilio, D., Rao, J., Lightstone, S., Lohman, G., Storm, A., Garcia-Arellano, C., Fadden, S.: Db2 design advisor: Integrated automatic physical database design. In:Proceedings of the Thirtieth International Conference on Very Large Data Bases 2004, pp. 1087–1097 (2004)
Halkidi, M., Batistakis, Y., Vazirgiannis, M.: On clustering validation techniques. Journal of Intelligent Information Systems 17, 107–145 (2001)
Berkhin, P.: Survey of clustering data mining techniques. Technical report, Accrue Software (2002), http://www.accrue.com/products/rp_cluster_review.pdf
Fraire, H.: Una Metodología para el Diseño de la Fragmentación y Ubicación en Grandes Bases de Datos Distribuidas. PhD thesis, CENIDET, Cuernavaca, Morelos, México (2005)
Beeferman, D., Berger, A.: Agglomerative clustering of a search engine query log. In: Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 407–416 (2000)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Fraire H., H., Castilla V., G., Hernández R., A., Gómez S., C., Mora O., G., Godoy V., A. (2005). A Model for the Distribution Design of Distributed Databases and an Approach to Solve Large Instances. In: Pal, A., Kshemkalyani, A.D., Kumar, R., Gupta, A. (eds) Distributed Computing – IWDC 2005. IWDC 2005. Lecture Notes in Computer Science, vol 3741. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11603771_56
Download citation
DOI: https://doi.org/10.1007/11603771_56
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-30959-8
Online ISBN: 978-3-540-32428-7
eBook Packages: Computer ScienceComputer Science (R0)