Abstract
Similarity search in metric spaces represents an important paradigm for content-based retrieval of many applications. Existing centralized search structures can speed-up retrieval, but they do not scale up to large volume of data because the response time is linearly increasing with the size of the searched file. The proposed GHT* index is a scalable and distributed structure. By exploiting parallelism in a dynamic network of computers, the GHT* achieves practically constant search time for similarity range queries in data-sets of arbitrary size. The structure also scales well with respect to the growing volume of retrieved data. Moreover, a small amount of replicated routing information on each server increases logarithmically. At the same time, the potential for interquery parallelism is increasing with the growing data-sets because the relative number of servers utilized by individual queries is decreasing. All these properties are verified by experiments on a prototype system using real-life data-sets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Amato, G., Rabitti, F., Savino, P., Zezula, P.: Region proximity in metric spaces and its use for approximate similarity search. ACM TOIS 21(2), 192–227 (2003)
Bustos, B., Navarro, G., Chávez, E.: Pivot selection techniques for proximity searching in metric spaces. In: Proc. of the XXI Conference of the Chilean Computer Science Society (SCCC 2001), pp. 33–40 (2001)
Chavez, E., Navarro, G., Baeza-Yates, R., Marroquin, J.: Proximity searching in metric spaces. ACM Computing Surveys 33(3), 273–321 (2001)
Ciaccia, P., Patella, M., Zezula, P.: M-tree: An efficient access method for similarity search in metric spaces. In: Proc. of 23rd International Conference on Very Large Data Bases (VLDB), pp. 426–435 (1997)
Devine, R.: Design and implementation of DDH: A distributed dynamic hashing algorithm. In: Lomet, D.B. (ed.) FODO 1993. LNCS, vol. 730, pp. 101–114. Springer, Heidelberg (1993)
Dohnal, V., Gennaro, C., Savino, P., Zezula, P.: D-index: Distance searching index for metric data sets. Multimedia Tools and Applications 21(1), 9–13 (2003)
Gennaro, C., Savino, P., Zezula, P.: Similarity search in metric databases through hashing. In: Proc. of the 3rd Work. on Multimedia Inf. Retrieval, pp. 1–5 (2001)
Hjaltason, G.R., Samet, H.: Index-driven similarity search in metric spaces. ACM Transactions on Database Systems 28(4), 517–580 (2003)
Johnson, T., Krishna, P.: Lazy updates for distributed search structure. In: Proc. of the ACM SIGMOD, vol. 22(2), pp. 337–346 (1993)
Kröll, B., Widmayer, P.: Distributing a search tree among a growing number of processors. In: Proc. of the ACM SIGMOD, vol. 23(2), pp. 265–276 (1994)
Litwin, W., Neimat, M., Schneider, D.A.: LH* - a scalable, distributed data structure. ACM Transactions on Database Systems 21(4), 480–525 (1996)
Ratnasamy, S., Francis, P., Handley, M., Karp, R., Shenker, S.: A scalable content addressable network. In: Proc. of ACM SIGCOMM 2001, pp. 161–172 (2001)
Tang, C., Xu, Z., Dwarkadas, S.: Peer-to-peer information retrieval using self-organizing semantic overlay networks. In: Proc. of Conference on Applications, tech., archit., and protocols for computer communications, pp. 175–186 (2003)
Uhlmann, J.K.: Satisfying general proximity / similarity queries with metric trees. IPL: Information Processing Letters 40, 175–179 (1991)
Zezula, P., Savino, P., Rabitti, F., Amato, G., Ciaccia, P.: Processing m-trees with parallel resources. In: Proc. of the 8th International Workshop on Research Issues in Data Engineering (RIDE 1998), Orlando, FL, February 1998, pp. 147–154 (1998)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Batko, M., Gennaro, C., Zezula, P. (2005). Similarity Grid for Searching in Metric Spaces. In: Türker, C., Agosti, M., Schek, HJ. (eds) Peer-to-Peer, Grid, and Service-Orientation in Digital Library Architectures. Lecture Notes in Computer Science, vol 3664. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11549819_3
Download citation
DOI: https://doi.org/10.1007/11549819_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-28711-7
Online ISBN: 978-3-540-28712-4
eBook Packages: Computer ScienceComputer Science (R0)