Abstract
An approximate algorithm to efficiently solve the k-Closest- Pairs problem in high-dimensional spaces is presented. The method is based on dimensionality reduction of the space ℝd through the Hilbert space filling curve and performs at most d+1 scans of the data set. After each scan, those points whose contribution to the solution has already been analyzed, are eliminated from the data set. The pruning is lossless, in fact the remaining points along with the approximate solution found can be used for the computation of the exact solution. Although we are able to guarantee an O(d 1+ 1/t ) approximation to the solution, where t = 1,…,∞ denotes the used L t metric, experimental results give the exact k-Closest-Pairs for all the data sets considered and show that the pruning of the search space is effective.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
H. C. Andrews. Introduction to Mathematical Techniques in Pattern Recognition. Wiley-Interscience, New York, 1972.
J. L. Bentley and M. I. Shamos. Divide-and-conquer in multidimensional space. In Proc. of the 8th ACM Symp. on Theory of Computing, pages 220–230, 1996.
T. Chan. Approximate nearest neighbor queries revisited. In Proc. of the 13th ACM Symp. on Computational Geometry, pages 352–358, 1997.
A. Corral, Y. Manolopoulos, Y. Theodoridis, and M. Vassilakopoulos. Closest pair queries in spatial databases. In Proc. ACM Int. Conf. on Managment of Data (SIGMOD’00), pages 189–2000, 2000.
C. Faloutsos. Multiattribute hashing using gray codes. In Proceedings ACM Int. Conf. on Managment of Data (SIGMOD’86), pages 227–238, 1986.
C. Faloutsos and S. Roseman. Fractals for secondary key retrieval. In Proc. ACM Int. Conf. on Principles of Database Systems (PODS’89), pages 247–252, 1989.
J. A. Hartigan. Clustering Algorithms. Wiley, New York, 1975.
H.V. Jagadish. Linear clustering of objects with multiple atributes. In Proc. ACM Int. Conf. on Managment of Data (SIGMOD’90), pages 332–342, 1990.
N. Katoh and K. Iwano. Finding k furthest pairs and k closest/farthest bichromatic pairs for points in the plane. In Proc. of the 8th ACM Symp. on Computational Geometry, pages 320–329, 1992.
H.P. Lenhof and M. Smid. Enumerating the k closest pairs optimally. In Proc. of the 33rd IEEE Symp. on Foundation of Computer Science (FOCS92), pages 380–386, 1992.
S. Liao, M. Lopez, and S. Leutenegger. High dimensional similarity search with space filling curves. In Proc. of the 17th Int. Conf. on Data Engineering (ICDE), pages 615–622, 2001.
M. Lopez and S. Liao. Finding k-closest-pairs efficiently for high dimensional data. In Proc. of the 12th Canadian Conf. on Computational Geometry (CCCG), pages 197–204, 2000.
B. Moon, H.V. Jagadish, C. Faloutsos, and J.H. Saltz. Analysis of the clustering properties of the Hilbert space-filling curve. Technical Report 10, Department of Computer Science, University of Arizona, Tucson, August 1999.
A. Nanopoulos, Y. Theodoridis, and Y. Manolopoulos. C 2 P: Clustering based on closest pairs. In Proc. of the 27th Conf. on Very Large Database (VLDB’01), pages 331–340, 2001.
K. Shim S. Guha, R. Rastogi. Cure: An efficient clustering algorithm for large databases. In Proc. ACM Int. Conf. on Managment of Data (SIGMOD’86), pages 73–84, 1998.
Hans Sagan. Space Filling Curves. Springer-Verlag, 1994.
J. Shepherd, X. Zhu, and N. Megiddo. A fast indexing method for multidimensional nearest neighbor search. In Proc. of SPIE Vol. 3656, Storage and retrieval for image and video databases, pages 350–355, 1998.
M. Smid. Closest-point problems in computational geometry. In Tech. Report, Univ. Magdeburg, Germany, pages 1–63, 1997.
Roman G. Strongin and Yaroslav D. Sergeyev. Global Optimization with Non-Convex Costraints. Kluwer Academic, 2000.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Angiulli, F., Pizzuti, C. (2002). Approximate k-Closest-Pairs with Space Filling Curves. In: Kambayashi, Y., Winiwarter, W., Arikawa, M. (eds) Data Warehousing and Knowledge Discovery. DaWaK 2002. Lecture Notes in Computer Science, vol 2454. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-46145-0_13
Download citation
DOI: https://doi.org/10.1007/3-540-46145-0_13
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-44123-6
Online ISBN: 978-3-540-46145-6
eBook Packages: Springer Book Archive