Abstract
In this paper we present a new data structure for estimating distances in a pseudo-metric space. Given are a database of objects and a distance function for the objects, which is a pseudo-metric. We map the objects to vectors in a pseudo-Euclidean space with a reasonably low dimension while preserving the distance between two objects approximately. Such a data structure can be used as an approximate oracle to process a broad class of pattern-matching based queries. Experimental results on both synthetic and real data show the good performance of the oracle in distance estimation.
Work supported in part by the Natural Sciences and Engineering Research Council of Canada under Grant No. OGP0046373, and by the U.S. NSF grants IRI-9224601, IRI-9224602, IRI-9531548 and IRI-9531554.
Preview
Unable to display preview. Download preview PDF.
References
R. Baeza-Yates, W. Cunto, U. Manber, and S. Wu.Proximity matching using fixed-queries trees. In Combinatorial Pattern Matching, Lecture Notes in Computer Science, pages 198–212, June 1994.
S. Berchtold, C. Bohm, B. Braunmuller, D. A. Keim, and H.-P. Kriegel. Fast parallel similarity search in multimedia databases. In Proceedings of the ACM SIGMOD International Conference on Management of Data, pages 1–12, May 1997.
C. Faloutsos and K.-I. Lin. Fastmap: A fast algorithm for indexing, data-mining and visualization of traditional and multimedia datasets. In Proceedings of the ACM SIGMOD International Conference on Management of Data, pages 163–174, May 1995.
K. Fukunaga. Introduction to Statistical Pattern Recognition. Academic Press, Inc., San Diego, California, 1990.
L. Godfarb. A new approach to pattern recognition. In L. Kanal and A. Rosenfeld, editors, Progress in Pattern Recognition, volume 2, pages 241–402, North-Holland, Amsterdam, 1985.
G. H. Golub and C. F. Van Loan. Matrix Computations. The Johns Hopkins University Press, Baltimore, Maryland, 1996.
W. Greub. Linear Algebra. Springer-Verlag, Inc., New York, New York, 1975.
J. L. Kelley. General Topology. D. Van Nostrand Company, Inc., Princeton, New Jersey, 1955.
P. D. Lax. Linear Algebra. John Wiley & Sons, Inc., New York, New York, 1997.
D. Shasha and T. L. Wang. New techniques for best-match retrieval. ACM Transactions on Information Systems, 8(2):140–158, April 1990.
A. F. Smeaton and C. J. Van Rijsbergen. The nearest neighbor problem in information retrieval: An algorithm using upperbounds. ACM SIGIR Forum, 16:83–87, 1981.
R. A. Wagner and M. J. Fischer. The string-to-string correction problem. Journal of the ACM, 21(1):168–173, Jan. 1974.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1998 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Yang, Y., Zhang, K., Wang, X., Wang, J.T.L., Shasha, D. (1998). An approximate oracle for distance in metric spaces. In: Farach-Colton, M. (eds) Combinatorial Pattern Matching. CPM 1998. Lecture Notes in Computer Science, vol 1448. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0030784
Download citation
DOI: https://doi.org/10.1007/BFb0030784
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-64739-3
Online ISBN: 978-3-540-69054-2
eBook Packages: Springer Book Archive