An approximate oracle for distance in metric spaces
In this paper we present a new data structure for estimating distances in a pseudo-metric space. Given are a database of objects and a distance function for the objects, which is a pseudo-metric. We map the objects to vectors in a pseudo-Euclidean space with a reasonably low dimension while preserving the distance between two objects approximately. Such a data structure can be used as an approximate oracle to process a broad class of pattern-matching based queries. Experimental results on both synthetic and real data show the good performance of the oracle in distance estimation.
Unable to display preview. Download preview PDF.
- 1.R. Baeza-Yates, W. Cunto, U. Manber, and S. Wu.Proximity matching using fixed-queries trees. In Combinatorial Pattern Matching, Lecture Notes in Computer Science, pages 198–212, June 1994.Google Scholar
- 2.S. Berchtold, C. Bohm, B. Braunmuller, D. A. Keim, and H.-P. Kriegel. Fast parallel similarity search in multimedia databases. In Proceedings of the ACM SIGMOD International Conference on Management of Data, pages 1–12, May 1997.Google Scholar
- 3.C. Faloutsos and K.-I. Lin. Fastmap: A fast algorithm for indexing, data-mining and visualization of traditional and multimedia datasets. In Proceedings of the ACM SIGMOD International Conference on Management of Data, pages 163–174, May 1995.Google Scholar
- 4.K. Fukunaga. Introduction to Statistical Pattern Recognition. Academic Press, Inc., San Diego, California, 1990.Google Scholar
- 5.L. Godfarb. A new approach to pattern recognition. In L. Kanal and A. Rosenfeld, editors, Progress in Pattern Recognition, volume 2, pages 241–402, North-Holland, Amsterdam, 1985.Google Scholar
- 6.G. H. Golub and C. F. Van Loan. Matrix Computations. The Johns Hopkins University Press, Baltimore, Maryland, 1996.Google Scholar
- 7.W. Greub. Linear Algebra. Springer-Verlag, Inc., New York, New York, 1975.Google Scholar
- 8.J. L. Kelley. General Topology. D. Van Nostrand Company, Inc., Princeton, New Jersey, 1955.Google Scholar
- 9.P. D. Lax. Linear Algebra. John Wiley & Sons, Inc., New York, New York, 1997.Google Scholar
- 10.D. Shasha and T. L. Wang. New techniques for best-match retrieval. ACM Transactions on Information Systems, 8(2):140–158, April 1990.Google Scholar
- 11.A. F. Smeaton and C. J. Van Rijsbergen. The nearest neighbor problem in information retrieval: An algorithm using upperbounds. ACM SIGIR Forum, 16:83–87, 1981.Google Scholar
- 12.R. A. Wagner and M. J. Fischer. The string-to-string correction problem. Journal of the ACM, 21(1):168–173, Jan. 1974.Google Scholar