An approximate oracle for distance in metric spaces

  • Yanling Yang
  • Kaizhong Zhang
  • Xiong Wang
  • Jason T. L. Wang
  • Dennis Shasha
Session III
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1448)


In this paper we present a new data structure for estimating distances in a pseudo-metric space. Given are a database of objects and a distance function for the objects, which is a pseudo-metric. We map the objects to vectors in a pseudo-Euclidean space with a reasonably low dimension while preserving the distance between two objects approximately. Such a data structure can be used as an approximate oracle to process a broad class of pattern-matching based queries. Experimental results on both synthetic and real data show the good performance of the oracle in distance estimation.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    R. Baeza-Yates, W. Cunto, U. Manber, and S. Wu.Proximity matching using fixed-queries trees. In Combinatorial Pattern Matching, Lecture Notes in Computer Science, pages 198–212, June 1994.Google Scholar
  2. 2.
    S. Berchtold, C. Bohm, B. Braunmuller, D. A. Keim, and H.-P. Kriegel. Fast parallel similarity search in multimedia databases. In Proceedings of the ACM SIGMOD International Conference on Management of Data, pages 1–12, May 1997.Google Scholar
  3. 3.
    C. Faloutsos and K.-I. Lin. Fastmap: A fast algorithm for indexing, data-mining and visualization of traditional and multimedia datasets. In Proceedings of the ACM SIGMOD International Conference on Management of Data, pages 163–174, May 1995.Google Scholar
  4. 4.
    K. Fukunaga. Introduction to Statistical Pattern Recognition. Academic Press, Inc., San Diego, California, 1990.Google Scholar
  5. 5.
    L. Godfarb. A new approach to pattern recognition. In L. Kanal and A. Rosenfeld, editors, Progress in Pattern Recognition, volume 2, pages 241–402, North-Holland, Amsterdam, 1985.Google Scholar
  6. 6.
    G. H. Golub and C. F. Van Loan. Matrix Computations. The Johns Hopkins University Press, Baltimore, Maryland, 1996.Google Scholar
  7. 7.
    W. Greub. Linear Algebra. Springer-Verlag, Inc., New York, New York, 1975.Google Scholar
  8. 8.
    J. L. Kelley. General Topology. D. Van Nostrand Company, Inc., Princeton, New Jersey, 1955.Google Scholar
  9. 9.
    P. D. Lax. Linear Algebra. John Wiley & Sons, Inc., New York, New York, 1997.Google Scholar
  10. 10.
    D. Shasha and T. L. Wang. New techniques for best-match retrieval. ACM Transactions on Information Systems, 8(2):140–158, April 1990.Google Scholar
  11. 11.
    A. F. Smeaton and C. J. Van Rijsbergen. The nearest neighbor problem in information retrieval: An algorithm using upperbounds. ACM SIGIR Forum, 16:83–87, 1981.Google Scholar
  12. 12.
    R. A. Wagner and M. J. Fischer. The string-to-string correction problem. Journal of the ACM, 21(1):168–173, Jan. 1974.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1998

Authors and Affiliations

  • Yanling Yang
    • 1
  • Kaizhong Zhang
    • 2
  • Xiong Wang
    • 3
  • Jason T. L. Wang
    • 4
  • Dennis Shasha
    • 5
  1. 1.Department of MathematicsBeijing Institute of Light IndustryBeijingChina
  2. 2.Department of Computer ScienceThe University of Western OntarioLondonCanada
  3. 3.Department of CISNew Jersey Institute of TechnologyNewarkUSA
  4. 4.Department of CISNew Jersey Institute of TechnologyNewarkUSA
  5. 5.Courant Institute of Mathematical SciencesNew York UniversityUSA

Personalised recommendations