Advertisement

Adaptable Similarity Search Using Vector Quantization

  • Christian Böhm
  • Hans-Peter Kriegel
  • Thomas Seidl
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2114)

Abstract

Adaptable similarity queries based on quadratic form distance functions are widely popular in data mining applications, particularly for domains such as multimedia, CAD, molecular biology or medical image databases. Recently it has been recognized that quantization of feature vectors can substantially improve query processing for Euclidean distance functions, as demonstrated by the scan-based VA-file and the index structure IQ-tree. In this paper, we address the problem that determining quadratic form distances between quantized vectors is difficult and computationally expensive. Our solution provides a variety of new approximation techniques for quantized vectors which are combined by an extended multistep query processing architecture. In our analysis section we show that the filter steps complement each other. Consequently, it is useful to apply our filters in combination. We show the superiority of our approach over other architectures and over competitive query processing methods. In our experimental evaluation, the sequential scan is outperformed by a factor of 2.3. Compared to the X-tree, on 64 dimensional color histogram data, we measured an improvement factor of 5.6.

Keywords

Grid Cell Query Processing Vector Quantization Range Query Query Point 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Ankerst M., Braunmüller B., Kriegel H.-P., Seidl T.: Improving Adaptable Similarity Query Processing by Using Approximations. Proc. 24th Int. Conf. on Very Large Data Bases (VLDB), 1998, 206–217.Google Scholar
  2. 2.
    Ankerst M., Kastenmüller G., Kriegel H.-P., Seidl T.: 3D Shape Histograms for Similarity Search and Classification in Spatial Databases. Int. Symp. on Spatial Databases (SSD), LNCS 1651, 1999, 207–226.Google Scholar
  3. 3.
    Ankerst M., Kriegel H.-P., Seidl T.: A Multi-Step Approach for Shape Similarity Search in Image Databases. IEEE Transactions on Knowledge and Data Engineering (TKDE) 10(6), 1998, 996–1004.CrossRefGoogle Scholar
  4. 4.
    Berchtold S., Böhm C., Jagadish H.V., Kriegel H.-P., Sander J.: Independent Quantization: An Index Compression Technique for High-Dimensional Data Spaces, Int. Conf. on Data Engineering (ICDE), 2000.Google Scholar
  5. 5.
    Berchtold S., Böhm C., Keim D., Kriegel H.-P.: A Cost Model For Nearest Neighbor Search in High-Dimensional Data Space. ACM PODS Symposium on Principles of Database Systems, 1997.Google Scholar
  6. 6.
    Berchtold S., Keim D., Kriegel H.-P.: The X-tree: An Index Structure for High-Dimensional Data. Proc. 22nd Int. Conf. on Very Large Data Bases (VLDB), 1996, 28–39.Google Scholar
  7. 7.
    Faloutsos C., Barber R., Flickner M., Hafner J., Niblack W., Petkovic D., Equitz W.: Efficient and Effective Querying by Image Content. Journal of Intelligent Information Systems, Vol. 3, 1994, 231–262.CrossRefGoogle Scholar
  8. 8.
    Hafner J., Sawhney H.S., Equitz W., Flickner M., Niblack W.: Efficient Color Histogram Indexing for Quadratic Form Distance Functions. IEEE Trans. on Pattern Analysis and Machine Intelligence (PAMI) 17(7), 1995, 729–736.CrossRefGoogle Scholar
  9. 9.
    Ishikawa Y., Subramanya R., Faloutsos C.: MindReader: Querying Databases Through Multiple Examples. Proc. 24th Int. Conf. on Very Large Data Bases (VLDB), 1998, 218–227.Google Scholar
  10. 10.
    Kriegel H.-P., Seidl T.: Approximation-Based Similarity Search for 3-D Surface Segments. GeoInformatica Int. Journal, Vol. 2,No. 2. Kluwer Academic Publishers, 1998, 113–147.CrossRefGoogle Scholar
  11. 11.
    Lin K., Jagadish H.V., Faloutsos C.: ‘The TV-Tree: An Index Structure for High-Dimensional Data. VLDB Journal 3(4), 1994, 517–542.CrossRefGoogle Scholar
  12. 12.
    Seidl T., Kriegel H.-P.: Efficient User-Adaptable Similarity Search in Large Multimedia Databases. Proc. 23rd Int. Conf. on Very Large Data Bases (VLDB), 1997, 506–515.Google Scholar
  13. 13.
    Smith J.R.: Integrated Spatial and Feature Image Systems: Retrieval, Compression and Analysis. Ph.D. thesis, Graduate School of Arts and Sciences, Columbia University, 1997.Google Scholar
  14. 14.
    White D.A., Jain R.: Similarity indexing with the SS-tree. Int. Conf on Data Engineering (ICDE), 1996.Google Scholar
  15. 15.
    Weber R., Schek H.-J., Blott S.: A Quantitative Analysis and Performance Study for Similarity-Search Methods in High-Dimensional Spaces. Int. Conf. on Very Large Databases (VLDB), 1998, 194–205.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2001

Authors and Affiliations

  • Christian Böhm
    • 1
  • Hans-Peter Kriegel
    • 1
  • Thomas Seidl
    • 1
  1. 1.University of MunichMunichGermany

Personalised recommendations