Skip to main content

Querying Objects Modeled by Arbitrary Probability Distributions

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4605))

Abstract

In many modern applications such as biometric identification systems, sensor networks, medical imaging, geology, and multimedia databases, the data objects are not described exactly. Therefore, recent solutions propose to model data objects by probability density functions(pdf). Since a pdf describing an uncertain object is often not explicitly known, approximation techniques like Gaussian mixture models(GMM) need to be employed. In this paper, we introduce a method for efficiently indexing and querying GMMs allowing fast object retrieval for arbitrary shaped pdf. We consider probability ranking queries which are very important for probabilistic similarity search. Our method stores the components and weighting functions of each GMM in an index structure. During query processing the mixture models are dynamically reconstructed whenever necessary. In an extensive experimental evaluation, we demonstrate that GMMs yield a compact and descriptive representation of video clips. Additionally, we show that our new query algorithm outperforms competitive approaches when answering the given probabilistic queries on a database of GMMs comprising about 100.000 single Gaussians.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Faradjian, A., Gehrke, J., Bonnet, P.: GADT: A Probability Space ADT For Representing and Querying the Physical World. In: Proc. 18th Int. Conf. on Data Engineering (ICDE 2002),San Jose, CA, USA p. 201 (2002)

    Google Scholar 

  2. Cheng, R., Kalashnikov, D.V., Prabhakar, S.: Evaluating Probabilistic Queries over Imprecise Data. In: Proc. ACM SIGMOD Int. Conf. on Management of Data (SIGMOD 2003), San Diego, CA, USA pp. 551–562 (2003)

    Google Scholar 

  3. Cheng, R., Xia, Y., Prabhakar, S., Shah, R., Vitter, J.S.: Efficient Indexing Methods for Probabilistic Threshold Queries over Uncertain Data. In: Proc. 30th Int. Conf. on Very Large Data Bases (VLDB 2004), Toronto, Cananda, pp. 876–887 (2004)

    Google Scholar 

  4. Deshpande, A., Guestrin, C., Madden, S., Hellerstein, J., Hong, W.: Model-driven data acquisition in sensor networks. In: Proc. 30th Int. Conf. on Very Large Data Bases (VLDB 2004), Toronto, Cananda (2004)

    Google Scholar 

  5. Tao, Y., Cheng, R., Xiao, X., Ngai, W.K., Kao, B., Prabhakar, S.: Indexing Multi-Dimensional Uncertain Data with Arbitrary Probability Density Functions. In: Proc. 30th Int. Conf. on Very Large Data Bases (VLDB 2005), Trondheim, Norway, pp. 922–933. (2005)

    Google Scholar 

  6. Böhm, C., Pryakhin, A., Schubert, M.: The Gauss-Tree: Efficient Object Identification of Probabilistic Feature Vectors. In: Proc. 22nd Int. Conf. on Data Engineering (ICDE 2006), Atlanta, GA, US, p. 9 (2006)

    Google Scholar 

  7. Titterington, D.M., Smith, A.F.M., Makov, U.E.: Statistical analysis of finite mixture distribution. Wiley, New York (1985)

    Google Scholar 

  8. Lindsay, B.G.: Mixture models: Theory, geometry, and applications (1995)

    Google Scholar 

  9. Greenspan, H., Goldberger, J., Mayer, A.: A probabilistic framework for spatio-temporal video representation & indexing. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002. LNCS, vol. 2350, pp. 461–475. Springer, Heidelberg (2002)

    Google Scholar 

  10. Yang, M., Ahuja, N.: Gaussian mixture model for human skin color and its application in image and video databases. In: SPIE 1999. Proc. of the Conf. on Storage and Retrieval for Image and Video Databases, vol. 3656, pp. 458–466. Springer, Heidelberg (1999)

    Chapter  Google Scholar 

  11. Chen, S.-C., Kashyap, R.L., Ghafoor, A.: Semantic Models for Multimedia Database Searching and Browsing. Kluwer Academic Publishers, Dordrecht (2002)

    Google Scholar 

  12. Srinivasan, U., Nepal, N.: Managing Multimedia Semantics. IRM Press (2005)

    Google Scholar 

  13. Deb, S.: Video Data Management and Information Retrieval. Idea Group Publishing (2005)

    Google Scholar 

  14. Gavin, D.G., Hu, F.S.: Bioclimatic modelling using gaussian mixture distributions and multiscale segmentation. Global Ecology and Biogeography 14, 491 (2005)

    Article  Google Scholar 

  15. Lim, P., Quek, S., Peh, K.: Application of the gaussian mixture model to drug dissolution profiles prediction. Neural Comput. Appl. 14(4), 345–352 (2005)

    Article  Google Scholar 

  16. Zajdel, W., Kröse, B.: Gaussian mixture model for multi-sensor tracking. In: Proc. of the 15th Dutch-Belgian Artificial Intelligence Conference (BNAIC 2003), pp. 371–378 (2003)

    Google Scholar 

  17. Reynolds, D.A., Quatieri, T.F., Dunn, R.B.: Speaker verification using adapted gaussian mixture models. Digital Signal Processing 10(1), 19–41 (2000)

    Article  Google Scholar 

  18. Yoo, S-H.: Application of a mixture model to approximate bottled water consumption distribution. Applied Economics Letters 10(3), 181–184 (2003)

    Article  Google Scholar 

  19. Deshpande, A., Guestrin, C., Madden, S.R.: Using Probabilistic Models for Data Management in Acquisitional Environments. In: Proc. CIDR (2005)

    Google Scholar 

  20. Böhm, C., Pryakhin, A., Schubert, M.: Probabilistic Ranking Queries on Gaussians. In: Proc. of the 18th Int. Conf. on Scientific and Statistical Database Management (SSDBM 2006), pp. 169–178 (2006)

    Google Scholar 

  21. Cheng, R., Kalashnikov, D.V., Prabhakar, S.: Evaluation of Probabilistic Queries over Imprecise Data in Constantly-Evolving Environments 32(1), 104–130 (2007)

    Google Scholar 

  22. Dai, X., Yiu, M.L., Mamoulis, N., Tao, Y., Vaitis, M.: Probabilistic Spatial Queries on Existentially Uncertain Data. In: Bauzer Medeiros, C., Egenhofer, M.J., Bertino, E. (eds.) SSTD 2005. LNCS, vol. 3633, pp. 400–417. Springer, Heidelberg (2005)

    Google Scholar 

  23. Ljosa, V., Singh, A.K.: APLA: Indexing arbitrary probability distributions. In: Proc. of the 23rd Int. Conf. on Data Engineering (ICDE 2007) (2007)

    Google Scholar 

  24. Chang, H.S., Sull, S., Lee, S.U.: Efficient Video Indexing Scheme for Content-Based Retrieval. In: IEEE Transactions on Circuits and Systems for Video Technology, vol. 9, pp. 1269–1279. IEEE Computer Society Press, Los Alamitos (1999)

    Google Scholar 

  25. Zhuang, Y., Rui, Y., Huang, T.S., Mehrotra, S.: Adaptive key frame extraction using unsupervised clustering. In: ICIP (1), pp. 866–870 (1998)

    Google Scholar 

  26. Cheung, S.S., Zakhor, A.: Efficient video similarity measurement with video signature. In: ICIP 2002. IEEE International Conference on Image Processing, vol. 1, pp. 621–624. IEEE Computer Society Press, Los Alamitos (2002)

    Google Scholar 

  27. Han, J., M., K.: Data Mining: Concepts and Techniques. Morgan Kaufmann, San Francisco (2006)

    Google Scholar 

  28. Witten, I.H., E., F.: Data Mining. Practical Machine Learning Tools and Techniques. Morgan Kaufmann, San Francisco (2005)

    MATH  Google Scholar 

  29. Guttman, A.: R-trees: A Dynamic Index Structure for Spatial Searching. In: Proc. ACM SIGMOD Int. Conf. on Management of Data, pp. 47–57. ACM Press, New York (1984)

    Google Scholar 

  30. Berchtold, S., Keim, D.A., Kriegel, H.P.: The X-Tree: An Index Structure for High-Dimensional Data. In: Proc. 22nd Int. Conf. on Very Large Data Bases (VLDB 1996), Bombay, India, pp. 28–39 (1996)

    Google Scholar 

  31. Eiter, T., Mannila, H.: Distance measures for point sets and their computation. Acta Informatica 34(2), 103–133 (1997)

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Dimitris Papadias Donghui Zhang George Kollios

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Böhm, C., Kunath, P., Pryakhin, A., Schubert, M. (2007). Querying Objects Modeled by Arbitrary Probability Distributions. In: Papadias, D., Zhang, D., Kollios, G. (eds) Advances in Spatial and Temporal Databases. SSTD 2007. Lecture Notes in Computer Science, vol 4605. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73540-3_17

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-73540-3_17

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-73539-7

  • Online ISBN: 978-3-540-73540-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics