Skip to main content

Novel Indexing Strategy and Similarity Measures for Gaussian Mixture Models

  • Conference paper
  • First Online:
  • 1088 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10439))

Abstract

Efficient similarity search for data with complex structures is a challenging task in many modern data mining applications, such as image retrieval, speaker recognition and stock market analysis. A common way to model these data objects is using Gaussian Mixture Models which has the ability to approximate arbitrary distributions in a concise way. To facilitate efficient queries, indexes are essential techniques. However, due different numbers of components in Gaussian Mixture Models, existing index methods tend to break down in performance. In this paper we propose a novel technique Normalized Transformation that reorganizes the index structure to account for different numbers of components in Gaussian Mixture Models. In addition, Normalized Transformation enables us to derive a set of similarity measures on the basis of existing ones that have close-form expression. Extensive experiments demonstrate the effectiveness of proposed technique for Gaussian component-based indexing and the performance of the novel similarity measures for clustering and classification.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    To the best of our knowledge.

  2. 2.

    Implementation provided by WEKA at http://weka.sourceforge.net/doc.dev/weka/clusterers/EM.html.

  3. 3.

    https://drive.google.com/open?id=0B3LRCuPdnX1BSTU3UjBCVDJSLWs.

  4. 4.

    http://archive.ics.uci.edu/ml/machine-learning-databases/00287/.

  5. 5.

    http://aloi.science.uva.nl/.

  6. 6.

    http://www.repository.voxforge1.org/downloads/SpeechCorpus/Trunk/Audio/Main/16kHz_16bit/.

References

  1. STATS description. https://www.stats.com/sportvu-basketball-media/. Accessed 25 Feb 2017

  2. Campbell, W.M., Sturim, D.E., Reynolds, D.A.: Support vector machines using GMM supervectors for speaker verification. IEEE Sig. Process. Lett. 13(5), 308–311 (2006)

    Article  Google Scholar 

  3. Reynolds, D.A., Quatieri, T.F., Dunn, R.B.: Speaker verification using adapted gaussian mixture models. Digit. Sig. Process. 10(1–3), 19–41 (2000)

    Article  Google Scholar 

  4. KaewTraKulPong, P., Bowden, R.: An improved adaptive background mixture model for real-time tracking with shadow detection. In: Remagnino, P., Jones, G.A., Paragios, N., Regazzoni, C.S. (eds.) Video-Based Surveillance Systems, pp. 135–144. Springer, Boston (2002). doi:10.1007/978-1-4615-0913-4_11

    Chapter  Google Scholar 

  5. Zivkovic, Z.: Improved adaptive Gaussian mixture model for background subtraction. In: ICPR, pp. 28–31 (2004)

    Google Scholar 

  6. Böhm, C., Pryakhin, A., Schubert, M.: The Gauss-tree: efficient object identification in databases of probabilistic feature vectors. In: ICDE, p. 9 (2006)

    Google Scholar 

  7. Helén, M.L., Virtanen, T.: Query by example of audio signals using Euclidean distance between Gaussian mixture models. In: ICASSP, vol. 1, pp. 225–228 (2007)

    Google Scholar 

  8. Sfikas, G., Constantinopoulos, C., Likas, A., Galatsanos, N.P.: An analytic distance metric for gaussian mixture models with application in image retrieval. In: Duch, W., Kacprzyk, J., Oja, E., Zadrożny, S. (eds.) ICANN 2005. LNCS, vol. 3697, pp. 835–840. Springer, Heidelberg (2005). doi:10.1007/11550907_132

    Google Scholar 

  9. Jensen, J.H., Ellis, D.P., Christensen, M.G., Jensen, S.H.: Evaluation of distance measures between Gaussian mixture models of MFCCs. In: ISMIR, pp. 107–108 (2007)

    Google Scholar 

  10. Tao, Y., Cheng, R., Xiao, X., Ngai, W.K., Kao, B., Prabhakar, S.: Indexing multi-dimensional uncertain data with arbitrary probability density functions. In: VLDB, pp. 922–933 (2005)

    Google Scholar 

  11. Zhou, L., Wackersreuther, B., Fiedler, F., Plant, C., Böhm, C.: Gaussian component based index for GMMs. In: ICDM, pp. 1365–1370 (2016)

    Google Scholar 

  12. Böhm, C., Kunath, P., Pryakhin, A., Schubert, M.: Querying objects modeled by arbitrary probability distributions. In: Papadias, D., Zhang, D., Kollios, G. (eds.) SSTD 2007. LNCS, vol. 4605, pp. 294–311. Springer, Heidelberg (2007). doi:10.1007/978-3-540-73540-3_17

    Chapter  Google Scholar 

  13. Kullback, S.: Information Theory and Statistics. Courier Dover Publications, New York (2012)

    MATH  Google Scholar 

  14. Hershey, J.R., Olsen, P.A.: Approximating the Kullback Leibler divergence between Gaussian mixture models. In: ICASSP, pp. 317–320 (2007)

    Google Scholar 

  15. Goldberger, J., Gordon, S., Greenspan, H.: An efficient image similarity measure based on approximations of KL-divergence between two Gaussian mixtures. In: ICCV, pp. 487–493 (2003)

    Google Scholar 

  16. Cui, S., Datcu, M.: Comparison of Kullback-Leibler divergence approximation methods between Gaussian mixture models for satellite image retrieval. In: IGARSS, pp. 3719–3722 (2015)

    Google Scholar 

  17. Beecks, C., Ivanescu, A.M., Kirchhoff, S., Seidl, T.: Modeling image similarity by Gaussian mixture models and the signature quadratic form distance. In: ICCV, pp. 1754–1761 (2011)

    Google Scholar 

  18. Rougui, J.E., Gelgon, M., Aboutajdine, D., Mouaddib, N., Rziza, M.: Organizing Gaussian mixture models into a tree for scaling up speaker retrieval. Pattern Recogn. Lett. 28(11), 1314–1319 (2007)

    Article  Google Scholar 

  19. Zhou, L., Ye, W., Plant, C., Böhm, C.: Knowledge discovery of complex data using Gaussian mixture models. In: DaWaK (2017)

    Google Scholar 

  20. Geusebroek, J.-M., Burghouts, G.J., Smeulders, A.W.: The Amsterdam library of object images. Int. J. Comput. Vis. 61(1), 103–112 (2005)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Christian Böhm .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Zhou, L., Ye, W., Wackersreuther, B., Plant, C., Böhm, C. (2017). Novel Indexing Strategy and Similarity Measures for Gaussian Mixture Models. In: Benslimane, D., Damiani, E., Grosky, W., Hameurlain, A., Sheth, A., Wagner, R. (eds) Database and Expert Systems Applications. DEXA 2017. Lecture Notes in Computer Science(), vol 10439. Springer, Cham. https://doi.org/10.1007/978-3-319-64471-4_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-64471-4_14

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-64470-7

  • Online ISBN: 978-3-319-64471-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics