Skip to main content

Learning Similarities from Examples Under the Evidence Accumulation Clustering Paradigm

  • Chapter
Book cover Similarity-Based Pattern Analysis and Recognition

Abstract

The SIMBAD project puts forward a unified theory of data analysis under a (dis)similarity based object representation framework. Our work builds on the duality of probabilistic and similarity notions on pairwise object comparison. We address the Evidence Accumulation Clustering paradigm as a means of learning pairwise similarity between objects, summarized in a co-association matrix. We show the dual similarity/probabilistic interpretation of the co-association matrix and exploit these for coherent consensus clustering methods, either exploring embeddings over learned pairwise similarities, in an attempt to better highlight the clustering structure of the data, or by means of a unified probabilistic approach leading to soft assignments of objects to clusters.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Technically, these distances are computed along a graph formed by connecting all k-nearest neighbors.

References

  1. Aidos, H., Fred, A.: A study of embedding methods under the evidence accumulation framework. In: Pelillo, M., Hancock, E. (eds.) Similarity-Based Pattern Recognition. Lecture Notes in Computer Science, vol. 7005, pp. 290–305. Springer, Berlin (2011). http://link.springer.com/chapter/10.1007/978-3-642-24471-1_21

    Chapter  Google Scholar 

  2. Ayad, H., Kamel, M.S.: Cumulative voting consensus method for partitions with variable number of clusters. IEEE Trans. Pattern Anal. Mach. Intell. 30(1), 160–173 (2008)

    Article  Google Scholar 

  3. Belkin, M., Niyogi, P.: Laplacian eigenmaps and spectral techniques for embedding and clustering. In: Advances in Neural Information Processing Systems (NIPS 2001), vol. 14, pp. 585–591 (2002)

    Google Scholar 

  4. Bezdek, J., Hathaway, R.: Vat: a tool for visual assessment of (cluster) tendency. In: Proceedings of the 2002 International Joint Conference on Neural Networks. IJCNN’02, vol. 3, pp. 2225–2230 (2002)

    Google Scholar 

  5. Boyd, S., Vandenberghe, L.: Convex Optimization, 1st edn. Cambridge University Press, Cambridge (2004)

    Book  MATH  Google Scholar 

  6. Demartines, P., Hérault, J.: Curvilinear component analysis: a self-organizing neural network for nonlinear mapping of data sets. IEEE Trans. Neural Netw. 8(1), 148–154 (1997)

    Article  Google Scholar 

  7. Dimitriadou, E., Weingessel, A., Hornik, K.: A combination scheme for fuzzy clustering. In: AFSS’02, 332–338 (2002)

    Google Scholar 

  8. Fern, X.Z., Brodley, C.E.: Solving cluster ensemble problems by bipartite graph partitioning. In: Proc. ICML’04 (2004)

    Google Scholar 

  9. Fred, A.: Finding consistent clusters in data partitions. In: Kittler, J., Roli, F. (eds.) Multiple Classifier Systems, vol. 2096, pp. 309–318. Springer, Berlin (2001)

    Chapter  Google Scholar 

  10. Fred, A., Jain, A.: Data clustering using evidence accumulation. In: Proc. of the 16th Int’l Conference on Pattern Recognition, pp. 276–280 (2002)

    Google Scholar 

  11. Fred, A., Jain, A.: Combining multiple clustering using evidence accumulation. IEEE Trans. Pattern Anal. Mach. Intell. 27(6), 835–850 (2005)

    Article  Google Scholar 

  12. Fred, A.L., Jain, A.K.: Learning pairwise similarity for data clustering. In: Proc. of the 18th Int’l Conference on Pattern Recognition (ICPR 2006), pp. 925–928. IEEE Comput. Soc., Washington (2006). doi:10.1109/ICPR.2006.754

    Chapter  Google Scholar 

  13. Hadjitodorov, S.T., Kuncheva, L.I., Todorova, L.P.: Moderate diversity for better cluster ensembles. Inf. Fusion 7(3), 264–275 (2006)

    Article  Google Scholar 

  14. He, X., Niyogi, P.: Locality preserving projections. In: Advances in Neural Information Processing Systems (NIPS 2003), vol. 16 (2004)

    Google Scholar 

  15. He, X., Cai, D., Yan, S., Zhang, H.J.: Neighborhood preserving embedding. In: Proc. of the 10th Int. Conf. on Computer Vision (ICCV 2005), vol. 2, pp. 1208–1213 (2005)

    Google Scholar 

  16. Hofmann, T., Puzicha, J., Jordan, M.I.: Learning from Dyadic Data. Advances in Neural Information Processing Systems (NIPS), vol. 11. MIT Press, Cambridge (1999)

    Google Scholar 

  17. Jain, A.K.: Data clustering: 50 years beyond k-means. Pattern Recognit. Lett. 31(8), 651–666 (2010)

    Article  Google Scholar 

  18. Jain, A.K., Murty, M.N., Flynn, P.J.: Data clustering: a review. ACM Comput. Surv. 31, 264–323 (1999)

    Article  Google Scholar 

  19. Kachurovskii, I.R.: On monotone operators and convex functionals. Usp. Mat. Nauk 15(4), 213–215 (1960)

    Google Scholar 

  20. Karypis, G., Kumar, V.: Multilevel algorithms for multi-constraint graph partitioning. In: Proceedings of the 10th Supercomputing Conference (1998)

    Google Scholar 

  21. Karypis, G., Aggarwal, R., Kumar, V., Shekhar, S.: Multilevel hypergraph partitioning: applications in vlsi domain. In: Proc. Design Automation Conf. (1997)

    Google Scholar 

  22. Kuncheva, L.I., Hadjitodorov, S.T.: Using diversity in cluster ensembles. In: Proc. of the IEEE International Conference on Systems, Man & Cybernetics, Hague, Netherlands, pp. 1214–1219 (2004)

    Google Scholar 

  23. Kuncheva, L., Hadjitodorov, S., Todorova, L.: Experimental comparison of cluster ensemble methods. In: 9th International Conference on Information Fusion, pp. 1–7 (2006). doi:10.1109/ICIF.2006.301614

    Google Scholar 

  24. Lee, J.A., Verleysen, M.: Nonlinear Dimensionality Reduction. Information Science and Statistics. Springer, Berlin (2007)

    Book  MATH  Google Scholar 

  25. Lee, J.A., Lendasse, A., Verleysen, M.: Nonlinear projection with curvilinear distances: isomap versus curvilinear distance analysis. Neurocomputing 57, 49–76 (2004)

    Article  Google Scholar 

  26. Levina, E., Bickel, P.J.: Maximum likelihood estimation of intrinsic dimension. In: Advances in Neural Information Processing Systems (NIPS 2004), vol. 17 (2004)

    Google Scholar 

  27. Lourenço, A., Fred, A.: Selectively learning clusters in multi-EAC. In: International Conference on Knowledge Discovery and Information Retrieval (KDIR 2010), Valencia, Spain (2010)

    Google Scholar 

  28. Lourenço, A., Fred, A., Jain, A.K.: On the scalability of evidence accumulation clustering. In: ICPR. Istanbul Turkey (2010)

    Google Scholar 

  29. Lourenço, A., Fred, A., Figueiredo, M.: A generative dyadic aspect model for evidence accumulation clustering. In: Pelillo, M., Hancock, E. (eds.) Similarity-Based Pattern Recognition. Lecture Notes in Computer Science, vol. 7005, pp. 104–116. Springer, Berlin (2011). http://link.springer.com/chapter/10.1007/978-3-642-24471-1_8

    Chapter  Google Scholar 

  30. Luenberger, D.G., Ye, Y.: Linear and Nonlinear Programming, 3rd edn. Springer, Berlin (2008)

    MATH  Google Scholar 

  31. Meila, M.: Comparing clusterings by the variation of information. In: Proc. of the Sixteenth Annual Conf. of Computational Learning Theory (COLT). Springer, Berlin (2003)

    Google Scholar 

  32. Ng, A.Y., Jordan, M.I., Weiss, Y.: On spectral clustering: analysis and an algorithm. In: NIPS, pp. 849–856. MIT Press, Cambridge (2001)

    Google Scholar 

  33. Punera, K., Ghosh, J.: Advances in Fuzzy Clustering and Its Applications, Chap. Soft Consensus Clustering. Wiley, New York (2007)

    Google Scholar 

  34. Rota Bulò, S., Lourenço, A., Fred, A., Pelillo, M.: Pairwise probabilistic clustering using evidence accumulation. In: Proc. 2010 Int. Conf. on Structural, Syntactic, and Statistical Pattern Recognition, SSPR&SPR’10, pp. 395–404 (2010)

    Chapter  Google Scholar 

  35. Roweis, S.T., Saul, L.K.: Nonlinear dimensionality reduction by locally linear embedding. Science 290, 2323–2326 (2000)

    Article  Google Scholar 

  36. Sammon, J.W.: A nonlinear mapping for data structure analysis. IEEE Trans. Comput. 18(5), 401–409 (1969)

    Article  Google Scholar 

  37. Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 22(8), 888–905 (2000)

    Article  Google Scholar 

  38. Steyvers, M., Griffiths, T.: Probabilistic Topic Models, Chap. Latent Semantic Analysis: a Road to Meaning. Laurence Erlbaum, Hillsdale (2007)

    Google Scholar 

  39. Strehl, A., Ghosh, J.: Cluster ensembles—a knowledge reuse framework for combining multiple partitions. J. Mach. Learn. Res. 3, 583–617 (2002)

    MathSciNet  Google Scholar 

  40. Tenenbaum, J.B., de Silva, V., Langford, J.C.: A global geometric framework for nonlinear dimensionality reduction. Science 290, 2319–2323 (2000)

    Article  Google Scholar 

  41. Theodoridis, S., Koutroumbas, K.: Pattern Recognition. Elsevier, Amsterdam (2003)

    Google Scholar 

  42. Topchy, A., Jain, A., Punch, W.: Combining multiple weak clusterings. In: IEEE Intl. Conf. on Data Mining, Melbourne, FL, pp. 331–338 (2003)

    Chapter  Google Scholar 

  43. Topchy, A., Jain, A., Punch, W.: A mixture model of clustering ensembles. In: Proc. of the SIAM Conf. on Data Mining (2004)

    Google Scholar 

  44. Topchy, A., Jain, A.K., Punch, W.: Clustering ensembles: models of consensus and weak partitions. IEEE Trans. Pattern Anal. Mach. Intell. 27(12), 1866–1881 (2005)

    Article  Google Scholar 

  45. Wang, H., Shan, H., Banerjee, A.: Bayesian cluster ensembles. In: 9th SIAM Int. Conf. on Data Mining (2009)

    Google Scholar 

  46. Wang, P., Domeniconi, C., Laskey, K.B.: Nonparametric Bayesian clustering ensembles. In: ECML PKDD’10, pp. 435–450 (2010)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ana L. N. Fred .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag London

About this chapter

Cite this chapter

Fred, A.L.N. et al. (2013). Learning Similarities from Examples Under the Evidence Accumulation Clustering Paradigm. In: Pelillo, M. (eds) Similarity-Based Pattern Analysis and Recognition. Advances in Computer Vision and Pattern Recognition. Springer, London. https://doi.org/10.1007/978-1-4471-5628-4_5

Download citation

  • DOI: https://doi.org/10.1007/978-1-4471-5628-4_5

  • Publisher Name: Springer, London

  • Print ISBN: 978-1-4471-5627-7

  • Online ISBN: 978-1-4471-5628-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics