A framework for deep constrained clustering

Abstract

The area of constrained clustering has been extensively explored by researchers and used by practitioners. Constrained clustering formulations exist for popular algorithms such as k-means, mixture models, and spectral clustering but have several limitations. A fundamental strength of deep learning is its flexibility, and here we explore a deep learning framework for constrained clustering and in particular explore how it can extend the field of constrained clustering. We show that our framework can not only handle standard together/apart constraints (without the well documented negative effects reported earlier) generated from labeled side information but more complex constraints generated from new types of side information such as continuous values and high-level domain knowledge. Furthermore, we propose an efficient training paradigm that is generally applicable to these four types of constraints. We validate the effectiveness of our approach by empirical results on both image and text datasets. We also study the robustness of our framework when learning with noisy constraints and show how different components of our framework contribute to the final performance. Our source code is available at: http://github.com/blueocean92.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Notes

  1. 1.

    https://wordnet.princeton.edu/.

References

  1. Aljalbout E, Golkov V, Siddiqui Y, Strobel M, Cremers D (2018) Clustering with deep learning: taxonomy and new methods. arXiv preprint arXiv:1801.07648

  2. Bade K, Nürnberger A (2008) Creating a cluster hierarchy under constraints of a partially known hierarchy. In: Proceedings of the 2008 SIAM international conference on data mining. SIAM, pp 13–24

  3. Basu S, Bilenko M, Mooney RJ (2004) A probabilistic framework for semi-supervised clustering. In: Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp 59–68

  4. Basu S, Davidson I, Wagstaff K (2008) Constrained clustering: advances in algorithms, theory, and applications. CRC Press, Cambridge

    Google Scholar 

  5. Bilenko M, Basu S, Mooney RJ (2004) Integrating constraints and metric learning in semi-supervised clustering. In: Proceedings of the twenty-first international conference on machine learning. ACM, p 11

  6. Caron M, Bojanowski P, Joulin A, Douze M (2018) Deep clustering for unsupervised learning of visual features. In: Proceedings of the European conference on computer vision (ECCV), pp 132–149

  7. Chatziafratis V, Niazadeh R, Charikar M (2018) Hierarchical clustering with structural constraints. In: International conference on machine learning, pp 774–783

  8. Dao TBH, Vrain C, Duong KC, Davidson I (2016) A framework for actionable clustering using constraint programming. In: ECAI

  9. Davidson I, Ravi S (2007) Intractability and clustering with constraints. In: Proceedings of the 24th international conference on machine learning. ACM, pp 201–208

  10. Davidson I, Wagstaff KL, Basu S (2006) Measuring constraint-set utility for partitional clustering algorithms. In: Knowledge discovery in databases: PKDD 2006. Springer, Berlin, pp 115–126

  11. Fogel S, Averbuch-Elor H, Cohen-Or D, Goldberger J (2019) Clustering-driven deep embedding with pairwise constraints. IEEE Comput Graph Appl 39(4):16–27

    Article  Google Scholar 

  12. Ghasedi Dizaji K, Herandi A, Deng C, Cai W, Huang H (2017) Deep clustering via joint convolutional autoencoder embedding and relative entropy minimization. In: Proceedings of the IEEE international conference on computer vision, pp 5736–5745

  13. Gress A, Davidson I (2016) Probabilistic formulations of regression with mixed guidance. In: 2016 IEEE 16th international conference on data mining (ICDM). IEEE, pp 895–900

  14. Guo X, Gao L, Liu X, Yin J (2017) Improved deep embedded clustering with local structure preservation. In: International joint conference on artificial intelligence (IJCAI-17), pp 1753–1759

  15. Haeusser P, Plapp J, Golkov V, Aljalbout E, Cremers D (2018) Associative deep clustering: training a classification network with no labels. In: German conference on pattern recognition. Springer, Berlin, pp 18–32

  16. Han K, Vedaldi A, Zisserman A (2019) Learning to discover novel visual categories via deep transfer clustering. In: Proceedings of the IEEE international conference on computer vision, pp 8401–8409

  17. Hsu YC, Kira Z (2015) Neural network-based clustering using pairwise constraints. arXiv preprint arXiv:1511.06321

  18. Hu W, Miyato T, Tokui S, Matsumoto E, Sugiyama M (2017) Learning discrete representations via information maximizing self-augmented training. In: International conference on machine learning, pp 1558–1567

  19. Ji X, Henriques JF, Vedaldi A (2019) Invariant information clustering for unsupervised image classification and segmentation. In: Proceedings of the IEEE international conference on computer vision, pp 9865–9874

  20. Jiang Z, Zheng Y, Tan H, Tang B, Zhou H (2017) Variational deep embedding: an unsupervised and generative approach to clustering. In: Proceedings of the 26th international joint conference on artificial intelligence, pp 1965–1972

  21. Joachims T (2002) Optimizing search engines using clickthrough data. In: Proceedings of the eighth ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 133–142

  22. LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324

    Article  Google Scholar 

  23. Lewis DD, Yang Y, Rose TG, Li F (2004) Rcv1: a new benchmark collection for text categorization research. J Mach Learn Res 5(Apr):361–397

    Google Scholar 

  24. Lu Z, Carreira-Perpinan MA (2008) Constrained spectral clustering through affinity propagation. In: IEEE conference on computer vision and pattern recognition, 2008. CVPR 2008. IEEE, pp 1–8

  25. Maaten Lvd, Hinton G (2008) Visualizing data using t-sne. J Mach Learn Res 9(Nov):2579–2605

  26. Munkres J (1957) Algorithms for the assignment and transportation problems. J Soc Ind Appl Math 5(1):32–38

    MathSciNet  Article  Google Scholar 

  27. Nair V, Hinton GE (2010) Rectified linear units improve restricted boltzmann machines. In: Proceedings of the 27th international conference on machine learning (ICML-10), pp 807–814

  28. Netzer Y, Wang T, Coates A, Bissacco A, Wu B, Ng AY (2011) Reading digits in natural images with unsupervised feature learning

  29. Schroff F, Kalenichenko D, Philbin J (2015) Facenet: A unified embedding for face recognition and clustering. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 815–823

  30. Schultz M, Joachims T (2004) Learning a distance metric from relative comparisons. In: Advances in neural information processing systems, pp 41–48

  31. Shaham U, Stanton K, Li H, Basri R, Nadler B, Kluger Y (2018) Spectralnet: spectral clustering using deep neural networks. In: International conference on learning representations

  32. Strehl A, Ghosh J, Mooney R (2000) Impact of similarity measures on web-page clustering. In: Workshop on artificial intelligence for web search (AAAI 2000), vol 58, p 64

  33. Wagstaff K, Cardie C (2000) Clustering with instance-level constraints. AAAI/IAAI 1097:577–584

    Google Scholar 

  34. Wagstaff K, Cardie C, Rogers S, Schrödl S et al (2001) Constrained k-means clustering with background knowledge. ICML 1:577–584

    Google Scholar 

  35. Wang X, Davidson I (2010) Flexible constrained spectral clustering. In: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp 563–572

  36. Xie J, Girshick R, Farhadi A (2016) Unsupervised deep embedding for clustering analysis. In: International conference on machine learning, pp 478–487

  37. Xing EP, Jordan MI, Russell SJ, Ng AY (2003) Distance metric learning with application to clustering with side-information. In: Advances in neural information processing systems, pp 521–528

  38. Xu W, Liu X, Gong Y (2003) Document clustering based on non-negative matrix factorization. In: Proceedings of the 26th annual international ACM SIGIR conference on Research and development in information retrieval. ACM, pp 267–273

  39. Yang B, Fu X, Sidiropoulos ND, Hong M (2017) Towards k-means-friendly spaces: Simultaneous deep learning and clustering. In: International conference on machine learning, PMLR, pp 3861–3870

  40. Zhang H, Basu S, Davidson I (2019) A framework for deep constrained clustering-algorithms and advances. In: Joint European conference on machine learning and knowledge discovery in databases. Springer, Berlin, pp 57–72

Download references

Acknowledgements

We acknowledge support for this work from a Google Gift entitled: “Combining Symbolic Reasoning and Deep Learning”.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Hongjing Zhang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Responsible editor: Shuiwang Ji.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Zhang, H., Zhan, T., Basu, S. et al. A framework for deep constrained clustering. Data Min Knowl Disc 35, 593–620 (2021). https://doi.org/10.1007/s10618-020-00734-4

Download citation

Keywords

  • Constrained clustering
  • Deep learning
  • Representation learning
  • Semi-supervised learning