Skip to main content

Integer Linear Programming Models for Constrained Clustering

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6332))

Abstract

We address the problem of building a clustering as a subset of a (possibly large) set of candidate clusters under user-defined constraints. In contrast to most approaches to constrained clustering, we do not constrain the way observations can be grouped into clusters, but the way candidate clusters can be combined into suitable clusterings. The constraints may concern the type of clustering (e.g., complete clusterings, overlapping or encompassing clusters) and the composition of clusterings (e.g., certain clusters excluding others). In the paper, we show that these constraints can be translated into integer linear programs, which can be solved by standard optimization packages. Our experiments with benchmark and real-world data investigates the quality of the clusterings and the running times depending on a variety of parameters.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agrawal, R., Srikant, R.: Fast Algorithms for Mining Association Rules. In: Proceedings of the 20th VLDB Conference, pp. 487–499 (1994)

    Google Scholar 

  2. An, A., Khan, S., Huang, X.: Objective and subjective algorithms for grouping association rules. In: Third International Conference on Data Mining, pp. 477–480 (2003)

    Google Scholar 

  3. Basu, S., Davidson, I., Wagstaff, K.: Constrained Clustering: Algorithms, Applications and Theory. Chapman & Hall/CRC Press, Boca Raton (2008)

    MATH  Google Scholar 

  4. Bonchi, F., Giannotti, F., Pedreschi, D.: A Relational Query Primitive for Constraint-Based Pattern Mining. In: Constraint-Based Mining and Inductive Databases, pp. 14–37 (2004)

    Google Scholar 

  5. Boulicaut, J.F., Masson, C.: Data mining query languages. In: Maimon, O., Rokach, L. (eds.) The Data Mining and Knowledge Discovery Handbook, pp. 715–727 (2005)

    Google Scholar 

  6. Boulicaut, J.F., Jeudy, B.: Constraint-based data mining. In: Maimon, O., Rokach, L. (eds.) The Data Mining and Knowledge Discovery Handbook, pp. 399–416 (2005)

    Google Scholar 

  7. Chaudhuri, S., Sarma, A.D., Ganti, V., Kaushik, R.: Leveraging Aggregate Constraints for Deduplication. In: Proceedings of the International Conference on Management of Data (SIGMOD), pp. 437–448 (2007)

    Google Scholar 

  8. Dash Optimization: XPRESS-MP, http://www.dash.co.uk

  9. Davidson, I., Ravi, S.: Clustering with Constraints: Feasibility Issues and the k-Means Algorithm. In: Proceedings of the Fifth SIAM International Conference on Data Mining (SDM 2005), pp. 138–149 (2005)

    Google Scholar 

  10. Davidson, I., Ravi, S.: The complexity of non-hierarchical clustering with instance and cluster level constraints. Data Mining and Knowledge Discovery 14(1), 25–61 (2007)

    Article  MathSciNet  Google Scholar 

  11. Demiriz, A., Bennett, K., Bradley, P.S.: Using assignment constraints to avoid empty clusters in k-means clustering. In: Basu, S., Davidson, I., Wagstaff, K. (eds.) Constrained Clustering: Algorithms, Applications and Theory (2008)

    Google Scholar 

  12. De Raedt, L.: A Perspective on Inductive Databases. SIGKDD Explorations 4(2), 66–77 (2002)

    Article  MathSciNet  Google Scholar 

  13. Dzeroski, S., Todorovski, L., Ljubic, P.: Inductive Queries on Polynomial Equations. In: Boulicaut, J.F., De Raedt, L., Mannila, H. (eds.) Constraint-Based Mining and Inductive Databases, pp. 127–154. Springer, Heidelberg (2004)

    Google Scholar 

  14. Garey, M.R., Johnson, D.S.: Computers and Intractability. Freeman, New York (1979)

    MATH  Google Scholar 

  15. Hapfelmeier, A., Schmidt, J., Mueller, M., Perneczky, R., Kurz, A., Drzezga, A., Kramer, S.: Interpreting PET Scans by Structured Patient Data: A Data Mining Case Study in Dementia Research. In: Eighth IEEE International Conference on Data Mining, pp. 213–222 (2008)

    Google Scholar 

  16. Nijssen, S., De Raedt, S.: IQL: A Proposal for an Inductive Query Language. In: Džeroski, S., Struyf, J. (eds.) KDID 2006. LNCS, vol. 4747, pp. 189–207. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  17. Saglam, B., Sibel, F., Sayin, S., Turkay, M.: A mixed-integer programming approach to the clustering problem with an application in customer segmentation. European Journal of Operational Research 173(3), 866–879 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  18. Schrijver, A.: Theory of Linear and Integer Programming. John Wiley&Sons, West Sussex (1998)

    MATH  Google Scholar 

  19. Sese, J., Morishita, S.: Itemset Classified Clustering. In: Proceedings of the 8th European Conference on Principles and Practice of Knowledge Discovery in Databases, pp. 398–409 (2004)

    Google Scholar 

  20. Wagstaff, K., Cardie, C., Rogers, S., Schrödl, S.: Constrained K-means Clustering with Background Knowledge. In: Proceedings of the Eighteenth International Conference on Machine Learning, pp. 577–584 (2001)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Mueller, M., Kramer, S. (2010). Integer Linear Programming Models for Constrained Clustering. In: Pfahringer, B., Holmes, G., Hoffmann, A. (eds) Discovery Science. DS 2010. Lecture Notes in Computer Science(), vol 6332. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-16184-1_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-16184-1_12

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-16183-4

  • Online ISBN: 978-3-642-16184-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics