On Coupling FCA and MDL in Pattern Mining

  • Tatiana MakhalovaEmail author
  • Sergei O. Kuznetsov
  • Amedeo Napoli
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11511)


Pattern Mining is a well-studied field in Data Mining and Machine Learning. The modern methods are based on dynamically updating models, among which MDL-based ones ensure high-quality pattern sets. Formal concepts also characterize patterns in a condensed form. In this paper we study MDL-based algorithm called Krimp in FCA settings and propose a modified version that benefits from FCA and relies on probabilistic assumptions that underlie MDL. We provide an experimental proof that the proposed approach improves quality of pattern sets generated by Krimp.



The work of Tatyana Makhalova and Sergei O. Kuznetsov was supported by the Russian Science Foundation under grant 17-11-01294 and performed at National Research University Higher School of Economics, Moscow, Russia.


  1. 1.
    Aggarwal, C.C., Han, J.: Frequent Pattern Mining. Springer, Heidelberg (2014). Scholar
  2. 2.
    Brin, S., Motwani, R., Silverstein, C.: Beyond market baskets: generalizing association rules to correlations. ACM SIGMOD Rec. 26, 265–276 (1997)CrossRefGoogle Scholar
  3. 3.
    Coenen, F.: The LUCS-KDD discretised/normalised ARM and CARM data library. Department of Computer Science, The University of Liverpool, UK (2003).
  4. 4.
    Gallo, A., De Bie, T., Cristianini, N.: MINI: mining informative non-redundant itemsets. In: Kok, J.N., Koronacki, J., Lopez de Mantaras, R., Matwin, S., Mladenič, D., Skowron, A. (eds.) PKDD 2007. LNCS (LNAI), vol. 4702, pp. 438–445. Springer, Heidelberg (2007). Scholar
  5. 5.
    Ganter, B., Wille, R.: Formal Concept Analysis: Logical Foundations. Springer, Berlin (1999). Scholar
  6. 6.
    Grünwald, P.D.: The Minimum Description Length Principle. MIT Press, Cambridge (2007)CrossRefGoogle Scholar
  7. 7.
    Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. ACM SIGMOD Rec. 29, 1–12 (2000)CrossRefGoogle Scholar
  8. 8.
    Hanhijärvi, S., Ojala, M., Vuokko, N., Puolamäki, K., Tatti, N., Mannila, H.: Tell me something I don’t know: randomization strategies for iterative data mining. In: Proceedings of the 15th ACM SIGKDD, pp. 379–388. ACM (2009)Google Scholar
  9. 9.
    Kuznetsov, S.O., Makhalova, T.: On interestingness measures of formal concepts. Inf. Sci. 442–443, 202–219 (2018)MathSciNetCrossRefGoogle Scholar
  10. 10.
    Mampaey, M., Vreeken, J., Tatti, N.: Summarizing data succinctly with the most informative itemsets. TKDD 6(4), 16 (2012)CrossRefGoogle Scholar
  11. 11.
    Siebes, A., Kersten, R.: A structure function for transaction data. In: Proceedings of SDM, pp. 558–569. SIAM (2011)Google Scholar
  12. 12.
    Smets, K., Vreeken, J.: SLIM: directly mining descriptive patterns. In: Proceedings of SDM, pp. 236–247. SIAM (2012)Google Scholar
  13. 13.
    Tatti, N.: Maximum entropy based significance of itemsets. Knowl. Inf. Syst. 17(1), 57–77 (2008)CrossRefGoogle Scholar
  14. 14.
    Vreeken, J., Van Leeuwen, M., Siebes, A.: KRIMP: mining itemsets that compress. Data Min. Knowl. Disc. 23(1), 169–214 (2011)MathSciNetCrossRefGoogle Scholar
  15. 15.
    Wang, C., Parthasarathy, S.: Summarizing itemset patterns using probabilistic models. In: Proceedings of the 12th ACM SIGKDD, pp. 730–735. ACM (2006)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Tatiana Makhalova
    • 1
    • 2
    Email author
  • Sergei O. Kuznetsov
    • 1
  • Amedeo Napoli
    • 2
  1. 1.National Research University Higher School of EconomicsMoscowRussia
  2. 2.Université de Lorraine, CNRS, Inria, LORIANancyFrance

Personalised recommendations