Advertisement

Number of Minimal Hypergraph Transversals and Complexity of IFM with Infrequency: High in Theory, but Often Not so Much in Practice!

  • Domenico SaccàEmail author
  • Edoardo Serra
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11946)

Abstract

Hypergraph Dualization (also called as hitting set enumeration) is the problem of enumerating all minimal transversals of a hypergraph \({\mathcal {H}}\), i.e., all minimal inclusion-wise hyperedges (i.e., sets of vertices) that intersect every hyperedge in \({\mathcal {H}}\). Dualization is at the core of many important Artificial Intelligence (AI) problems. As a contribution to a better understanding of Dualization complexity, this paper introduces a tight upper bound to the number of minimal transversals that can be computed in polynomial time. In addition, the paper presents an interesting exploitation of the upper bound to the number of minimal transversals. In particular, the problem dealt with is characterizing the complexity of the data mining problem called \(\mathtt {IFM}_{\mathtt {I}}\) (Inverse Frequent itemset Mining with Infrequency constraints), that is the problem of finding a transaction database whose frequent and infrequent itemsets satisfy a number of frequency/infrequency patterns given in input.

Keywords

Hypergraph transversal Hypergraph dualization Inverse data mining 

References

  1. 1.
    Agrawal, R., Imieliński, T., Swami, A.: Mining association rules between sets of items in large databases. In: Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, SIGMOD 1993, pp. 207–216. ACM, New York (1993).  https://doi.org/10.1145/170035.170072
  2. 2.
    Berge, C.: Graphs and Hypergraphs. North-Holland Pub. Co., Amsterdam (1973)zbMATHGoogle Scholar
  3. 3.
    Boros, E., Gurvich, V., Khachiyan, L., Makino, K.: On maximal frequent and minimal infrequent sets in binary matrices. Ann. Math. Artif. Intell. 39, 211–221 (2003).  https://doi.org/10.1023/A:1024605820527MathSciNetCrossRefzbMATHGoogle Scholar
  4. 4.
    Calders, T.: Itemset frequency satisfiability: complexity and axiomatization. Theoret. Comput. Sci. 394(1–2), 84–111 (2008).  https://doi.org/10.1016/j.tcs.2007.11.003MathSciNetCrossRefzbMATHGoogle Scholar
  5. 5.
    Damaschke, P.: Parameterized algorithms for double hypergraph dualization with rank limitation and maximum minimal vertex cover. Discret. Optim. 8(1), 18–24 (2011).  https://doi.org/10.1016/j.disopt.2010.02.006MathSciNetCrossRefzbMATHGoogle Scholar
  6. 6.
    Desaulniers, G., Desrosiers, J., Solomon, M.M.: Column Generation. Springer, New York (2005).  https://doi.org/10.1007/b135457CrossRefzbMATHGoogle Scholar
  7. 7.
    Downey, R.G., Fellows, M.R.: Parameterized Complexity. Springer, New York (1999).  https://doi.org/10.1007/978-1-4612-0515-9CrossRefzbMATHGoogle Scholar
  8. 8.
    Eiter, T., Gottlob, G.: Identifying the minimal transversals of a hypergraph and related problems. SIAM J. Comput. 24(6), 1278–1304 (1995).  https://doi.org/10.1137/S0097539793250299MathSciNetCrossRefzbMATHGoogle Scholar
  9. 9.
    Eiter, T., Makino, K.: Generating all abductive explanations for queries on propositional horn theories. In: Baaz, M., Makowsky, J.A. (eds.) CSL 2003. LNCS, vol. 2803, pp. 197–211. Springer, Heidelberg (2003).  https://doi.org/10.1007/978-3-540-45220-1_18CrossRefGoogle Scholar
  10. 10.
    Elbassioni, K.M., Rauf, I., Ray, S.: Enumerating minimal transversals of geometric hypergraphs. In: Proceedings of the 23rd Annual Canadian Conference on Computational Geometry, Toronto, Ontario, Canada, 10–12 August (2011)Google Scholar
  11. 11.
    Flum, J., Grohe, M.: Parameterized Complexity Theory. Texts in Theoretical Computer Science. An EATCS Series. Springer, Heidelberg (2006).  https://doi.org/10.1007/3-540-29953-XCrossRefzbMATHGoogle Scholar
  12. 12.
    Fredman, M.L., Khachiyan, L.: On the complexity of dualization of monotone disjunctive normal forms. J. Algorithms 21(3), 618–628 (1996).  https://doi.org/10.1006/jagm.1996.0062MathSciNetCrossRefzbMATHGoogle Scholar
  13. 13.
    Gottlob, G., Malizia, E.: Achieving new upper bounds for the hypergraph duality problem through logic. SIAM J. Comput. 47(2), 456–492 (2018).  https://doi.org/10.1137/15M1027267MathSciNetCrossRefzbMATHGoogle Scholar
  14. 14.
    Gottlob, G.: Deciding monotone duality and identifying frequent itemsets in quadratic logspace. In: Hull, R., Fan, W. (eds.) PODS, pp. 25–36. ACM (2013).  https://doi.org/10.1145/2463664.2463673
  15. 15.
    Gunopulos, D., Khardon, R., Mannila, H., Toivonen, H.: Data mining, hypergraph transversals, and machine learning. In: Mendelzon, A.O., Özsoyoglu, Z.M. (eds.) PODS 1997, pp. 209–216. ACM Press (1997).  https://doi.org/10.1145/263661.263684
  16. 16.
    Guzzo, A., Moccia, L., Saccà, D., Serra, E.: Solving inverse frequent itemset mining with infrequency constraints via large-scale linear programs. ACM Trans. Knowl. Discov. Data 7(4), 18:1–18:39 (2013).  https://doi.org/10.1145/2541268.2541271CrossRefGoogle Scholar
  17. 17.
    Kavvadias, D., Papadimitriou, C.H., Sideri, M.: On horn envelopes and hypergraph transversals. In: Ng, K.W., Raghavan, P., Balasubramanian, N.V., Chin, F.Y.L. (eds.) ISAAC 1993. LNCS, vol. 762, pp. 399–405. Springer, Heidelberg (1993).  https://doi.org/10.1007/3-540-57568-5_271CrossRefGoogle Scholar
  18. 18.
    Khachiyan, L., Boros, E., Elbassioni, K.M., Gurvich, V.: On the dualization of hypergraphs with bounded edge-intersections and other related classes of hypergraphs. Theor. Comput. Sci. 382(2), 139–150 (2007).  https://doi.org/10.1016/j.tcs.2007.03.005MathSciNetCrossRefzbMATHGoogle Scholar
  19. 19.
    Liu, G., Li, J., Wong, L.: A new concise representation of frequent itemsets using generators and a positive border. Knowl. Inf. Syst. 17(1), 35–56 (2008).  https://doi.org/10.1007/s10115-007-0111-5CrossRefGoogle Scholar
  20. 20.
    Mielikainen, T.: On inverse frequent set mining. In: Proceedings of 2nd Workshop on Privacy Preserving Data Mining, PPDM 2003, pp. 18–23. IEEE Computer Society, Washington, DC (2003)Google Scholar
  21. 21.
    Papadimitriou, C.H.: Computational Complexity. Addison-Wesley, Boston (1994)zbMATHGoogle Scholar
  22. 22.
    Reiter, R.: A theory of diagnosis from first principles. Artif. Intell. 32(1), 57–95 (1987).  https://doi.org/10.1016/0004-3702(87)90062-2MathSciNetCrossRefzbMATHGoogle Scholar
  23. 23.
    Saccà, D., Serra, E.: On line appendix to: number of minimal hypergraph transversals and complexity of IFM with infrequency: high in theory, but often not so much in practice! Version of 12 September 2019. http://sacca.deis.unical.it/#view=object&format=object&id=1490/gid=160
  24. 24.
    Saccà, D., Serra, E., Guzzo, A.: Count constraints and the inverse OLAP problem: definition, complexity and a step toward aggregate data exchange. In: Lukasiewicz, T., Sali, A. (eds.) FoIKS 2012. LNCS, vol. 7153, pp. 352–369. Springer, Heidelberg (2012).  https://doi.org/10.1007/978-3-642-28472-4_20CrossRefGoogle Scholar
  25. 25.
    Saccá, D., Serra, E., Rullo, A.: Extending inverse frequent itemsets miningto generate realistic datasets: complexity, accuracy and emerging applications. Data Min. Knowl. Discov. 33, 1736–1774 (2019).  https://doi.org/10.1007/s10618-019-00643-1MathSciNetCrossRefGoogle Scholar
  26. 26.
    Vardi, M.Y.: Lost in math? Commun. ACM 62(3), 7 (2019).  https://doi.org/10.1145/3306448CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.University of CalabriaRendeItaly
  2. 2.Boise State UniversityBoiseUSA

Personalised recommendations