Modelling large bases of categorical data with acyclic schemes

  • F. M. Malvestuto
Contributed Papers
Part of the Lecture Notes in Computer Science book series (LNCS, volume 243)


The design and the implementation of a large base of categorical data raise several problems: storage requirements, performance of the query-processing system, consistency ... Most problems find a simple and efficient solution if and only if the database has an acyclic scheme.


Relational Database Category Attribute Database Scheme Universal Schema Categorical Database 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    A.H. Andersen, Multidimensional contingency tables, Scand. J. Stat. 1 (1974) 115–127Google Scholar
  2. 2.
    C. Beeri, R. Fagin, D. Maier and M. Yannakakis, On the desirability of acyclic database schemes, J. ACM 30 (1983) 479–513Google Scholar
  3. 3.
    C. Berge, Graphs and hypergraphs. NORTH HOLLAND, 1973.Google Scholar
  4. 4.
    Y. Bishop et al., Discrete multivariate analysis: theory and practice. MIT PRESS, 1978.Google Scholar
  5. 5.
    J.N. Darroch, Interaction models, Encyclopedia of statistical sciences, 4 (1983) 182–187Google Scholar
  6. 6.
    J.N. Darroch et al., Markov fields and log-linear interaction models for contingency tables, The Ann. of Statist. 8:3 (1980) 522–539Google Scholar
  7. 7.
    L.A. Goodman, The multivariate analysis of qualitative data: interactions among multiple classifications, J. Amer.Stat. Assoc. 65 (1070) 226–256Google Scholar
  8. 8.
    L.A. Goodman, Partitioning of chi-square, analysis of marginal contingency tables and estimation of extected frequencies in multidimensional contingency tables, J. Amer. Statist. Assoc. 66 (1971) 339–344Google Scholar
  9. 9.
    S.J. Haberman, The general log-linear model. Ph. D. thesis. Dept. Statist. Univ. Chicago.Google Scholar
  10. 10.
    S.J. Haberman, The analysis of frequency data. Un.Chicago PRESS, 1974Google Scholar
  11. 11.
    F.M. Malvestuto, Statistical treatment of the information content of a database, Inf. Systems 11:3 (1986)Google Scholar
  12. 12.
    F.M. Malvestuto, Decomposing complex contingency tables to reduce storage requirements, Proc. Conf. on "Computational Statistics" (1986)Google Scholar
  13. 13.
    F.M. Malvestuto, La Ricerca Operativa nella progettazione delle vasi di dati statistici, Atti delle Giornate AIRO su "Informatica e Rlcerca Operativa" (1986)Google Scholar
  14. 14.
    T.H. Merret et al., Distribution model of relations, Proc. 5 th Conf. on Very Large Data Bases, Rio de Janeiro.Google Scholar
  15. 15.
    J.E. Shore et al., Axiomatic derivation of the principle of maximum entropy and the principle of minimum cross-entropy, IEEE Trans. Inf. Theory 26 (1980) 26–37Google Scholar
  16. 16.
    A. Shoshani et al., Statistical and scientific database issues, IEEE Trans. Soft. Eng. 11 (1985) 1040–1047Google Scholar
  17. 17.
    R.E. Tarjan et al., Simple linear-time algorithm to test chordality of graphs, test acyclicity of hypergraphs, and selectively reduce acyclic hypergraphs, SIAM J. Comput. 13:3 (1984)Google Scholar
  18. 18.
    M. Vlach, Conditions for the existence of solutions of the three-dimensional planar transportation problem, Discr. Appl. Math. 13 (1986) 61–78Google Scholar
  19. 19.
    S. Watanabe, A unified view of clustering algorithms, IFIP Congress on "Foundations of information processing", 1971Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1986

Authors and Affiliations

  • F. M. Malvestuto
    • 1
  1. 1.Studi: Documentazione e Informazione, ENEAItaly

Personalised recommendations