Gröbner Bases pp 165-221

# Markov Bases and Designed Experiments

Chapter

## Abstract

Markov bases first appeared in a 1998 work by Diaconis and Sturmfels (Ann Stat 26:363–397, 1998). In this paper, they considered the problem of estimating the p values for conditional tests for data summarized in contingency tables by Markov chain Monte Carlo methods; this is one of the fundamental problems in applied statistics. In this setting, it is necessary to have an appropriate connected Markov chain over the given finite sample space. Diaconis and Sturmfels formulated this problem with the idea of a Markov basis, and they showed that it corresponds to the set of generators of a well-specified toric ideal. Their work is very attractive because the theory of a Gröbner basis, a concept of pure mathematics, can be used in actual problems in applied statistics. In fact, their work became one of the origins of the relatively new field, computational algebraic statistics. In this chapter, we first introduce their work along with the necessary background in statistics. After that, we use the theory of Gröbner bases to solve actual applied statistical problems in experimental design.

## Keywords

Contingency Table Nuisance Parameter Full Factorial Design Fractional Factorial Design Markov Chain Monte Carlo Method
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

## References

1. 1.
S. Aoki, A. Takemura, Minimal basis for a connected Markov chain over 3 × 3 × K contingency tables with fixed two-dimensional marginals. Aust. N. Z. J. Stat. 45, 229–249 (2003)
2. 2.
S. Aoki, A. Takemura, The list of indispensable moves of unique minimal Markov basis for 3 × 4 × K and 4 × 4 × 4 contingency tables with fixed two-dimensional marginals. Technical Report METR 03-38, Department of Mathematical Engineering and Information Physics, The University of Tokyo (2003)Google Scholar
3. 3.
S. Aoki, A. Takemura, Minimal invariant Markov basis for sampling contingency tables with fixed marginals. Ann. Inst. Stat. Math. 60, 229–256 (2008)
4. 4.
S. Aoki, A. Takemura, The largest group of invariance for Markov bases and toric ideals. J. Symb. Comput. 43, 342–358 (2008)
5. 5.
S. Aoki, A. Takemura, A short survey on design of experiments and Gröbner bases. Bull. Jpn. Soc. Symb. Algebr. Comput. 16(2), 15–22 (2009) (in Japanese)Google Scholar
6. 6.
S. Aoki, A. Takemura, Markov basis for design of experiments with three-level factors, in Algebraic and Geometric Methods in Statistics (dedicated to Professor Giovanni Pistone on the occasion of his sixty-fifth birthday), ed. by P. Gibilisco, E. Riccomagno, M.P. Rogantin, H.P. Wynn (Cambridge University Press, Cambridge, 2009), pp. 225–238Google Scholar
7. 7.
S. Aoki, A. Takemura, Markov chain Monte Carlo tests for designed experiments. J. Stat. Plan. Inference 140, 817–830 (2010)
8. 8.
S. Aoki, A. Takemura, R. Yoshida, Indispensable monomials of toric ideals and Markov bases. J. Symb. Comput. 43, 490–507 (2008)
9. 9.
L.W. Condra, Reliability Improvement with Design of Experiments (Dekker, New York, 1993)Google Scholar
10. 10.
J. Cornfield, A statistical problem arising from retrospective studies, in Proceedings of 3rd Berkeley Symposium on Mathematical Statistics and Probability, University of California Press, vol. 4 (1956), pp. 135–148
11. 11.
P. Diaconis, B. Sturmfels, Algebraic algorithms for sampling from conditional distributions. Ann. Stat. 26, 363–397 (1998)
12. 12.
A. Dobra, Markov bases for decomposable graphical models. Bernoulli 9, 1093–1108 (2003)
13. 13.
M.H. Gail, N. Mantel, Counting the number of r × c contingency tables with fixed marginals. J. Am. Stat. Assoc. 72, 859–862 (1977)
14. 14.
S.J. Haberman, A warning on the use of chi-squared statistics with frequency tables with small expected cell counts. J. Am. Stat. Assoc. 83, 555–560 (1988)
15. 15.
W.K. Hastings, Monte Carlo sampling methods using Markov chains and their applications. Biometrika 57, 97–109 (1970)
16. 16.
R. Hemmecke, P. Malkin, Computing generating sets of toric ideals. arXiv: math.CO/0508359 (2005)Google Scholar
17. 17.
R. Hemmecke, P.N. Malkin, Computing generating sets of lattice ideals and Markov bases of lattices. J. Symb. Comput. 44, 1463–1476 (2009)
18. 18.
E.L. Lehmann, Testing Statistical Hypotheses, 2nd edn. (Wiley, New York, 1986)
19. 19.
E.L. Lehmann, G. Casella, Theory of Point Estimation, 2nd edn. (Springer, Berlin, 2001)Google Scholar
20. 20.
C.R. Mehta, N.R. Patel, A network algorithm for performing Fisher’s exact test in r × c contingency tables. J. Am. Stat. Assoc. 78, 427–434 (1983)
21. 21.
R. Mukerjee, C.F.J. Wu, A Modern Theory of Factorial Designs. Springer Series in Statistics (Springer, New York, 2006)Google Scholar
22. 22.
G. Pistone, H.P. Wynn, Generalised confounding with Gröbner bases. Biometrika 83, 653–666 (1996)
23. 23.
G. Pistone, E. Riccomagno, H.P. Wynn, Algebraic Statistics, Computational Commutative Algebra in Statistics (Chapman & Hall, London, 2000)
24. 24.
R.L. Plackett, The Analysis of Categorical Data, 2nd edn. (Griffin, London, 1981)
25. 25.
F. Santos, B. Sturmfels, Higher Lawrence configurations. J. Comb. Theory Ser. A 103, 151–164 (2003)
26. 26.
A. Takemura, S. Aoki, Some characterizations of minimal Markov basis for sampling from discrete conditional distributions. Ann. Inst. Stat. Math. 56, 1–17 (2004)