Abstract
This paper relates computational commutative algebra to tree classification with binary covariates. With a single classification variable, properties of uniqueness of a tree polynomial are established. In a binary multivariate context, it is shown how trees for many response variables can be made into a single ideal of polynomials for computations. Finally, a new sequential algorithm is proposed for uniform conditional sampling. The algorithm combines the lexicographic Groebner basis with importance sampling and it can be used for conditional comparisons of regulatory network maps. The binary state space leads to an explicit form for the design ideal, which leads to useful radical and extension properties that play a role in the algorithms.
Similar content being viewed by others
References
Albert R, Othmer HG (2003) The topology of the regulatory interactions predicts the expression pattern of the segment polarity genes in Drosophila melanogaster. J Theor Biology 223: 1–18
Albert R (2007) Network inference, analysis, and modeling in systems biology. Plant Cell 19: 3327–3338
Breiman L, Friedman JH, Olshen RA, Stone CJ (1984) Classification and regression trees. Chapman and Hall, New York
Bryant RE (1986) Graph-based algorithms for Boolean function manipulation. IEEE Trans Comput 35: 677–691
Clegg M, Edmonds J, Impagliazzo R (1996) Using the Groebner basis algorithm to find proofs of unsatisfiability. In: Proc. 28th ACM Symposium on Theory of Computing. ACM, New York, pp 174–183
Chen Y, Dinwoodie IH, Sullivant S (2006) Sequential importance sampling for multiway tables. Ann Stat 34: 523–545
CoCoATeam (2007) CoCoA: a system for doing computations in commutative algebra. URL: http://cocoa.dima.unige.it
Cox D, Little J, O’Shea D (1997) Ideals, varieties, and algorithms, 2nd edn. Springer, New York
Cox D, Little J, O’Shea D (1998) Using algebraic geometry. Springer, New York
Duanmu S, Kim H-Y, Stiennon N (2005) Stress and syllable structure in English: approaches to phonological variations. Taiwan J Linguist 3: 45–78
Greuel G-M, Pfister G, Schönemann H (2007) Singular 3.0.4. A Computer Algebra System for Polynomial Computations. Centre for Computer Algebra, University of Kaiserslautern. URL: http://www.singular.uni-kl.de
Kreuzer M, Robbiano L (2000) Computational commutative algebra I. Springer, New York
Lattner AD, Kim S, Cervone G, Grefenstette JJ (2003) Experimental comparison of symbolic learning programs for the classification of gene network topology models. In: FGML 2003 Workshop, annual meeting of the GI working group–machine learning, knowledge discovery, data mining. Karlsruhe, Germany (2003)
Laubenbacher R, Stigler B (2004) A computational algebra approach to the reverse engineering of gene regulatory networks. J Theor Biology 229: 523–537
Laubenbacher R, Sturmfels B (2007) Computer algebra in systems biology. URL: https://www.vbi.vt.edu/admg/publications/
Lehmann EL, Romano JP (2005) Testing statistical hypotheses, 3rd edn. Springer, New York
Liu JS (2001) Monte carlo strategies in scientific computing. Springer, New York
Markowetz F, Spang R (2007) Inferring cellular networks–a review. BMC Bioinfor 8(Suppl 6): S5
Pistone G, Riccomagno E, Wynn H (2001) Algebraic statistics: computational commutative algebra in statistics. Chapman and Hall, New York
Ripley B (2007) tree: Classification and regression trees, 1.0-26. URL: http://cran.r-project.org
R Development Core Team (2007) R: a language and environment for statistical computing. R Foundation for statistical computing, Vienna, Austria. URL: http://www.R-project.org
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Dinwoodie, I.H. Polynomials for classification trees and applications. Stat Methods Appl 19, 171–192 (2010). https://doi.org/10.1007/s10260-009-0123-2
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10260-009-0123-2