Advertisement

Simultaneous Threshold Interaction Detection in Binary Classification

  • Claudio ConversanoEmail author
  • Elise Dusseldorp
Conference paper
Part of the Studies in Classification, Data Analysis, and Knowledge Organization book series (STUDIES CLASS)

Abstract

Classification Trunk Approach (CTA) is a method for the automatic selection of threshold interactions in generalized linear modelling (GLM). It comes out from the integration of classification trees and GLM. Interactions between predictors are expressed as “threshold interactions” instead of traditional cross-products. Unlike classification trees, CTA is based on a different splitting criterion and it is framed in a new algorithm – STIMA – that can be used to estimate threshold interactions effects in classification and regression models. This paper specifically focuses on the binary response case, and presents the results of an application on the Liver Disorders dataset to give insight into the advantages deriving from the use of CTA with respect to other model-based or decision tree-based approaches. Performances of the different methods are compared focusing on prediction accuracy and model complexity.

References

  1. Breiman, L. (1996). Bagging predictors. Machine Learning, 24, 123–140.MathSciNetzbMATHGoogle Scholar
  2. Breiman, L. (2001). Random forests. Machine Learning, 45, 5–32.CrossRefzbMATHGoogle Scholar
  3. Breiman, L., Friedman, J. H., Olshen, R. A., & Stone, C. J. (1984). Classification and regression trees. Belmont, CA: Wadsworth.zbMATHGoogle Scholar
  4. Cohen, J., Cohen, P., West, S. G., & Aiken, L. S. (2003). Applied multiple regression/correlation analysis for the behavioral sciences (3rd edition). Mahwah NJ: Lawrence Erlbaum.Google Scholar
  5. de Gonzalez, A. B., & Cox, D. R. (2007). Interpretation of interaction: A review. Annals of Applied Statistics, 1(2), 371–375.CrossRefMathSciNetzbMATHGoogle Scholar
  6. Dusseldorp, E., & Meulman, J. (2004). The regression trunk approach to discover treatment covariate interactions. Psychometrika, 69, 355–374.CrossRefMathSciNetGoogle Scholar
  7. Dusseldorp, E., Spinhoven, P., Bakker, A., Van Dyck, R., & Van Balkom, A. J. L. M. (2007). Which panic disorder patients benefit from which treatment: Cognitive therapy or antidepressants? Psychotherapy and Psychosomatics, 76, 154–161.CrossRefGoogle Scholar
  8. Dusseldorp, E., Conversano, C., & Van Os, B. J. (2009). Combining an Additive and tree-based regression model simulatenously: STIMA, Journal of Computational and Graphical Statistics, to appear.Google Scholar
  9. Fahrmeir, L., & Tutz, G. (2001). Multivariate statistical modelling based on generalized linear models (2nd edition). New York: Springer.zbMATHGoogle Scholar
  10. Freund, Y., & Schapire, R. (1997). A decision-theoretic generalization of on-line learning and an application to Boosting. Journal of Computer and System Sciences, 55(1), 119–139.CrossRefMathSciNetzbMATHGoogle Scholar
  11. Friedman, J. H. (1991). Multivariate adaptive regression splines (with discussion). Annals of Statistics, 19, 1–141.CrossRefMathSciNetzbMATHGoogle Scholar
  12. Hastie, T. J., & Tibshirani, R. J. (1990). Generalized additive models. London: Chapman & Hall.zbMATHGoogle Scholar
  13. McCullagh, P., & Nelder, J. A. (1989). Generalized linear models (2nd edition). London: Chapman & Hall.zbMATHGoogle Scholar
  14. Vapnik, V. (1998). Statistical learning theory. New York: Wiley.zbMATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  1. 1.Department of EconomicsUniversity of CagliariCagliariItaly

Personalised recommendations