Skip to main content

Exploratory Versus Decision Trees

  • Conference paper
COMPSTAT

Abstract

Tree-structured methods using recursive partitioning procedures provide a powerful analysis tool for exploring the structure of data and for predicting the outcomes of new cases. Some attention is given to partitioning algorithms and this paper in particular recalls two-stage segmentation. The step from exploratory to decision trees requires the definition of simplification methods and statistical criteria to define the final rule for classifying/predicting unseen cases. Alternative strategies are considered, which result either in the selection of a method among the others or in the definition of a compromise among them.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Aluja-Banet, T. & Nafria, E. (1998). Robust Impurity Measures in Decision Trees. In: Data Science, Classification, and Related Methods (ed. C. Hayashi et al.), 207–214. Tokyo: Springer.

    Google Scholar 

  • Breiman L., Friedman J. H., Olshen R. A. & Stone C. J. (1984). Classification and Regression Trees. Wadsworth, Belmont CA.

    MATH  Google Scholar 

  • Buntine, W. L. (1992). Learning Classification Trees. Statistics & Computing, 2, 63–73.

    Article  Google Scholar 

  • Buntine, W. & Niblett, T. (1992). A Further Comparison of Splitting Rules for Decision-Tree Induction. Machine Learning, 8, 75–85.

    Google Scholar 

  • Cappelli, C. & Siciliano, R. (1997). On Simplification Methods for Decision Trees. In: Proceedings of the NGUS-97 Conference, 104–108.

    Google Scholar 

  • Cappelli, C., Mola, F. & Siciliano, R. (1998). An Alternative Pruning Method Based on the Impurity-Complexity Measure. COMPSTAT98 Proceedings in Computational Statistics, in press.

    Google Scholar 

  • Cestnik, B. & Bratko, I. (1991). On Estimating Probabilities in Tree Pruning. In: Proceedings of the EWSL-91, 138–150.

    Google Scholar 

  • Clark, L.A. & Pregibon, D. (1992). Tree-Based Models. Statistical Models in S (ed. J. M. Chambers & T. J. Hastie), 377–420. Belmont CA Wadsworth and Brooks.

    Google Scholar 

  • Clark, P. & Niblett, T. (1989). The CN2 Induction Algorithm. Machine Learning, 3, 261–283.

    Google Scholar 

  • Gelfand, S. B., Ravishankar, C. S. & Delp, E. J. (1991). An Iterative Growing and Pruning Algorithm for Classification Tree Design. IEEE Transactions on Pattern Analysis and Machine Intelligence, 13, 2, 163–174.

    Article  Google Scholar 

  • Hand, D. J. (1996). Classification and Computers: Shifting the Focus. In: COMPSTAT96 Proceedings in Computational Statistics (ed. A. Prat), 77–88. Heidelberg: Physica Verlag.

    Google Scholar 

  • Hand, D. J. (1997). Construction and Assessment of Classification Rules. New York: J. Wiley and Sons.

    MATH  Google Scholar 

  • Hastie, T. & Tibshirani, R., (1990), Generalized Additive Models. London: Chapman & Hall.

    MATH  Google Scholar 

  • Ho T. B., Nguyen, T. D. & Kimura, M. (1998). Induction of Decision Trees Based on the Rough Set Theory. In: Data Science, Classification, and Related Methods (ed. C. Hayashi et al.), 215–222. Tokyo: Springer.

    Google Scholar 

  • Klaschka, J., Siciliano, R. & Antoch, J. (1998). Computational Enhancements in Tree-Growing Methods. In: Proceedings of the IFCS Conference (ed. A. Rizzi), in press.

    Google Scholar 

  • Lauro, N. C. & D’Ambra, L. (1984). L’Analyse Non Symmetrique des Correspondances. In: Data Analysis and Informatics III (ed. E. Diday et al), 433–446. Amsterdam: North Holland.

    Google Scholar 

  • Malerba, D., Esposito, F. & Semeraro, G. (1996). A Further Comparison of Simplification Methods for Decision-Tree Induction. Learning from Data: AI and Statistics (ed. V. Fisher & H. J. Lenz), 365–374. Heidelberg: Springer Verlag.

    Chapter  Google Scholar 

  • Mingers, J. (1989a). An Empirical Comparison of Selection Measures for Decision Tree Induction. Machine Learning, 3, 319–342.

    Google Scholar 

  • Mingers, J. (1989b). An Empirical Comparison of Pruning Methods for Decision Tree Induction. Machine Learning, 4, 227–243.

    Article  Google Scholar 

  • Mola, F. & Siciliano, R. (1992). A Two-Stage Predictive Splitting Algorithm in Binary Segmentation. In: Computational Statistics (ed. Y. Dodge & J. Whittaker), 1, 179–184. Heidelberg: Physica Verlag.

    Google Scholar 

  • Mola, F. & Siciliano, R. (1994). Alternative Strategies and CATANOVA Testing in Two-Stage Binary Segmentation. In: New Approaches in Classification and Data Analysis (ed. E. Diday et al.), 316–323. Heidelberg: Springer Verlag.

    Google Scholar 

  • Mola, F & Siciliano, R (1997). A Fast Splitting Procedure for Classification trees. Statistics and Computing, 7, 208–216.

    Article  Google Scholar 

  • Mola, F. & Siciliano, R. (1998). Visualizing Data in Tree-Structured Classification. In: Data Science, Classification, and Related Methods (ed. C. Hayashi et al.), 223–230. Tokyo: Springer.

    Google Scholar 

  • Mola, F., Klaschka, J. & Siciliano, R.(1996). Logistic Classification Trees. In: COMPSTAT ’96 (ed. A. Prat), 165–177. Heidelberg: Physica Verlag.

    Google Scholar 

  • Mola, F., Davino, C., Siciliano, R. & Vistocco, D. (1997)Useand Overuse of Neural Networks in Statistics. In: Proceedings of the NGUS-97 Conference, 57–68.

    Google Scholar 

  • Oliver, J. J. & Hand, D. J. (1995). On Pruning and Averaging Decision Trees. In: Machine Learning: Proceedings of the 12th International Workshop, 430–437.

    Google Scholar 

  • Quinlan, J. R. (1987). Simplifying Decision Tree. International Journal of Man-Machine Studies, 27, 221–234.

    Article  Google Scholar 

  • Quinlan, J. R. (1993). C.4.5: Programs for Machine Learning. Morgan Kaufmann, San Mateo, CA.

    Google Scholar 

  • Russel R. (1993). Pruning Algorithms — A Survey. IEEE Transactions on Neural Networks, 4, 5, 740–747.

    Article  Google Scholar 

  • Schaffer, C. (1993). Overfitting Avoidance as Bias. Machine Learning, 10, 153–178.

    Google Scholar 

  • Sethi, I. K. (1990). Entropy Nets: from Decision Trees to Neural Networks. Proceedings of the IEEE, 78, 10, 1605–1613.

    Article  Google Scholar 

  • Siciliano, R. (1998). Latent Budget Trees for Multiple Classification. Proceedings of the IFCS Classification Group of SIS. Heidelberg: Springer Verlag, in press.

    Google Scholar 

  • Siciliano, R. & Mola, F. (1994). A Recursive Partitioning Procedure and Variable Selection. In: COMPSTAT94 Proceedings in Computational Statistics (ed. R. Dutter & W. Grossmann), 172–177, Heidelberg: Physica Verlag.

    Google Scholar 

  • Siciliano, R. & Mola, F. (1996). A Fast Regression Tree Procedure. In: Statistical Modelling. Proceedings of the 11th International Workshop on Statistical Modelling (ed. A. Forcina et al.), 332–340. Perugia: Graphos.

    Google Scholar 

  • Siciliano, R. & Mola, F. (1997). Multivariate Data Analysis and Modelling through Classification and Regression Trees. Computing Science and Statistics (ed. E. Wegman & S. Azen), 29, 2, 503–512. Interface Foundation of North America, Inc.: Fairfax.

    Google Scholar 

  • Siciliano, R. & Mola, F. (1998a). On the Behaviour of Splitting Criteria for Classification Trees. In: Data Science, Classification, and Related Methods (ed. C. Hayashi et al.), 191–198. Tokyo: Springer.

    Google Scholar 

  • Siciliano, R. & Mola, F. (1998b). Ternary Classification Trees: a Factorial Approach. In: Visualization of categorical data (ed. J. Blasius & M. Greenacre). New York: Academic Press.

    Google Scholar 

  • Siciliano, R., Mooijaart, A. & van der Heijden, P. G. M. (1993). A Probabilistic Model for Nonsymmetric Correspondence Analysis and Prediction in Contingency Tables. Journal of the Italian Statistical Society, 2, 1, 85106.

    Article  Google Scholar 

  • Taylor, P. C. & Silverman, B. W. (1993). Block Diagrams and Splitting Criteria for Classification Trees. Statistics & Computing, 3, 147–161.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1998 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Siciliano, R. (1998). Exploratory Versus Decision Trees. In: Payne, R., Green, P. (eds) COMPSTAT. Physica, Heidelberg. https://doi.org/10.1007/978-3-662-01131-7_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-662-01131-7_10

  • Publisher Name: Physica, Heidelberg

  • Print ISBN: 978-3-7908-1131-5

  • Online ISBN: 978-3-662-01131-7

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics