Exploratory Versus Decision Trees

Siciliano, Roberta

doi:10.1007/978-3-662-01131-7_10

Roberta Siciliano³

5 Citations

Abstract

Tree-structured methods using recursive partitioning procedures provide a powerful analysis tool for exploring the structure of data and for predicting the outcomes of new cases. Some attention is given to partitioning algorithms and this paper in particular recalls two-stage segmentation. The step from exploratory to decision trees requires the definition of simplification methods and statistical criteria to define the final rule for classifying/predicting unseen cases. Alternative strategies are considered, which result either in the selection of a method among the others or in the definition of a compromise among them.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Aluja-Banet, T. & Nafria, E. (1998). Robust Impurity Measures in Decision Trees. In: Data Science, Classification, and Related Methods (ed. C. Hayashi et al.), 207–214. Tokyo: Springer.
Google Scholar
Breiman L., Friedman J. H., Olshen R. A. & Stone C. J. (1984). Classification and Regression Trees. Wadsworth, Belmont CA.
MATH Google Scholar
Buntine, W. L. (1992). Learning Classification Trees. Statistics & Computing, 2, 63–73.
Article Google Scholar
Buntine, W. & Niblett, T. (1992). A Further Comparison of Splitting Rules for Decision-Tree Induction. Machine Learning, 8, 75–85.
Google Scholar
Cappelli, C. & Siciliano, R. (1997). On Simplification Methods for Decision Trees. In: Proceedings of the NGUS-97 Conference, 104–108.
Google Scholar
Cappelli, C., Mola, F. & Siciliano, R. (1998). An Alternative Pruning Method Based on the Impurity-Complexity Measure. COMPSTAT98 Proceedings in Computational Statistics, in press.
Google Scholar
Cestnik, B. & Bratko, I. (1991). On Estimating Probabilities in Tree Pruning. In: Proceedings of the EWSL-91, 138–150.
Google Scholar
Clark, L.A. & Pregibon, D. (1992). Tree-Based Models. Statistical Models in S (ed. J. M. Chambers & T. J. Hastie), 377–420. Belmont CA Wadsworth and Brooks.
Google Scholar
Clark, P. & Niblett, T. (1989). The CN2 Induction Algorithm. Machine Learning, 3, 261–283.
Google Scholar
Gelfand, S. B., Ravishankar, C. S. & Delp, E. J. (1991). An Iterative Growing and Pruning Algorithm for Classification Tree Design. IEEE Transactions on Pattern Analysis and Machine Intelligence, 13, 2, 163–174.
Article Google Scholar
Hand, D. J. (1996). Classification and Computers: Shifting the Focus. In: COMPSTAT96 Proceedings in Computational Statistics (ed. A. Prat), 77–88. Heidelberg: Physica Verlag.
Google Scholar
Hand, D. J. (1997). Construction and Assessment of Classification Rules. New York: J. Wiley and Sons.
MATH Google Scholar
Hastie, T. & Tibshirani, R., (1990), Generalized Additive Models. London: Chapman & Hall.
MATH Google Scholar
Ho T. B., Nguyen, T. D. & Kimura, M. (1998). Induction of Decision Trees Based on the Rough Set Theory. In: Data Science, Classification, and Related Methods (ed. C. Hayashi et al.), 215–222. Tokyo: Springer.
Google Scholar
Klaschka, J., Siciliano, R. & Antoch, J. (1998). Computational Enhancements in Tree-Growing Methods. In: Proceedings of the IFCS Conference (ed. A. Rizzi), in press.
Google Scholar
Lauro, N. C. & D’Ambra, L. (1984). L’Analyse Non Symmetrique des Correspondances. In: Data Analysis and Informatics III (ed. E. Diday et al), 433–446. Amsterdam: North Holland.
Google Scholar
Malerba, D., Esposito, F. & Semeraro, G. (1996). A Further Comparison of Simplification Methods for Decision-Tree Induction. Learning from Data: AI and Statistics (ed. V. Fisher & H. J. Lenz), 365–374. Heidelberg: Springer Verlag.
Chapter Google Scholar
Mingers, J. (1989a). An Empirical Comparison of Selection Measures for Decision Tree Induction. Machine Learning, 3, 319–342.
Google Scholar
Mingers, J. (1989b). An Empirical Comparison of Pruning Methods for Decision Tree Induction. Machine Learning, 4, 227–243.
Article Google Scholar
Mola, F. & Siciliano, R. (1992). A Two-Stage Predictive Splitting Algorithm in Binary Segmentation. In: Computational Statistics (ed. Y. Dodge & J. Whittaker), 1, 179–184. Heidelberg: Physica Verlag.
Google Scholar
Mola, F. & Siciliano, R. (1994). Alternative Strategies and CATANOVA Testing in Two-Stage Binary Segmentation. In: New Approaches in Classification and Data Analysis (ed. E. Diday et al.), 316–323. Heidelberg: Springer Verlag.
Google Scholar
Mola, F & Siciliano, R (1997). A Fast Splitting Procedure for Classification trees. Statistics and Computing, 7, 208–216.
Article Google Scholar
Mola, F. & Siciliano, R. (1998). Visualizing Data in Tree-Structured Classification. In: Data Science, Classification, and Related Methods (ed. C. Hayashi et al.), 223–230. Tokyo: Springer.
Google Scholar
Mola, F., Klaschka, J. & Siciliano, R.(1996). Logistic Classification Trees. In: COMPSTAT ’96 (ed. A. Prat), 165–177. Heidelberg: Physica Verlag.
Google Scholar
Mola, F., Davino, C., Siciliano, R. & Vistocco, D. (1997)Useand Overuse of Neural Networks in Statistics. In: Proceedings of the NGUS-97 Conference, 57–68.
Google Scholar
Oliver, J. J. & Hand, D. J. (1995). On Pruning and Averaging Decision Trees. In: Machine Learning: Proceedings of the 12th International Workshop, 430–437.
Google Scholar
Quinlan, J. R. (1987). Simplifying Decision Tree. International Journal of Man-Machine Studies, 27, 221–234.
Article Google Scholar
Quinlan, J. R. (1993). C.4.5: Programs for Machine Learning. Morgan Kaufmann, San Mateo, CA.
Google Scholar
Russel R. (1993). Pruning Algorithms — A Survey. IEEE Transactions on Neural Networks, 4, 5, 740–747.
Article Google Scholar
Schaffer, C. (1993). Overfitting Avoidance as Bias. Machine Learning, 10, 153–178.
Google Scholar
Sethi, I. K. (1990). Entropy Nets: from Decision Trees to Neural Networks. Proceedings of the IEEE, 78, 10, 1605–1613.
Article Google Scholar
Siciliano, R. (1998). Latent Budget Trees for Multiple Classification. Proceedings of the IFCS Classification Group of SIS. Heidelberg: Springer Verlag, in press.
Google Scholar
Siciliano, R. & Mola, F. (1994). A Recursive Partitioning Procedure and Variable Selection. In: COMPSTAT94 Proceedings in Computational Statistics (ed. R. Dutter & W. Grossmann), 172–177, Heidelberg: Physica Verlag.
Google Scholar
Siciliano, R. & Mola, F. (1996). A Fast Regression Tree Procedure. In: Statistical Modelling. Proceedings of the 11th International Workshop on Statistical Modelling (ed. A. Forcina et al.), 332–340. Perugia: Graphos.
Google Scholar
Siciliano, R. & Mola, F. (1997). Multivariate Data Analysis and Modelling through Classification and Regression Trees. Computing Science and Statistics (ed. E. Wegman & S. Azen), 29, 2, 503–512. Interface Foundation of North America, Inc.: Fairfax.
Google Scholar
Siciliano, R. & Mola, F. (1998a). On the Behaviour of Splitting Criteria for Classification Trees. In: Data Science, Classification, and Related Methods (ed. C. Hayashi et al.), 191–198. Tokyo: Springer.
Google Scholar
Siciliano, R. & Mola, F. (1998b). Ternary Classification Trees: a Factorial Approach. In: Visualization of categorical data (ed. J. Blasius & M. Greenacre). New York: Academic Press.
Google Scholar
Siciliano, R., Mooijaart, A. & van der Heijden, P. G. M. (1993). A Probabilistic Model for Nonsymmetric Correspondence Analysis and Prediction in Contingency Tables. Journal of the Italian Statistical Society, 2, 1, 85106.
Article Google Scholar
Taylor, P. C. & Silverman, B. W. (1993). Block Diagrams and Splitting Criteria for Classification Trees. Statistics & Computing, 3, 147–161.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Dipartimento di Matematica e Statistica, Università di Napoli Federico II, Italia
Roberta Siciliano

Authors

Roberta Siciliano
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Statistics Department, IACR-Rothamsted, AL5 2JQ, Harpenden, Herts, UK
Roger Payne
Department of Mathematics, University of Bristol, BS8 1TW, Bristol, UK
Peter Green

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Siciliano, R. (1998). Exploratory Versus Decision Trees. In: Payne, R., Green, P. (eds) COMPSTAT. Physica, Heidelberg. https://doi.org/10.1007/978-3-662-01131-7_10

Download citation

DOI: https://doi.org/10.1007/978-3-662-01131-7_10
Publisher Name: Physica, Heidelberg
Print ISBN: 978-3-7908-1131-5
Online ISBN: 978-3-662-01131-7
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics