Abstract
Tree-structured methods using recursive partitioning procedures provide a powerful analysis tool for exploring the structure of data and for predicting the outcomes of new cases. Some attention is given to partitioning algorithms and this paper in particular recalls two-stage segmentation. The step from exploratory to decision trees requires the definition of simplification methods and statistical criteria to define the final rule for classifying/predicting unseen cases. Alternative strategies are considered, which result either in the selection of a method among the others or in the definition of a compromise among them.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Aluja-Banet, T. & Nafria, E. (1998). Robust Impurity Measures in Decision Trees. In: Data Science, Classification, and Related Methods (ed. C. Hayashi et al.), 207–214. Tokyo: Springer.
Breiman L., Friedman J. H., Olshen R. A. & Stone C. J. (1984). Classification and Regression Trees. Wadsworth, Belmont CA.
Buntine, W. L. (1992). Learning Classification Trees. Statistics & Computing, 2, 63–73.
Buntine, W. & Niblett, T. (1992). A Further Comparison of Splitting Rules for Decision-Tree Induction. Machine Learning, 8, 75–85.
Cappelli, C. & Siciliano, R. (1997). On Simplification Methods for Decision Trees. In: Proceedings of the NGUS-97 Conference, 104–108.
Cappelli, C., Mola, F. & Siciliano, R. (1998). An Alternative Pruning Method Based on the Impurity-Complexity Measure. COMPSTAT98 Proceedings in Computational Statistics, in press.
Cestnik, B. & Bratko, I. (1991). On Estimating Probabilities in Tree Pruning. In: Proceedings of the EWSL-91, 138–150.
Clark, L.A. & Pregibon, D. (1992). Tree-Based Models. Statistical Models in S (ed. J. M. Chambers & T. J. Hastie), 377–420. Belmont CA Wadsworth and Brooks.
Clark, P. & Niblett, T. (1989). The CN2 Induction Algorithm. Machine Learning, 3, 261–283.
Gelfand, S. B., Ravishankar, C. S. & Delp, E. J. (1991). An Iterative Growing and Pruning Algorithm for Classification Tree Design. IEEE Transactions on Pattern Analysis and Machine Intelligence, 13, 2, 163–174.
Hand, D. J. (1996). Classification and Computers: Shifting the Focus. In: COMPSTAT96 Proceedings in Computational Statistics (ed. A. Prat), 77–88. Heidelberg: Physica Verlag.
Hand, D. J. (1997). Construction and Assessment of Classification Rules. New York: J. Wiley and Sons.
Hastie, T. & Tibshirani, R., (1990), Generalized Additive Models. London: Chapman & Hall.
Ho T. B., Nguyen, T. D. & Kimura, M. (1998). Induction of Decision Trees Based on the Rough Set Theory. In: Data Science, Classification, and Related Methods (ed. C. Hayashi et al.), 215–222. Tokyo: Springer.
Klaschka, J., Siciliano, R. & Antoch, J. (1998). Computational Enhancements in Tree-Growing Methods. In: Proceedings of the IFCS Conference (ed. A. Rizzi), in press.
Lauro, N. C. & D’Ambra, L. (1984). L’Analyse Non Symmetrique des Correspondances. In: Data Analysis and Informatics III (ed. E. Diday et al), 433–446. Amsterdam: North Holland.
Malerba, D., Esposito, F. & Semeraro, G. (1996). A Further Comparison of Simplification Methods for Decision-Tree Induction. Learning from Data: AI and Statistics (ed. V. Fisher & H. J. Lenz), 365–374. Heidelberg: Springer Verlag.
Mingers, J. (1989a). An Empirical Comparison of Selection Measures for Decision Tree Induction. Machine Learning, 3, 319–342.
Mingers, J. (1989b). An Empirical Comparison of Pruning Methods for Decision Tree Induction. Machine Learning, 4, 227–243.
Mola, F. & Siciliano, R. (1992). A Two-Stage Predictive Splitting Algorithm in Binary Segmentation. In: Computational Statistics (ed. Y. Dodge & J. Whittaker), 1, 179–184. Heidelberg: Physica Verlag.
Mola, F. & Siciliano, R. (1994). Alternative Strategies and CATANOVA Testing in Two-Stage Binary Segmentation. In: New Approaches in Classification and Data Analysis (ed. E. Diday et al.), 316–323. Heidelberg: Springer Verlag.
Mola, F & Siciliano, R (1997). A Fast Splitting Procedure for Classification trees. Statistics and Computing, 7, 208–216.
Mola, F. & Siciliano, R. (1998). Visualizing Data in Tree-Structured Classification. In: Data Science, Classification, and Related Methods (ed. C. Hayashi et al.), 223–230. Tokyo: Springer.
Mola, F., Klaschka, J. & Siciliano, R.(1996). Logistic Classification Trees. In: COMPSTAT ’96 (ed. A. Prat), 165–177. Heidelberg: Physica Verlag.
Mola, F., Davino, C., Siciliano, R. & Vistocco, D. (1997)Useand Overuse of Neural Networks in Statistics. In: Proceedings of the NGUS-97 Conference, 57–68.
Oliver, J. J. & Hand, D. J. (1995). On Pruning and Averaging Decision Trees. In: Machine Learning: Proceedings of the 12th International Workshop, 430–437.
Quinlan, J. R. (1987). Simplifying Decision Tree. International Journal of Man-Machine Studies, 27, 221–234.
Quinlan, J. R. (1993). C.4.5: Programs for Machine Learning. Morgan Kaufmann, San Mateo, CA.
Russel R. (1993). Pruning Algorithms — A Survey. IEEE Transactions on Neural Networks, 4, 5, 740–747.
Schaffer, C. (1993). Overfitting Avoidance as Bias. Machine Learning, 10, 153–178.
Sethi, I. K. (1990). Entropy Nets: from Decision Trees to Neural Networks. Proceedings of the IEEE, 78, 10, 1605–1613.
Siciliano, R. (1998). Latent Budget Trees for Multiple Classification. Proceedings of the IFCS Classification Group of SIS. Heidelberg: Springer Verlag, in press.
Siciliano, R. & Mola, F. (1994). A Recursive Partitioning Procedure and Variable Selection. In: COMPSTAT94 Proceedings in Computational Statistics (ed. R. Dutter & W. Grossmann), 172–177, Heidelberg: Physica Verlag.
Siciliano, R. & Mola, F. (1996). A Fast Regression Tree Procedure. In: Statistical Modelling. Proceedings of the 11th International Workshop on Statistical Modelling (ed. A. Forcina et al.), 332–340. Perugia: Graphos.
Siciliano, R. & Mola, F. (1997). Multivariate Data Analysis and Modelling through Classification and Regression Trees. Computing Science and Statistics (ed. E. Wegman & S. Azen), 29, 2, 503–512. Interface Foundation of North America, Inc.: Fairfax.
Siciliano, R. & Mola, F. (1998a). On the Behaviour of Splitting Criteria for Classification Trees. In: Data Science, Classification, and Related Methods (ed. C. Hayashi et al.), 191–198. Tokyo: Springer.
Siciliano, R. & Mola, F. (1998b). Ternary Classification Trees: a Factorial Approach. In: Visualization of categorical data (ed. J. Blasius & M. Greenacre). New York: Academic Press.
Siciliano, R., Mooijaart, A. & van der Heijden, P. G. M. (1993). A Probabilistic Model for Nonsymmetric Correspondence Analysis and Prediction in Contingency Tables. Journal of the Italian Statistical Society, 2, 1, 85106.
Taylor, P. C. & Silverman, B. W. (1993). Block Diagrams and Splitting Criteria for Classification Trees. Statistics & Computing, 3, 147–161.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1998 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Siciliano, R. (1998). Exploratory Versus Decision Trees. In: Payne, R., Green, P. (eds) COMPSTAT. Physica, Heidelberg. https://doi.org/10.1007/978-3-662-01131-7_10
Download citation
DOI: https://doi.org/10.1007/978-3-662-01131-7_10
Publisher Name: Physica, Heidelberg
Print ISBN: 978-3-7908-1131-5
Online ISBN: 978-3-662-01131-7
eBook Packages: Springer Book Archive