Ordinal Forests

Abstract

The ordinal forest method is a random forest–based prediction method for ordinal response variables. Ordinal forests allow prediction using both low-dimensional and high-dimensional covariate data and can additionally be used to rank covariates with respect to their importance for prediction. An extensive comparison study reveals that ordinal forests tend to outperform competitors in terms of prediction performance. Moreover, it is seen that the covariate importance measure currently used by ordinal forest discriminates influential covariates from noise covariates at least similarly well as the measures used by competitors. Several further important properties of the ordinal forest algorithm are studied in additional investigations. The rationale underlying ordinal forests of using optimized score values in place of the class values of the ordinal response variable is in principle applicable to any regression method beyond random forests for continuous outcome that is considered in the ordinal forest method.

This is a preview of subscription content, log in to check access.

Fig. 1

References

  1. Ben-David, A. (2008). Comparison of classification accuracy using Cohen’s weighted Kappa. Expert Systems with Applications, 34(2), 825–832.

    Article  Google Scholar 

  2. Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32.

    Article  Google Scholar 

  3. Breiman, L., Friedman, J.H., Olshen, R.A., Ston, C.J. (1984). Classification and regression trees. Monterey: Wadsworth International Group.

    Google Scholar 

  4. Cohen, J. (1960). A Coefficient of agreement for nominal scales. Educational and Psychological Measurement, 20, 37–46.

    Article  Google Scholar 

  5. Cohen, J. (1968). Weighed Kappa: nominal scale agreement with provision for scaled disagreement or partial credit. Psychological Bulletin, 70(4), 213–220.

    Article  Google Scholar 

  6. Hornung, R. (2018). ordinalForest: Ordinal Forests: Prediction and Variable Ranking with Ordinal Target Variables, R package version 2.2.

  7. Hothorn, T., Hornik, K., Zeileis, A. (2006). Unbiased recursive partitioning: a conditional inference framework. Journal of Computational and Graphical Statistics, 15(3), 651–674.

    MathSciNet  Article  Google Scholar 

  8. Jakobsson, U., & Westergren, A. (2005). Statistical methods for assessing agreement for ordinal data. Scandinavian Journal of Caring Sciences, 19(4), 427–431.

    Article  Google Scholar 

  9. Janitza, S., Tutz, G., Boulesteix, A.L. (2016). Random forest for ordinal responses: prediction and variable selection. Computational Statistics and Data Analysis, 96, 57–73.

    MathSciNet  Article  Google Scholar 

  10. McCullagh, P. (1980). Regression models for ordinal data. Journal of the Royal Statistical Society Series B, 42(2), 109–142.

    MathSciNet  MATH  Google Scholar 

  11. Probst, P., Bischl, B., Boulesteix, A.L. (2018). Tunability: importance of hyperparameters of machine learning algorithms. arXiv:1802.09596.

  12. Wright, M.N., & Ziegler, A. (2017). ranger: A Fast Implementation of Random Forests for High Dimensional Data in C++ and R. Journal of Statistical Software, 77 (1), 1–17.

    Article  Google Scholar 

Download references

Acknowledgments

The author thanks Giuseppe Casalicchio for proofreading and comments and Jenny Lee for language corrections. This work was supported by the German Science Foundation (DFG-Einzelförderung BO3139/6-1 to Anne-Laure Boulesteix).

Author information

Affiliations

Authors

Corresponding author

Correspondence to Roman Hornung.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

(PDF 1.01 MB)

(ZIP 1.58 MB)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Hornung, R. Ordinal Forests. J Classif 37, 4–17 (2020). https://doi.org/10.1007/s00357-018-9302-x

Download citation

Keywords

  • Prediction
  • Ordinal response variable
  • Covariate importance ranking
  • Random forest