Skip to main content

A Survey on ROC-based Ordinal Regression

  • Chapter
  • First Online:

Abstract

Ordinal regression can be seen as a special case of preference learning, in which the class labels corresponding with data instances can take values from an ordered finite set. In such a setting, the classes usually have a linguistic interpretation attached by humans to subdivide the data into a number of preference bins. In this chapter, we give a general survey on ordinal regression from a machine learning point of view. In particular, we elaborate on some important connections with ROC analysis that have been introduced recently by the present authors. First, the important role of an underlying ranking function in ordinal regression models is discussed, as well as its impact on the performance evaluation of such models. Subsequently, we describe a new ROC-based performance measure that directly evaluates the underlying ranking function, and we place it in the more general context of ROC analysis as the volume under an r-dimensional ROC surface (VUS) for in general rclasses. Furthermore, we also discuss the scalability of this measure and show that it can be computed very efficiently for large samples. Finally, we present a kernel-based learning algorithm that optimizes VUS as a specific case of structured support vector machines.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   139.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   179.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   179.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    Unlike Nakas and Yannoutsos [48] who mainly place ROC analysis in a medical decision-making perspective, we rather focus on its use in a machine learning context. The approach in [22] is limited to the three-class case.

  2. 2.

    The MOSEK-package can be freely downloaded for noncommercial use from www.mosek.com.

References

  1. S.Agarwal, T.Graepel, R.Herbrich, S.Har-Peled, D.Roth, Generalization bounds for the area under the ROC curve. J. Mach. Learn. Res. 6, 393–425 (2005)

    MathSciNet  Google Scholar 

  2. A.Agresti, Categorical Data Analysis, 2nd version. (Wiley, 2002)

    Google Scholar 

  3. K.Ataman, N.Street, Y.Zhang, Learning to rank by maximizing AUC with linear programming, in Proceedings of the IEEE International Joint Conference on Neural Networks(Vancouver, BC, Canada, 2006), pp. 123–129

    Google Scholar 

  4. C.J.C. Burges, A tutorial on support vector machines for pattern recognition. Data Min. Knowl. Discov. 2(2), 121–167 (1998)

    Article  Google Scholar 

  5. Z.Cao, T.Qin, T.Liu, M.Tsai, H.Li, Learning to rank: from pairwise approach to listwise approach, in Proceedings of the International Conference on Machine Learning(Corvallis, OR, USA, 2007), pp. 129–136

    Google Scholar 

  6. K.Cao-Van, Supervised Ranking, from semantics to algorithms. PhD thesis, Ghent University, Belgium, 2003

    Google Scholar 

  7. W.Chu, Z.Ghahramani, Gaussian processes for ordinal regression. J. Mach. Learn. Res. 6, 1019–1041 (2005)

    MathSciNet  MATH  Google Scholar 

  8. W.Chu, Z.Ghahramani, Preference learning with Gaussian processes, in Proceedings of the International Conference on Machine Learning(Bonn, Germany, 2005), pp. 137–144

    Google Scholar 

  9. W.Chu, S.Keerthi, New approaches to support vector ordinal regression, in Proceedings of the International Conference on Machine Learning(Bonn, Germany, 2005), pp. 321–328

    Google Scholar 

  10. W.Chu, S.Keerthi, Support vector ordinal regression. Neural Comput. 19(3), 792–815 (2007)

    Google Scholar 

  11. S.Clémençon, N.Vayatis, Ranking the best instances. J. Mach. Learn. Res. 8, 2671–2699 (2007)

    MathSciNet  MATH  Google Scholar 

  12. W.Cohen, R.Schapire, Y.Singer, Learning to order things, in Advances in Neural Information Processing Systems, vol.10 (MIT, Vancouver, Canada, 1998), pp. 451–457

    Google Scholar 

  13. C.Cortes, M.Mohri, AUC optimization versus error rate minimization, in Advances in Neural Information Processing Systems, vol.16 (MIT, Vancouver, Canada, 2003), pp. 313–320

    Google Scholar 

  14. C.Cortes, M.Mohri. Confidence intervals for the area under the ROC curve, in Advances in Neural Information Processing Systems, vol.17 (MIT, Vancouver, Canada, 2004), pp. 305–312

    Google Scholar 

  15. C.Cortes, M.Mohri, A.Rastogi, Magnitude-preserving ranking algorithms, in Proceedings of the International Conference on Machine Learning(Corvallis, OR, USA, 2007), pp. 169–176

    Google Scholar 

  16. K.Crammer, Y.Singer, Pranking with ranking, in Proceedings of the Conference on Neural Information Processing Systems(Vancouver, Canada, 2001), pp. 641–647

    Google Scholar 

  17. N.Cristianini, J.Shawe-Taylor, An Introduction to Support Vector Machines(Cambridge University Press, 2000)

    Google Scholar 

  18. B.De Baets, H.De Meyer, Transitivity frameworks for reciprocal relations: cycle-transitivity versus { FG}-transitivity. Fuzzy Sets Syst. 152, 249–270 (2005)

    Article  MATH  Google Scholar 

  19. B.De Baets, H.De Meyer, B.De Schuymer, S.Jenei, Cyclic evaluation of transitivity of reciprocal relations. Soc. Choice Welfare 26, 217–238 (2006)

    Article  MATH  Google Scholar 

  20. B.De Schuymer, H.De Meyer, B.De Baets, Cycle-transitive comparison of independent random variables. J. Multivar. Anal. 96, 352–373 (2005)

    Article  MATH  Google Scholar 

  21. B.De Schuymer, H.De Meyer, B.De Baets, S.Jenei, On the cycle-transitivity of the dice model. Theory Decis. 54, 164–185 (2003)

    Article  Google Scholar 

  22. S.Dreiseitl, L.Ohno-Machado, M.Binder, Comparing three-class diagnostic tests by three-way ROC analysis. Med. Decis. Mak. 20, 323–331 (2000)

    Article  Google Scholar 

  23. E.Frank, M.Hall, A simple approach to ordinal classification, in Proceedings of the European Conference on Machine Learning, (Freibourg, Germany, 2001), pp. 146–156

    Google Scholar 

  24. Y.Freund, R.Yier, R.Schapire, Y.Singer, An efficient boosting algorithm for combining preferences. J. Mach. Learn. Res. 4, 933–969 (2003)

    Google Scholar 

  25. J.Fürnkranz, Round robin classification. J. Mach. Learn. Res. 2, 723–747 (2002)

    Google Scholar 

  26. J.Fürnkranz, E.Hüllermeier, Pairwise preference learning and ranking. Lect. Notes Comput. Sci. 2837, 145–156 (2003)

    Article  Google Scholar 

  27. M.Gönen, G.Heller, Concordance probability and discriminatory power in proportional hazards regression. Biometrika 92(4), 965–970 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  28. D.Hand, R.Till, A simple generalization of the area under the ROC curve for multiple class problems. Mach. Learn. 45, 171–186 (2001)

    Article  MATH  Google Scholar 

  29. J.Hanley, B.McNeil, The meaning and use of the area under a receiver operating characteristics curve. Radiology 143, 29–36 (1982)

    Google Scholar 

  30. J.Hanley, B.McNeil, A method of comparing receiver operating characteristics curves derived from the same class. Radiology 148, 839–843 (1983)

    Google Scholar 

  31. S.Har-Peled, D.Roth, D.Zimak, Constraint classification: A new approach to multi-class classification and ranking, in Advances in Neural Information Processing Systems, vol.15 (MIT, Vancouver, Canada, 2002), pp. 785–792

    Google Scholar 

  32. E.Harrington, Online ranking/collaborative filtering using the perceptron algorithm, in Proceedings of the 20th International Conference on Machine Learning(Washington, USA, 2003), pp. 250–257

    Google Scholar 

  33. T.Hastie, R.Tibshirani, Classification by pairwise coupling. Ann. Stat. 26(2), 451–471 (1998)

    MathSciNet  MATH  Google Scholar 

  34. M.Heller, B.Schnörr, Learning sparse representations by non-negative matrix factorization and sequential cone programming. J. Mach. Learn. Res. 7, 1385–1407 (2006)

    MathSciNet  Google Scholar 

  35. R.Herbrich, T.Graepel, K.Obermayer, Large margin rank boundaries for ordinal regression, in Advances in Large Margin Classifiers, ed. by A.Smola, P.Bartlett, B.Schölkopf, D.Schuurmans (MIT, 2000), pp. 115–132

    Google Scholar 

  36. A.Herschtal, B.Raskutti, Optimizing area under the ROC curve using gradient descent, in Proceedings of the International Conference on Machine Learning(Bonn, Germany, 2005), pp. 49–57

    Google Scholar 

  37. J.Higgins, Introduction to Modern Nonparametric Statistics(Duxbury, 2004)

    Google Scholar 

  38. E.Hüllermeier, J.Hühn, Is an ordinal class structure useful in classifier learning? Int. J. Data Min. Model. Manag. 1(1), 45–67 (2009)

    Google Scholar 

  39. T.Joachims, A support vector method for multivariate performance measures, in Proceedings of the International Conference on Machine Learning(Bonn, Germany, 2005), pp. 377–384

    Google Scholar 

  40. J.Kelley, The cutting plane method for convex programs. J. Soc. Ind. Appl. Math. 9, 703–712 (1960)

    MathSciNet  Google Scholar 

  41. S.Kramer, G.Widmer, B.Pfahringer, M.Degroeve, Prediction of ordinal classes using regression trees. Fundam. Informaticae 24, 1–15 (2000)

    MathSciNet  Google Scholar 

  42. G.Lanckriet, N.Cristianini, P.Bartlett, L.El Gaoui, M.Jordan, Learning the kernel matrix with semidefinite programming. J. Mach. Learn. Res. 5, 27–72 (2004)

    MATH  Google Scholar 

  43. L.Lehmann, Nonparametrics: Statistical Methods based on Ranks(Holden Day, 1975)

    Google Scholar 

  44. S.Lievens, B.De Baets, K.Cao-Van, A probabilistic framework for the design of instance-based supervised ranking algorithms in an ordinal setting. Ann. Oper. Res. 163, 115–142 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  45. H.Lin, L.Li, Large-margin thresholded ensembles for ordinal regression: Theory and practice. Lect. Notes Comput. Sci. 4264, 319–333 (2006)

    Article  Google Scholar 

  46. R.Luce, P.Suppes, Handbook of Mathematical Psychology, chapter Preference, Utility and Subjective Probability (Wiley, 1965), pp. 249–410

    Google Scholar 

  47. P.McCullagh, Regression models for ordinal data. J. R. Stat. Soc. B 42(2), 109–142 (1980)

    MathSciNet  MATH  Google Scholar 

  48. C.Nakas, C.Yiannoutsos, Ordered multiple-class ROC analysis with continuous measurements. Stat. Med. 22, 3437–3449 (2004)

    Article  Google Scholar 

  49. M.Öztürk, A.Tsoukiàs, Ph. Vincke, Preference modelling, in Multiple Criteria Decision Analysis. State of the Art Surveys, ed. by J.Figueira, S.Greco, M.Ehrgott (Springer, 2005), pp. 27–71

    Google Scholar 

  50. J.Platt, N.Cristianini, J.Shawe-Taylor, Large margin DAGs for multiclass classification. Adv. Neural Process. Syst. 12, 547–553 (2000)

    Google Scholar 

  51. R.Potharst, J.C. Bioch, Decision trees for ordinal classification. Intell. Data Process. 4(2) (2000)

    Google Scholar 

  52. J.D. Rennie, N.Srebro, Loss functions for preference levels: Regression with discrete, ordered labels in Proceedings of the IJCAI Multidisciplinary Workshop on Advances in Preference Handling, (Edinburgh, Scotland, 2005), pp. 180–186

    Google Scholar 

  53. R.Rifkin, A.Klautau, In defense of one-versus-all classification. J. Mach. Learn. Res. 5, 101–143 (2004)

    MathSciNet  MATH  Google Scholar 

  54. B.Schölkopf, A.Smola. Learning with Kernels, Support Vector Machines, Regularisation, Optimization and Beyond(MIT, 2002)

    Google Scholar 

  55. A.Shashua, A.Levin, Ranking with large margin principle: Two approaches, in Advances in Neural Information Processing Systems, vol.16 (MIT, Vancouver, Canada, 2003), pp. 937–944

    Google Scholar 

  56. V.Torra, J.Domingo-Ferrer, J.M. Mateo-Sanz, M.Ng, Regression for ordinal variables without underlying continuous variables. Inf. Sci. 176, 465–476 (2006)

    Article  MathSciNet  Google Scholar 

  57. Y.Tsochantaridis, T.Joachims, T.Hofmann, Y.Altun, Large margin methods for structured and independent output variables. J. Mach. Learn. Res. 6, 1453–1484 (2005)

    MathSciNet  MATH  Google Scholar 

  58. G.Tutz, K.Hechenbichler, Aggregating classifiers with ordinal response structure. J. Stat. Comput. Simul. 75(5), (2004)

    Google Scholar 

  59. W.Waegeman, B.De Baets, On the ERA ranking representability of multi-class classifiers. Artif. Intell. (2009) submitted

    Google Scholar 

  60. W.Waegeman, B.De Baets, L.Boullart, Learning a layered graph with a maximal number of paths connecting source and sink, in Proceedings of the ICML Workshop on Constrained Optimization and Structured Output Spaces(Corvallis, OR, USA, 2007)

    Google Scholar 

  61. W.Waegeman, B.De Baets, L.Boullart, Learning layered ranking functions with structured support vector machines. Neural Netw. 21(10), 1511–1523 (2008)

    Article  MATH  Google Scholar 

  62. W.Waegeman, B.De Baets, L.Boullart, On the scalability of ordered multi-class ROC analysis. Comput. Stat. Data Anal. 52, 3371–3388 (2008)

    Article  MATH  Google Scholar 

  63. W.Waegeman, B.De Baets, L.Boullart, ROC analysis in ordinal regression learning. Pattern Recognit. Lett. 29, 1–9 (2008)

    Article  Google Scholar 

  64. J.Xu, Y.Cao, H.Li, Y.Huang, Cost-sensitive learning of SVM for ranking, in Proceedings of the 17th European Conference on Machine Learning(Berlin, Germany, 2006), pp. 833–840

    Google Scholar 

  65. L.Yan, R.Dodier, M.Mozer, R.Wolniewiecz, Optimizing classifier performance via an approximation to the Wilcoxon-Mann-Whitney statistic, in Proceedings of the International Conference on Machine Learning(Washington, DC, USA, 2003), pp. 848–855

    Google Scholar 

  66. S.Yu, K.Yu, V.Tresp, H.Kriegel, Collaborative ordinal regression, in Proceedings of the International Conference on Machine Learning(Pittsburgh, PA, 2006), pp. 1089–1096

    Google Scholar 

Download references

Acknowledgements

Willem Waegeman has been supported by a grant of the “Institute for the Promotion of Innovation through Science and Technology in Flanders (IWT-Vlaanderen)”. He is currently supported by the Research Foundation Flanders.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Willem Waegeman .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Waegeman, W., Baets, B.D. (2010). A Survey on ROC-based Ordinal Regression. In: Fürnkranz, J., Hüllermeier, E. (eds) Preference Learning. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-14125-6_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-14125-6_7

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-14124-9

  • Online ISBN: 978-3-642-14125-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics