Advertisement

Statistical Methods & Applications

, Volume 22, Issue 3, pp 381–390 | Cite as

Consistency of the estimator of binary response models based on AUC maximization

  • Igor FedotenkovEmail author
Article
  • 190 Downloads

Abstract

This paper examines the asymptotic properties of a binary response model estimator based on maximization of the Area Under receiver operating characteristic Curve (AUC). Given certain assumptions, AUC maximization is a consistent method of binary response model estimation up to normalizations. As AUC is equivalent to Mann-Whitney U statistics and Wilcoxon test of ranks, maximization of area under ROC curve is equivalent to the maximization of corresponding statistics. Compared to parametric methods, such as logit and probit, AUC maximization relaxes assumptions about error distribution, but imposes some restrictions on the distribution of explanatory variables, which can be easily checked, since this information is observable.

Keywords

ROC AUC maximization Consistency Binary response model 

Mathematics Subject Classification (2010)

62G05 62G20 62J15 

Notes

Acknowledgments

I would like to thank the participants at the 12th Symposium of Mathematics and its Applications (2009) in Timisoara. Furthermore, I wish to thank Alfredas Račkauskas, Dmitrij Celov and Irena Mikolajun for their useful comments and Steve Guttenberg for his help with the English language.

References

  1. Agarwal S, Har-Peled S, Roth D (2005) A uniform convergence bound for the area under the ROC curve. In: Proceedings of the 10th international workshop on artificial intelligence and, statistics, pp 1–8Google Scholar
  2. Ailon N, Mohri M (2007) An efficient reduction of ranking to classification. Technical Report TR-2007-903, New York UniversityGoogle Scholar
  3. Balcan MF, Bansal N, Beygelzimer A, Coppersmith D, Langford J, Sorkin GB (2008) Robust reductions from ranking to classification. Mach Learn J 72(1–2):139–153Google Scholar
  4. Bamber D (1975) The area above the ordinal dominance graph and the area below the receiver operating characteristic graph. J Math Psychol 12(4):387–415MathSciNetzbMATHCrossRefGoogle Scholar
  5. Cortes C, Mohri M (2004) AUC optimization vs error rate minimization. Advances in neural information processing systems. MIT Press, CambridgeGoogle Scholar
  6. Jaroszewicz S (2006) Polynomial association rules with applications to logistic regression. KDD conference paper, pp 586–591Google Scholar
  7. Hanley JA, McNeil BJ (1982) The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 143:29–36Google Scholar
  8. Herschtal A, Raskutti B (2004) Optimising area under the roc curve using gradient descent. ACM Press, ICMLGoogle Scholar
  9. Horowitz JL (1992) Smoothed maximum score estimator for the binary response model. Econometrica 60(3):505–531MathSciNetzbMATHCrossRefGoogle Scholar
  10. Manski CF (1975) Maximum score estimation of the stochastic utility model of choice. J Econom 3(3): 205–228Google Scholar
  11. Manski CF (1983) Closest empirical distribution estimation. Econometrica 51(2):305–319MathSciNetCrossRefGoogle Scholar
  12. Manski CF (1985a) Semiparametric analysis of discrete response: asymptotic properties of the maximum score estimator. J Econom 27(3):313–333MathSciNetzbMATHCrossRefGoogle Scholar
  13. Manski CF (1985b) Semiparametric analysis of binary response from response-based samples. J Econom 31(1):31–40MathSciNetCrossRefGoogle Scholar
  14. Manski CF (1986) Operational characteristics of maximum score estimation. J Econom 32(1):85–108MathSciNetCrossRefGoogle Scholar
  15. Manski CF (1988) Identification of binary response models. J Am Stat Assoc 83(403):729–738MathSciNetzbMATHCrossRefGoogle Scholar
  16. Marrocco C, Duin RPW, Tortorella F (2008) Maximizing the area under the ROC curve by pairwise feature combination. Pattern Recognit 41(6):1961–1974zbMATHCrossRefGoogle Scholar
  17. Rakotomamonjy A (2004) Optimizing area under ROC curve with SVMs. ROC Anal Artif Intell proceedings, 71–80Google Scholar
  18. Toh KA, Kim J, Lee S (2008) Maximizing area under ROC curve for biometric scores fusion. Pattern Recognit 41:3373–3392zbMATHCrossRefGoogle Scholar
  19. Train K (2003) Discrete choice methods with simulation, 1st edn. Cambridge University Press, CambridgeCrossRefGoogle Scholar
  20. Wenxia G, Whitmore GA (2010) Binary response and logistic regression in recent accounting research publications: a methodological note. Rev Quant Financ Account 34(1):81–93CrossRefGoogle Scholar
  21. Wooldridge JM (2006) Introductory econometrics: a modern approach, 3rd edn. Thomson South-Western, CanadaGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  1. 1.Department of Economics, University of VeronaVeronaItaly

Personalised recommendations