# Consistency of the estimator of binary response models based on AUC maximization

- 190 Downloads

## Abstract

This paper examines the asymptotic properties of a binary response model estimator based on maximization of the Area Under receiver operating characteristic Curve (AUC). Given certain assumptions, AUC maximization is a consistent method of binary response model estimation up to normalizations. As AUC is equivalent to Mann-Whitney U statistics and Wilcoxon test of ranks, maximization of area under ROC curve is equivalent to the maximization of corresponding statistics. Compared to parametric methods, such as logit and probit, AUC maximization relaxes assumptions about error distribution, but imposes some restrictions on the distribution of explanatory variables, which can be easily checked, since this information is observable.

## Keywords

ROC AUC maximization Consistency Binary response model## Mathematics Subject Classification (2010)

62G05 62G20 62J15## Notes

### Acknowledgments

I would like to thank the participants at the 12th Symposium of Mathematics and its Applications (2009) in Timisoara. Furthermore, I wish to thank Alfredas Račkauskas, Dmitrij Celov and Irena Mikolajun for their useful comments and Steve Guttenberg for his help with the English language.

## References

- Agarwal S, Har-Peled S, Roth D (2005) A uniform convergence bound for the area under the ROC curve. In: Proceedings of the 10th international workshop on artificial intelligence and, statistics, pp 1–8Google Scholar
- Ailon N, Mohri M (2007) An efficient reduction of ranking to classification. Technical Report TR-2007-903, New York UniversityGoogle Scholar
- Balcan MF, Bansal N, Beygelzimer A, Coppersmith D, Langford J, Sorkin GB (2008) Robust reductions from ranking to classification. Mach Learn J 72(1–2):139–153Google Scholar
- Bamber D (1975) The area above the ordinal dominance graph and the area below the receiver operating characteristic graph. J Math Psychol 12(4):387–415MathSciNetzbMATHCrossRefGoogle Scholar
- Cortes C, Mohri M (2004) AUC optimization vs error rate minimization. Advances in neural information processing systems. MIT Press, CambridgeGoogle Scholar
- Jaroszewicz S (2006) Polynomial association rules with applications to logistic regression. KDD conference paper, pp 586–591Google Scholar
- Hanley JA, McNeil BJ (1982) The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 143:29–36Google Scholar
- Herschtal A, Raskutti B (2004) Optimising area under the roc curve using gradient descent. ACM Press, ICMLGoogle Scholar
- Horowitz JL (1992) Smoothed maximum score estimator for the binary response model. Econometrica 60(3):505–531MathSciNetzbMATHCrossRefGoogle Scholar
- Manski CF (1975) Maximum score estimation of the stochastic utility model of choice. J Econom 3(3): 205–228Google Scholar
- Manski CF (1983) Closest empirical distribution estimation. Econometrica 51(2):305–319MathSciNetCrossRefGoogle Scholar
- Manski CF (1985a) Semiparametric analysis of discrete response: asymptotic properties of the maximum score estimator. J Econom 27(3):313–333MathSciNetzbMATHCrossRefGoogle Scholar
- Manski CF (1985b) Semiparametric analysis of binary response from response-based samples. J Econom 31(1):31–40MathSciNetCrossRefGoogle Scholar
- Manski CF (1986) Operational characteristics of maximum score estimation. J Econom 32(1):85–108MathSciNetCrossRefGoogle Scholar
- Manski CF (1988) Identification of binary response models. J Am Stat Assoc 83(403):729–738MathSciNetzbMATHCrossRefGoogle Scholar
- Marrocco C, Duin RPW, Tortorella F (2008) Maximizing the area under the ROC curve by pairwise feature combination. Pattern Recognit 41(6):1961–1974zbMATHCrossRefGoogle Scholar
- Rakotomamonjy A (2004) Optimizing area under ROC curve with SVMs. ROC Anal Artif Intell proceedings, 71–80Google Scholar
- Toh KA, Kim J, Lee S (2008) Maximizing area under ROC curve for biometric scores fusion. Pattern Recognit 41:3373–3392zbMATHCrossRefGoogle Scholar
- Train K (2003) Discrete choice methods with simulation, 1st edn. Cambridge University Press, CambridgeCrossRefGoogle Scholar
- Wenxia G, Whitmore GA (2010) Binary response and logistic regression in recent accounting research publications: a methodological note. Rev Quant Financ Account 34(1):81–93CrossRefGoogle Scholar
- Wooldridge JM (2006) Introductory econometrics: a modern approach, 3rd edn. Thomson South-Western, CanadaGoogle Scholar