Regression analysis: likelihood, error and entropy

Full Length Paper Series B


In a regression with independent and identically distributed normal residuals, the log-likelihood function yields an empirical form of the \(\mathcal{L}^2\)-norm, whereas the normal distribution can be obtained as a solution of differential entropy maximization subject to a constraint on the \(\mathcal{L}^2\)-norm of a random variable. The \(\mathcal{L}^1\)-norm and the double exponential (Laplace) distribution are related in a similar way. These are examples of an “inter-regenerative” relationship. In fact, \(\mathcal{L}^2\)-norm and \(\mathcal{L}^1\)-norm are just particular cases of general error measures introduced by Rockafellar et al. (Finance Stoch 10(1):51–74, 2006) on a space of random variables. General error measures are not necessarily symmetric with respect to ups and downs of a random variable, which is a desired property in finance applications where gains and losses should be treated differently. This work identifies a set of all error measures, denoted by \(\mathscr {E}\), and a set of all probability density functions (PDFs) that form “inter-regenerative” relationships (through log-likelihood and entropy maximization). It also shows that M-estimators, which arise in robust regression but, in general, are not error measures, form “inter-regenerative” relationships with all PDFs. In fact, the set of M-estimators, which are error measures, coincides with \(\mathscr {E}\). On the other hand, M-estimators are a particular case of L-estimators that also arise in robust regression. A set of L-estimators which are error measures is identified—it contains \(\mathscr {E}\) and the so-called trimmed \(\mathcal{L}^p\)-norms.


Regression Likelihood Entropy Error measure M-estimator L-estimator 

Mathematics Subject Classification

90C90 90C25 90C15 



We are grateful to the referees for the comments and suggestions, which helped to improve the quality of the paper. The first author thanks the University of Leicester for granting him the academic study leave to do this research.


  1. 1.
    Alfons, A., Croux, C., Gelper, S.: Sparse least trimmed squares regression for analyzing high-dimensional large data sets. Ann. Appl. Stat. 7(1), 226–248 (2013)MathSciNetCrossRefMATHGoogle Scholar
  2. 2.
    Bartolucci, F., Scaccia, L.: The use of mixtures for dealing with non-normal regression errors. Comput. Stat. Data Anal. 48(4), 821–834 (2005)MathSciNetCrossRefMATHGoogle Scholar
  3. 3.
    Bernholt, T.: Computing the least median of squares estimator in time o(\(n^d\)). In: International Conference on Computational Science and Its Applications, pp. 697–706. Springer (2005)Google Scholar
  4. 4.
    Boscovich, R.J.: De litteraria expeditione per pontificiam ditionem, et synopsis amplioris operis, ac habentur plura ejus ex exemplaria etiam sensorum impressa. Bononiensi Scientarum et Artum Instituto Atque Academia Commentarii 4, 353–396 (1757)Google Scholar
  5. 5.
    Box, G.: Non-normality and tests on variances. Biometrika 40, 318–335 (1953)MathSciNetCrossRefMATHGoogle Scholar
  6. 6.
    Cover, T., Thomas, J.: Elements of Information Theory. Wiley, New York (2012)MATHGoogle Scholar
  7. 7.
    Edgeworth, F.: On observations relating to several quantities. Hermathena 6(13), 279–285 (1887)Google Scholar
  8. 8.
    Efron, B.: Regression percentiles using asymmetric squared error loss. Stat. Sin. 1(1), 93–125 (1991)MathSciNetMATHGoogle Scholar
  9. 9.
    Föllmer, H., Schied, A.: Stochastic Finance, 3rd edn. de Gruyter, Berlin (2011)CrossRefMATHGoogle Scholar
  10. 10.
    Gauss, C.F.: Theoria motus corporum coelestium in sectionibus conicis solem ambientium. sumtibus Frid. Perthes et IH Besser (1809)Google Scholar
  11. 11.
    Grechuk, B., Molyboha, A., Zabarankin, M.: Maximum entropy principle with general deviation measures. Math. Oper. Res. 34(2), 445–467 (2009)MathSciNetCrossRefMATHGoogle Scholar
  12. 12.
    Grechuk, B., Molyboha, A., Zabarankin, M.: Chebyshev inequalities with law-invariant deviation measures. Probab. Eng. Inf. Sci. 24(1), 145–170 (2010)MathSciNetCrossRefMATHGoogle Scholar
  13. 13.
    Grechuk, B., Zabarankin, M.: Schur convex functionals: Fatou property and representation. Math. Finance 22(2), 411–418 (2012)MathSciNetCrossRefMATHGoogle Scholar
  14. 14.
    Grechuk, B., Zabarankin, M.: Inverse portfolio problem with mean-deviation model. Eur. J. Oper. Res. 234(2), 481–490 (2014)MathSciNetCrossRefMATHGoogle Scholar
  15. 15.
    Grechuk, B., Zabarankin, M.: Sensitivity analysis in applications with deviation, risk, regret, and error measures. SIAM J. Optim. 27(4), 2481–2507 (2017)MathSciNetCrossRefMATHGoogle Scholar
  16. 16.
    Gu, Y., Zou, H.: High-dimensional generalizations of asymmetric least squares regression and their applications. Ann. Stat. 44(6), 2661–2694 (2016)MathSciNetCrossRefMATHGoogle Scholar
  17. 17.
    Harter, L.: The method of least squares and some alternatives: Part I. In: International Statistical Review/Revue Internationale de Statistique, pp. 147–174 (1974)Google Scholar
  18. 18.
    Hosking, J., Balakrishnan, N.: A uniqueness result for l-estimators, with applications to l-moments. Stat. Methodol. 24, 69–80 (2015)MathSciNetCrossRefGoogle Scholar
  19. 19.
    Huber, P.: Robust estimation of a location parameter. Ann. Math. Stat. 35(1), 73–101 (1964)MathSciNetCrossRefMATHGoogle Scholar
  20. 20.
    Huber, P.: Robust Statistics. Wiley, New York (1981)CrossRefMATHGoogle Scholar
  21. 21.
    Jaynes, E.T.: Information theory and statistical mechanics (notes by the lecturer). Stat. Phys. 3 1, 181 (1963)Google Scholar
  22. 22.
    Jouini, E., Schachermayer, W., Touzi, N.: Law invariant risk measures have the Fatou property. Adv. Math. Econ. 9, 49–71 (2006)MathSciNetCrossRefMATHGoogle Scholar
  23. 23.
    Koenker, R., Bassett Jr., G.: Regression quantiles. Econ. J. Econ. Soc. 46(1), 33–50 (1978)MathSciNetMATHGoogle Scholar
  24. 24.
    Krokhmal, P.: Higher moment coherent risk measures. Quant. Finance 7(4), 373–387 (2007)MathSciNetCrossRefMATHGoogle Scholar
  25. 25.
    Kullback, S., Leibler, R.: On information and sufficiency. Ann. Math. Stat. 22(1), 79–86 (1951)MathSciNetCrossRefMATHGoogle Scholar
  26. 26.
    Laplace, P.S.: Traité de mécanique céleste, vol. 2. J. B. M. Duprat, Paris (1799)Google Scholar
  27. 27.
    Lee, W.M., Hsu, Y.C., Kuan, C.M.: Robust hypothesis tests for m-estimators with possibly non-differentiable estimating functions. Econom. J. 18(1), 95–116 (2015)MathSciNetCrossRefGoogle Scholar
  28. 28.
    Legendre, A.M.: Nouvelles méthodes pour la détermination des orbites des comètes. 1. F. Didot, Paris (1805)Google Scholar
  29. 29.
    Lisman, J., Van Zuylen, M.: Note on the generation of most probable frequency distributions. Stat. Neerl. 26(1), 19–23 (1972)CrossRefMATHGoogle Scholar
  30. 30.
    Loh, P.L.: Statistical consistency and asymptotic normality for high-dimensional robust \(m\)-estimators Ann. Stat. 45(2), 866–896 (2017)CrossRefMATHGoogle Scholar
  31. 31.
    Mafusalov, A., Uryasev, S.: CVaR (superquantile) norm: stochastic case. Eur. J. Oper. Res. 249(1), 200–208 (2016)MathSciNetCrossRefMATHGoogle Scholar
  32. 32.
    Morales-Jimenez, D., Couillet, R., McKay, M.: Large dimensional analysis of robust m-estimators of covariance with outliers. IEEE Trans. Signal Process. 63(21), 5784–5797 (2015)MathSciNetCrossRefGoogle Scholar
  33. 33.
    Mount, D., Netanyahu, N., Piatko, C., Silverman, R., Wu, A.: On the least trimmed squares estimator. Algorithmica 69(1), 148–183 (2014)MathSciNetCrossRefMATHGoogle Scholar
  34. 34.
    Rockafellar, R.T., Royset, J.: Measures of residual risk with connections to regression, risk tracking, surrogate models, and ambiguity. SIAM J. Optim. 25(2), 1179–1208 (2015)MathSciNetCrossRefMATHGoogle Scholar
  35. 35.
    Rockafellar, R.T., Royset, J.: Random variables, monotone relations, and convex analysis. Math. Program. 148(1–2), 297–331 (2014)MathSciNetCrossRefMATHGoogle Scholar
  36. 36.
    Rockafellar, R.T., Uryasev, S.: Conditional value-at-risk for general loss distributions. J. Bank. Finance 26(7), 1443–1471 (2002)CrossRefGoogle Scholar
  37. 37.
    Rockafellar, R.T., Uryasev, S.: The fundamental risk quadrangle in risk management, optimization and statistical estimation. Surv. Oper. Res. Manag. Sci. 18(1), 33–53 (2013)MathSciNetGoogle Scholar
  38. 38.
    Rockafellar, R.T., Uryasev, S., Zabarankin, M.: Generalized deviations in risk analysis. Finance Stoch. 10(1), 51–74 (2006)MathSciNetCrossRefMATHGoogle Scholar
  39. 39.
    Rockafellar, R.T., Uryasev, S., Zabarankin, M.: Risk tuning with generalized linear regression. Math. Oper. Res. 33(3), 712–729 (2008)MathSciNetCrossRefMATHGoogle Scholar
  40. 40.
    Rousseeuw, P., Leroy, A.: Robust Regression and Outlier Detection, vol. 589. Wiley, New York (2005)MATHGoogle Scholar
  41. 41.
    Rousseeuw, P., Van Driessen, K.: Computing LTS regression for large data sets. Data Min. Knowl. Disc. 12(1), 29–45 (2006)MathSciNetCrossRefMATHGoogle Scholar
  42. 42.
    Rousseeuw, P.G.: Least median of squares regression. J. Am. Stat. Assoc. 79, 871–880 (1984)MathSciNetCrossRefMATHGoogle Scholar
  43. 43.
    Shannon, C.E.: A mathematical theory of communication. Bell Syst. Tech. 27, 379–423, 623–656 (1948)Google Scholar
  44. 44.
    Xie, S., Zhou, Y., Wan, A.: A varying-coefficient expectile model for estimating value at risk. J. Bus. Econ. Stat. 32(4), 576–592 (2014)MathSciNetCrossRefGoogle Scholar
  45. 45.
    Zabarankin, M., Uryasev, S.: Statistical Decision Problems: Selected Concepts and Portfolio Safeguard Case Studies. Springer, Berlin (2014)CrossRefMATHGoogle Scholar

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature and Mathematical Optimization Society 2018

Authors and Affiliations

  1. 1.Department of MathematicsUniversity of LeicesterLeicesterUK
  2. 2.Department of Mathematical SciencesStevens Institute of TechnologyHobokenUSA

Personalised recommendations