Skip to main content

Estimating and Assessing Distributional Regression

  • Chapter
  • First Online:
Analysing Inequalities in Germany

Part of the book series: SpringerBriefs in Statistics ((BRIEFSSTATIST))

Abstract

This chapter sketches out how Structured Additive Distributional Regression relates to other regression models, like classical linear models, quantile regression models and conditional transformation models. In addition, the chapter entails some remarks on covariate selection, model complexity and state space issues.

Essentially, all models are wrong, but some are useful.

George Box (1987)

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 16.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Interestingly enough, the advancement of stochastic concepts ran parallel to the appearance of the fictional novel on the literary scene (see, Esposito 2007).

  2. 2.

    For the cross-sectional data predominantly used in this book, OLS generally requires independent and identically distributed error terms that are exogenous. Moreover, the design matrix needs to be void of multicollinearity. Often homoscedasticity and normality are also assumed for inferential purposes. Despite frequently supposing a normality, OLS does not necessarily require any assumption on the nature of the distribution of the error terms though.

  3. 3.

    Best in this context means giving the lowest variance of the estimate.

  4. 4.

    It should be stressed that the requirement that the estimator is unbiased is quintessential to the theorem and other biased estimators have been found which can have better mean-squared error (MSE) properties as they feature lower variance, e.g. estimators from ridge regression.

  5. 5.

    Note that alternatively one could also minimise the negative of the likelihood or related forms thereof like the deviance, where one basically subtracts the likelihood of a given model from another benchmark value thought to represent a saturated model.

  6. 6.

    It should be noted that the term convergence is not used in a strict mathematical sense, like almost sure convergence or stochastic convergence, but rather in a heavily heuristic sense. For a more formally inclined discussion of convergence and estimator properties, see among others White (2001).

  7. 7.

    Fisher criticised Bayesians for assuming that uncertainties can be expressed in form of probabilities (see Gigerenzer et al. 1989, p. 93). While it may be argued if that were true this would prompt inconsistencies all over inferential statistics, it is a fair point that the operationalisation of implicit assumptions can be highly challenging, to say the least.

  8. 8.

    Varying coefficients and more generally mixed models can be seen as a model class that arguably bridges the divide between the frequentist and the Bayesian paradigm. For more information on mixed models the reader is referred to Safken (2015).

  9. 9.

    The direct quote from Marx (1983, p. 189) reads: “Die Gesellschaft besteht nicht aus Individuen, sondern drückt die Summe der Beziehungen, Verhältnisse aus, worin diese Individuen zueinander stehen”.

  10. 10.

    I thus focus on potential labour market experience rather than actual labour market experience. This choice is grounded in the belief that experience is not only derived out of employment spells but also from other life experiences such as caring for children.

  11. 11.

    The reason for not using a continuous variable as in Mincer is that such a continuous variable would feature high point masses rather than a continuous spectrum of the distribution.

  12. 12.

    For example, relating a normal distribution with two parameters to two variables linearly would require the estimation of 6 parameters when including a constant, while even a coarse grid of only ten analogously specified conditional quantiles to approximate the conditional distribution would already require the estimation of 30 parameters.

  13. 13.

    For notational brevity I have excluded a distinct time-specific effect, \(b_t\). Since most economic panel databases for income analysis feature relatively few time periods, this effect can be captured by few more linear effects in the second term of Eq. (3.17).

  14. 14.

    ISEs can either be seen as a fixed effect or a random effect. A fixed effect, in a frequentist setting, is conceived as an ultimately deterministic effect to which the estimator is thought to converge. A random effect, more akin to the Bayesian mode of thought, is conceived as a realisation from a random variable and thus ultimately stochastic in their nature. For sake of simplicity and clarity of contrast, I will assume the estimation of ISEs in the form of fixed effects in the frequentist setting in the following. This is warranted by the fact that fixed effects analysis is without a doubt much more frequent in panel data applications in economics than random effects analysis. The main reason for this is the emphasis put on unbiased estimators and the resultant application of the Hausman test (see Hausman 1978) which, in practice, almost invariably rejects random effect specification on the basis that the expected effects differ significantly from a fixed effect specification for the ISEs. Additionally, it may be argued that random effects are actually somewhere in between including fixed effects and outright excluding all ISEs. In a random effects specification, where the random effects are thought to follow a distribution converging towards a Dirac delta function, the results would be converging towards the results obtained from a specification without ISEs.

  15. 15.

    Most software use transformations of the data, like the within-transformation or first differences to speed up the estimation process dramatically (see StataCorp 2011). By virtue of the transformation the direct estimation of the ISE is evaded and the other linear and nonlinear effects can easily be estimated. However, by evading the estimation of the ISEs, the model complexity is reduced but rather major parts of it are shifted outside the estimation process and potentially forgotten about.

  16. 16.

    In some way, the inclusion of ISEs can be seen as a form of kitchen sink regression, whereby any conceivably relevant variable is lumped into the regression.

  17. 17.

    Especially in the context of income analysis, the possibility of controlling for otherwise unobserved/unobservable factors must seem like divine aid sent by Athena herself to aid the valiant economists given the herculean task of tackling the hydra of the labour market. Accounting for ISEs thus seems to tie down all but one of the biting and hissing heads that influence labour market outcomes such as income. Yet, anyone studying the Greek mythology would know of the often twofold nature of divine aid and should be weary of the potential variety of its implications. Analogously, I believe that one needs to be weary of conditioning on covariates as complex and opaque as ISEs.

  18. 18.

    As I point out in Sect. 2.3, I believe that this iterative loop does not only include analyses with quantitative methods but should explicitly also contemplate insights from qualitative research.

  19. 19.

    Note that I do not conceive state space to be constrained to temporally varying states as is often done in the literature, but rather as a more general concept which allows to capture different economic functions for different populations across time, space or any dimension by which populations may be differentiated. Nonetheless, like in most state space models, I conceive time to be the pivotal dimension. Indeed, based on my research on the development of professorial salaries (Sohn 2016), I am a strong believer in temporal dependence of income structures. This dependence may be conceived in the framework of state space models in general and hidden Markov models in specific.

  20. 20.

    This is very similar to the problem of samples consisting only of WEIRD (Western, Educated, Industrialised, Rich, Democratic Countries) in psychology (see Bellemare et al. 2008).

References

  • Aitken AC (1936) IV.–on least squares and linear combination of observations. Proc R Soc Edinb 55:42–48. doi:10.1017/S0370164600014346

    Article  MATH  Google Scholar 

  • Arnold BC (2008) Pareto and generalized pareto distributions. In: Chotikapanich D (ed) Modeling income distributions and lorenz curves. Springer, New York, pp 119–145

    Chapter  Google Scholar 

  • Belitz C, Brezger A, Klein N, Kneib T, Lang S, Umlauf N (2015) BayesX - software for Bayesian inference in structured additive regression models: version 3.0. http://www.bayesx.org

  • Bellemare C, Kröger S, van Soest A (2008) Measuring inequity aversion in a heterogeneous population using experimental decisions and subjective probabilities. Econometrica 76(4):815–839

    Article  MATH  Google Scholar 

  • Bondell HD, Reich BJ, Wang H (2010) Noncrossing quantile regression curve estimation. Biometrika 97(4):825–838. doi:10.1093/biomet/asq048

    Article  MathSciNet  MATH  Google Scholar 

  • Bourdieu P (1995) Sozialer Raum und “Klassen”. Leçon sur la leçon: 2 Vorlesungen, vol 500, 3rd edn. Suhrkamp, Frankfurt am Main

    Google Scholar 

  • Bourdieu P, Passeron JC (2007) Die Erben: Studenten, Bildung und Kultur. UVK-Verl.-Ges, Konstanz

    Google Scholar 

  • Box GEP (1976) Science and statistics. J Am Stat Assoc 71(356):791–799

    Article  MathSciNet  MATH  Google Scholar 

  • Box GEP (1987) Empirical model-building and response surfaces. Wiley, New York

    MATH  Google Scholar 

  • Box GEP, Tiao G (1973) Bayesian inference in statistical analysis. Addison-Wesley, Rading

    MATH  Google Scholar 

  • Brezger A, Lang S (2006) Generalized structured additive regression based on Bayesian P-splines. Comput Stat Data Anal 50(4):967–991. doi:10.1016/j.csda.2004.10.011

    Article  MathSciNet  MATH  Google Scholar 

  • Chernozhukov V, Fernandez-Val I, Melly B (2013) Inference on counterfactual distributions. Econometrica 81(6):2205–2268. doi:10.3982/ECTA10582

    Article  MathSciNet  MATH  Google Scholar 

  • Cox DR (1997) The current position of statistics: a personal view. Int Stat Rev 65(3):261–290

    Article  MATH  Google Scholar 

  • Cox DR, Fitzpatrick R, Fletcher AE, Gore SM, Spiegelhalter DJ, Jones DR (1992) Quality-of-life assessment: can we keep it simple. J R Stat Soc A 155(3):353–393

    Article  Google Scholar 

  • Dagum C (1977) A new model of personal income distribution: specification and estimation. Economie Applicée 30:413–437

    Google Scholar 

  • Daston L (2001) Wunder, Beweise und Tatsachen: Zur Geschichte der Rationalität, orig.-ausg. edn. Fischer-Taschenbuch-Verl., Frankfurt am Main

    Google Scholar 

  • Diebold FX (2013) No hesitations 2013: a blog book. http://www.ssc.upenn.edu/~fdiebold/papers/paper117/NoHesitations2013.pdf

  • Dobb M (1973) Theories of value and distribution since Adam Smith: ideology and economic theory. Cambridge University Press, Cambridge

    Google Scholar 

  • Duesenberry JS (1949) Income, saving and the theory of consumer behaviour. (1967) Galaxy, Book edn. Oxford University Press, New York

    Google Scholar 

  • Eilers PHC, Marx BD (1996) Flexible smoothing with B-splines and penalties. Stat Sci 11(2):89–102

    Article  MathSciNet  MATH  Google Scholar 

  • Esposito E (2007) Die Fiktion der wahrscheinlichen Realität. Suhrkamp, Frankfurt am Main

    Google Scholar 

  • Fahrmeir L, Kneib T, Lang S, Marx BD (2013) Regression: models, methods and applications. Springer, Berlin

    Book  MATH  Google Scholar 

  • Fortin NM, Lemieux T, Firpo S (2011) Decomposition methods in economics. In: Ashenfelter O, Card DE (eds) Handbook of labor economics, vol 4A. North-Holland, Amsterdam, pp 1–102

    Google Scholar 

  • Galton F (1889) Natural inheritance. Macmillan, London

    Book  Google Scholar 

  • Gigerenzer G, Swijtink Z, Porter T, Daston L, Beatty J, Krüger L (1989) The empire of chance: how probability changed science and everyday life. Cambridge University Press, Cambridge

    Book  Google Scholar 

  • Gillham NW (2001) A life of sir francis galton: from African exploration to the birth of Eugenics. Oxford University Press, New York

    Google Scholar 

  • Gneiting T, Katzfuss M (2014) Probabilistic forecasting. Annu Rev Stat Appl 1:125–151

    Article  Google Scholar 

  • Hambuckers J, Kneib T, Langrock R, Sohn A (2016) A Markov-switching generalized additive model for compound poisson processes, with applications to operational losses models. ZfS Working Paper 09/2016

    Google Scholar 

  • Hausman JA (1978) Specification tests in econometrics. Econometrica 46(6):1251–1271. doi:10.2307/1913827

    Article  MathSciNet  MATH  Google Scholar 

  • Herrnstein RJ (1971) IQ Atl Mon 228(3):43–64

    Google Scholar 

  • Hothorn T, Kneib T, Bühlmann P (2014) Conditional transformation models. J R Stat Soc Ser B (Stat Methodol) 76(1):3–27

    Article  MathSciNet  Google Scholar 

  • Jensen A (1969) How much can we boost IQ and academic achievement. Harv Educ Rev 39(1):1–123

    Article  Google Scholar 

  • Kleiber C (1996) Dagum versus Singh-Maddala income distributions. Econ Lett 57:39–44

    Article  MATH  Google Scholar 

  • Klein N, Kneib T, Lang S (2015a) Bayesian generalized additive models for location, scale and shape for zero-inflated and over-dispersed count data. J Am Stat Assoc 110(509):405–419. doi:10.1080/01621459.2014.912955

    Article  Google Scholar 

  • Klein N, Kneib T, Lang S, Sohn A (2015b) Bayesian structured additive distributional regression with an application to regional income inequality in Germany. Ann Appl Stat 9(2):1024–1052. doi:10.1214/15-AOAS823

    Article  MathSciNet  MATH  Google Scholar 

  • Koenker R (2005) Quantile regression. Cambridge University Press, Cambridge

    Book  MATH  Google Scholar 

  • Koenker R, Bassett G (1978) Regression quantiles. Econometrica 46(1):33–50

    Article  MathSciNet  MATH  Google Scholar 

  • Lancaster T (2004) An introduction to modern Bayesian econometrics. Blackwell, Oxford

    MATH  Google Scholar 

  • Langrock R, Kneib T, Sohn A, DeRuiter S (2015a) Nonparametric inference in hidden markov models using P-Splines. Biometrics 71(2):520–528

    Google Scholar 

  • Langrock R, Michelot T, Sohn A, Kneib T (2015b) Semiparametric stochastic volatility modelling using penalized splines. Comput Stat 30(2):517–537

    Google Scholar 

  • Machado J, Mata J (2005) Counterfactual decompostion of changes in wage distributions using quantile regression. J Appl Econ 20(4):445–465

    Article  Google Scholar 

  • Marx K (1983) Ökonomische Manuskripte 1857/1858. In: Institut für Marxismus-Leninismus beim ZK der SED (ed) MEW, vol 42. Dietz, Berlin, pp 3–769

    Google Scholar 

  • Melly B (2005) Public-private sector wage differentials in Germany: evidence from quantile regression. Empir Econ 30(2):505–520

    Article  Google Scholar 

  • Mincer J (1974) Schooling, experience, and earnings. National Bureau of Economic Research and distributed by Columbia University Press, New York

    Google Scholar 

  • Oexle OG (2007) Krise des Historismus, Krise der Wirklichkeit: Wissenschaft, Kunst und Literatur 1880–1932. Vandenhoeck & Ruprecht, Göttingen

    Google Scholar 

  • Pareto V (1897) Cours d’Economie Politique. In: Bousquet GH, Busino G (eds) New Edition (1964). Librairie Droz, Geneva

    Google Scholar 

  • Peters J, Langbein J, Roberts G (forthcoming) Policy evaluation, randomized controlled trials and external validty - a systematic review. Econ Lett

    Google Scholar 

  • Petty W (1899) Political arithmetic. In: Hull CHH (ed) The economic writings of Sir William Petty, together with The Observations upon Bills of Mortality, more probably by Captain John Graunt. Cambridge University Press, Cambridge, pp 233–313

    Google Scholar 

  • Poincaré H (1902) La Sience et l’Hypothèse. Flammarion, Paris

    MATH  Google Scholar 

  • Pudney S (1999) On some statistical methods for modelling the incidence of poverty. Oxf Bull Econ Stat 61(3):385–408

    Article  MathSciNet  Google Scholar 

  • R Core Team (2012) R: a language and environment for statistical computing. http://www.R-project.org

  • Rennies H, Kneib T (2015) Structural equation models for dealing with spatial confounding. In: Proceedings of the 30th international workshop on statistical modelling, volume 2, pp 231–234

    Google Scholar 

  • Reulen H (2015) Modelling combined transition-type effects in multi-state models. PhD thesis, Georg-August-Universität Göttingen, Göttingen

    Google Scholar 

  • Rigby RA, Stasinopoulos DM (2005) Generalized additive models for location, scale and shape. J R Stat Soc Ser C (Appl Stat) 54(3):507–554

    Article  MathSciNet  MATH  Google Scholar 

  • Robert CP (2007) The Bayesian choice: from decision-theoretic foundations to computational implementation, 2nd edn. Springer texts in statistics. Springer, New York

    Google Scholar 

  • Rothe C, Wied D (2013) Misspecification testing in a class of conditional distributional models. J Am Stat Assoc 108(501):314–324. doi:10.1080/01621459.2012.736903

    Article  MathSciNet  MATH  Google Scholar 

  • Rue H, Held L (2005) Gaussian Markov random fields: theory and applications. Chapman & Hall/CRC, Boca Raton

    Book  MATH  Google Scholar 

  • Säfken B (2015) Model choice and variable selection in mixed and semiparametric models. PhD thesis, Georg-August-Universität Göttingen, Göttingen. http://d-nb.info/1069664928/34

  • Schulze Waltrup L, Sobotka F, Kneib T, Kauermann G (2015) Expectile and quantile regression - David and Goliath? Stat Modell 15(5):433–456. doi:10.1177/1471082X14561155

    Article  MathSciNet  Google Scholar 

  • Singh SK, Maddala GS (1976) A function for size distribution of incomes. Econometrica 44:963–970

    Article  MATH  Google Scholar 

  • Sobotka F, Kneib T (2012) Geoadditive expectile regression. Comput Stat Data Anal 56(4):755–767. doi:10.1016/j.csda.2010.11.015

    Article  MathSciNet  MATH  Google Scholar 

  • Sohn A (2016) Poor university professors? The relative earnigns decline of German professors during the 20th century. Scand Econ Hist Rev 64(2):84–102. doi:10.1080/03585522.2016.1175374

    Article  Google Scholar 

  • StataCorp (2011) Stata statistical software. http://www.stata.com

  • Tableman M, Kim JS (2004) Survival analysis using S: analysis of time-to-event data. Chapman & Hall/CRC, Boca Raton, Fla

    MATH  Google Scholar 

  • Vianelli S (1983) The family of normal and lognormal distributions of order r. Metron 41:3–10

    MathSciNet  MATH  Google Scholar 

  • Waldmann E, Kneib T, Yue YR, Lang S, Flexeder C (2013) Bayesian semiparametric additive quantile regression. Stat Modell 13(3):223–252. doi:10.1177/1471082X13480650

    Article  MathSciNet  Google Scholar 

  • Waldmann E, Sobotka F, Kneib T (forthcoming) Bayesian regularisation in geoadditive expectile regression. Stat Comput

    Google Scholar 

  • White H (2001) Asymptotic theory for econometricians. Academic Press, San Diego

    Google Scholar 

  • Wood SN (2006) Generalized additive models. Chapman & Hall, Boca Raton

    MATH  Google Scholar 

  • Wooldridge J (2011) Introductory econometrics, 5th edn. South-Western Cengage Learning, Mason

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alexander Silbersdorff .

Rights and permissions

Reprints and permissions

Copyright information

© 2017 The Author(s)

About this chapter

Cite this chapter

Silbersdorff, A. (2017). Estimating and Assessing Distributional Regression. In: Analysing Inequalities in Germany. SpringerBriefs in Statistics. Springer, Cham. https://doi.org/10.1007/978-3-319-65331-0_3

Download citation

Publish with us

Policies and ethics