Swiss Journal of Economics and Statistics

, Volume 152, Issue 1, pp 49–80 | Cite as

A Simple Method for Predicting Distributions by Means of Covariates with Examples from Poverty and Health Economics

  • Jing Dai
  • Stefan Sperlich
  • Walter Zucchini
Open Access


We present an integration based procedure for predicting the distribution f of an indicator of interest in situations where, in addition to the sample data, one has access to covariates that are available for the entire population. The proposed method, based on similar ideas that have been used in the literature on policy evaluation, provides an alternative to existing simulation and imputation methods. It is very simple to apply, flexible, requires no additional assumptions, and does not involve the inclusion of artificial random terms. It therefore yields reproducible estimates and allows for valid inference. It also provides a tool for future predictions, scenarios and ex-ante impact evaluation. We illustrate our procedure by predicting income distributions in a case with sample selection, and both current and future doctor visits. We find our approach outperforms other commonly used procedures substantially.


predicting distributions missing values household expenditures income distribution health economics impact evaluation 


C1 C4 I32 I15 


  1. Atkinson, Anthony B., and François Bourguignon (2000), Handbook of Income Distribution, Amsterdam: North-Holland.Google Scholar
  2. Azzarri, Carlo, Gero Carletto, Benjamin Davis, and Alberto Zezza (2006), “Monitoring Poverty without Consumption Data”, Eastern European Economics 44(1), pp. 59–82.CrossRefGoogle Scholar
  3. Berzel, Andreas, Gillian Z. Heller, and Walter Zucchini (2006), “Estimating the Number of Visits to the Doctor”, Australian & New Zealand Journal of Statistics 48, pp. 213–224.CrossRefGoogle Scholar
  4. Biewen, Martin, and Stephen P. Jenkins (2005), “A Framework for the Decomposition of Poverty Defferences with an Application to Poverty Defferences Between countries”, Empirical Economics 30, pp. 331–358.CrossRefGoogle Scholar
  5. Birkin, Mark, and Martin Clarke (1989), “The Generation of Individual and Household Incomes at the Small Area Level Using Synthesis”, Regional Studies 23(6), pp. 535–548.CrossRefGoogle Scholar
  6. Chaudhuri, Shubham, Jyotsna Jalan, and Asep Suryahadi (2002), “Assessing Household Vulnerability to Poverty from Cross-Sectional Data: A Methodology and Estimates from Indonesia”, Discussion Paper Series, Department of Economics, Columbia University.Google Scholar
  7. Chernozhukov, Victor, Iván Fernández-Val, and Blaise Melly (2013), “Inference on Counterfactual Distributions”, Econometrica 81(6), pp. 2205–CrossRefGoogle Scholar
  8. Chotikapanich, Duangkamon (2008), Modeling Income Distributions and Lorenz Curves, Series: Economic Studies in Inequality, Social Exclusion and Well-Being 5, Springer.CrossRefGoogle Scholar
  9. Davis, Benjamin (2003), Choosing a Method for Poverty Mapping, Food and Agriculture Organization of the United Nations, Rome, Scholar
  10. DiNardo, Jone, Nicole M. Fortin, and Thomas Lemieux (1996), “Labor Market Institutions and the Distribution of Wages, 1973–1992: A Semiparametric Approach”, Econometrica 65, pp. 1001–1046.CrossRefGoogle Scholar
  11. Donald, Stephen G., Yu-Chin Hsu, and Garry F. Barrett (2012), “Incorporating Covariates in the Measurement of Welfare and Inequality: Methods and Applications”, Econometrics Journal 15, pp. C1–C30.CrossRefGoogle Scholar
  12. Elbers, Chris, Jean O. Lanjouw, and Peter Lanjouw (2003), “Micro-Level Estimation of Poverty and Inequality”, Econometrica 71(1), pp. 355–364.CrossRefGoogle Scholar
  13. Filmer, Deon, and Lant H. Pritchett (2001), “Estimating Wealth Effects without Expenditure Data — Or Tears: An Application to Educational Enrollments in States of India”, Demography 38(1), pp. 115–132.Google Scholar
  14. Gasparini, Leonardo, Martín Cicowiez, Federico Gutierrez, and Mariana Marchionni (2003), “Simulating Income Distribution Changes in Bolivia: A Microeconometric Approach”, The World Bank Bolivia Poverty AssessmeGoogle Scholar
  15. González-Manteiga, Wenceslao, and Rosa M. Crujeiras (2013), “An Updated Review of Goodness-of-Fit Tests for Regression Models”, Test 22(3), pp. 361–411.CrossRefGoogle Scholar
  16. Härdle, Wolfgang, Sylvie Huet, Enno Mammen, and Stefan Sperlich (2004), “Bootstrap Inference in Semiparametric Generalized Additive Models”, Econometric Theory 20, pp. 265–300.CrossRefGoogle Scholar
  17. Heller, Gillian Z. (1997), “Who Visits the GP? Demographic Patterns in a Sydney Suburb”, Technical report, Department of Statistics, Macquarie University.Google Scholar
  18. Hentschel, Jesco, Jean Olson Lanjouw, Peter Lanjouw, and Javier Poggi (2000), “Combining Census and Survey Data to Trace the Spatial Dimensions of Poverty: A Case Study of Ecuador”, World Bank Economic Review 14(1), pp. 147–165.CrossRefGoogle Scholar
  19. Horton, Nicholas J., and Stuart R. Lipsitz (2001), “Multiple Imputation in Practice: Comparison of Software Packages for Regression Models with Missing Variables”, The American Statistician 55(3), pp. 244–254.CrossRefGoogle Scholar
  20. Juhn, Chinhui, Kevin M. Murphy, and Brooks Pierce (1993), “Wage Inequality and the Rise in Returns to Skill”, The Journal of Political Economy 101(3), pp. 410–442.CrossRefGoogle Scholar
  21. Little, Roderick J. A., and Donald B. Rubin (2002), Statistical Analysis with Missing Data (Second Edition), John Wiley, New York.CrossRefGoogle Scholar
  22. Lombardía, María J., and Stefan Sperlich (2008), “Semiparametric Inference in Generalized Mixed Effects Models”, Journal of Royal Statistical Society: Series B 70(5), pp. 913–930.CrossRefGoogle Scholar
  23. McLachlan, Geoffrey, and David Peel (2000), Finite Mixture Models, Wiley Series in Probability and Statistics.CrossRefGoogle Scholar
  24. MELLY, BLAISE (2005), “Decomposition of Differences in Distribution Using Quantile Regression”, Labour Economics 12(4), pp. 577–590Google Scholar
  25. MISHRA, SATISH C. (2009), “Economic Inequality in Indonesia: Trends, Causes, and Policy Response”, Strategic Asia, commissioned by UNDP Regional Office, Colombo.Google Scholar
  26. Noufaily, Angela, and M. C. Jones (2013), “Parametric Quantile Regression Based on the Generalized Gamma Distribution”, Journal of the Royal Statistical Society, Series C, Applied Statistics 62(5), pp. 723–740.Google Scholar
  27. Paulin, Geoffrey D., and David L. Ferraro (1994), “Imputing Income in the Consumer Expenditure Survey”, Monthly Labor Review 117(12), pp. 23–31.Google Scholar
  28. Politis, Dimitris N., Joseph P. Romano, and Michael Wolf (1999), Subsampling, Springer, New York.CrossRefGoogle Scholar
  29. Ravallion, Martin (2001), “Growth, Inequality and Poverty: Looking Beyond Averages”, World Development 29(11), pp. 1803–1815.CrossRefGoogle Scholar
  30. Rigby, R. A., and Stasinopoulos, D. M. (2005), “Generalized Additive Models for Location, Scale and Shape”, Applied Statistics 54, pp. 507–554.Google Scholar
  31. Rothe, Christoph (2010), “Nonparametric Estimation of Distributional Policy Effects”, Journal of Econometrics 155, pp. 5670.CrossRefGoogle Scholar
  32. Royston, Patrick (2004), “Multiple Imputation of Missing Values”, The Stata Journal 4(3), pp. 227–241.Google Scholar
  33. Sahn, David E., and David C. Stifel (2000), “Poverty Comparison Over Time and Across Countries in Africa”, World Development 28(12), pp. 2123–CrossRefGoogle Scholar
  34. Sperlich, Stefan, Oliver B. Linton, and Wolfgang Härdle (1999), “Integration and Backfitting Methods in Additive Models — Finite Sample Properties and Comparison”, Test 8, pp. 419–458.CrossRefGoogle Scholar
  35. Sperlich, Stefan (2014), “On the Choice of Regularization Parameters in Specification Testing: a critical discussion”, Empirical Economics 47, pp. 427–450.CrossRefGoogle Scholar
  36. Stock, sc James H. (1989), “Nonparametric Policy Analysis”, Journal of the American Statistical Association 84(406), pp. 567–575.CrossRefGoogle Scholar
  37. Su, Yu-Sung, Andrew Gelman, Jennifer Hill, and Masanao Yajima (2011),“Multiple Imputation with Diagnostics (mi) in R: Opening Windows into the Black Box”, Journal of Statistical Software 45(2), pp. 1–31.Google Scholar
  38. Tarozzi, Alessandro, and Angus Deaton (2009), “Using Census and Survey Data to Estimate Poverty and Inequality for Small Areas”, Review of Economics and Statistics 91(4), pp. 773–792.CrossRefGoogle Scholar
  39. Van Kerm, Philippe (2013). “Generalized Measures of Wage Differences”, Empirical Economics 45(1), pp. 465–482.CrossRefGoogle Scholar
  40. Yee, T. W., and C. J. Wild (1996), “Vector Generalized Additive Models”, Journal of Royal Statistical Society, Series B, Methodological 58, pp. 481–493.Google Scholar
  41. Zeller, Manfred, Julia Johannsen, and Gabriela Alcaraz V. (2005), “Developing and Testing Poverty Assessment Tools: Results from Accuracy Test in Peru”, College Park, IRIS Center, University of Maryland.Google Scholar

Copyright information

© Swiss Society of Economics and Statistics 2016

Authors and Affiliations

  1. 1.Universität KasselKasselSwitzerland
  2. 2.Université de Genève, Geneva School of Economics and ManagementGenèveSwitzerland
  3. 3.Department of Economic SciencesGeorg-August UniversitätGöttingenSwitzerland

Personalised recommendations