Abstract
This paper focuses on the estimation of the concentration curve of a finite population, when data are collected according to a complex sampling design with different inclusion probabilities. A (design-based) Hájek type estimator for the Lorenz curve is proposed, and its asymptotic properties are studied. Then, a resampling scheme able to approximate the asymptotic law of the Lorenz curve estimator is constructed. Applications are given to the construction of (i) a confidence band for the Lorenz curve, (ii) confidence intervals for the Gini concentration ratio, and (iii) a test for Lorenz dominance. The merits of the proposed resampling procedure are evaluated through a simulation study.
Similar content being viewed by others
References
Anderson G (1996) Nonparametric tests of stochastic dominance in income distribution. Econometrica 64:1183–1193
Antal E, Tillé Y (2011) A direct bootstrap method for complex sampling designs from a finite population. J Am Stat Assoc 106(494):534–543
Barabesi L, Diana G, Perri PF (2016) Linearization of inequality indices in the design-based framework. Statistics 50:1161–1172
Barrett GF, Donald SG, Bhattacharya D (2014) Consistent nonparametric tests for Lorenz dominance. J Bus Econ Stat 32:1–13
Bhattacharya D (2005) Asymptotic inference from multi-stage samples. J Econom 126:145–171
Bhattacharya D (2007) Inference on inequality from household survey data. J Econom 137:674–707
Bickel PJ, Freedman D (1981) Some asymptotic theory for the bootstrap. Ann Stat 9:1196–1216
Boistard H, Lopuhaä R, Ruiz-Gazen A (2017) Functional central limit theorems for single-stage sampling designs. Ann Stat 45:1728–1758
Chauvet G (2007) Méthodes de bootstrap en population finie. Ph.D. Dissertation, Laboratoire de statistique d’enquêtes, CREST-ENSAI, Universioté de Rennes 2
Conti PL, Di Iorio A (2018) Analytic inference in finite populations via resampling, with applications to confidence intervals and testing for independence. Preprint arXiv:1809.08035 (submitted for publication)
Conti PL, Marella D (2015) Inference for quantiles of a finite population: asymptotic vs. resampling results. Scand J Stat 42:545–561
Csörgő M, Csörgő S, Horváth L (1986) An asymptotic theory for empirical reliability and concentration processes. Springer, Berlin
Davidson R (2009) Reliable inference for the Gini index. J Econom 150:30–40
Gastwirth JL (1972) The estimation of Lorenz curve and Gini index. Rev Econ Stat 54:306–316
Giorgi GM (1999) Income inequality measurement: the statistical approach. In: Silber J (ed) Hanbdbook of income inequtality measurement. Kluwer Academic Publishers, Boston
Giorgi GM, Gigliarano C (2017) The Gini concentration index: a review of the inference literature. J Econ Surv 31:1130–1148
Goldie CM (1977) Convergence theorems for empirical Lorenz curve and their inverses. Ann Appl Probab 9:765–791
Hájek J (1964) Asymptotic theory of rejective sampling with varying probabilities from a finite population. Ann Math Stat 35:1491–1523
Horvitz DG, Thompson DJ (1952) A generalization of sampling without replacement from a finite universe. J Am Stat Assoc 47:663–685
Langel M, Tillé Y (2013) Variance estimation of the Gini index: revisiting a result several times published. J R Stat Soc Ser A 176:521540
Leadbetter MR, Weissner JH (1969) On continuity and other analytic properties of stochastic process sample functions. Proc Am Math Soc 22:291–294
Lifshits MA (1982) On the absolute continuity of distributions of functionals of random processes. Theory Probab Appl 27:600–607
Marella D, Vicard P (2018) PC complex: PC algorithm for complex survey data. Working Paper n. 240, Dipartimento di Economia - Università Roma Tre. ISSN: 2279-6916 (submitted for publication)
Massart P (1990) The tight constant in the Dvoretzky–Kiefer–Wolfowitz inequality. Ann Probab 18:1269–1283
Pfeffermann D (1993) The role of sampling weights when modeling survey data. Int Stat Rev 61:317–337
Pfeffermann D, Sverchkov M (2004) Prediction of finite population totals based on the sample distribution. Surv Methodol 30:79–92
Sen PK, Singer J (1993) Large sample methods in statistics. Champam & Hall, London
Tillé Y (2006) Sampling algorithms. Springer, New York
van der Vaart A (1998) Asymptotic statistics. Cambridge University Press, Cambridge
Zheng B (2002) Testing Lorenz curves with non-simple random samples. Econometrica 70:1235–1243
Acknowledgements
Funding was provided by Sapienza Università di Roma (C26A144TFX - Nuove metodologie di ricampionamento per indagini complesse con applicazioni alla stima di misure di disuguaglianza; C26A15W8EK - Un nuovo approccio all’imputazione singola e multipla).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
Proof of Proposition 2
Suppose that \(y < t\). From (25) it is not difficult to see that
Assumption C1 implies that
so that from (71) it is not difficult to see that
C being an appropriate constant. Inequality (72) also holds when \(y>t\). Hence, in terms of the process \({\mathcal {B}}^H\) introduced above we may write
Inequality (73) and the Gaussianity of \({\mathcal {B}}^H (t) - {\mathcal {B}}^H (y)\), in their turn, imply that
Observing that \(P ( {\mathcal {B}}^H (0) =0) = P ({\mathcal {B}}^H (1) =0) =1\), Proposition 2 now follows from (74) and Leadbetter and Weissner (1969). \(\square \)
Proof of Proposition 6
Let
be the (resampling) d.f. of \(Z^{*}_{n,m} \) (48). By Dvoretzky–Kiefer–Wolfowitz inequality (cfr. Massart 1990), we have first
Using the Borel–Cantelli first lemma, and taking into account that \(R^{*}_{n} (z)\) converges uniformly to \(P ( \sup _p \vert {\mathcal {L}}^H (p) \vert \le z )\), (51) immediately follows. Statement (52) follows from (51) and the absolute continuity of the distribution of \(\sup _p \vert {\mathcal {L}}^H (p) \vert \) (cfr. Lifshits 1982). \(\square \)
Proof of Proposition 7
Proof of (57) and (58) is similar to Proposition 6. As far as (59) is concerned, it is a consequence of Th. 2.5.5. in Sen and Singer (1993) (pp. 90–91). \(\square \)
Rights and permissions
About this article
Cite this article
Conti, P.L., Di Iorio, A., Guandalini, A. et al. On the estimation of the Lorenz curve under complex sampling designs. Stat Methods Appl 29, 1–24 (2020). https://doi.org/10.1007/s10260-019-00478-6
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10260-019-00478-6