Abstract
In this paper we consider estimation of models popular in efficiency and productivity analysis (such as the stochastic frontier model, truncated regression model, etc.) via the local maximum likelihood method, generalizing this method here to allow for not only continuous but also discrete regressors. We provide asymptotic theory, some evidence from simulations, and illustrate the method with an empirical example. Our methodology and theory can also be adapted for other models where a likelihood of the unknown functions can be used to identify and estimate the underlying model. Simulation results indicate flexibility of the approach and good performances in various complex scenarios, even with moderate sample sizes.
Similar content being viewed by others
Notes
We also tried the least squares cross-validation (LSCV) method and the results were very similar. Interestingly, yet not so surprisingly, our simulations showed that the LSCV appeared to be somewhat more robust for relatively small samples and much faster to optimize for large samples, yet the MLCV method sometimes gave better fit for relatively large samples.
Recall that the total conditional variance of \(\varepsilon \) is given by \({\mathrm{Var}}(v-u|x,z)=\sigma _{v}^{2}(x,z)+\sigma _{u}^{2}(x,z)(\pi -2)/\pi \). Note also that here we used local likelihood estimation with linear approximation for \(r(x,z), \log \sigma _{u}^{2}(x,z)\), and \(\log \sigma _{v}^{2}(x,z)\).
E.g., MLCV for obtaining results presented in the NW, SW, NW and NE panels took about 0.2, 0.1, 4.5 and 1.5 hours, respectively, on the desktop with Intel Xeon CPU E5620 @ 2.40GHz with two processors, while parametric MLE took a second.
In the simulations, the MLCV-based optimal bandwidth for the continuous regressor was usually around \(0.15\) for \(n=1,000\) and around \(0.25\) for \(n=100\).
References
Aigner DJ, Lovell CAK, Schmidt P (1977) Formulation and estimation of stochastic frontier models. J Econom 6:21–37
Aitchison J, Aitken CGG (1976) Multivariate binary discrimination by the kernel method. Biometrika 63(3):413–420
Badin L, Simar L (2009) A bias corrected nonparametric envelopment estimator of frontiers. Econom Theory 25:1289–1318
Eguchi S, Kim TY, Park BU (2003) Local likelihood method: a bridge over parametric and nonparametric regression. J Nonparametric Stat 15(6):665–683
Fan J, Heckman NE, Wand MP (1995) Local polynomial kernel regression for generalized linear models and quasi-likelihood functions. J Am Stat Assoc 90(429):141–150
Frölich M (2006) Non-parametric regression for binary dependent variables. Econom J 9:511–540
Hall P, Li Q, Racine J (2007) Nonparametric estimation of regression functions in the presence of irrelevant regressors. Rev Econ Stat 89(4):784–789
Henderson DJ, Zelenyuk V (2006) Testing for (efficiency) catching-up. South Econ J 73(4):1003–1019
Jondrow J, Lovell CAK, Materov IS, Schmidt P (1982) On the estimation of technical inefficiency in stochastic frontier production models. J Econom 19:233–238
Kumar S, Russell RR (2002) Technological change, technological catch-up, and capital deepening: relative contributions to growth and convergence. Am Econ Rev 92(3):527–548
Kumbhakar SC, Park BU, Simar L, Tsionas EG (2007) Nonparametric stochastic frontiers: a local likelihood approach. J Econom 137(1):1–27
Li Q, Racine JS (2007) Nonparametric econometrics: theory and practice. Princeton University Press, Princeton
Park B, Simar L, Ch Weiner (2000) The FDH estimator for productivity efficiency scores: asymptotic properties. Econom Theory 16:855–877
Park BU, Simar L, Zelenyuk V (2008) Local likelihood estimation of truncated regression and its partial derivatives: theory and application. J Econom 146(1):185–198
Racine JS, Li Q (2004) Nonparmetric estimation of regression functions with both categorical and continuous data. J Econom 119(1):99–130
Racine J, Hart J, Li Q (2006) Testing the significance of categorical predictor variables in nonparametric regression models. Econom Rev 25(4):523–544
Severini TA, Wong WH (1992) Profile likelihood and conditionally parametric models. Ann Stat 20(4):1768–1802
Simar L, Wilson PW (2007) Estimation and inference in two-stage, semi-parametric models of production processes. J Econom 136(1):31–64
Simar L, Zelenyuk V (2006) On testing equality of distributions of technical eciency scores. Econom Rev 25(4):497–522
Simar L, Zelenyuk V (2011) Stochastic FDH/DEA estimators for frontier analysis. J Prod Anal 36(1):1–20
Tibshirani R, Hastie TJ (1987) Local likelihood estimation. J Am Stat Assoc 82:559–567
Zelenyuk V, Zheka V (2006) Corporate governance and firm’s efficiency: the case of a transitional country, Ukraine. J Prod Anal 25:143–168
Acknowledgments
All authors acknowledge the financial support from ARC Discovery Grant DP130101022 and the CEPA of School of Economics of The University of Queensland (Australia), from the “Interuniversity Attraction Pole”, Phase VII (No. P7/06) of the Belgian Science Policy (Belgium), from the INRA-GREMAQ, Toulouse (France), from the NRF Grant funded by the Korean government (MEST), No. 20100017437, (listed here in alphabetical order). The authors also thank their colleagues and audiences of many conferences and seminars where this work has been presented. Only the authors and not the above mentioned institutions or people remain responsible for the views expressed.
Author information
Authors and Affiliations
Corresponding author
Appendix: Technical details
Appendix: Technical details
Below in the conditions and in the proof of Theorem 3.1, \(\Vert \mathbf{v}\Vert \) denotes the usual \(\ell _{2}\)-norm for a vector \(\mathbf{v}\), and the Frobenius (Hilbert−Schmidt) norm for a matrix \(\mathbf{v}\). Define \({\varvec{\psi }}(\mathbf{s}|\mathbf{x},\mathbf{z})=E\left[ \mathbf{g}_{1}(Y,{\varvec{\theta }}(\mathbf{x},\mathbf{z})+\mathbf{s}\,|\,\mathbf{X}=\mathbf{x},\mathbf{Z}=\mathbf{z})\right] \) for \(\mathbf{s}\in {\mathbb {R}}^{k}\). The conditions and the proof are given for a fixed point \((\mathbf{x},\mathbf{z})\) at which we want to estimate the value of \({\varvec{\theta }}=(\theta _{1},\ldots ,\theta _{k})^{\top }\).
1.1 Regularity conditions
-
(A1) For the vector of functions \(\mathbf{G}\) defined at (7.1), the equation \(\mathbf{G}({\varvec{\alpha }},\mathbf{A})=\mathbf{0}\) has the unique solution \({\varvec{\alpha }}=\mathbf{0}\) and \(\mathbf{A}=\mathbf{O}\), where \(\mathbf{0}\) is the zero vector and \(\mathbf{O}\) is the zero matrix. Also, \(E[\mathbf{g}_{1}(Y,{\varvec{\theta }}(\mathbf{X},\mathbf{Z}))|\mathbf{X},\mathbf{Z}]=\mathbf{0}\) almost surely.
-
(A2) For any compact set \(\mathcal {C}\), there exists a function \(U_{1}\) such that \(\sup _{\footnotesize {\varvec{\theta }}\in \mathcal {C}}\Vert \mathbf{g}_{1}(y,{\varvec{\theta }})\Vert \le U_{1}(y)\) and \(\sup _{\Vert \mathbf{u}-\mathbf{x}\Vert \le \epsilon }E[U_{1}(Y)^{2+\delta }|\mathbf{X}=\mathbf{u})<\infty \) for some \(\epsilon ,\,\delta >0\). Also, \(\mathbf{g}_{2}(y,{\varvec{\theta }})\) is continuous in \({\varvec{\theta }}\) for each \(y\), and there exists a function \(U_{2}(y)\) such that \(\sup _{\footnotesize {\varvec{\theta }}\in \mathcal {C}}\Vert \mathbf{g}_{2}(y,{\varvec{\theta }})\Vert \le U_{2}(y)\) for any compact set \(\mathcal {C}\) and \(\sup _{\Vert \mathbf{u}-\mathbf{x}\Vert \le \epsilon }E[U_{2}(Y)^{2}|\mathbf{X}=\mathbf{u}]<\infty \) for some \(\epsilon >0\).
-
(A3) All entries of \({\varvec{\theta }}(\cdot ,\mathbf{v})\) are twice partially continuously differentiable at \(\mathbf{x}\) for all values of \(\mathbf{v}\) such that \(d(\mathbf{v},\mathbf{z})=0\) or \(1\). Also, there exists \(\epsilon >0\) such that for all \(1\le j\le k\)
$$\begin{aligned} \sup _{\Vert \mathbf{u}-\mathbf{x}\Vert \le \epsilon ,\mathbf{v}\in \mathcal {D}}\Big \Vert \frac{\partial }{\partial \mathbf{u}}\theta _{j}(\mathbf{u},\mathbf{v})\Big \Vert <\infty . \end{aligned}$$ -
(A4) All entries of \({\varvec{\rho }}(\cdot ,\mathbf{v})\) are continuous at \(\mathbf{x}\) for all values \(\mathbf{v}\) such that \(d(\mathbf{v},\mathbf{z})=0\) or \(1\), and \({\varvec{\rho }}(\mathbf{x},\mathbf{z})\) is positive definite.
-
(A5) The density function \(f(\cdot ,\mathbf{v})\) is continuous at \(\mathbf{x}\) for all values \(\mathbf{v}\) such that \(d(\mathbf{v},\mathbf{z})=0\) or \(1\), and \(f(\mathbf{x},\mathbf{z})>0\).
-
(A6) All entries of \({\varvec{\tau }}(\cdot ,\mathbf{z})\) are continuous at \(\mathbf{x}\).
-
(A7) For any compact set \(\mathcal {C}\), it holds that \(\sup _{\mathbf{s}\in \mathcal {C}}\Vert {\varvec{\psi }}(\mathbf{s}|\mathbf{x}+\mathbf{u},\mathbf{z})-{\varvec{\psi }}(\mathbf{s}|\mathbf{x},\mathbf{z})\Vert \rightarrow 0\) as \(\Vert \mathbf{u}\Vert \rightarrow 0\).
The first part of the assumption (A1) is required for likelihood-based methods. Without this assumption, likelihood-based methods would not work. It holds if the logarithm of the conditional density \(\log g(y,{\varvec{\theta }})\) is strictly convex in \({\varvec{\theta }}\), the latter being typically assumed for likelihood-based methods. The second part of (A1) is also typical. It is just a Bartlett identity of first-order. The two conditions of (A2) are for a stochastic expansion and the asymptotic normality of the estimator. For the stochastic expansion we actually need the first condition with \(\delta =0\) and the second one, but for the asymptotic normality we require a higher moment condition on \(U_{1}\). The first part of (A3) is typical for nonparametric smoothing, and is for a bias expansion of the estimator. The second part of (A3) is to deal with those terms involving \(w_{j}\) in the bias expansion. The assumptions (A4)-(A6) are used to obtain the leading bias and variance of the estimator. The last assumption (A7) is also required, along with (A2), for a stochastic expansion of the estimator.
1.2 Proof of Theorem 3.1
Hereafter, \({\varvec{\theta }}\) denotes the true function. We also let \({\varvec{\Theta }}\) denote the matrix of the partial derivatives of the true vector function, that is, \({\varvec{\Theta }}_{jl}(\mathbf{x},\mathbf{z})=\partial \theta _{j}(\mathbf{x},\mathbf{z})/\partial x_{l}\), where \(\theta _{j}\) is the \(j\)th component function of \({\varvec{\theta }}\) and \(x_{l}\) is the \(l\)th coordinate of \(\mathbf{x}\). Define, for a given \((\mathbf{x},\mathbf{z})\),
The function \(\tilde{{\varvec{\theta }}}(\mathbf{u},\mathbf{v})\) is an approximation of \({\varvec{\theta }}(\mathbf{u},\mathbf{v})\) for \(\mathbf{u}\) near \(\mathbf{x}\) and for \(\mathbf{v}\) near \(\mathbf{z}\), which is linear in the direction of \(\mathbf{x}\), while constant in the direction of \(\mathbf{z}\). Define \(\mathbf{l}(\mathbf{u})=(1,\mathbf{u}^{\top })^{\top }\) for \(\mathbf{u}\in {\mathbb {R}}^{p}\), and
for \({\varvec{\alpha }}\in {\mathbb {R}}^{k}\) and \(\mathbf{A}\) being a \((k\times p)\)-matrix, where \(\otimes \) denotes the Kronecker product. Note that \(\mathbf{G}\) is a vector of \(k(p+1)\) multivariate functions. This is the population version of
The function \(\mathbf{G}_{n}\) is obtained if we differentiate \({\tilde{L}}_{n}({\varvec{\alpha }},\mathbf{A}){:=}L_{n}({\varvec{\theta }}(\mathbf{x},\mathbf{z})+{\varvec{\alpha }},{\varvec{\Theta }}(\mathbf{x},\mathbf{z})+\mathbf{A}\mathbf{H}^{-1})\) with respect to \({\varvec{\alpha }}\) and \(\mathbf{A}\), where \(L_{n}\) is defined at (2.1). The top \(k\) entries of \(\mathbf{G}_{n}\) are the partial derivatives \(\partial {\tilde{L}}_{n}({\varvec{\alpha }},\mathbf{A})/\partial \alpha _{1},\partial {\tilde{L}}_{n}({\varvec{\alpha }},\mathbf{A})/\partial \alpha _{2},\ldots ,\partial {\tilde{L}}_{n}({\varvec{\alpha }},\mathbf{A})/\partial \alpha _{k}\), and the next \(k\) entries are \(\partial {\tilde{L}}_{n}({\varvec{\alpha }},\mathbf{A})/\partial A_{11},\partial {\tilde{L}}_{n}({\varvec{\alpha }},\mathbf{A})/\partial A_{21},\ldots ,\partial {\tilde{L}}_{n}({\varvec{\alpha }},\mathbf{A})/\partial A_{k1}\), and the last \(k\) entries are \(\partial {\tilde{L}}_{n}({\varvec{\alpha }},\mathbf{A})/\partial A_{1p},\partial {\tilde{L}}_{n}({\varvec{\alpha }},\mathbf{A})/\partial A_{2p},\ldots ,\partial {\tilde{L}}_{n}({\varvec{\alpha }},\mathbf{A})/\partial A_{kp}\), where we write \({\varvec{\alpha }}=(\alpha _{1},\ldots ,\alpha _{k})^{\top }\) and \(\mathbf{A}=(A_{ij})\). Define
Then, \((\hat{{\varvec{\alpha }}},\hat{\mathbf{A}})\) is the solution of the equation \(\mathbf{G}_{n}({\varvec{\alpha }},\mathbf{A})=\mathbf{0}\).
We claim that, for any compact set \(\mathcal {C}\) of \(({\varvec{\alpha }},\mathbf{A})\), one has
These two properties imply the uniform convergence of \(\mathbf{G}_{n}({\varvec{\alpha }},\mathbf{A})\) to \(\mathbf{G}({\varvec{\alpha }},\mathbf{A})\) in probability over any compact set \(\mathcal {C}\). Due to the first part of the assumption (A1), we can conclude that all the entries of \(\hat{{\varvec{\alpha }}}(\mathbf{x},\mathbf{z})\) and \(\hat{\mathbf{A}}(\mathbf{x},\mathbf{z})\) converge to zero in probability. This enables us to further expand \(\mathbf{G}_{n}(\hat{{\varvec{\alpha }}},\hat{\mathbf{A}})=0\) around the solution of \(\mathbf{G}({\varvec{\alpha }},\mathbf{A})=\mathbf{0}\) which are \(({\varvec{\alpha }},\mathbf{A})=(\mathbf{0},\mathbf{O})\). Define
This is obtained by differentiating \(\mathbf{G}_{n}({\varvec{\alpha }},\mathbf{A})\) with respect to \({\varvec{\alpha }}\) and \(\mathbf{A}\). Let \({\varvec{\upsilon }}(\hat{{\varvec{\alpha }}},\hat{\mathbf{A}})\) denote a \(k(p+1)\)-vector obtained by concatenating the entries of \(\hat{{\varvec{\alpha }}}\) and \(\hat{\mathbf{A}}\). It is defined by \({\varvec{\upsilon }}(\hat{{\varvec{\alpha }}},\hat{\mathbf{A}})^{\top }=(\hat{{\varvec{\alpha }}}^{\top },\hat{\mathbf{A}}_{1}^{\top },\ldots ,\hat{\mathbf{A}}_{p}^{\top })\), where \(\hat{\mathbf{A}}=[\hat{\mathbf{A}}_{1},\ldots ,\hat{\mathbf{A}}_{p}]\). Then, it follows that, for some \(({\varvec{\alpha }}^{*},\mathbf{A}^{*})\) such that \(\Vert ({\varvec{\alpha }}^{*},\mathbf{A}^{*})\Vert \le \Vert (\hat{{\varvec{\alpha }}},\hat{\mathbf{A}})\Vert \),
For \(\mathbf{J}_{n}({\varvec{\alpha }},\mathbf{A})\) we will show that, for any compact set \(\mathcal {C}\),
This entails with the second part of the assumption (A2)
To see this, note that the second part of the assumption (A2) implies that for a given \(\delta >0\) there exists \(\varepsilon >0\) such that, for sufficiently large \(n\), \(\Vert E\mathbf{J}_{n}({\varvec{\alpha }},\mathbf{A})-E\mathbf{J}_{n}(\mathbf{0},\mathbf{O})\Vert \le \delta \) for all \(({\varvec{\alpha }},\mathbf{A})\) with \(\Vert ({\varvec{\alpha }},\mathbf{A})\Vert \le \varepsilon \). This and the consistency of \((\hat{{\varvec{\alpha }}},\hat{\mathbf{A}})\) together with (7.5) establish (7.6). Define a diagonal matrix \(\mathbf{M}\) of dimension \((p+1)\) in such a way that the first entry equals \(1\) and the rest are all \(\mu _{2}\). We claim
The expansions (7.4), (7.6) and (7.7) give
where \(\mathbf{1}_{p+1}\) denotes the \((p+1)\)-dimensional unit vector such that \(\mathbf{1}_{p+1}^{\top }=(1,0,\ldots ,0)\).
Now, we derive the first-order properties of \(\mathbf{G}_{n}(\mathbf{0},\mathbf{O})\). For \(\mathbf{Z}^{i}=\mathbf{z}\), \(\Lambda _{\mathbf{w}}(\mathbf{Z}^{i},\mathbf{z})=1\). For \(\mathbf{Z}^{i}\) with \(d(\mathbf{Z}^{i},\mathbf{z})=1\), we have \(\Lambda _{\mathbf{w}}(\mathbf{Z}^{i},\mathbf{z})=w_{j}\) for some \(j\). Those \(\mathbf{Z}^{i}\) with \(d(\mathbf{Z}^{i},\mathbf{z})\ge 2\) have a contribution of order \(O_{p}(w^{*2})\) to \(\mathbf{G}_{n}(\mathbf{0},\mathbf{O})\), where \(w^{*}=\max _{1\le j\le d}w_{j}\). Thus,
The expected value of the first term in (7.9) has the following expansion due to the second part of the assumption (A1) and the assumptions (A2)–(A5):
By the properties of the multivariate kernel \(K\), we can further approximate \(E(\mathbf{T}_{1})\) by
where \({\varvec{\beta }}_{\mathbf{H}}(\mathbf{x},\mathbf{z})\) is \(k\)-vector whose \(j\)th entry equals \({\mathrm{tr}}({\varvec{\theta }}_{j}''(\mathbf{x},\mathbf{z})\mathbf{H}^{2})\). One can similarly get an approximation of \({\mathrm{var}}(\mathbf{T}_{1})\). In fact,
where \(\mathbf{D}\) is a \((p+1)\)-dimensional diagonal matrix whose first diagonal entry equals \(\int K^{2}(\mathbf{u})\, d\mathbf{u}\) and the next \(p\) diagonal entries are \(\int u_{j}^{2}K^{2}(\mathbf{u})\, d\mathbf{u},\,1\le j\le p\).
Next, we look into the term \(\mathbf{T}_{2}\). This term contributes only \(E(\mathbf{T}_{2})\) to the first-order properties of \(\mathbf{G}_{n}(\mathbf{0},\mathbf{O})\) since \({\mathrm{var}}(\mathbf{T}_{2})\) is negligible in comparison with \({\mathrm{var}}(\mathbf{T}_{1})\) because of the additional factors \(w_{j}\) that go to zero as \(n\) tends to infinity. For an expansion of \(E(\mathbf{T}_{2})\), we note that the following approximation holds due to the second part of (A3): uniformly for \(\mathbf{v}\in \mathcal {D}\),
for \(\mathbf{u}\) near \(\mathbf{x}\). With this and using the assumptions (A4) and (A5) we get
Asymptotic normality of \(\mathbf{T}_{1}\) follows from a standard technique and the first part of the assumption (A2). The theorem now follows from some basic properties of Kronecker products. It remains to prove (7.2), (7.3), (7.5) and (7.7). Among them, (7.7) can be proved similarly as in the derivation of the expansion for \(E(\mathbf{T}_{1})\).
We prove (7.2) first. We write simply \(\mathbf{l}^{i}\) for \(\mathbf{l}(\mathbf{H}^{-1}(\mathbf{X}^{i}-\mathbf{x}))\), \(\mathbf{g}_{1}^{i}({\varvec{\alpha }},\mathbf{A})\) for \(\mathbf{g}_{1}(Y^{i},\tilde{{\varvec{\theta }}}(\mathbf{X}^{i},\mathbf{Z}^{i})+{\varvec{\alpha }}+\mathbf{A}\mathbf{H}^{-1}(\mathbf{X}^{i}-\mathbf{x}))\), \(K^{i}\) for \(K_{\mathbf{H}}(\mathbf{X}^{i}-\mathbf{x})\) and \(\Lambda ^{i}\) for \(\Lambda _{\mathbf{w}}(\mathbf{Z}^{i},\mathbf{z})\). Define \({\varvec{\xi }}^{i}({\varvec{\alpha }},\mathbf{A})=(\mathbf{l}^{i}\otimes \mathbf{g}_{1}^{i}({\varvec{\alpha }},\mathbf{A}))K^{i}\Lambda ^{i}\). Then, we can write \(\mathbf{G}_{n}({\varvec{\alpha }},\mathbf{A})=n^{-1}\sum _{i=1}^{n}{\varvec{\xi }}^{i}({\varvec{\alpha }},\mathbf{A})\). We want to get an exponential bound for a large deviation of the centered \(\sqrt{n|\mathbf{H}|/\log n}\mathbf{G}_{n}({\varvec{\alpha }},\mathbf{A})\) for each fixed \(({\varvec{\alpha }},\mathbf{A})\). Since \({\varvec{\xi }}^{i}({\varvec{\alpha }},\mathbf{A})\) are not bounded, we employ a truncation technique. Since \(\Lambda ^{i}\le 1\) for all \(1\le i\le n\) and from the first part of the assumption (A2), we obtain that for any compact set \(\mathcal {C}\)
The right hand side of (7.11) has the expectation of the magnitude \(O(n^{-1/2})\) due to the fact that the conditional second moment of \(U_{1}(Y)\) given \(\mathbf{X}=\mathbf{u}\) is bounded locally uniformly for \(\mathbf{u}\) around \(\mathbf{x}\), see the first part of (A2). This implies that the left hand side of (7.11) is of order \(O_{p}(n^{-1/2})\). Similarly, we also get \(E[{\varvec{\xi }}({\varvec{\alpha }},\mathbf{A})I(\Vert {\varvec{\xi }}({\varvec{\alpha }},\mathbf{A})\Vert >\sqrt{n})]=O(n^{-1/2})\) uniformly for \(({\varvec{\alpha }},\mathbf{A})\) in any compact set. These considerations reduce the proof of (7.2) to that for the truncated version \(\mathbf{G}_{n}^{*}({\varvec{\alpha }},\mathbf{A}){:=}n^{-1}\sum _{i=1}^{n}\{{\varvec{\xi }}^{i}({\varvec{\alpha }},\mathbf{A})I[\Vert {\varvec{\xi }}^{i}({\varvec{\alpha }},\mathbf{A})\Vert \le \sqrt{n}]\}\). By a simple application of Markov inequality and since \(|\mathbf{H}|E\Vert {\varvec{\xi }}({\varvec{\alpha }},\mathbf{A})\Vert ^{2}\) is bounded, say by \(c\), from the first part of the assumption (A2), we get
for any fixed \({\varvec{\alpha }}\) and \(\mathbf{A}\). Since \(\mathbf{G}_{n}\) is Lipschitz continuous of order \(1\) with a Lipschitz constant \(O_{p}(1)\) by the second part of the assumption (A2), the exponential bound (7.13) concludes the proof of (7.2).
Next, we prove (7.3). By the assumption (A7), we obtain
uniformly for \(({\varvec{\alpha }},\mathbf{A})\) in any compact set. This completes the proof of (7.3). The proof of (7.5) is similar to that of (7.2). For this proof, one may use continuity of \(\mathbf{g}_{2}(y,{\varvec{\theta }})\) in \({\varvec{\theta }}\) and the following exponential inequality for the truncated version of \(\mathbf{J}_{n}\), denoted by \(\mathbf{J}_{n}^{*}\), constructed in the same way as \(\mathbf{G}_{n}^{*}\): for any \(\varepsilon >0\) it holds that
for any fixed \({\varvec{\alpha }}\) and \(\mathbf{A}\), where \(c\) is the same positive constant as at (7.13).
Rights and permissions
About this article
Cite this article
Park, B.U., Simar, L. & Zelenyuk, V. Categorical data in local maximum likelihood: theory and applications to productivity analysis. J Prod Anal 43, 199–214 (2015). https://doi.org/10.1007/s11123-014-0394-y
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11123-014-0394-y
Keywords
- Stochastic frontier models
- Truncated regression
- Local maximum likelihood
- Nonparametric smoothing
- Categorical variables