# Varying Coefficient Models

**DOI:**https://doi.org/10.1057/978-1-349-95189-5_2501

## Abstract

Varying coefficient models offer a compromise between fully nonparametric and parametric models by allowing for the desired flexibility of the response coefficients of standard regression models to uncover hidden structures in the data without running into the serious curse of the dimensionality issue.

### Keywords

Functional coefficient models Heteroskedasticity Least squares Linear regression models Maximum likelihood Nonparametric estimation Parameter heterogeneity Random coefficients model Smooth coefficient models Tuning variables Varying coefficient modelsOne of the most interesting forms of nonlinear regression models is the varying coefficient model (VCM). Unlike the linear regression model, VCMs were introduced by Hastie and Tibshirani (1993) to allow the regression coefficients to vary systematically and smoothly in more than one dimension. It is worth noting the distinction between the VCM and the so-called random coefficients model, which assumes that the coefficients vary non-systematically (randomly). Versions of the VCM are encountered in the literature as functional coefficient models (see Cai et al. 2000b) and smooth coefficient models (see Li et al. 2002).

VCMs are very useful tools in applied work in economics as they can be used to model parameter heterogeneity in a general way. For example, Durlauf et al. (2001) estimate a version of the Solow model that allows the parameters for each country to vary as functions of initial income. This work is extended in Kourtellos (2005), who finds parameter dependence on initial literacy, initial life expectancy, expropriation risk and ethnolinguistic fractionalization. Li et al. (2002) use the above smooth coefficient model to estimate the production function of the non-metal mineral industry in China. Stengos and Zacharias (2006) use the same model to examine an intertemporal hedonic model of the personal computer market, where the coefficients of the hedonic regression were unknown functions of time. Hong and Lee (2003) forecast the nonlinearity in the conditional mean of exchange rate changes using a VCM that allows the autoregressive coefficients to vary with investment positions. Ahmad et al. (2005) apply the VCM in the estimation of a production function in China’s manufacturing industry to show that the marginal productivity of labour and capital depends on the firm’s R&D values. Mamuneas et al. (2006) study the effect of human capital on total factor productivity in an empirical growth framework. In what follows we present the basic structure of the standard VCM specification as it appears in the literature and then proceed to discuss certain aspects of estimation and some of its recent generalizations.

## Basic Specification

with *E*(*u*_{i}|*X*_{i})= 0, where *X*_{i} = (1, *x*_{i2},..., *x*_{ip})′ is a *p* dimensional vector of slope regressors and *β*(*z*_{i})′ = (*β*_{1}(*z*_{i1}), *β*_{2}(*z*_{i2}),..., *β*_{p}(*z*_{ip})) is a *p* dimensional vector of varying coefficients, which take the form of unknown smooth functions of *z*_{i1}, *z*_{i2},..., *z*_{ip}, respectively. Notice that *β*_{1}(*z*_{i}) is a varying intercept that measures the direct relationship between the tuning variable *z*_{i} and the dependent variable in a nonparametric way. We refer to the variables *z*_{i}’s as tuning variables, and they can be one-dimensional or multidimensional. These functions map the tuning variables into a set of local regression coefficient estimates that imply that the effect of *X*_{i} on *y*_{i} will not be constant but rather it will vary smoothly with the tuning variables. These tuning variables could take the form of a scalar like time or it could be a vector of dimension *q*. A common situation in the literature arises when the *z*_{j} is the same for all *j*.

It is worth is noting that the VCM (1) is a very flexible and rich family of models. One of the reasons is that the general additive separable structure of (1) offers also a very useful compromise to the general high-dimensional nonparametric regression that is known to suffer from the curse of dimensionality. This allows for nonparametric estimation even when the conditioning regressor space is in high dimensional. Another is that it nests many well-known models as a special case. For instance, consider the following cases. If *β*_{j}(*z*_{ij})= *β*_{j}, for all *j* then we are dealing with the usual linear model. If *β*_{j}(*z*_{ij})= *β*_{j}*z*_{ij} for some variable *j*, we simply have the interaction term *β*_{j}*x*_{ij}*z*_{ij} entering the regression function. If *x*_{i} = *c* (a constant) or if *z*_{ij} = *x*_{ij} for all *j* = 1, … *p*, then the model takes the generalized additive form where the additive components are unknown functions (see Hastie and Tibshirani 1990; Linton and Nielsen 1995).

*z*

_{0}the functions

*β*

_{j}(

*z*),

*j*= 1...

*p*are approximated by local linear polynomials

*β*

_{j}(

*z*) ≈

*c*

_{j0}+

*c*

_{j1}(

*z*–

*z*

_{0}) for

*z*in a neighborhood of

*z*

_{0}. This leads to the following weighted local least squares problem:

for a given kernel function *K* and bandwidth *h*, where *K*_{h}(·) = *K*(·/*h*)/*h*. While this method is simple, it is implicitly assumed that the functions *β*_{j}(*z*) possess the same degrees of smoothness and hence can be approximated equally well in the same interval. Fan and Zhang (1999) allow for different degrees of smoothness for different coefficient functions by proposing a two-stage method. This is similar in spirit to what Huang and Shen (2004) do for global smoothers using regression splines but allowing each coefficient function to have different (global) smoothing parameters.

An attractive alternative to local polynomial estimation is a global smoothing method based on general series methods such as polynomial splines and trigonometric approximation (see Ahmad et al. 2005; Huang et al. 2004; Huang and Shen 2004; Xue and Yang 2006a). All these papers emphasize the computational savings from having to solve only one minimization problem. Ahmad, Leelahanon and Li stress the efficiency gains of the series approach over a kernel-based approach when one allows for conditional heteroskedasticity. We should note that the inference for the estimated coefficients will differ for different choices of approximation, and the asymptotic properties of such estimators are generally not easy to obtain.

Although the model was initially developed for i.i.d. data, it has been extended for time series data by Chen and Tsay (1993), Cai et al. (2000b), Huang and Shen (2004), and Cai (2007) for strictly stationary processes with different mixing conditions. The coefficient functions typically now become functions of time and/or lagged values of the dependent variable. It is worth noting that estimation issues such as bandwidth selection are similar, as in the i.i.d. data case (see Cai 2007). The varying coefficient model has also been employed to analyse longitudinal data (see Brumback and Rice (1998), Hoover et al. (1998), and Huanget al. (2004).

## The Partially Linear Varying Coefficient Model

*z*

_{i}, is the partially linear VCM. Here some of the coefficients are constants (independent of

*z*

_{i}). In that case, eq. (1) can be rewritten as

where *W*_{i} is the *i*th observation on a (1 × *q*) vector of additional regressors that enter the regression function linearly. The estimation of this model requires some special treatment as the partially linear structure may allow for efficiency gains, since the linear part can be estimated at a much faster rate, namely, \( \sqrt{n} \).

The partially linear VCM has been studied by Zhang et al. (2002), Xia et al. (2004), Ahmad et al. (2005), and Fan and Huang (2005). Zhang et al. (2002) suggest a two-step procedure where the coefficients of the linear part are estimated in the first step using polynomial fitting with an initial small bandwidth using cross validation (see Hoover et al. 1998). In other words, the approach is based on under-smoothing in the first stage. Then these estimates are averaged to yield the final first-step linear part estimates which are then used to redefine the dependent variable and return to the environment of eq. (1), where local smoothers can be applied as described above. Alternatively, Xia et al. (2004) separate the estimation of *γ* from that of *β*(*z*_{i}) by noting that the former can be estimated globally, but the latter locally. This is what they call a ‘semi-local least squares procedure’, and they achieve a more efficient estimate of *γ* without under-smoothing using standard bandwidth selection methods. Once *γ* has been estimated, then again the linear part can be used to redefine the dependent variable and return to the environment of eq. (3).

More recently, Fan and Huang (2005) use a profile least squares estimation approach to provide a simple and useful method for (3). More precisely, they construct a Wald test and a profile likelihood ratio test for the parametric component that share similar sampling properties. More importantly, they show that the asymptotic distribution of the profile likelihood ratio test under the null is independent of nuisance parameters, and follows an asymptotic *χ*^{2} distribution. They also propose a generalized likelihood ratio test statistic to test whether certain parametric functions fit the nonparametric varying coefficients. This hypothesis test includes testing for the significance of the slope variables *X* (zero coefficients) and the homogeneity of the model (constant coefficients). Other work on specification testing includes Li et al. (2002), Cai et al. (2000b), Cai (2007), Yang et al. (2006) that mainly rely on bootstrapping in their implementation.

## Generalizations and Extensions

*m*(

*X*

_{i},

*Z*

_{i})=

*β*(

*z*

_{i})′

*X*

_{i}via some given link function

*g*(···)

This generalization is known as the generalized varying coefficient model and was originally proposed by Hastie and Tibshirani (1993). Cai et al. (2000a) study this model using local polynomial techniques and propose an efficient one-step local maximum likelihood estimator. Notice that if *g*(···) is the normal CDF then (4) generalizes the standard tool of the discrete choice literature, namely the probit model.

*z*

_{l},

*l*= 1, 2, …,

*q*. Although Hastie and Tibshirani (1993) proposed a back-fitting algorithm to estimate the varying coefficient functions, they did not provide any asymptotic justification. The most notable advance in this context has been by Xue and Yang (2006a), who propose a generalization of the VCM as in (1) that allows the varying coefficients to have an additive coefficient structure on regression coefficients to avoid the curse of dimensionality

Under mixing conditions, Xue and Yang (2006a) propose local polynomial marginal integration estimators, while Xue and Yang (2006b) study this model using polynomial splines.

Finally, Cai et al. (2006) have shifted the discussion to consider a structural VCM. They examine the case of endogenous slope regressors, and propose a two-stage IV procedure based on local linear estimation procedures in both stages. We believe that this line of research is fruitful for economic applications.

## Conclusion

VCMs have increasingly been employed as useful tools that allow for a compromise between fully nonparametric and parametric models. This compromise allows for the desired flexibility to uncover hidden structures that underlie the response coefficients of standard regression models without running into the serious curse of the dimensionality issue. More importantly, the structure of the VCM that allows the regression coefficients to vary with a tuning variable is very appealing in many economic applications, for it has a natural interpretation of non-constant marginal effects.

## See Also

### Bibliography

- Ahmad, I., S. Leelahanon, and Q. Li. 2005. Efficient estimation of a semiparametric partially varying linear model.
*Annals of Statistics*33: 258–283.CrossRefGoogle Scholar - Brumback, B., and J. Rice. 1998. Smoothing spline models for the analysis of nested and crossed samples of curves.
*Journal of the American Statistical Association*93: 961–976.CrossRefGoogle Scholar - Cai, Z. 2007. Trending time-varying coefficient time series models with serially correlated errors.
*Journal of Econometrics*136: 163–188.CrossRefGoogle Scholar - Cai, Z., J. Fan, and R. Li. 2000a. Efficient estimation and inferences for varying-coefficient models.
*Journal of the American Statistical Association*95: 888–902.CrossRefGoogle Scholar - Cai, Z., J. Fan, and Q. Yao. 2000b. Functional coefficient regression models for nonlinear time series models.
*Journal of the American Statistical Association*95: 941–956.CrossRefGoogle Scholar - Cai, Z., M. Das, H. Xiong, and Z. Wu. 2006. Functional coefficient instrumental variables models.
*Journal of Econometrics*133: 207–241.CrossRefGoogle Scholar - Chen, R., and R. Tsay. 1993. Functional coefficient autoregressive models.
*Journal of the American Statistical Association*88: 298–308.Google Scholar - Durlauf, S., A. Kourtellos, and A. Minkin. 2001. The local Solow growth model.
*European Economic Review*45: 928–940.CrossRefGoogle Scholar - Fan, J. 1992. Design-adaptive nonparametric regression.
*Journal of the American Statistical Association*87: 998–1004.CrossRefGoogle Scholar - Fan, J., and I. Gijbels. 1996.
*Local polynomial modelling and its applications*. London: Chapman and Hall.Google Scholar - Fan, J., and T. Huang. 2005. Profile likelihood inferences on semiparametric varying- partially linear models.
*Bernoulli*11: 1031–1057.CrossRefGoogle Scholar - Fan, J., and W. Zhang. 1999. Statistical estimation in varying-coefficient models.
*Annals of Statistics*27: 1491–1518.CrossRefGoogle Scholar - Hastie, T., and R. Tibshirani. 1990.
*Generalized additive models*. New York: Chapman and Hall.Google Scholar - Hastie, T., and R. Tibshirani. 1993. Varying coefficient models.
*Journal of the Royal Statistical Society, Series B*55: 757–796.Google Scholar - Hong, Y., and T.-H. Lee. 2003. Inference on predictability of foreign exchange rates via generalized spectrum and nonlinear time series models.
*The Review of Economics and Statistics*85: 1048–1062.CrossRefGoogle Scholar - Hoover, D., C. Rice, C. Wu, and L. Yang. 1998. Nonparametric smoothing estimates of time-varying coefficient models with longitudinal data.
*Biometrika*85: 809–822.CrossRefGoogle Scholar - Huang, J., and H. Shen. 2004. Functional coefficient regression models for nonlinear time series: A polynomial spline approach. Scandinavian Journal of Statistics 31, 515–534.Google Scholar
- Huang, J., C. Wu, and L. Zhou. 2004. Polynomial spline estimation and inference for varying coefficient models with longitudinal data.
*Statistica Sinica*14: 763–788.Google Scholar - Kourtellos, A. 2005. Modeling parameter heterogeneity in cross-country growth regression models. Mimeo, Department of Economics, University of Cyprus.Google Scholar
- Li, Q., C. Huang, D. Li, and T. Fu. 2002. Semiparametric smooth coefficient models.
*Journal of Business and Economic Statistics*20: 412–422.CrossRefGoogle Scholar - Linton, O., and J. Nielsen. 1995. A kernel method of estimating structural nonparametric regression based on marginal integration.
*Biometrika*82: 93–100.CrossRefGoogle Scholar - Mamuneas, T., A. Savvides, and T. Stengos. 2006. Economic development and the return to human capital: A smooth coefficient semiparametric approach.
*Journal of Applied Econometrics*21: 111–132.CrossRefGoogle Scholar - Stengos, T., and E. Zacharias. 2006. Intertemporal pricing and price discrimination: A semiparametric hedonic analysis of the personal computer market.
*Journal of Applied Econometrics*21: 371–386.CrossRefGoogle Scholar - Stone, C. 1977. Consistent nonparametric regression.
*Annals of Statistics*5: 595–620.CrossRefGoogle Scholar - Xia, Y., W. Zhang, and H. Tong. 2004. Efficient estimation for semivarying-coefficient models.
*Biometrika*91: 661–681.CrossRefGoogle Scholar - Xue, L., and L. Yang. 2006a. Estimation of semiparametric additive coefficient model.
*Journal of Statistical Planning and Inference*136: 2506–2534.CrossRefGoogle Scholar - Xue, L., and L. Yang. 2006b. Additive coefficient modeling via polynomial spline.
*Statistica Sinica*16: 1423–1446.Google Scholar - Yang, L., B. Park, L. Xue, and W. Härdle. 2006. Estimation and testing for varying coefficients in additive models with marginal integration.
*Journal of the American Statistical Association*101: 1212–1227.CrossRefGoogle Scholar - Zhang, W., S.-Y. Lee, and X. Song. 2002. Local polynomial fitting in semivarying coefficient model.
*Journal of Multivariate Analysis*82: 166–188.CrossRefGoogle Scholar