Skip to main content

Penalized Relative Error Estimation of a Partially Functional Linear Multiplicative Model

  • Conference paper
  • First Online:

Part of the book series: Contributions to Statistics ((CONTRIB.STAT.))

Abstract

Functional data become increasingly popular with the rapid technological development in data collection and storage. In this study, we consider both scalar and functional predictors for a positive scalar response under the partially linear multiplicative model. A loss function based on the relative errors is adopted, which provides a useful alternative to the classic methods such as the least squares. Penalization is used to detect the true structure of the model. The proposed method can not only identify the significant scalar variables but also select the basis functions (on which the functional variable is projected) that contribute the response. Both estimation and selection consistency properties are rigorously established. Simulation is conducted to investigate the finite sample performance of the proposed method. We analyze the Tecator data to demonstrate application of the proposed method.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Bühlmann, P., & Van De Geer, S. (2011). Statistics for high-dimensional data: Methods, theory and applications. Berlin/Heidelberg: Springer Science & Business Media.

    Book  Google Scholar 

  2. Chen, K., Guo, S., Lin, Y., & Ying, Z. (2010). Least absolute relative error estimation. Journal of the American Statistical Association, 105(491), 1104–1112.

    Article  MathSciNet  Google Scholar 

  3. Chen, K., Lin, Y., Wang, Z., & Ying, Z. (2016). Least product relative error estimation. Journal of Multivariate Analysis, 144, 91–98.

    Article  MathSciNet  Google Scholar 

  4. Fan, J., & Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association, 96(456), 1348–1360.

    Article  MathSciNet  Google Scholar 

  5. Ferraty, F., Hall, P., & Vieu, P. (2010). Most-predictive design points for functional data predictors. Biometrika, 97(4), 807–824.

    Article  MathSciNet  Google Scholar 

  6. Knight, K. (1998). Limiting distributions for L1 regression estimators under general conditions. The Annals of Statistics, 26(2), 755–770.

    Article  MathSciNet  Google Scholar 

  7. Kong, D., Xue, K., Yao, F., & Zhang, H. H. (2016). Partially functional linear regression in high dimensions. Biometrika, 131(1), 147–159.

    Article  MathSciNet  Google Scholar 

  8. Li, Y., & Hsing, T. (2010). Deciding the dimension of effective dimension reduction space for functional and high-dimensional data. The Annals of Statistics, 38(5), 3028–3062.

    Article  MathSciNet  Google Scholar 

  9. Lian, H. (2013). Shrinkage estimation and selection for multiple functional regression. Statistica Sinica, 23, 51–74.

    MathSciNet  MATH  Google Scholar 

  10. Narula, S. C., & Wellington, J. F. (1977). Prediction, linear regression and the minimum sum of relative errors. Technometrics, 19(2), 185–190.

    Article  Google Scholar 

  11. Park, H., & Stefanski, L. (1998). Relative-error prediction. Statistics & Probability Letters, 40(3), 227–236.

    Article  MathSciNet  Google Scholar 

  12. Shin, H. (2009). Partial functional linear regression. Journal of Statistical Planning and Inference, 139(10), 3405–3418.

    Article  MathSciNet  Google Scholar 

  13. Wang, J. L., Chiou, J. M., & Müller Hans-Georg, H. G. (2016). Functional data analysis. Annual Review of Statistics and Its Application, 3, 257–295.

    Article  Google Scholar 

  14. Wang, Z., Chen, Z., & Wu, Y. (2016). A relative error estimation approach for single index model. Preprint. arXiv:1609.01553.

    Google Scholar 

  15. Yao, F., Müller, H. G., & Wang, J. L. (2005). Functional data analysis for sparse longitudinal data. Journal of the American Statistical Association, 100(470), 577–590.

    Article  MathSciNet  Google Scholar 

  16. Zhang, Q., & Wang, Q. (2013). Local least absolute relative error estimating approach for partially linear multiplicative model. Statistica Sinica, 23(3), 1091–1116.

    MathSciNet  MATH  Google Scholar 

  17. Zhang, T., Zhang, Q., & Li, N. (2016). Least absolute relative error estimation for functional quadratic multiplicative model. Communications in Statistics–Theory and Methods, 45(19), 5802–5817.

    Article  MathSciNet  Google Scholar 

  18. Zou, H. (2006). The adaptive lasso and its oracle properties. Journal of the American Statistical Association, 101(476), 1418–1429.

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

We thank the organizers for invitation. The study was partly supported by the Fundamental Research Funds for the Central Universities (20720171064), the National Natural Science Foundation of China (11561006, 11571340), Research Projects of Colleges and Universities in Guangxi (KY2015YB171), Open Fund Project of Guangxi Colleges and Universities Key Laboratory of Mathematics and Statistical Model (2016GXKLMS005), and National Bureau of Statistics of China (2016LD01). Ahmed research is supported by the Natural Sciences and Engineering Research Council of Canada (NSERC).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shuangge Ma .

Editor information

Editors and Affiliations

Appendix

Appendix

Denote δ = (θ , β ) and \(\hat {\delta }=(\hat {\theta }^{\top }, \hat {\beta }^{\top } )^{\top }\). To facilitate proof of the theorems, we introduce Lemmas 13 where proof of Lemmas 1 and 2 can be found in [7] and proof of Lemma 3 can be found in [9].

Lemma 1

Under (C1), (C2), and (C5), we have

$$\displaystyle \begin{aligned} \begin{array}{rcl} &\displaystyle n^{-1}\sum_{i=1}^{n}(\hat{\eta}_{ij}\hat{\eta}_{ik}-\eta_{ij}\eta_{ik})=O_{p}(j^{\alpha/2+1}n^{-1/2} +k^{\alpha/2}n^{-1/2}),\\ &\displaystyle n^{-1}\sum_{i=1}^{n}(\hat{\eta}_{ij}-\eta_{ij})z_{ik}=O_{p}(j^{\alpha/2+1}n^{-1/2} ). \end{array} \end{aligned} $$

Lemma 2

Under (C1)–(C5), we have

$$\displaystyle \begin{aligned} \begin{array}{rcl} n^{-1}\sum_{i=1}^{n}\sum_{j=K+1}^{\infty}\eta_{ij}\beta_{j}z_{il}=O_{p}(n^{-1/2}). \end{array} \end{aligned} $$

Lemma 3

Under (C1) and (C2), we have

$$\displaystyle \begin{aligned} \begin{array}{rcl} &\displaystyle |\hat{\eta}_{ij}-\eta_{ij}|=O_{p}(K^{\alpha+1}/\sqrt{n}),\\ &\displaystyle \|\hat{\phi}_{j}-\phi_{j}\|=O_{p}(K^{2\alpha+2}/n). \end{array} \end{aligned} $$

Proof of Theorem 1

Let \(\alpha _{n}=\sqrt {K/n}\). We first show that \(\|\hat {\delta }-\delta \|=O_{p}(\alpha _{n})\). Following [4], it is sufficient to show that, for any given ε > 0, there exists a large constant C, such that

$$\displaystyle \begin{aligned} \begin{array}{rcl} P\{ \inf_{\|u\|=C}Q_{n}(\delta+\alpha_{n}u))>Q_{n}(\delta)) \}=1-\varepsilon, \end{array} \end{aligned} $$

where δ C = {δ  = δ + α n u, ∥u∥ = C} for C > 0. This implies that, with probability 1 − ε, there exists a minimum in the ball δ C. Hence, the consistency of δ is established.

By some simplifications, we have

$$\displaystyle \begin{aligned} \begin{array}{rcl} \psi_{n}(u)&\displaystyle =&\displaystyle Q_{n}(\delta+\alpha_{n}u))-Q_{n}(\delta))\\ &\displaystyle =&\displaystyle \sum_{i=1}^{n}\{|1-Y_{i}^{-1}\exp\{\hat{W}_{i}^{\top}(\delta+\alpha_{n}u)\}| - |1-Y_{i}^{-1}\exp(\hat{W}_{i}^{\top}\delta)|\}\\ &\displaystyle &\displaystyle + \sum_{i=1}^{n}\{|1-Y_{i}\exp\{-\hat{W}_{i}^{\top}(\delta+\alpha_{n}u)\}|]- |1-Y_{i}\exp\{-\hat{W}_{i}^{\top}\delta\}|\}\\ &\displaystyle &\displaystyle +n\sum_{l=1}^{q}\{\xi_{l}|\theta_{l}+u_{l}\alpha_{n}|-\xi_{l}|\theta_{l}|\} +n\sum_{j=1}^{K}\{ \lambda_{j}|\beta_{j}+u_{j+q}\alpha_{n}|- \lambda_{j}|\beta_{j}|\} \\ &\displaystyle \equiv&\displaystyle I_{1}+I_{2}+I_{3}+I_{4}. \end{array} \end{aligned} $$
(6)

For I 1, by applying the identity in [6],

$$\displaystyle \begin{aligned} \begin{array}{rcl} |x-y|-|x|=-y[I(x>0)-I(x<0)]+2\int_{0}^{y}[I(x\leq s)-I(x\leq0)]ds, \end{array} \end{aligned} $$

which is valid for x ≠ 0, we have

$$\displaystyle \begin{aligned} \begin{array}{rcl} I_{1}&\displaystyle =&\displaystyle -\sum_{i=1}^{n}w_{1i}[I(1-Y_{i}^{-1}\exp(\hat{W}_{i}^{\top}\delta)>0) - I(1-Y_{i}^{-1}\exp(\hat{W}_{i}^{\top}\delta)<0)]\\ &\displaystyle &\displaystyle +2\sum_{i=1}^{n}\int_{0}^{w_{1i}}[I(1-Y_{i}^{-1}\exp(\hat{W}_{i}^{\top}\delta)\leq s)-I(1-Y_{i}^{-1}\exp(\hat{W}_{i}^{\top}\delta)\leq0)]ds\\ &\displaystyle \equiv &\displaystyle I_{11}+I_{12}, \end{array} \end{aligned} $$

where

$$\displaystyle \begin{aligned} \begin{array}{rcl} w_{1i}=Y_{i}^{-1}\{\exp(\hat{W}_{i}^{\top}(\delta+\alpha_{n}u)) -\exp(\hat{W}_{i}^{\top}\delta)\}. \end{array} \end{aligned} $$

By Taylor expansion, we have

$$\displaystyle \begin{aligned} \begin{array}{rcl} I_{11} &\displaystyle =&\displaystyle \sum_{i=1}^{n}\alpha_{n}u\hat{W}_{i}^{\top}Y_{i}^{-1}\exp(\hat{W}_{i}^{\top}\delta) sgn(1- Y_{i}^{-1}\exp(\hat{W}_{i}^{\top}\delta))\\ &\displaystyle &\displaystyle +\alpha_{n}^{2}u^{\top}\{\sum_{i=1}^{n}\hat{W}_{i}^{\top}\hat{W}_{i}Y_{i}^{-1}\exp(\xi_{i}^{[1]}) sgn(1- Y_{i}^{-1}\exp(\hat{W}_{i}^{\top}\delta))\}u/2\\ &\displaystyle \equiv &\displaystyle I_{111}+I_{112}, \end{array} \end{aligned} $$

where \(\xi _{i}^{[1]}\) lies between \(\hat {W}_{i}^{\top }(\delta +\alpha _{n}u)\) and \(\hat {W}_{i}^{\top }\delta \).

For I 111, by Lemma 2, we have

$$\displaystyle \begin{aligned} \begin{array}{rcl} I_{111}&\displaystyle =&\displaystyle \sum_{i=1}^{n}\alpha_{n}u\hat{W}_{i}^{\top}\frac{1}{\varepsilon_{i}} sgn(1- Y_{i}^{-1}\exp(\hat{W}_{i}^{\top}\delta))\\ &\displaystyle &\displaystyle +\sum_{i=1}^{n}\alpha_{n}^{2}u^{\top}\frac{1}{\varepsilon_{i}}\hat{W}_{i}^{\top} \hat{W}_{i}u sgn(1- Y_{i}^{-1}\exp(\hat{W}_{i}^{\top}\delta))+o_{p}(1)\\ &\displaystyle \equiv &\displaystyle I_{1111}+I_{1112}+o_{p}(1). \end{array} \end{aligned} $$

It follows directly from Lemma 1 and (C5) that I 1111 = O p(1)∥α n u∥. Moreover, under (C5), \(I_{1112}=\alpha _{n}^{2}u^{\top }D_{2}u/2\) and \(I_{112}=\alpha _{n}^{2}u^{\top }D_{2}u/2+o_{p}(1)\|\alpha _{n}u\|{ }^{2}\).

Hence,

$$\displaystyle \begin{aligned} \begin{array}{rcl} I_{11}=O_{p}(1)\|\alpha_{n}u\|+\alpha_{n}^{2}u^{\top}Du/2+o_{p}(1)\|u\|. \end{array} \end{aligned} $$

For I 12, denote \(c_{1i} = \exp (\hat {W}_{i}^{\top }\hat {\delta }-W_{i}^{\top }\delta -H)\), \(c_{2i} = \exp (\hat {W}_{i}^{\top }\delta -W_{i}^{\top }\delta -H)\) and τ =  i, then

$$\displaystyle \begin{aligned} \begin{array}{rcl} I_{12}&\displaystyle =&\displaystyle 2\sum_{i=1}^{n}\int_{0}^{w_{1i}}[I(1-\varepsilon_{i}^{-1}c_{2i}\leq s)-I(1-\varepsilon_{i}^{-1}c_{2i}\leq0)]ds\\ &\displaystyle =&\displaystyle 2\sum_{i=1}^{n}\int_{0}^{c_{1i}-c_{2i}}\varepsilon_{i}^{-1}[I(\varepsilon_{i}\leq c_{2i}+\delta) -I(\varepsilon_{i}\leq c_{2i})]d\delta\\ &\displaystyle =&\displaystyle 2\sum_{i=1}^{n}\int_{0}^{c_{1i}-c_{2i}}E_{\varepsilon|X}\{\varepsilon_{i}^{-1}[I(\varepsilon_{i} \leq c_{2i}+\tau)-I(\varepsilon_{i}\leq c_{2i})]\}d\tau+o_{p}(1)\\ &\displaystyle =&\displaystyle 2\sum_{i=1}^{n}\int_{0}^{c_{1i}-c_{2i}}E_{\varepsilon|X}\{[I(\varepsilon_{i}\leq c_{2i}+\tau) -I(\varepsilon_{i}\leq c_{2i})]\}d\delta\\ &\displaystyle &\displaystyle +2\sum_{i=1}^{n}\int_{0}^{c_{1i}-c_{2i}}E_{\varepsilon|X}\{(\varepsilon_{i}^{-1}-1) [I(\varepsilon_{i}\leq c_{2i}+\tau)-I(\varepsilon_{i}\leq c_{2i})]\}d\delta+o_{p}(1)\\ &\displaystyle =&\displaystyle \alpha_{n}^{2}u^{\top}\sum_{i=1}^{n}\hat{W}_{i}\hat{W}_{i}^{\top}u[1+o_{p}(1)]\\ &\displaystyle =&\displaystyle \alpha_{n}^{2}u^{\top}D_{2}u+o_{p}(1)\|\alpha_{n}u\|{}^{2}. \end{array} \end{aligned} $$

Therefore, we have

$$\displaystyle \begin{aligned} \begin{array}{rcl} I_{1}=O_{p}(1)\|\alpha_{n}u\|+\alpha_{n}^{2}u^{\top}D_{2}u/2+o_{p}(1)\|u\|. \end{array} \end{aligned} $$
(7)

Similarly, it can be shown that

$$\displaystyle \begin{aligned} \begin{array}{rcl} I_{2}=O_{p}(1)\|\alpha_{n}u\|+\alpha_{n}^{2}u^{\top}D_{1}u/2+o_{p}(1)\|u\|. \end{array} \end{aligned} $$
(8)

Adopting the approach in [6], we have

$$\displaystyle \begin{aligned} \begin{array}{rcl} &\displaystyle |\theta_{l}+u_{l}\alpha_{n}|-|\theta_{l}| \rightarrow \alpha_{n}\{u_{l}sgn(\theta_{l})I(\theta_{l}\neq 0)+|u_{l}|I(\theta_{l}= 0)\},\\ &\displaystyle |\beta_{j}+u_{j+q}\alpha_{n}|-|\beta_{j}| \rightarrow \alpha_{n}\{u_{j+q}sgn(\beta_{j})I(\beta_{j}\neq 0)+|u_{j+q}|I(\beta_{j}= 0)\}. \end{array} \end{aligned} $$

According to (C7), we have

$$\displaystyle \begin{aligned} \begin{array}{rcl} I_{3}>O(1)\|u\| \end{array} \end{aligned} $$
(9)

and

$$\displaystyle \begin{aligned} \begin{array}{rcl} I_{4}>O(1)\|u\|. \end{array} \end{aligned} $$
(10)

Combining (6)–(10), for a sufficiently large C, we have ψ n(u) > 0. Therefore, \(\|\hat {\delta }-\delta \|=O_{p}(\sqrt {K/n})\).

Next we show \(\|\hat {\gamma }(t)-\gamma (t)\|=o_{p}(1)\). In fact,

$$\displaystyle \begin{aligned} \begin{array}{rcl} &\displaystyle &\displaystyle \|\hat{\gamma}_{1}(t)-\gamma_{1}(t)\| \leq \sum_{j=1}^{K}\|\hat{\beta}_{j}\hat{\phi}_{j}(t)-\beta_{j}\phi_{j}(t)\|+\| \sum_{j>K}\beta_{j}\phi_{j}(t)\|\\ &\displaystyle \leq &\displaystyle \sum_{j=1}^{K}[|\hat{\beta}_{j}-\beta_{j}|\|\phi_{j}(s)\|+|\hat{\beta}_{j}-\beta_{j}|\|\hat{\phi}_{j}(s)-\phi_{j}(s)\| +|\beta_{j}|\|\hat{\phi}_{j}(s)-\phi_{j}(s)\|]\\ &\displaystyle &\displaystyle +\| \sum_{j>K}\beta_{j}\phi_{j}(s)\|\\ &\displaystyle \equiv&\displaystyle F_{1}+F_{2}. \end{array} \end{aligned} $$

By result of the first part and Lemma 3, we can obtain

$$\displaystyle \begin{aligned} \begin{array}{rcl} F_{1}=O_{p}(\frac{K^{2\alpha+3}}{n}). \end{array} \end{aligned} $$

By the square integrable property of γ(t), we have F 2 = o p(1). From the above results and condition (C4), we obtain

$$\displaystyle \begin{aligned} \begin{array}{rcl} &\displaystyle \|\hat{\gamma}(t)-\gamma(t)\| =o_{p}(1). \end{array} \end{aligned} $$

This completes the proof of Theorem 1. □

Proof of Theorem 2

For j ∈ A and k ∈ B, the consistency result in Theorem 1 indicates that \(\hat {\theta }\rightarrow \theta \) and \(\hat {\beta }\rightarrow \beta \) in probability. Therefore, \(P(j\in \hat {A}_{n})\rightarrow 1\) and \(P(k\in \hat {B}_{n})\rightarrow 1\). Then it suffices to show that \(P ( j\in \hat {A}_{n} ) \rightarrow 0\) for ∀jA and \(P ( k\in \hat {B}_{n} ) \rightarrow 0\) for ∀kB.

For \(\forall j\in \hat {A}_{n}\) and \(\forall k\in \hat {B}_{n}\), we have

$$\displaystyle \begin{aligned} \begin{array}{rcl} 0=\frac{\partial \psi_{n}(u)}{\partial u}&\displaystyle =&\displaystyle \sum_{i=1}^{n}\{|1-Y_{i}^{-1}\exp\{\hat{W}_{i}^{\top}(\delta)\}| - |1-Y_{i}^{-1}\exp(\hat{W}_{i}^{\top}\delta)|\}\hat{W}_{ijk}\\ &\displaystyle &\displaystyle + \sum_{i=1}^{n}\{|1-Y_{i}\exp\{-\hat{W}_{i}^{\top}(\delta)\}|]- |1-Y_{i}\exp\{-\hat{W}_{i}^{\top}\delta\}|\}\hat{W}_{ijk}\\ &\displaystyle &\displaystyle +\sum_{i=1}^{n}\{|1-Y_{i}^{-1}\exp\{\hat{W}_{i}^{\top}(\delta)\}| - |1-Y_{i}^{-1}\exp(\hat{W}_{i}^{\top}\delta)|\}\hat{W}_{ijk}\hat{W}_{i}^{\top}u\\ &\displaystyle &\displaystyle + \sum_{i=1}^{n}\{|1-Y_{i}\exp\{-\hat{W}_{i}^{\top}(\delta)\}|]- |1-Y_{i}\exp\{-\hat{W}_{i}^{\top}\delta\}|\}\hat{W}_{ijk}\hat{W}_{i}^{\top}u\\ &\displaystyle &\displaystyle +n\alpha_{n}\xi_{j}sgn(\theta_{j})+n\alpha_{n}\lambda_{k}sgn(\beta_{k})\\ &\displaystyle \equiv&\displaystyle N_{1}+N_{2}+N_{3}+N_{4}+N_{5}+N_{6}. \end{array} \end{aligned} $$

By similar arguments with Theorem 1, we have

$$\displaystyle \begin{aligned} \begin{array}{rcl} N_{k}=O_{p}(1), k=1,\cdots,4. \end{array} \end{aligned} $$

According to (C8), we have

$$\displaystyle \begin{aligned} \begin{array}{rcl} N_{k}\rightarrow \infty, k=5,6. \end{array} \end{aligned} $$

Consequently, we must have \(P(j\in \hat {A}_{n})\rightarrow 0\) and \(P(k\in \hat {B}_{n})\rightarrow 0\) for ∀jA and ∀kB. This completes the proof Theorem 2. □

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhang, T., Huang, Y., Zhang, Q., Ma, S., Ahmed, S.E. (2019). Penalized Relative Error Estimation of a Partially Functional Linear Multiplicative Model. In: Ahmed, S., Carvalho, F., Puntanen, S. (eds) Matrices, Statistics and Big Data. IWMS 2016. Contributions to Statistics. Springer, Cham. https://doi.org/10.1007/978-3-030-17519-1_10

Download citation

Publish with us

Policies and ethics