A study of sensitivity analysis on the method of Principal Hessian Directions

Heng-Hui, Lue

doi:10.1007/s001800100054

A study of sensitivity analysis on the method of Principal Hessian Directions

Published: 04 November 2019

Volume 16, pages 109–130, (2001)
Cite this article

Computational Statistics Aims and scope Submit manuscript

Lue Heng-Hui¹

237 Accesses
8 Citations
Explore all metrics

Summary

A new method for nonparametric regression data analysis by analyzing the sensitivity of normally large perturbations with the Principal Hessian Directions (PHD) method (Li 1992) is introduced, combining the merits of effective dimension reduction and visualization. We develop techniques for detecting perturbed points without knowledge of the functional form of the regression model when a small percentage of observations is subject to normally large values. The main feature in our proposed method is to estimate the deviation angle of the PHD direction. The basic idea is to recursively trim out perturbed points which cause larger directional deviations. Our multiple trimming method always reduces the pattern-ambiguity of geometric shape information about the regression surface. Several simulations with empirical results are reported.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multiparameter Methods are the New Field in Statistics

Article 19 December 2017

Robust Statistics

U-Tests of General Linear Hypotheses for High-Dimensional Data Under Nonnormality and Heteroscedasticity

Article 01 September 2015

References

Chaudhuri, P., Huang, M. C., Loh, W. Y. and Yao, R. (1994), “Piecewise-polynomial regression trees,” Statistica Sinica, 4, 143–167.
MATH Google Scholar
Cheng, C. S. and Li, K. C. (1995), “A study of the method of Principal Hessian Direction for analysis of data from designed experiments,” Statistica Sinica, 5, 617–639.
MathSciNet MATH Google Scholar
Cook, R. D. (1998), “Principal Hessian Directions Revisited,” (with discussion), J. Amer. Stat. Assoc., 93, 84–100.
Article MathSciNet Google Scholar
Cook, R. D. and Weisberg, S. (1989), “Regression diagnostics with dynamic graphics,” (with discussion), Technometrics, 31, 277–308.
MathSciNet Google Scholar
Duan, N. and Li, K. C. (1991), “Slicing regression: A link-free regression method,” Ann. Statist., 19, 505–530.
Article MathSciNet Google Scholar
Filliben, J. J. and Li, K. C. (1997), “A systematic approach to the analysis of complex interaction patterns in two-level factorial designs,” Technometrics, 39, 286–297.
Article Google Scholar
Hall, P. and Li, K. C. (1993), “On almost linearity of low dimensional projections from high dimensional data,” Ann. Statist., 21, 867–889.
Article MathSciNet Google Scholar
Hsing, T. and Carroll, R. J. (1992), “An asymptotic theory for sliced inverse regression,” Ann. Statist., 20, 1040–1061.
Article MathSciNet Google Scholar
Kurt, E., Anthony, R. and Herbert, S. W. (1979), Statistical Methods for Digital Computers, Vol. 3, John Wiley.
Li, K. C. (1991), “Sliced inverse regression for dimension reduction,” (with discussion), J. Amer. Stat. Assoc., 86, 316–342.
Article MathSciNet Google Scholar
Li, K. C. (1992), “On principal Hessian directions for data visualization and dimension reduction: another application of Stein’s lemma,” J. Amer. Stat. Assoc., 87, 1025–1039.
Article MathSciNet Google Scholar
Li, K. C., Lue, H. H. and Chen, C. H. (2000), “Interactive Tree-structured Regression via Principal Hessian Directions,” J. Amer. Stat. Assoc., 95, 547–560.
Article MathSciNet Google Scholar
Lue, H. H. (1994), “Principal-Hessian-direction-based regression trees,” unpublished Ph.D. dissertation, Department of Math., University of California, Los Angeles.
Google Scholar
Weisberg, S. (1985), Applied Linear Regression, John Wiley.
Tierney, L. (1990), LISP-STAT: an object-oriented environment for statistical computing and dynamic graphics, New York: John Wiley & Sons.
Book Google Scholar

Download references

Acknowledgment

This research was supported in part by the National Science Council, R.O.C. grant #NSC86-2115-M-130-002.

Author information

Authors and Affiliations

Department of Applied Statistics, Ming Chuan University, Taipei, Taiwan
Lue Heng-Hui

Authors

Lue Heng-Hui
View author publications
You can also search for this author in PubMed Google Scholar

Appendix

A. Proof of Lemma 3.1

To proceed this proof, a straightforward expression for the residuals with case i deleted leads to $\widehat{r}_j^{\left( i \right)} = {y_j} - x_j^\prime {\left( {{\rm{X}}_{\left( i \right)}^\prime {{\rm{X}}_{\left( i \right)}}} \right)^{ - 1}}{\rm{X}}_{\left( i \right)}^\prime {y_{\left( i \right)}}$, where X is a full rank matrix with n rows and (p+1) columns (a column of ones included), X_(i) is the (n-1) by (p+1) matrix obtained from X by deleting the ith row, and $x_i^\prime $ is the ith row of X. To simplify this expression, we apply the equality

$${\left( {{\rm{X}}_{\left( i \right)}^\prime {{\rm{X}}_{\left( i \right)}}} \right)^{ - 1}} = {\left( {{{\rm{X}}^\prime }{\rm{X}}} \right)^{ - 1}} + {1 \over {1 - {h_{ii}}}}{\left( {{{\rm{X}}^\prime }{\rm{X}}} \right)^{ - 1}}{x_i}x_i^\prime {\left( {{{\rm{X}}^\prime }{\rm{X}}} \right)^{ - 1}}$$

(see, Weisberg 1985). As a result, we have, thus completing the proof. $\widehat{r}_j^{\left( i \right)} = {\widehat{r}_j} + {{{h_{ji}}} \over {1 - {h_{ii}}}}{\widehat{r}_i}$

B. Proof of Corollary 3.1

To obtain this result, without loss of generality we assume $\overline {\bf{x}} = 0$. It is straightforward to express ${\widehat{\rm{\Sigma }}_{{r^{\left( i \right)}}{{\bf{x}}_{\left( i \right)}}{{\bf{x}}_{\left( i \right)}}}} = {1 \over {n - 1}}\left( {\sum\nolimits_{j = 1}^n {\widehat{r}_j^{\left( i \right)}{{\bf{x}}_j}{\bf{x}}_j^\prime - \widehat{r}_i^{\left( i \right)}{{\bf{x}}_i}{\bf{x}}_i^\prime } } \right)$, where x_j is a p-dimensional random vector for j = 1, ⋯, n. To simplify this expression, we apply Lemma 3.1. Then we have

$$\begin{array}{*{20}{c}} {{{\hat \sum }_{{r^{(i)}}{x_{(i)}}{x_{(i)}}}}}& = &{\frac{n}{{n - 1}}{{\hat \sum }_{rxx}} + \frac{{\hat r_i^{(i)}}}{{n - 1}}(\sum\limits_{j = 1}^n {{h_{ji}}{x_j}{{x'}_i}} )} \\ \;& = &{\frac{n}{{n = 1}}{{\hat \sum }_{rxx}} + \frac{n}{{n - 1}}\hat r_i^{(i)}{{\hat \sum }_{{{\bar h}_i}xx}}\;\;} \end{array}$$

thus completing the proof.

C. Proof of Theorem 3.1

To obtain this result, without loss of generality we assume that ${\widehat{\rm{\Sigma }}_{\bf{x}}} = {\rm{I}}$ and $\left\| {{{\widehat{b}}_{rj}}} \right\| = 1.\,\,\,{\rm{Let}}\,\,{\rm{\Delta }}\,{\widehat{b}_{rj}}$ be the vector component of $\widehat{b}_{rj}^{\left( i \right)}$ orthogonal to ${{{\widehat{b}}_{rj}}}$, then there exists a constant c such that $c\widehat{b}_{rj}^{\left( i \right)} = {\widehat{b}_{rj}} + {\rm{\Delta }}{\widehat{b}_{rj}}$ and $\cos \hat \theta _{rj}^{(i)} = (c\hat b_{rj}^{(i)},{\hat b_{rj}})/\left\| {c\hat b_{rj}^{(i)}} \right\|\left\| {{{\hat b}_{rj}}} \right\|$ where (·, ·) denotes as an inner product. Suppose that $\theta _{rj}^{\left( i \right)}$ is positive, then it can be shown that $\cos \widehat\theta _{rj}^{\left( i \right)} = 1/{\left( {1 + {{\left\| {{\rm{\Delta }}{{\widehat{b}}_{rj}}} \right\|}^2}} \right)^{{1 \over 2}}}$, $\sin \widehat\theta _{rj}^{\left( i \right)} = \left\| {{\rm{\Delta }}{{\widehat{b}}_{rj}}} \right\|/{\left( {1 + {{\left\| {{\rm{\Delta }}{{\widehat{b}}_{rj}}} \right\|}^2}} \right)^{{1 \over 2}}}$ and $\widehat\theta _{rj}^{\left( i \right)} = \left\| {{\rm{\Delta }}{{\widehat{b}}_{rj}}} \right\| + O\left( {{{\left\| {{\rm{\Delta }}{{\widehat{b}}_{rj}}} \right\|}^3}} \right)$. Thus we derive $\widehat\theta _{rj}^{\left( i \right)} \approx \left\| {{\rm{\Delta }}{{\widehat{b}}_{rj}}} \right\|$.

We proceed with this proof by evaluating the differential ${d{{\widehat{b}}_{rj}}}$ of eigenvector ${{{\widehat{b}}_{rj}}}$ for the sample weighted covariance matrix ${\widehat{\rm{\Sigma }}_{r{\bf{xx}}}}$ with eigenvalue ${\widehat\lambda _j}$, which is denned as

$$d{\widehat{b}_{rj}} = \sum\limits_{k \ne j} {{{\widehat{b}_{rj}^\prime d{{\widehat{\rm{\Sigma }}}_{r{\bf{xx}}}}{{\widehat{b}}_{rk}}} \over {{{\widehat\lambda }_j} - {{\widehat\lambda }_k}}}{{\widehat{b}}_{rk}}} $$

for j, k = 1, ⋯, p (see, Kurt et. al. 1979). To asymptotically obtain the deviation angle estimate $\widehat\theta _{rj}^{\left( i \right)}$ by applying ${\rm{\Delta }}{\widehat\Sigma _{r{\bf{xx}}}} \approx \widehat{r}_i^{\left( i \right)}{\widehat\Sigma _{\widetilde{{h_i}}{\bf{xx}}}}$ from Corollary 3.1 as n is sufficiently large, we can verify that

$$\begin{array}{*{20}{c}} {\left\| {\Delta {{\hat b}_{rj}}} \right\|}& = &{(\Delta {{\hat b}_{rj}},\Delta {{\hat b}_{rj}})\;\;\;\;\;\;\;\;\;\;} \\ \;& \approx &{\sum\limits_{k \ne j} {{{(\frac{{{{\hat b'}_{rj}}\Delta {{\hat \sum }_{rxx}}{{\hat b}_{rk}}}}{{{{\hat \lambda }_j} - {{\hat \lambda }_k}}})}^2}\;\;\;} } \\ \;& \approx &{{{(\hat r_i^{(i)})}^2}\sum\limits_{k \ne j} {(\frac{{{{\hat b'}_{rj}}{{\hat \sum }_{{{\bar h}_{ixx}}}}{{\hat b}_{rk}}}}{{{{\hat \lambda }_j} - {{\hat \lambda }_k}}}).} } \end{array}$$

$\widehat\theta _{rj}^{\left( i \right)} \approx \left| {\widehat{r}_i^{\left( i \right)}} \right|{c_j}\left( {\bf{x}} \right)$, where the scalar function ${c_j}\left( {\bf{x}} \right) = {\left( {\sum\nolimits_{k \ne j} {{{\left( {{{\widehat{b}_{rj}^\prime {{\widehat{\rm{\Sigma }}}_{{{\widetilde{h}}_i}{\bf{xx}}}}{{\widehat{b}}_{rk}}} \over {{{\widehat\lambda }_j} - {{\widehat\lambda }_k}}}} \right)}^2}} } \right)^{{1 \over 2}}}$. Similarly, Suppose $\theta _{rj}^{\left( i \right)}$ is negative, then $\widehat\theta _{rj}^{\left( i \right)} \approx - \left| {\widehat{r}_i^{\left( i \right)}} \right|{c_j}\left( {\bf{x}} \right)$. As a result, we obtain

$$\widehat\theta _{rj}^{\left( i \right)} \approx {\rm{sign}}\left( {\theta _{rj}^{\left( i \right)}} \right)\left| {\widehat{r}_i^{\left( i \right)}} \right|{c_j}\left( {\bf{x}} \right)$$

where sign(z)= 1 if z> 0, −1 if z< 0, 0 otherwise, thus completing the proof of this theorem.

D. Algorithmic Description

We present the technical details of the algorithm for the absolute deviation angle estimate $\left| {\widehat\theta _{rj}^{\left( i \right)}} \right|$ for fixed j. Suppose that the data consist of n observations, y_i and x_i, i = 1, ⋯, n.

1.
We begin with exploring multiple linear regression of y against x, and then constructing the eigenvalue decomposition
$${\widehat{\rm{\Sigma }}_{r{\bf{xx}}}}{\widehat{b}_{rj}} = {\widehat\lambda _j}{\widehat{\rm{\Sigma }}_{\bf{x}}}{\widehat{b}_{rj}}$$
for achieving estimated r-based PHD directions ${\widehat{b}_{rj}}$ and eigenvalues ${\widehat\lambda _j},\,j = {\widetilde{h}_i}$ 1, ⋯, p. The hat matrix H = X(X′X)⁻¹X′ can be obtained from the fitted values for multiple linear regression of each column vector of identity matrix I against x. The vector h_i is the ith column of the difference between H and I. The deleted residual $\widehat{r}_i^{\left( i \right)}$ is equal to ${\widehat{r}_i}{\rm{/}}\left( {1 - {h_{ii}}} \right)$.
2.
For each i, we need to compute the scale function c_j(x) by using the results in 1
$${c_j}({\bf{x}}) = {\left( {\sum\limits_{k \ne j} {{{\left( {{{\widehat{b}_{rj}^\prime {{\widehat{\rm{\Sigma }}}_{{{\tilde h}_i}{\bf{xx}}}}{{\widehat{b}}_{rk}}} \over {{{\widehat\lambda }_j} - {{\widehat\lambda }_k}}}} \right)}^2}} } \right)^{{1 \over 2}}}$$
where ${\widehat{\rm{\Sigma }}_{{{\tilde h}_i}{\bf{xx}}}} = {1 \over n}{\sum\nolimits_{j = 1}^n {{{\widetilde{h}}_{ij}}\left( {{{\bf{x}}_j} - \overline {\bf{x}} } \right)\left( {{{\bf{x}}_j} - \overline {\bf{x}} } \right)} ^\prime }$, and ${{{\widetilde{h}}_{ij}}}$ is the jth component of ${{{\widetilde{h}}_i}}$. Therefore, we obtain the absolute deviation angle estimate
$$\left| {\widehat\theta _{rj}^{\left( i \right)}} \right| \approx \left| {\widehat{r}_i^{\left( i \right)}} \right|{c_j}\left( {\bf{x}} \right)$$

Rights and permissions

Reprints and permissions

About this article

Cite this article

Heng-Hui, L. A study of sensitivity analysis on the method of Principal Hessian Directions. Computational Statistics 16, 109–130 (2001). https://doi.org/10.1007/s001800100054

Download citation

Published: 04 November 2019
Issue Date: March 2001
DOI: https://doi.org/10.1007/s001800100054

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A study of sensitivity analysis on the method of Principal Hessian Directions

Summary

Access this article

Similar content being viewed by others

Multiparameter Methods are the New Field in Statistics

Robust Statistics

U-Tests of General Linear Hypotheses for High-Dimensional Data Under Nonnormality and Heteroscedasticity

References

Acknowledgment

Author information

Authors and Affiliations

Appendix

A. Proof of Lemma 3.1

B. Proof of Corollary 3.1

C. Proof of Theorem 3.1

D. Algorithmic Description

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A study of sensitivity analysis on the method of Principal Hessian Directions

Summary

Access this article

Similar content being viewed by others

Multiparameter Methods are the New Field in Statistics

Robust Statistics

U-Tests of General Linear Hypotheses for High-Dimensional Data Under Nonnormality and Heteroscedasticity

References

Acknowledgment

Author information

Authors and Affiliations

Appendix

Appendix

A. Proof of Lemma 3.1

B. Proof of Corollary 3.1

C. Proof of Theorem 3.1

D. Algorithmic Description

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation