Skip to main content
Log in

Threshold selection for multivariate heavy-tailed data

  • Published:
Extremes Aims and scope Submit manuscript

Abstract

Regular variation is often used as the starting point for modeling multivariate heavy-tailed data. A random vector is regularly varying if and only if its radial part R is regularly varying and is asymptotically independent of the angular part Θ as R goes to infinity. The conditional limiting distribution of Θ given R is large characterizes the tail dependence of the random vector and hence its estimation is the primary goal of applications. A typical strategy is to look at the angular components of the data for which the radial parts exceed some threshold. While a large class of methods has been proposed to model the angular distribution from these exceedances, the choice of threshold has been scarcely discussed in the literature. In this paper, we describe a procedure for choosing the threshold by formally testing the independence of R and Θ using a measure of dependence called distance covariance. We generalize the limit theorem for distance covariance to our unique setting and propose an algorithm which selects the threshold for R. This algorithm incorporates a subsampling scheme that is also applicable to weakly dependent data. Moreover, it avoids the heavy computation in the calculation of the distance covariance, a typical limitation for this measure. The performance of our method is illustrated on both simulated and real data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Bickel, P.J., Wichura, M.J.: Convergence criteria for multiparameter stochastic processes and some applications. Ann. Statist. 42, 1656–1670 (1971)

    Article  MathSciNet  MATH  Google Scholar 

  • Billingsley, P.: Probability and Measure, 3rd edition. Wiley, New York (1995)

    MATH  Google Scholar 

  • Davis, R.A., Mikosch, T.: The extremogram: a correlogram for extreme events. Bernoulli 15(4), 977–1009 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  • Davis, R.A., Mikosch, T., Cribben, I.: Towards estimating extremal serial dependence via the bootstrapped extremogram. J Econometrics 170(1), 142–152 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  • Davis, R.A., Matsui, M., Mikosch, T., Wan, P.: Applications of distance covariance to time series. arXiv:1606.05481 (2017)

  • de Haan, L., de Ronde, J.: Sea and wind: multivariate extremes at work. Extremes 1, 7–46 (1998)

    Article  MathSciNet  MATH  Google Scholar 

  • Doukhan, P.: Mixing: Properties and Examples. Springer, New York (1994)

    Book  MATH  Google Scholar 

  • Feuerverger, A.: A consistent test for bivariate dependence. Internat. Statis. Rev. 61(3), 419–433 (1993)

    Article  MathSciNet  MATH  Google Scholar 

  • Fryzlewicz, P.: Wild binary segmentation for multiple change-point detection. Ann. Statist. 42(6), 2243–2281 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  • Jeon, S., Smith, R.L.: Dependence structure of spatial extremes using threshold approach. arXiv:1209.6344v1 (2014)

  • Joe, H., Smith, R.L., Weissman, I.: Bivariate threshold methods for extremes. JRSS B. 54(1), 171–183 (1992)

    MathSciNet  MATH  Google Scholar 

  • Mallik, A., Sen, B., Banerjee, M., Michailidis, G.: Threshold estimation based on a p-value framework in dose-response and regression settings. Biometrika 98(4), 887–900 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  • Page, E.S.: Continuous inspection schemes. Biometrika 41(1), 100–115 (1954)

    Article  MathSciNet  MATH  Google Scholar 

  • Resnick, S.I.: Hidden regular variation, second order regular variation and asymptotic independence. Extremes 5, 303–336 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  • Resnick, S.I.: Heavy-Tail Phenomena: Probabilistic and Statistical Modeling. Springer, New York (2007)

    MATH  Google Scholar 

  • Stǎricǎ, C.: Multivariate extremes for models with constant conditional correlations. J. Empir. Finan. 6, 515–553 (1999)

    Article  Google Scholar 

  • Székely, G.J., Rizzo, M.L., Bakirov, N.K.: Measuring and testing dependence by correlation of distances. Ann. Statist. 35, 2769–2794 (2007)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

The foreign exchange rate data are obtained from OANDA from through R-package ‘qrmtools’. We would like to thank Bodhisattva Sen for helpful discussions. We would also like to thank the editor and referees for their many constructive and insightful comments.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Phyllis Wan.

Additional information

This research was supported in part by MURI ARO (grant no. W911NF-12-10385).

Appendices

Appendix A: Proof of Theorem 1

Note from the definition of the empirical distance covariance in Eq. 9, the integrand can be expressed as

$$\begin{array}{@{}rcl@{}} C_n(s,t) &=& \frac{1}{n\hat p_n}\sum\limits_{j = 1}^n e^{isR_j/r_n + it^T\mathbf{\Theta}_j}\mathbf{1}_{\{R_j>r_n\}} \\ && \quad - \frac{1}{n\hat p_n}\sum\limits_{j = 1}^n e^{isR_j/r_n}\mathbf{1}_{\{R_j>r_n\}} \frac{1}{n\hat p_n}\sum\limits_{k = 1}^n e^{it^T\mathbf{\Theta}_k}\mathbf{1}_{\{R_k>r_n\}} \\ &=& \frac{1}{n\hat p_n}\sum\limits_{j = 1}^n \left( e^{isR_j/r_n} - \varphi_{{\frac{R}{r_n}}|r_n}(s)\right) \left( e^{it^T\mathbf{\Theta}_j} - \varphi_{{\Theta}|r_n}(t)\right) \mathbf{1}_{\{R_j>r_n\}} \\ &&-\, \frac{1}{n\hat p_n}\sum\limits_{j = 1}^n \left( e^{isR_j/r_n} - \varphi_{{\frac{R}{r_n}}|r_n}(s)\right)\mathbf{1}_{\{R_j>r_n\}} \ \frac{1}{n\hat p_n}\sum\limits_{k = 1}^n \left( e^{it^T\mathbf{\Theta}_k} - \varphi_{{\Theta}|r_n}(t)\right) \mathbf{1}_{\{R_k>r_n\}}. \end{array} $$

Writing \( U_{jn} \,=\, \left (e^{isR_{j}/r_{n}} - \varphi _{{\frac {R}{r_{n}}}|r_{n}}(s)\right )\mathbf {1}_{\{R_{j}>r_{n}\}}\), \(V_{jn} \,=\, \left (e^{it^{T}\mathbf {\Theta }_{j}} - \varphi _{{\Theta }|r_{n}}(t)\right )\mathbf {1}_{\{R_{j}>r_{n}\}}\), we have

$$C_{n}(s,t) = \frac{p_{n}}{\hat{p}_{n}}\,\frac{1}{n} \sum\limits_{j = 1}^{n} \frac{U_{jn}V_{jn}}{p_{n}} -\left( \frac{p_{n}}{\hat{p}_{n}}\right)^{2}\,\frac{1}{n}\sum\limits_{j = 1}^{n} \frac{U_{jn}}{p_{n}} \,\frac{1}{n} \sum\limits_{k = 1}^{n} \frac{V_{kn}}{p_{n}}. $$

Since \({\mathbb E} U_{jn} = {\mathbb E} V_{jn} = 0\) and \({\mathbb E} U_{jn}V_{jn}/p_{n} = \varphi _{{\frac {R}{r_{n}}},{\Theta }|r_{n}}(s,t)- \varphi _{{\frac {R}{r_{n}}}|r_{n}}(s)\varphi _{{\Theta }|r_{n}}(t)\), it is convenient to mean correct the summands and obtain

$$\begin{array}{@{}rcl@{}} C_n(s,t) &=& \frac{p_n}{\hat{p}_n}\,\frac{1}{n} \sum\limits_{j = 1}^n \left( \frac{U_{jn}V_{jn}}{p_n} - \left( \varphi_{{\frac{R}{r_n}},{\Theta}|r_n}(s,t) - \varphi_{{\frac{R}{r_n}}|r_n}(s)\varphi_{{\Theta}|r_n}(t)\right)\right)\\ &&-\left( \frac{p_n}{\hat{p}_n}\right)^2\,\frac{1}{n}\sum\limits_{j = 1}^n \frac{U_{jn}}{p_n} \,\frac{1}{n} \sum\limits_{k = 1}^n \frac{V_{kn}}{p_n} \\ && + \frac{p_n}{\hat{p}_n}\,(\varphi_{{\frac{R}{r_n}},{\Theta}|r_n}(s,t) - \varphi_{{\frac{R}{r_n}}|r_n}(s)\varphi_{{\Theta}|r_n}(t)) \\ &=:& \left( \frac{p_n}{\hat{p}_n} \right) \tilde{E}_1 - \left( \frac{p_n}{\hat{p}_n} \right)^2 \tilde{E}_{21} \tilde{E}_{22} + \left( \frac{p_n}{\hat{p}_n} \right) \tilde{E}_3 \\ &=:& \left( \frac{p_n}{\hat{p}_n} \right) \tilde{E}_1 - \left( \frac{p_n}{\hat{p}_n} \right)^2 \tilde{E}_{2} + \left( \frac{p_n}{\hat{p}_n} \right) \tilde{E}_3 \end{array} $$

Note that \( \tilde {E}_{1}, \tilde {E}_{21}, \tilde {E}_{22}\) are averages of iid zero-mean random variables and \(\tilde {E}_{3}\) is non-random. We first prove the second part of Theorem 1. The first part of Theorem 1 follows easily in a similar fashion.

Proof

(Proof of Theorem 1(2))

In order to show (14), it suffices to establish that

$$ n\hat p_n {\int}_{{\mathbb R}^{d + 1}} \left( \frac{p_n}{\hat{p}_n} \right)^2 |\tilde{E}_1|^2 \mu(ds,dt) \stackrel{d}{\rightarrow} {\int}_{{\mathbb R}^{d + 1}} |Q(s,t)|^2\mu(ds,dt), $$
(25)

and

$$ \left|n\hat p_nT_n - n\hat p_n {\int}_{{\mathbb R}^{d + 1}} \left( \frac{p_n}{\hat{p}_n} \right)^2 |\tilde{E}_1|^2\mu(ds,dt)\right| \stackrel{p}{\rightarrow} 0, $$
(26)

where Eq. 26 can be implied by

$$ n\hat p_n {\int}_{{\mathbb R}^{d + 1}} \left( \frac{p_n}{\hat{p}_n} \right)^2|\tilde E_2|^2\mu(ds,dt) + n\hat p_n {\int}_{{\mathbb R}^{d + 1}} \left( \frac{p_n}{\hat{p}_n} \right)^2|\tilde E_3|^2\mu(ds,dt) \stackrel{p}{\rightarrow} 0. $$
(27)

Notice that

$$\begin{array}{@{}rcl@{}} {\mathbb E} \left|\frac{\hat p_n}{{p}_n}-1\right|^2 &=& {\mathbb E}\left|\frac{1}{n} \sum\limits_{j = 1}^n\left( \frac{ \mathbf{1}_{\{R_j>r_n\}}}{p_n} -1\right)\right|^2 = \frac{1}{n} {\mathbb E} \left|\frac{\mathbf{1}_{\{R_1>r_n\}}}{p_n} -1\right|^2\\ &\le& \frac{1}{np_n} O(1) + \frac{1}{n} O(1) \to 0. \end{array} $$

Hence \(\hat p_{n}/p_{n} \stackrel {p}{\rightarrow } 1\) and for Eqs. 25 and 27, it is equivalent to prove that

$$ n p_n {\int}_{{\mathbb R}^{d + 1}} |\tilde E_1|^2\mu(ds,dt) \stackrel{d}{\rightarrow} {\int}_{{\mathbb R}^{d + 1}} |Q(s,t)|^2\mu(ds,dt) $$
(28)

and

$$ n p_n {\int}_{{\mathbb R}^{d + 1}} |\tilde E_2|^2\mu(ds,dt) + n p_n {\int}_{{\mathbb R}^{d + 1}} |\tilde E_3|^2\mu(ds,dt) \stackrel{p}{\rightarrow} 0. $$
(29)

We will show the convergence (28) in Proposition 1. By Eq. 13,

$$n p_{n} {\int}_{{\mathbb R}^{d + 1}} |\tilde E_{3}|^{2}\mu(ds,dt) \to 0. $$

Sothat Eq. 29 holds provided

$$ n p_n {\int}_{{\mathbb R}^{d + 1}} |\tilde E_2|^2\mu(ds,dt)\stackrel{p}{\rightarrow} 0, $$
(30)

which follows in a similar fashion as Proposition 1.□

Proposition 1

Assume μ satisfies

$${\int}_{{\mathbb R}^{d + 1}} (1\wedge |s|^{\beta})(1\wedge |t|^{2}) \mu(ds,dt) <\infty, $$

and that npnas n → ∞, then

$$ n p_n {\int}_{{\mathbb R}^{d + 1}} |\tilde E_1|^2\mu(ds,dt) \stackrel{d}{\rightarrow} {\int}_{{\mathbb R}^{d + 1}} |Q(s,t)|^2\mu(ds,dt), $$

where Q is a centered Gaussian process with covariance function (15).

Proof

(Proof of Proposition 1)

We first show that

$$ \sqrt{n p_n} \tilde E_1 \overset{d}\to Q(s,t), \quad \text{on } \mathcal{C}({\mathbb R}^{d + 1}) $$
(31)

which can be implied by the finite distributional convergence of \(\sqrt {n p_{n}} \tilde E_{1}(s,t)\) and its tightness on \(\mathcal {C}({\mathbb R}^{d + 1})\).

Write

$$\sqrt{n p_{n}} \tilde E_{1} = \frac{1}{\sqrt{n}} \sum\limits_{j = 1}^{n} \left( \frac{U_{jn}V_{jn}}{\sqrt{p_{n}}} - \sqrt{p_{n}} (\varphi_{{\frac{R}{r_{n}}},{\Theta}|r_{n}}(s,t)- \varphi_{{\frac{R}{r_{n}}}|r_{n}}(s)\varphi_{{\Theta}|r_{n}}(t))\right) =: \frac{1}{\sqrt n} \sum\limits_{j = 1}^{n} Y_{jn}, $$

where Yjn’s are iid random variableswith mean 0. For fixed (s,t), note that

$$\text{Var}(Y_{1n}) = {\mathbb E}|Y_{1n}|^{2} \,=\, \frac{{\mathbb E}|U_{1n}V_{1n}|^{2}}{p_{n}} (1+o(1)) \,=\, \frac{{\mathbb E}\mathbf{1}_{\{R_{1}>r_{n}\}}}{p_{n}}O(1) \, < \infty. $$

On the other hand, anyδ > 0,

$${\mathbb E}|Y_{1n}|^{2+\delta} \,=\, \frac{{\mathbb E}|U_{1n}V_{1n}|^{2+\delta}}{p_{n}^{1+\delta/2}} (1+o(1)) \,\le\, c\frac{{\mathbb E}\mathbf{1}_{\{R_{1}>r_{n}\}}}{p_{n}^{1+\delta/2}}(1+o(1)) = O(p_{n}^{-\delta/2}) $$

Then we can apply the central limit theorem for triangular arrays by checking the Lyapounov condition (see, e.g., Billingsley (1995)) for the Yjn’s:

$$\frac{ {\sum}_{j = 1}^{n} {\mathbb E}|Y_{jn}|^{2+\delta}}{\left( \text{Var}\left( {\sum}_{j = 1}^{n} Y_{jn}\right)\right)^{\frac{2+\delta}{2}}} = \frac{O(np_{n}^{-\frac{\delta}{2}})}{{n^{1+\frac{\delta}{2}}\text{Var}(Y_{1n})^{1+\frac{\delta}{2}}}} = O((np_{n})^{-\frac{\delta}{2}} ) \to 0. $$

It follows easily that for fixed (s,t),

$$\sqrt{n p_{n}} \tilde E_{1} \stackrel{d}{\rightarrow} Q(s,t). $$

The finite-dimensional distribution can be obtained using the Cramér-Wold device and the covariance function can be verified through calculations.

We now show the tightness of \(\sqrt {np_{n}} \tilde E_{1}\). Note that

$$\begin{array}{@{}rcl@{}} \tilde E_{1}(s,t) &=& \frac{1}{n}\sum\limits_{j = 1}^n \frac{\left( e^{isR_j/r_n} - \varphi_{{\frac{R}{r_n}}|r_n}(s)\right) \left( e^{it^T\mathbf{\Theta}_j} - \varphi_{{\Theta}|r_n}(t)\right) \mathbf{1}_{\{R_j>r_n\}}}{p_n} \\ && \quad- \left( \varphi_{{\frac{R}{r_n}},{\Theta}|r_n}(s,t)- \varphi_{{\frac{R}{r_n}}|r_n}(s)\varphi_{{\Theta}|r_n}(t)\right) \\ &=& \left( \frac{1}{n}\sum\limits_{j = 1}^n \frac{e^{isR_j/r_n+it^T\mathbf{\Theta}_j} \mathbf{1}_{\{R_j>r_n\}}}{p_n} - \varphi_{{\frac{R}{r_n}},{\Theta}|r_n}(s,t)\right) \\ && \quad - \left( \frac{1}{n}\sum\limits_{j = 1}^n \frac{e^{isR_j/r_n} \mathbf{1}_{\{R_j>r_n\}}}{p_n} - \varphi_{{\frac{R}{r_n}}|r_n}(s)\right)\varphi_{{\Theta}|r_n}(t) \\ && \qquad - \left( \frac{1}{n}\sum\limits_{j = 1}^n \frac{e^{it^T\mathbf{\Theta}_j} \mathbf{1}_{\{R_j>r_n\}}}{p_n} - \varphi_{{\Theta}|r_n}(t)\frac{\hat p_n}{p_n}\right) \varphi_{{\frac{R}{r_n}}|r_n}(s) \\ &=:& \tilde E_{11} + \tilde E_{12} + \tilde E_{13}. \end{array} $$

Without loss of generality, we show the tightness for\(\sqrt {np_{n}} \tilde E_{11}\) and that of \(\sqrt {np_{n}} \tilde E_{12}\) and \(\sqrt {np_{n}} \tilde E_{13}\) follows from the same argument.

First we introduce some notation following that from Bickel and Wichura (1971). Fix\((s,t),(s^{\prime },t^{\prime }) \in {\mathbb R}^{d + 1}\) where s < s and t < t. Let B be the subset of \({\mathbb R}^{d + 1}\) of the form

$$B:= \left( (s,t),(s^{\prime},t^{\prime})\right] = (s,s^{\prime}] \times \prod\limits_{k = 1}^{d} (t_{k},t_{k}^{\prime}] \subset {\mathbb R}^{d + 1}. $$

For ease of notation, we suppress the dependence of B on (s,t), (s,t). Define the increment of the stochastic process \({\tilde E_{11}}\) on B to be

$$\begin{array}{@{}rcl@{}} \tilde E_{11}(B) &:=&\frac{1}{n}\sum\limits_{j = 1}^n \sum\limits_{z_0 = 0,1} \sum\limits_{z_1 = 0,1} {\cdots} \sum\limits_{z_d = 0,1} (-1)^{d + 1-{\sum}_j z_j} \\ && \qquad\tilde E_{11}\left( s+z_0(s^{\prime}-s), t_1+z_1(t_1^{\prime}-t_1) ,\ldots, t_d+z_d(t_d^{\prime}-t_d)\right). \end{array} $$

From a sufficient condition of Theorem 3 of Bickel and Wichura (1971), the tightness of\(\sqrt {np_{n}} \tilde E_{1}\) is implied if thefollowing statement holds for any (s,t), (s,t) and corresponding B,

$${\mathbb E}|\sqrt{np_n}\tilde E_{11}(B)|^2 \le c|s-s^{\prime}|^{\beta} \prod\limits_{k = 1}^d|t_k-t_k^{\prime}|^{\beta}, \quad \text{for some }\beta>1. $$

It follows that

$$\begin{array}{@{}rcl@{}} &&{\mathbb E} \left| \sqrt{n p_n} \left( \tilde E_{11}(B)\right) \right|^2 \\ &=& np_n {\mathbb E}\left| \sum\limits_{z_0 = 0,1}{\cdots} \sum\limits_{z_d = 0,1}(-1)^{d + 1-{\sum}_j {z_j}}\frac{1}{n}\sum\limits_{j = 1}^n e^{i(s+z_0(s^{\prime}-s)){R_j}/r}\prod\limits_{k = 1}^d e^{i(t_k+z_k(t^{\prime}_k-t_k)){\Theta}_{jk}}\frac{ \mathbf{1}_{\{R_j>r_n\}}}{p_n} \right.\\ && \quad\quad\quad \left.- \sum\limits_{z_0 = 0,1}{\cdots} \sum\limits_{z_d = 0,1}(-1)^{d + 1-\sum\limits_j {z_j}}{\mathbb E} \left[\left( e^{i(s+z_0(s^{\prime}-s))R/r}\right)\prod\limits_{k = 1}^d e^{i(t_k+z_k(t^{\prime}_k-t_k)){\Theta}_{k}}\frac{ \mathbf{1}_{\{R>r_n\}}}{p_n} \right]\right|^2\\ &=& np_n {\mathbb E}\left| \frac{1}{n}\sum\limits_{j = 1}^n (e^{isR_j/r_j} - e^{is^{\prime}R_j/r_j})\prod\limits_{k = 1}^d (e^{it_k{\Theta}_{jk}} - e^{it^{\prime}_k{\Theta}_{jk}})\frac{ \mathbf{1}_{\{R_j>r_n\}}}{p_n} \right.\\ && \quad\quad\quad \left.- {\mathbb E} \left[(e^{isR/r} - e^{is^{\prime}R/r})\prod\limits_{k = 1}^d (e^{it_k{\Theta}_k} - e^{it^{\prime}_k{\Theta}_k})\frac{ \mathbf{1}_{\{R>r_n\}}}{p_n}\right]\right|^2 \\ &=& p_n \text{Var}\left( (e^{isR/r_j} - e^{is^{\prime}R/r_j})\prod\limits_{k = 1}^d (e^{it_k{\Theta}_{k}} - e^{it^{\prime}_k{\Theta}_{k}})\frac{ \mathbf{1}_{\{R>r_n\}}}{p_n} \right) \\ &\le& {\mathbb E} \left[\left|(e^{isR/r} - e^{is^{\prime}R/r})\prod\limits_{k = 1}^d (e^{it_k{\Theta}_{k}} - e^{it^{\prime}_k{\Theta}_{k}})\right|^2\middle| R>r_n\right]. \end{array} $$
(32)

From a Taylor series argument,

$$|e^{ix} - e^{ix^{\prime}}|^{2} \le c1\wedge|x-x^{\prime}|^{2} \le c1\wedge|x-x^{\prime}|^{\beta} \le c|x-x^{\prime}|^{\beta}, \quad \text{for any} \beta\in(0,2]. $$

Hence for any β ∈ (1, 2 ∧ α),

$$\begin{array}{@{}rcl@{}} {\mathbb E} \left| \sqrt{n p_n} \tilde E_{11}(B) \right|^2 &\le& c|s-s^{\prime}|^{\beta} \prod\limits_{k = 1}^d|t_i-t_i^{\prime}|^{\beta} {\mathbb E}\left[(R/r_n)^{\beta} \prod\limits_{k = 1}^d|{\Theta}_k|^{\beta}|R>r_n\right] \\ &<& c|s-s^{\prime}|^{\beta} \prod\limits_{k = 1}^d|t_i-t_i^{\prime}|^{\beta}, \end{array} $$

since |Θk|β’s are bounded and \({\sup _{n}}{\mathbb E}[(R/r_{n})^{\beta }|R>r_{n}]<\infty \) by the regular variation assumption. This proves the tightness.

Define the bounded set

$$K_{\delta} = \{(s,t)|\ \delta < |s|<1/\delta,\delta <|t| <1/\delta\}, \quad \text{for} \delta<.5. $$

Then, using (31), we have from the continuous mapping theorem,

$$ np_n{\int}_{K_{\delta}} |\tilde E_1|^2 \mu(ds,dt) \stackrel{d}{\rightarrow} {\int}_{K_{\delta}} |Q(s,t)|^2 \mu(ds,dt). $$
(33)

On the other hand, for any β < 2 ∧ α, we have

$$\begin{array}{@{}rcl@{}} &&{\mathbb E}|\sqrt{n p_n} \tilde E_1|^2 \\ &=& np_n {\mathbb E} \left|\frac{1}{n} \sum\limits_{j = 1}^n \left( \frac{U_{jn}V_{jn}}{{p_n}} - {\mathbb E}\left[\frac{U_{jn}V_{jn}}{{p_n}}\right]\right)\right|^2 \\ &\le& \frac{{\mathbb E}|U_{jn}V_{jn} - {\mathbb E} U_{jn}V_{jn}|^2}{p_n} \\ &\le& \frac{c\, {\mathbb E}|U_{jn}V_{jn}|^2}{p_n} \\ &= & \frac{c\,{\mathbb E}\left[\left|e^{isR_j/r_n} - \varphi_{{\frac{R}{r_n}}|r_n}(s)\right|^2\left|e^{it^T\mathbf{\Theta}_j} - \varphi_{{\Theta}|r_n}(t) \right|^2 \mathbf{1}_{\{R_j>r_n\}}\right]}{p_n} \\ &\le & { \frac{c\,{\mathbb E}\left[\left( \left|e^{isR_j/r_n} -1\right|^2+\left| \varphi_{{\frac{R}{r_n}}|r_n}(s)-1 \right|^2\right) \left( \left|e^{it^T\mathbf{\Theta}_j}-1\right|^2 + \left|\varphi_{{\Theta}|r_n}(t)-1 \right|^2\right) \mathbf{1}_{\{R_j>r_n\}}\right]}{p_n} }\\ &\le & { \frac{c\,{\mathbb E}\left[\left( 1 \wedge \left|\frac{sR_j}{r_n}\right|^2 + {\mathbb E}\left[1 \wedge \left|\frac{sR_j}{r_n}\right|^2 |\frac{R}{r_n}>1\right] \right)\left( 1\wedge|t\mathbf{\Theta}_j|^2 + {\mathbb E}\left[1\wedge|t\mathbf{\Theta}_j|^2|\frac{R}{r_n}>1\right]\right) \mathbf{1}_{\{R_j>r_n\}}\right]}{p_n} }\\ &\le & { \frac{c\,{\mathbb E}\left[\left( 1 \wedge \left|\frac{sR_j}{r_n}\right|^{\beta} + {\mathbb E}\left[1 \wedge \left|\frac{sR_j}{r_n}\right|^{\beta} |\frac{R}{r_n}>1\right] \right)\left( 1\wedge|t\mathbf{\Theta}_j|^2 + {\mathbb E}\left[1\wedge|t\mathbf{\Theta}_j|^2|\frac{R}{r_n}>1\right]\right) \mathbf{1}_{\{R_j>r_n\}}\right]}{p_n} }\\ &\le & \frac{c\,{\mathbb E}\left[\left( 1 \wedge |s|^{\beta} \right) \left( \left|\frac{R_j}{r_n}\right|^{\beta} + {\mathbb E}\left[\left|\frac{R}{r_n}\right|^{\beta} |\frac{R}{r_n}>1\right] \right)\left( 1\wedge|t|^2\right) \mathbf{1}_{\{R_j>r_n\}}\right]}{p_n} \\ &\le& c\,{\mathbb E}\left[\left( 1 \wedge |s|^{\beta} (|R_j/r_n|^{\beta} + {\mathbb E}[|R/r_n|^{\beta} |R>r_n]) \right)\left( 1\wedge|t|^2\right) |R>r_n\right] \\ &\le& c(1\wedge |s|^{\beta})(1\wedge |t|^{2}). \end{array} $$
(34)

Therefore for any 𝜖 > 0,

$$\begin{array}{@{}rcl@{}} \lim\limits_{\delta\to 0}\limsup\limits_{n\to\infty} {\mathbb P}\left[np_n{\int}_{K^c_{\delta}} |\tilde E_1|^2 \mu(ds,dt) >\epsilon\right] &\le& \frac{1}{\epsilon} \lim\limits_{\delta\to 0}\limsup\limits_{n\to\infty} {\int}_{K^c_{\delta}} {\mathbb E}|\sqrt{np_n}\tilde E_1|^2 \mu(ds,dt) \\ &\le& \frac{1}{\epsilon} \lim\limits_{\delta\to 0}\limsup\limits_{n\to\infty} {\int}_{K^c_{\delta}} c (1\wedge |s|^{\beta})(1\wedge |t|^{2}) \mu(ds,dt) \to 0 \end{array} $$

by the dominated convergence theorem. This combined with Eq. 33 shows the convergence of \( np_{n}\int |\tilde E_{1}|^{2} \mu (ds,dt)\) to\(\int |Q(s,t)|^{2} \mu (ds,dt)\), and hence completes the proof of the proposition. □

Proof

(Proof of Theorem 1(2) (cont.))

Now it remains to show (30). Similar to the proof of Proposition 1, we can show that

$$\sqrt{np_{n}} \tilde E_{21} \stackrel{d}{\rightarrow} Q^{\prime} $$

for a centered Gaussian process Q,and

$$\tilde E_{22} \stackrel{p}{\rightarrow} 0. $$

Hence

$$\sqrt{np_{n}} \tilde E_{2} = \sqrt{np_{n}} \tilde E_{21}\tilde E_{22} \stackrel{p}{\rightarrow} 0. $$

The argument then follows similarly from the continuous mapping theorem and bounding the tail integrals.□

Proof (Proof of Theorem 1(1))

Similar to the proof of Theorem 1(2), it suffices to show that

$$ \int|\tilde E_i|^2\mu(ds,dt) \stackrel{p}{\rightarrow} 0,\quad i = 1,2,3. $$
(35)

The convergence (35) for i = 1, 2 follows trivially from the more general results (28) and (30) in the proof of Theorem 1(2). Hence it suffices to show

$$ \int|\tilde E_3|^2\mu(ds,dt) \to 0, $$
(36)

where we s recall that \(\tilde E_{3}:=\varphi _{{\frac {R}{r_{n}}},{\Theta }|r_{n}}(s,t)- \varphi _{{\frac {R}{r_{n}}}|r_{n}}(s)\varphi _{{\Theta }|r_{n}}(t)\) is non-random.

Let \(P_{{\frac {R}{r_{n}}},{\Theta }|r_{n}}(\cdot ) = P\left [\left (\frac {R}{r_{n}},\mathbf {\Theta }\right ) \in \cdot | \frac {R}{r_{n}} > 1 \right ] \) and \(P_{{\frac {R}{r_{n}}}|r_{n}},P_{{\Theta }|r_{n}}\) be the corresponding marginal measures. Then from Eq. 3,

$$P_{{\frac{R}{r_{n}}},{\Theta}|r_{n}} - P_{{\frac{R}{r_{n}}}|r_{n}}P_{{\Theta}|r_{n}} {\,\stackrel{v}{\rightarrow}\, \nu_{\alpha} \times S-\nu_{\alpha} \times S = 0}, $$

and hence for fixed (s,t),

$$\tilde E_{3}(s,t) = \int e^{is{T} + it^{T}\mathbf{\Theta}} \,(P_{{\frac{R}{r_{n}}},{\Theta}|r_{n}} - P_{{\frac{R}{r_{n}}}|r_{n}}P_{{\Theta}|r_{n}})(d{T},d\mathbf{\Theta}) \to 0. $$

For any β < 2 ∧ α, using the same argument in Eq. 34,

$$|\tilde E_{3}|^{2} = \left( \frac{{\mathbb E}|U_{jn}V_{jn}|}{p_{n}}\right)^{2} \le c(1\wedge |s|^{\beta})(1\wedge |t|^{2}). $$

Then Eq. 36 follows from Eq. 12 and dominated convergence. This concludes the proof.□

Appendix B: Proof of Theorem 2

Following the same notation and steps as the proof of Theorem 1 in Appendix, it suffices to prove the following convergences for the mixing case:

$$ \frac{\hat p_n}{p_n} \stackrel{p}{\rightarrow}1, $$
(37)
$$ n p_n {\int}_{{\mathbb R}^{d + 1}} |\tilde E_1|^2\mu(ds,dt) \stackrel{d}{\rightarrow} {\int}_{{\mathbb R}^{d + 1}} |Q^{\prime}(s,t)|^2\mu(ds,dt) $$
(38)

and

$$ n p_n {\int}_{{\mathbb R}^{d + 1}} |\tilde E_2|^2\mu(ds,dt)\stackrel{p}{\rightarrow} 0. $$
(39)

We prove (37) and (38) in Propositions 2 and 3, respectively. The proof of Eq. 39 follows in a similar fashion. The proofs of both propositions rely on the following lemma.

Throughout this proof we make use of the results that if {Zt} is stationary and α-mixing with coefficient {αh}, then

$$ |\text{cov}(Z_0,Z_h)| \le c \alpha_h^{\delta} \left( {\mathbb E}|Z_0|^{2/(1-\delta)}\right)^{1-\delta}, \quad \text{for any} \delta \in (0,1), $$
(40)

see Section 1.2.2, Theorem 3(a) of Doukhan (1994).

Lemma 1

Let {Xt} be a multivariate stationary time series that is regularly varying and α-mixing with mixing coefficient {αh}.For a sequencern, setpn = (∥X0∥ > rn). Letf1, f2be bounded functions which vanish outside\(\overline {{\mathbb R}}^{d}\backslash B_{1}(\bf 0)\),whereB1(0)is the unit open ball {x|∥x∥ < 1},with sets of discontinuity of measure zero. Set,

$$S_{n}^{(i)} = \sum\limits_{t = 1}^{n} \left( f_{i}\left( \frac{\textbf{X}_{t}}{r_{n}}\right) - {\mathbb E} f_{i}\left( \frac{\textbf{X}_{0}}{r_{n}}\right) \right), \quad i = 1,2. $$

Assume that condition (M) holds for {αh} and {rn}. Then

$$ \frac{1}{\sqrt{np_n}} (S_n^{(1)},S_n^{(2)})^T \stackrel{d}{\rightarrow} N(\bf0,{\Sigma}), $$
(41)

where the covariance matrixij]i,j= 1,2 = [σ2(fi, fj)]i,j= 1,2with

$$ \sigma^2(f_1,f_2):= \sigma^2_0(f_1,f_2) + 2 \sum\limits_{h = 1}^{\infty} \sigma_h^2(f_1,f_2) $$
(42)

and

$$ \sigma_h^2(f_1,f_2) = \int f_1f_2 d\mu_h,\quad h\ge0. $$
(43)

In particular,

$$\frac{1}{{np_{n}}} (S_{n}^{(1)},S_{n}^{(2)})^{T} \stackrel{p}{\rightarrow} \bf0. $$

The proof of Lemma 1 is provided after the proofs of the propositions.

Proposition 2

Assume that condition ( M ) holds, then

$$\frac{\hat p_{n}}{p_{n}} \stackrel{p}{\rightarrow}1, $$

Proof

We have

$$\frac{\hat p_{n}}{{p}_{n}}-1 = \frac{1}{n} \sum\limits_{j = 1}^{n}\left( \frac{ \mathbf{1}_{\{R_{j}>r_{n}\}}}{p_{n}} -1\right) = \frac{1}{np_{n}} \sum\limits_{j = 1}^{n} (\mathbf{1}_{\{R_{j}>r_{n}\}} - p_{n}). $$

Apply Lemma 1 to f(x) = 1{∥x∥> 1} and the result follows. □

Proposition 3

Assume that condition(M) holds, and that μ and {rn} satisfies (12) and (13), respectively, then

$$n p_{n} {\int}_{{\mathbb R}^{d + 1}} |\tilde E_{1}|^{2}\mu(ds,dt) \stackrel{d}{\rightarrow} {\int}_{{\mathbb R}^{d + 1}} |Q^{\prime}(s,t)|^{2}\mu(ds,dt), $$

where Qis acentered Gaussian process.

Proof

Let us first establish the convergence of \(\sqrt {np_{n}}\tilde E_{1}(s,t)\) for fixed (s,t). Take

$$\begin{array}{@{}rcl@{}} f_1(\textbf{x}) &=& \text{Re}\left\{\left( e^{is\|\textbf{x}\|}-{\mathbb E} [e^{is\|\textbf{x}\|}|\|\textbf{x}\|>1]\right) \left( e^{it\textbf{x}/\|\textbf{x}\|} - {\mathbb E} [e^{it\textbf{x}/\|\textbf{x}\|}|\|\textbf{x}\|>1]\right) {\bf1}_{\|\textbf{x}\|>1}\right\} \\ f_2(\textbf{x}) &=& \text{Im}\left\{ \left( e^{is\|\textbf{x}\|}-{\mathbb E} [e^{is\|\textbf{x}\|}|\|\textbf{x}\|>1]\right) \left( e^{it\textbf{x}/\|\textbf{x}\|} - {\mathbb E} [e^{it\textbf{x}/\|\textbf{x}\|}|\|\textbf{x}\|>1]\right) {\bf1}_{\|\textbf{x}\|>1}\right\}. \end{array} $$

Then from Lemma 1,

$$\frac{1}{\sqrt{np_{n}}} (S_{n}^{(1)},S_{n}^{(2)})^{T} = \sqrt{np_{n}} (\text{Re}\{\tilde E_{1}(s,t)\}, \text{Im}\{\tilde E_{1}(s,t)\}) \stackrel{d}{\rightarrow} N(\bf0,{\Sigma}), $$

where the covariance structureΣcan be derived from Eqs. 42 and 43. This implies that

$$\sqrt{np_{n}} \tilde E_{1}(s,t) \stackrel{d}{\rightarrow} Q^{\prime}(s,t), $$

where Q(s,t) is a zero-mean complex normal process with covariance matrix Σ11 + Σ22and relation matrix Σ11 −Σ22 + i12 + Σ21).

The finite-dimensional distributional convergence of \(\sqrt {n\hat p_{n}} \tilde {E}_{1}\) to a Q(s,t) can be generalized using the Cramér-Wold device and we omit the calculation of the covariance structure. The tightness condition for the functional convergence follows the same arguments in the proof of Proposition 1 from Bickel and Wichura (1971), with equality (32) replaced by a variance calculation of the sum ofα-mixing componentsusing the inequality (40) and condition (33) is verified through the same argument. This completes the proof ofProposition 3. □

Proof (of Lemma 1)

The proof follows from that of Theorem 3.2 in Davis and Mikosch (2009). Here we outline the sketch of the proof and detail only the parts that differ from their proof.

By the vague convergence in Eq. 20, we have

  1. i)
    $$\frac{1}{p_{n}} {\mathbb E} \left[f_{i} \left( \frac{\textbf{X}_{0}}{r_{n}}\right)\right] \,\to\, \int f_{i} d\mu_{0} \quad\text{and} \quad \frac{1}{p_{n}} {\mathbb E} \left[f_{i}^{2} \left( \frac{\textbf{X}_{0}}{r_{n}}\right)\right] \,\to\, \int {f^{2}_{i}} d\mu_{0}; $$
  2. ii)
    $$\frac{1}{p_{n}} \text{Var} \left[f_{i} \left( \frac{\textbf{X}_{0}}{r_{n}}\right)\right] \,=\, \frac{1}{p_{n}} {\mathbb E} \left[f_{i}^{2} \left( \frac{\textbf{X}_{0}}{r_{n}}\right)\right] - p_{n} \left( \frac{1}{p_{n}}{\mathbb E} \left[f_{i} \left( \frac{\textbf{X}_{0}}{r_{n}}\right)\right]\right)^{2} \,\to\, \int {f^{2}_{i}} d\mu_{0} = {\sigma^{2}_{0}}(f_{i},f_{i}); $$
  3. iii)
    $$\begin{array}{@{}rcl@{}} \frac{1}{p_n} \text{cov} \left[f_i \left( \frac{\textbf{X}_0}{r_n}\right), f_j\left( \frac{\textbf{X}_h}{r_n}\right)\right] & \to& \int f_i f_j d\mu_h \,=\, \sigma^2_h(f_i,f_j). \end{array} $$

Let us first consider the marginal convergence of \(\frac {1}{\sqrt {np_{n}}}S_{n}^{(i)}\) for i = 1, 2. Without loss of generality, we suppress the dependency on i and set

$$Y_{tn} := f\left( \frac{\textbf{X}_{t}}{r_{n}}\right) - {\mathbb E} f\left( \frac{\textbf{X}_{0}}{r_{n}}\right).$$

Then

$$\frac{1}{p_{n}} \text{Var} \left[Y_{1n}\right] \to {\sigma^{2}_{0}}(f,f) \quad \text{and} \quad \frac{1}{p_{n}} \text{cov} \left( Y_{1n}, Y_{(h + 1)n}\right) \to{\sigma^{2}_{h}}(f,f). $$

We also have the following two results for |cov(Y1n, Y(h+ 1)n)|:

$$\begin{array}{@{}rcl@{}} && \lim\limits_{h\to\infty}\limsup\limits_{n\to\infty}\sum\limits_{j=h}^{l_n} \frac{1}{p_n}|\text{cov}(Y_{1n}, Y_{(j + 1)n})| \\ &\le& \lim\limits_{h\to\infty}\limsup\limits_{n\to\infty}\sum\limits_{j=h}^{l_n} \frac{1}{p_n}{\mathbb E}\left|f \left( \frac{\textbf{X}_0}{r_n}\right) f\left( \frac{\textbf{X}_j}{r_n}\right) \right| + \sum\limits_{j=h}^{l_n}\frac{1}{p_n}{\mathbb E}\left( \left|f \left( \frac{\textbf{X}_0}{r_n}\right) \right|\right)^2 \\ &\le& \lim\limits_{h\to\infty}\limsup\limits_{n\to\infty}\sum\limits_{j=h}^{l_n} \frac{c}{p_n}{\mathbb E}\left( {\bf1}_{\{\|\textbf{X}_0\|>r_n\}}\right)\left( {\bf1}_{\{\|\textbf{X}_j\|>r_n\}}\right) + \sum\limits_{j=h}^{l_n}\frac{c}{p_n}\left( {\mathbb E}{\bf1}_{\{\|\textbf{X}_0\|>r_n\}}\right)^2 \\ &\le& \lim\limits_{h\to\infty}\limsup\limits_{n\to\infty}\sum\limits_{j=h}^{l_n} \frac{c}{p_n}\mathbb P\left( \|\textbf{X}_0\|>r_n, \|\textbf{X}_j\|>r_n\right) + cl_np_n \\ &=&0 \end{array} $$
(44)

from condition (22), and

$$\begin{array}{@{}rcl@{}} \lim\limits_{n\to\infty}\sum\limits_{j=l_n}^{\infty} \frac{1}{p_n}|\text{cov}(Y_{1n}, Y_{(j + 1)n})|&\le& \lim\limits_{n\to\infty}\sum\limits_{j=l_n}^{\infty} \frac{1}{p_n}\left| \text{cov} \left[f \left( \frac{\textbf{X}_0}{r_n}\right), f\left( \frac{\textbf{X}_j}{r_n}\right)\right] \right| \\ &\le& \lim\limits_{n\to\infty}\sum\limits_{j=l_n}^{\infty} \frac{1}{p_n}\alpha_j^{\delta} \left( {\mathbb E}\left| f \left( \frac{ extbf{X}_0}{r_n}\right)\right|^{2/(1-\delta)}\right)^{1-\delta} \\ &\le& \lim\limits_{n\to\infty}\sum\limits_{j=l_n}^{\infty} \frac{c}{p_n}\alpha_j^{\delta} \left( {\mathbb E}\left( {\bf1}_{\{\| extbf{X}_0\|>r_n\}}\right)^{2/(1-\delta)}\right)^{1-\delta} \\ &\le& \lim\limits_{n\to\infty}\sum\limits_{j=l_n}^{\infty} c\alpha_j^{\delta} p_n^{-\delta} \\ &=& 0 \end{array} $$
(45)

from condition (21).

We apply the same technique of small/large blocks as used in Davis and Mikosch (2009). Let mnandln be the sizes of big and small blocks, respectively, where lnmnn.Let Ikn = {(k − 1)mn + 1,…,kmn} and Jkn = {(k − 1)mn + 1,…, (k − 1)mn + ln},k = 1,…,n/mn, be the index sets of big and small blocks respectively. Set \(\tilde {I}_{kn} = I_{kn}\backslash J_{kn}\), i.e.,\(\tilde {I}_{kn}\) are the big blocks withthe first ln observations removed. For simplicity, we set mn := 1/pnand assume that the number of big blocks n/mn = npn is integer-valued. The non-integer case can be generalized without additional difficulties.

Denote

$$S_{n}(B) := \sum\limits_{t\in B} Y_{tn}, $$

then

$$\sum\limits_{t = 1}^{n} Y_{tn} = S_{n}(1:n) = \sum\limits_{k = 1}^{np_{n}} S_{n}(I_{kn}) = \sum\limits_{k = 1}^{np_{n}} S_{n}(\tilde{I}_{kn}) + \sum\limits_{k = 1}^{np_{n}} S_{n}(J_{kn}). $$

Let\(\{\tilde {S}_{n}(\tilde {I}_{kn})\}_{k = 1,\ldots ,np_{n}}\) be iid copies of \(\tilde {S}_{n}(\tilde {I}_{1n})\). To provethe convergence of \(\frac {1}{\sqrt {np_{n}}}S_{n}(1:n)\),it suffices to show the following:

$$ \frac{1}{\sqrt{np_{n}}}\sum\limits_{k = 1}^{np_{n}} \tilde{S}_{n}(\tilde{I}_{kn}) \text{ and } \frac{1}{\sqrt{np_{n}}}\sum\limits_{k = 1}^{np_{n}} S_{n}(\tilde{I}_{kn}) \text{ has the same limiting distribution}, $$
(46)
$$ \frac{1}{\sqrt{np_n}}\sum\limits_{k = 1}^{np_n} S_n(J_{kn}) \overset{p}\to 0, $$
(47)

and

$$ \frac{1}{\sqrt{np_n}}\sum\limits_{k = 1}^{np_n} \tilde{S}_n(\tilde{I}_{kn}) \overset{d}\to N(0,\sigma^2(f,f)). $$
(48)

The statement (46) holds if

$$ np_n \alpha_{l_n} \to 0, \quad \text{as } n\to\infty. $$
(49)

This follows from the same argument in equation (6.2) in Davis and Mikosch (2009).

For condition (47), it suffices to show that

$$\frac{1}{{np_{n}}}\text{Var}\left( \sum\limits_{k = 1}^{np_{n}} S_{n}(J_{kn})\right) \to 0. $$

Note that

$$\frac{1}{{np_{n}}}\text{Var}\left( \sum\limits_{k = 1}^{np_{n}} S_{n}(J_{kn})\right) \le \text{Var}(S_{n}(J_{1n}))+ 2 \sum\limits_{h = 1}^{np_{n}-1} (1-\frac{h}{np_{n}}) |\text{cov}(S_{n}(J_{1n}),S_{n}(J_{(h + 1)n}))| =:P_{1}+P_{2}. $$

We have

$$\begin{array}{@{}rcl@{}} \limsup\limits_{n\to\infty} P_1 &=& \limsup\limits_{n\to\infty} \text{Var}\left( \sum\limits_{j = 1}^{l_n}Y_{jn}\right) \\ &\le& \limsup\limits_{n\to\infty} l_np_n \left( \frac{\text{Var}(Y_{1n})}{p_n} + 2 \sum\limits_{j = 1}^{l_n-1}(1-\frac{j}{l_n}) \frac{|\text{cov}(Y_{1n},Y_{(j + 1)n})|}{p_n}\right) \\ &\le& \limsup\limits_{n\to\infty} l_np_n\frac{\text{Var}(Y_{1n})}{p_n} + \lim\limits_{h\to\infty}\limsup\limits_{n\to\infty} 2l_np_n \sum\limits_{j = 1}^{h-1} \frac{|\text{cov}(Y_{1n},Y_{(j + 1)n})|}{p_n} \\ && \quad+ \lim\limits_{h\to\infty}\limsup\limits_{n\to\infty} 2l_np_n\sum\limits_{j=h}^{l_n-1} \frac{|\text{cov}(Y_{1n},Y_{(j + 1)n})|}{p_n} \\ &=& 0 \end{array} $$

where the last step follows from dominated convergence and Eq. 44. And for the other term,

$$\begin{array}{@{}rcl@{}} P_2 &\le& 2\sum\limits_{h = 1}^{np_n-1} \sum\limits_{s\in J_{1n}}\sum\limits_{t \in J_{(h + 1)n}}\left| \text{cov}(Y_{sn},Y_{tn})\right| \\ &\le& 2 \sum\limits_{h = 1}^{np_n-1} l_n \sum\limits_{k=h/p_n-l_n + 1}^{h/p_n} \left| \text{cov}(Y_{1n},Y_{(k + 1)n})\right| \\ &\le& 2l_np_n\sum\limits_{k = 1/p_n-l_n + 1}^{\infty}\frac{\left| \text{cov}(Y_{1n},Y_{(k + 1)n})\right| }{p_n}\\ &\le& 2l_np_n\sum\limits_{k=l_n + 1}^{\infty}\frac{\left| \text{cov}(Y_{1n},Y_{(k + 1)n})\right| }{p_n} \to 0. \end{array} $$

Note that 1/pn = mnis the size of big blocks Ikn’s and 1/pnln + 1 = mnln + 1 is the distance between consecutive small blocks (Jkn, J(k+ 1)n)’s.T he last limit follows from Eq. 45.

To finish the proof, we need to establish the central limit theorem in Eq. 48. Note the\(\tilde {S}_{n}(\tilde {I}_{ln})\)’s are iid with\({\mathbb E}\tilde {S}_{n}(\tilde {I}_{ln})= 0\). We now calculate its variance. Recall that 1/pnlnis the size of \(\tilde {I}_{1n}\), the big block with small block removed. Then

$$\begin{array}{@{}rcl@{}} \text{Var} \left( \tilde{S}_n(\tilde{I}_{1n})\right) &=& \text{Var}\left( \sum\limits_{j = 1}^{1/p_n-l_n}Y_{jn}\right) \\ &=& (\frac{1}{p_n}-l_n) \text{Var}(Y_{jn}) + 2 \sum\limits_{k = 1}^{1/p_n-l_n-1}(1/p_n-l_n-k) \text{cov}(Y_{1n},Y_{(k + 1)n}) \\ &=&(\frac{1}{p_n}-l_n) \text{Var}(Y_{jn}) + 2 \left( \sum\limits_{k = 1}^{h}+ \sum\limits_{k=h + 1}^{l_n} + \sum\limits_{k=l_n + 1}^{1/p_n-l_n-1}\right)\left( 1-\frac{l_n+k}{1/p_n}\right) \frac{1}{p_n}\text{cov}(Y_{1n},Y_{(k + 1)n}) \\ &:=& I_0 + I_1 + I_2 + I_3. \end{array} $$

Here

$$\lim\limits_{n\to\infty} I_{0} = \lim\limits_{n\to\infty}(1-l_{n}p_{n})\frac{1}{p_{n}} \text{Var}(Y_{jn}) = {\sigma_{0}^{2}}(f,f), $$

and

$$\lim\limits_{n\to\infty} I_{1} = \lim\limits_{n\to\infty} 2\sum\limits_{k = 1}^{h}\left( 1-p_{n}(l_{n}+k)\right) \frac{\text{cov}(Y_{1n},Y_{(k + 1)n})}{p_{n}} = 2 \sum\limits_{k = 1}^{h} {\sigma^{2}_{k}}(f,f). $$

We also have

$$\lim\limits_{h\to\infty}\limsup\limits_{n\to\infty}|I_{2}| \le \lim\limits_{h\to\infty}\limsup\limits_{n\to\infty}\sum\limits_{k=h + 1}^{l_{n}}\frac{|\text{cov}(Y_{1n},Y_{(k + 1)n})|}{p_{n}} = 0 $$

from Eq. 44, and

$$\lim\limits_{n\to\infty}|I_{3}| \le \lim\limits_{n\to\infty}\sum\limits_{k=l_{n}}^{\infty}\frac{|\text{cov}(Y_{1n},Y_{(k + 1)n})|}{p_{n}} = 0 $$

from Eq. 45. Therefore

$$\lim\limits_{n\to\infty}\text{Var} \left( \tilde{S}_{n}(\tilde{I}_{1n})\right) = \lim\limits_{n\to\infty} I_{0} + \lim\limits_{h\to\infty}\lim\limits_{n\to\infty} I_{1} = {\sigma_{0}^{2}}(f,f) + 2 \sum\limits_{k = 1}^{\infty} {\sigma^{2}_{k}}(f,f) =: \sigma^{2}(f,f) $$

as defined. To show that this infinite sum converges, it suffices to show that

$$\sum\limits_{h = 1}^{\infty} \mu_{h}(\left\{(\textbf{x},\textbf{x}^{\prime})|\|\textbf{x}\|>1,\|\textbf{x}^{\prime}\|>1\right\}) < \infty. $$

This follows from Eq. 22 in condition (M), for if

$$\sum\limits_{h = 1}^{\infty} \mu_{h}(\left\{(\textbf{x},\textbf{x}^{\prime})|\|\textbf{x}\|>1,\|\textbf{x}^{\prime}\|>1\right\}) = \infty, $$

then

$$\begin{array}{@{}rcl@{}} \limsup\limits_{n\to\infty} \frac{1}{p_n} \sum\limits_{j=h}^{l_n} {\mathbb P}(\|\textbf{X}_0\|>r_n, \|\textbf{X}_j\|>r_n) &\ge& \liminf\limits_{n\to\infty} \sum\limits_{j=h}^{l_n} {\mathbb P}(\|\textbf{X}_0\|>r_n, \|\textbf{X}_j\|>r_n | \|\textbf{X}_0\|>r_n) \\ &\ge& \sum\limits_{j=h}^{\infty} \mu_j(\left\{(\textbf{x},\textbf{x}^{\prime})|\|\textbf{x}\|>1,\|\textbf{x}^{\prime}\|>1\right\}) = \infty, \end{array} $$

which leads to a contradiction.

To apply the central limit theorem, we verify the Lindeberg’s condition,

$$\begin{array}{@{}rcl@{}} {\mathbb E} \left[(\tilde{S}_n(\tilde{I}_{1n}))^2 \mathbf1_{\{|\tilde{S}_n(\tilde{I}_{1n})|>\epsilon\sqrt{np_n}\}} \right] &\le& {\mathbb E} \left[\left( \sum\limits_{j = 1}^{1/p_n-l_n}Y_{jn}\right)^2 \mathbf1_{\{|\tilde{S}_n(\tilde{I}_{1n})|>\epsilon\sqrt{np_n}\}} \right] \\ &\le& {\mathbb E} \left[c(1/p_n-l_n)^2 \mathbf1_{\{|\tilde{S}_n(\tilde{I}_{1n})|>\epsilon\sqrt{np_n}\}} \right] \\ &\le& c \frac{1}{p_n^2} {\mathbb P}\left[{|\tilde{S}_n(\tilde{I}_{1n})|>\epsilon\sqrt{np_n}}\right] \\ &\le& c\frac{1}{p_n^2} \frac{\text{Var}\left[\tilde{S}_n(\tilde{I}_{1n})\right]}{\epsilon^2np_n} \,=\,O(\frac{1}{np_n^3}) \to 0. \end{array} $$

This completes the proof for the convergence of \(\frac {1}{\sqrt {np_{n}}}S_{n}(1:n)\).

The joint convergence of\(\frac {1}{\sqrt {np_{n}}} (S_{n}^{(1)},S_{n}^{(2)})^{T}\) follows from the same line of argument together with the Crámer-Wold device. In particular,

$$\frac{1}{np_{n}} \text{cov}\left( S_{n}^{(i)},S_{n}^{(j)}\right) = \sigma^{2}(f_{i},f_{j}), \quad i,j = 1,2. $$

This completes the proof of the lemma. □

Remark 1

Lemma 1 itself is a more general result of independent interest. The result can be generalized for functions fi defined on \(\overline {{\mathbb R}}^{d}\backslash \{\bf 0\}\) with compact support. In this case, condition (22) should be modified to

$$\lim\limits_{h\to\infty} \limsup\limits_{n\to\infty} \frac{1}{p_{n}} \sum\limits_{j=h}^{l_{n}} {\mathbb P}(\|\textbf{X}_{0}\|>\epsilon r_{n}, \|\textbf{X}_{j}\|>\epsilon r_{n}) = 0 $$

for some 𝜖 > 0, where \(\text {support}(f)\subseteq \overline {{\mathbb R}}^{d}\backslash B_{\epsilon }(\bf 0)\). Also, as seen during the proof of the lemma, the conditions onpn, ln, and αtcan be further relaxed.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wan, P., Davis, R.A. Threshold selection for multivariate heavy-tailed data. Extremes 22, 131–166 (2019). https://doi.org/10.1007/s10687-018-0316-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10687-018-0316-x

Keywords

AMS 2000 Subject Classifications

Navigation