Convergence rates of wavelet density estimation for negatively dependent sample

  • Junlian XuEmail author
Open Access


In this paper, linear and nonlinear wavelet estimators are defined for a density in a Besov space based on a negatively dependent random sample, and their upper bounds on \(L^{p}\) (\(1\leq p<\infty \)) risk are provided.


Wavelet estimator Density function Negatively dependent Besov space 

1 Introduction

Random variables \(X_{1}, X_{2}, \ldots , X_{n}\) are said to be negatively dependent (ND), if for any \(x_{1}, x_{2},\ldots , x_{n} \in \mathbb{R}\),
$$ \mathbf{P}(X_{1}\leq x_{1}, X_{2}\leq x_{2},\dots ,X_{n}\leq x_{n}) \leq \prod _{i=1}^{n} P(X_{i}\leq x_{i}), $$
$$ \mathbf{P}(X_{1}> x_{1}, X_{2}> x_{2}, \dots ,X_{n}> x_{n})\leq \prod_{i=1}^{n} P(X_{i}>x_{i}). $$
The definition was introduced by Bozorgnia [3]. Further discussion and related concepts can be found in [2, 10]. ND random variables are very useful in reliability theory and applications. Because of the wide applications, the notion of ND random variables has been receiving more and more attention recently. A series of useful results have been established (see [13, 14, 15, 16]). Hence, we consider density estimation for ND random variables in this paper.

For density estimation, Donoho et al. [6] defined wavelet estimators and showed their convergence rates on \(L^{p}\)-loss, when \(X_{1}, X_{2}, \ldots , X_{n}\) are independent. They found that the convergence rate of the nonlinear estimator is better than that of the linear one. In many cases, random variables \(X_{1}, X_{2}, \ldots , X_{n}\) are dependent. Doosti et al. [8] proposed a linear wavelet estimator and evaluated its \(L^{p}\) (\(1\leq p<\infty \)) risks for negatively associated random variables. Soon afterwards, the above results were extended to the case of negatively dependent sequences [7]. Chesneau [4] and Liu [12] also considered density estimation for an NA sample. Kou [11] defined linear and nonlinear wavelet estimators for mixing data and obtained their convergence rates.

Motivated by the above work, this paper will estimate the unknown density function f from a sequence of ND data \(X_{1}, X_{2}, \ldots , X_{n}\). We shall define wavelet estimators and give their upper bounds on \(L^{p}\)-loss. It turns out that our results reduce to Donoho’s classical theorems in [6], when the random sample is independent.

We establish our results on Besov spaces on a compact subset of the real line \(\mathbb{R}\). As usual, the Sobolev spaces with integer exponents are defined as
$$ W_{r}^{n}(\mathbb{R}):=\bigl\{ f\in L^{r}( \mathbb{R}), f^{(n)}\in L^{r}( \mathbb{R})\bigr\} $$
with \(\|f\|_{W_{r}^{n}}:=\|f\|_{r}+\|f^{(n)}\|_{r}\). Then \(L^{r}( \mathbb{R})\) can be considered as \(W_{r}^{0}(\mathbb{R})\). For \(1\leq r,q\leq \infty \) and \(s=n+\alpha \) with \(\alpha \in (0,1]\), a Besov space on \(\mathbb{R}\) means
$$ B^{s}_{r,q}(\mathbb{R}):=\bigl\{ f\in W_{r}^{n}( \mathbb{R}), \bigl\Vert t^{-\alpha } \omega _{r}^{2} \bigl(f^{(n)},t\bigr) \bigr\Vert _{q}^{*}< \infty \bigr\} $$
with the norm \(\|f\|_{srq}:=\|f\|_{W_{r}^{n}}+\|t^{-\alpha }\omega _{r}^{2}(f^{(n)},t)\|_{q}^{*}\), where \(\omega _{r}^{2}(f,t):=\sup_{|h|\leq t}\|f(\cdot +2h)-2f(\cdot +h)+f(\cdot )\|_{r}\) stands for the smoothness modulus of f and
$$ \Vert h \Vert _{q}^{*}= \textstyle\begin{cases} (\int _{0}^{\infty } \vert h(t) \vert ^{q}\frac{dt}{t})^{\frac{1}{q}}, & \text{if } 1\leq q< \infty ; \\ \operatorname{ess}\sup _{t} \vert h(t) \vert , & \text{if } q=\infty . \end{cases} $$
We always assume \(f\in B^{s}_{r,q}(\mathbb{R}, L)=\{f\in B^{s}_{r,q}( \mathbb{R}), f \text{ is a probability density and } \|f\|_{srq}\leq L\}\) with \(L>0\). Let \(\phi \in C_{0}^{t}(\mathbb{R})\) be an orthonormal scaling function with \(t>\max \{s,1\}\). Then ϕ is a function of bounded variation (BV). The corresponding wavelet function is denoted by ψ. It is well known that \(\{\phi _{J,k}, \psi _{j,k}, j\geq J, k\in Z\}\) constitutes an orthonormal basis of \(L^{2}(\mathbb{R})\), where \(\phi _{J,k}(x):=2^{\frac{J}{2}} \psi (2^{J}x-k)\), \(\psi _{j,k}(x):=2^{\frac{j}{2}}\psi (2^{j}x-k)\) as in wavelet analysis [5]. Then for each \(f\in L^{2}(\mathbb{R})\), \(\alpha _{J,k}=\int f(x)\phi _{J,k}(x)\,dx\), and \(\beta _{j,k}=\int f(x)\psi _{j,k}(x)\,dx\), we have
$$ f(x)=\sum_{k\in \mathbb{Z}}\alpha _{J,k}\phi _{J,k}(x)+\sum_{j\geq J}\sum _{k\in \mathbb{Z}}\beta _{j,k}\psi _{j,k}(x). $$

Here and in what follows, \(A\lesssim B\) denotes \(A\leq C B\) for some constant \(C>0\); \(A\gtrsim B\) means \(B\lesssim A\); \(A\sim B\) stands for both \(A\lesssim B\) and \(B\lesssim A\). The following theorems are needed in our discussion:

Theorem 1.1

(Härdle et al. [9])

Let\(f\in L^{r}(\mathbb{R})\) (\(1\leq r\leq \infty \)), \(\alpha _{J,k}=\int f(x) \phi _{J,k}(x)\,dx\)and\(\beta _{j,k}=\int f(x)\psi _{j,k}(x)\,dx\). The following assertions are equivalent:
  1. (i)

    \(f\in B^{s}_{r,q}\)\((\mathbb{R})\), \(s>0\), \(1\leq q\leq \infty \);

  2. (ii)

    \(\{2^{js}\|P_{j}f-f\|_{r}\}_{j\geq 0}\in l^{q}\)with\(P_{j}f:=\sum_{k\in \mathbb{Z}}\alpha _{j,k}\phi _{j,k}\);

  3. (iii)

    \(\|\alpha _{J\cdot }\|_{r}+\|\{2^{j(s+\frac{1}{2}-\frac{1}{r})} \|\beta _{j\cdot }\|_{r}\}_{j\geq 0}\|_{q}<+\infty \).

$$ \Vert f \Vert _{srq}\sim \bigl\Vert \bigl(2^{js} \Vert P_{j}f-f \Vert _{r}\bigr)_{j\geq 0} \bigr\Vert _{q}\sim \Vert \alpha _{J\cdot } \Vert _{r}+ \bigl\Vert \bigl\{ 2^{j(s+\frac{1}{2}-\frac{1}{r})} \Vert \beta _{j\cdot } \Vert _{r}\bigr\} _{j\geq 0} \bigr\Vert _{q}. $$

Theorem 1.2

(Härdle et al. [9])

Let\(\theta _{\phi }(x):=\sum_{k}|\phi (x-k)|\)and\(\operatorname{ess} \sup_{x}\theta _{\phi }(x)<\infty \). Then for\(\lambda =\{ \lambda _{k}\}\in l^{r}(\mathbb{Z})\)and\(1\leq r\leq \infty \),
$$ \biggl\Vert \sum_{k\in \mathbb{Z}}\lambda _{k}\phi _{jk} \biggr\Vert _{r}\sim 2^{j( \frac{1}{2}-\frac{1}{r})} \Vert \lambda \Vert _{r}. $$

Negatively dependent random variables possess the following property which will be used in this paper.

Theorem 1.3

(Bozorgnia et al. [3])

Let\(X_{1},\ldots ,X_{n}\)be a sequence of ND random variables and let\(A_{1},\ldots ,A_{m}\)be some pairwise disjoint nonempty subsets of\(\{1,\ldots ,n\}\)with\(\alpha _{i}=\sharp (A_{i})\), where\(\sharp (A)\)denotes the number of elements in the setA. If\(f_{i}: \mathbb{R} ^{\alpha _{i}}\rightarrow \mathbb{R}\) (\(i=1,\ldots ,m\)) aremcoordinatewise nondecreasing (nonincreasing) functions, then\(f_{1}(X_{i}, i\in A_{1}), \ldots , f_{m}(X_{i}, i\in A_{m})\)are also ND. In particular, for any\(t_{i}\geq 0 (\leq 0)\), \(1\leq i\leq m\),
$$ \mathbf{E}\Biggl[\exp \Biggl(\sum_{i=1}^{n}t_{i}X_{i} \Biggr)\Biggr]\leq \prod_{i=1}^{n} \mathbf{E} \bigl[\exp (t_{i}X_{i})\bigr]. $$

2 Linear estimators

In this section, we shall give a linear wavelet estimator for a density function \(f(x)\) in a Besov space.

The linear wavelet estimator of \(f(x)\) is defined as follows:
$$ \hat{f^{\mathrm{lin}}_{n}}(x)=\sum _{k\in K_{0}}\hat{\alpha }_{j_{0},k} \phi _{j_{0},k}(x), $$
where \(K_{0}=\{k\in \mathbb{Z}, \operatorname{supp} f\cap \operatorname{supp} \phi _{j_{0},k}\neq \emptyset \}\),
$$ \hat{\alpha }_{j_{0},k}=\frac{1}{n}\sum _{i=1}^{n}\phi _{j_{0},k}(X _{i}). $$

The following inequalities play important roles in this paper.

Lemma 2.1

(Rosenthal’s inequality, see Asadian et al. [1])

Let\(X_{1},\ldots ,X_{n}\)be a sequence of ND random variables, which satisfy\(\mathbf{E}X_{i}=0\)and\(\mathbf{E}|X_{i}|^{p}< \infty \), where\(i=1,\dots ,n\). Then
$$\begin{aligned}& \mathbf{E}\Biggl( \Biggl\vert \sum_{i=1}^{n}X_{i} \Biggr\vert ^{p}\Biggr)\lesssim \sum_{i=1} ^{n}\mathbf{E} \vert X_{i} \vert ^{p}+\Biggl( \sum_{i=1}^{n}\mathbf{E}X_{i}^{2} \Biggr)^{ \frac{p}{2}}, \quad p\geq 2, \\& \mathbf{E}\Biggl( \Biggl\vert \sum_{i=1}^{n}X_{i} \Biggr\vert ^{p}\Biggr)\leq \Biggl(\sum_{i=1} ^{n}\mathbf{E}X_{i}^{2}\Biggr)^{\frac{p}{2}}, \quad 0< p\leq 2. \end{aligned}$$

Lemma 2.2

Let\(X_{1}, X_{2}, \ldots , X _{n}\)be ND random variables and let the density functionfbe bounded and compactly supported with support length less than\(H>0\). Then for\(\hat{\alpha }_{{j_{0}},k}\)defined by (2) we have
$$ \mathbf{E} \vert \hat{\alpha }_{{j_{0}},k}-\alpha _{{j_{0}},k} \vert ^{p}\lesssim n^{-\frac{p}{2}} $$
for\(1\leq p<\infty \)and\(2^{j_{0}}\leq n\).


By the definition of \(\hat{\alpha }_{{j_{0}},k}\), one has
$$ \mathbf{E} \vert \hat{\alpha }_{{j_{0}},k}-\alpha _{{j_{0}},k} \vert ^{p} =\frac{1}{n ^{p}}\mathbf{E} \Biggl\vert \sum_{i=1}^{n}\bigl[\phi _{{j_{0}},k}(X_{i})- \alpha _{{j_{0}},k}\bigr] \Biggr\vert ^{p}. $$
Let \(\xi _{i}:=\phi _{{j_{0}},k}(X_{i})-\alpha _{{j_{0}},k}\) (\(i=1,2, \ldots ,n\)). Clearly,
$$ \mathbf{E} \Biggl\vert \sum _{i=1}^{n}\bigl[\phi _{{j_{0}},k}(X_{i})- \alpha _{{j_{0}},k}\bigr] \Biggr\vert ^{p} =\mathbf{E} \Biggl\vert \sum_{i=1}^{n}\xi _{i} \Biggr\vert ^{p}. $$
One can choose a scaling function ϕ, which a function of bounded variation, and assume \(\phi :=\tilde{\phi }-\bar{\phi }\), where ϕ̃ and ϕ̄ are bounded, nonnegative and nondecreasing functions. Define
$$ \tilde{\alpha }_{{j_{0}},k}:= \int \tilde{\phi }_{{j_{0}},k}(x)f(x)\,dx, \qquad \bar{\alpha }_{{j_{0}},k}:= \int \bar{\phi }_{{j_{0}},k}(x)f(x)\,dx, $$
$$ \tilde{\xi }_{i}:=\tilde{\phi }_{{j_{0}},k}(X_{i})- \tilde{\alpha } _{{j_{0}},k}, \qquad \bar{\xi }_{i}:=\bar{\phi }_{{j_{0}},k}(X_{i})-\bar{\alpha }_{{j_{0}},k}. $$
Then \(\alpha _{{j_{0}},k}=\tilde{\alpha }_{{j_{0}},k}-\bar{\alpha } _{{j_{0}},k}\), \(\xi _{i}=\tilde{\xi }_{i}-\bar{\xi }_{i}\) and
$$ \mathbf{E} \Biggl\vert \sum _{i=1}^{n}\xi _{i} \Biggr\vert ^{p} =\mathbf{E} \Biggl\vert \sum_{i=1}^{n}( \tilde{\xi }_{i}-\bar{\xi }_{i}) \Biggr\vert ^{p}. $$
It is easy to see that \(\mathbf{E}\tilde{\xi }_{i}=0\), the random variables \(\tilde{\xi }_{1}, \ldots , \tilde{\xi }_{n}\) are ND due to the nondecreasing property ϕ̃ and Theorem 1.3. To apply the Rosenthal’s inequality, one shows an inequality
$$ \mathbf{E} \vert \tilde{\xi }_{i} \vert ^{m}\lesssim 2^{\frac{(m-2)j_{0}}{2}} $$
for \(m\geq 2\). In fact,
$$ \mathbf{E} \vert \tilde{\xi }_{i} \vert ^{m} =\mathbf{E} \bigl\vert \tilde{\phi }_{j_{0},k}(X _{i})-\tilde{\alpha }_{{j_{0}},k} \bigr\vert ^{m} \lesssim \mathbf{E} \bigl\vert \tilde{\phi }_{j_{0},k}(X_{i}) \bigr\vert ^{m}+ \vert \tilde{\alpha }_{{j_{0}},k} \vert ^{m}. $$
Note that \(|\tilde{\phi }_{j_{0},k}(x)|\lesssim 2^{\frac{{j_{0}}}{2}}\). Then for \(m\geq 2\),
$$\begin{aligned} \mathbf{E} \bigl\vert \tilde{\phi }_{j_{0},k}(X_{i}) \bigr\vert ^{m} =&\mathbf{E}\bigl[ \bigl\vert \tilde{\phi }_{j_{0},k}(X_{i}) \bigr\vert ^{2} \bigl\vert \tilde{\phi }_{j_{0},k}(X_{i}) \bigr\vert ^{m-2}\bigr] \\ \lesssim &2^{\frac{(m-2){j_{0}}}{2}}\mathbf{E} \bigl\vert \tilde{\phi }_{j_{0},k} ^{2}(X_{i}) \bigr\vert . \end{aligned}$$
Note that \(f\in B^{s}_{r,q}(\mathbb{R},L)\subseteq B^{s-\frac{1}{r}} _{\infty ,q}(\mathbb{R},L)\). Then \(\|f\|_{\infty }\leq L\). Using \(\tilde{\phi }\in L^{2}(\mathbb{R})\), one knows that
$$ \mathbf{E} \bigl\vert \tilde{\phi }_{j_{0},k}^{2}(X_{i}) \bigr\vert \lesssim \int ( \tilde{\phi }_{j_{0},k})^{2}(x)f(x)\,dx = \int \bigl\vert \tilde{\phi }(x-k) \bigr\vert ^{2}f \bigl(2^{-j}x\bigr)\,dx \lesssim 1, $$
and \(|\tilde{\alpha }_{{j_{0}},k}|=|\int f(x)\tilde{\phi }_{j_{0},k}(x)\,dx| \lesssim 1\) because of suppf is contained in some interval I with length \(|I|\leq H\). This, together with (8) and (7), leads to (6).
By Rosenthal’s inequality with \(1\leq p\leq 2\),
$$ \mathbf{E} \Biggl\vert \sum_{i=1}^{n} \tilde{\xi }_{i} \Biggr\vert ^{p}\leq \Biggl[\sum _{i=1}^{n_{m}}\mathbf{E}(\tilde{\xi }_{i})^{2} \Biggr]^{\frac{p}{2}} \lesssim n^{\frac{p}{2}}. $$
Similarly, \(\mathbf{E}|\sum_{i=1}^{n}\bar{\xi }_{i}|^{p}\lesssim n^{\frac{p}{2}}\). Combining this with (5), one has
$$ \mathbf{E} \Biggl\vert \sum _{i=1}^{n}\xi _{i} \Biggr\vert ^{p}\lesssim \mathbf{E} \Biggl\vert \sum _{i=1}^{n}\tilde{\xi }_{i} \Biggr\vert ^{p} +\mathbf{E} \Biggl\vert \sum_{i=1}^{n} \bar{\xi }_{i} \Biggr\vert ^{p}\lesssim n^{\frac{p}{2}}. $$
Substituting (9) into (4), one obtains
$$ \mathbf{E} \Biggl\vert \sum_{i=1}^{n} \bigl[\phi _{{j_{0}},k}(X_{i})- \alpha _{{j_{0}},k}\bigr] \Biggr\vert ^{p} \lesssim n^{\frac{p}{2}}. $$
This with (3) shows that for \(1\leq p\leq 2\),
$$ E \vert \hat{\alpha }_{{j_{0}},k}-\alpha _{{j_{0}},k} \vert ^{p} \lesssim \frac{1}{n ^{p}}\times n^{\frac{p}{2}}= n^{-\frac{p}{2}}. $$
When \(2\leq p<\infty \), Rosenthal’s inequality and (6) show that
$$ \mathbf{E} \Biggl\vert \sum_{i=1}^{n} \tilde{\xi }_{i} \Biggr\vert ^{p}\lesssim \sum _{i=1}^{n}\mathbf{E} \vert \tilde{\xi }_{i} \vert ^{p} +\Biggl[\sum _{i=1} ^{n}\mathbf{E}(\tilde{\xi }_{i})^{2} \Biggr]^{\frac{p}{2}}\lesssim n2^{\frac{(p-2)j _{0}}{2}}+n^{\frac{p}{2}}. $$
Similarly, \(\mathbf{E}|\sum_{i=1}^{n}\bar{\xi }_{i}|^{p}\lesssim n2^{\frac{(p-2)j_{0}}{2}}+n^{\frac{p}{2}}\). Hence \(\mathbf{E}|\sum_{i=1}^{n}\xi _{i}|^{p}\lesssim n2^{\frac{(p-2)j_{0}}{2}}+n^{ \frac{p}{2}}\). Furthermore, it follows from (4), (3) and \(2^{j_{0}}\leq n \) that
$$ E \vert \hat{\alpha }_{{j_{0}},k}-\alpha _{{j_{0}},k} \vert ^{p} \lesssim \frac{1}{n ^{p}}\bigl[n2^{\frac{(p-2)j_{0}}{2}}+n^{\frac{p}{2}} \bigr]\lesssim n^{- \frac{p}{2}}. $$
Combining this with (10), one concludes the desired inequality of the lemma. □

Theorem 2.1

Let\(f(x)\in B^{s}_{r,q}( \mathbb{R},L)\) (\(s>\frac{1}{r}\), \(r, q \geq 1\)) and let\(\hat{f}^{\mathrm{lin}} _{n}\)be defined by (1). Under the conditions of Lemma 2.2, for each\(1\leq p<\infty \), one has
$$ \sup _{f\in B^{s}_{r,q}(\mathbb{R},L)}\mathbf{E} \bigl\Vert \hat{f}^{\mathrm{lin}} _{n}-f \bigr\Vert ^{p}_{p}\lesssim n^{-\frac{s^{\prime }p}{2s^{\prime }+1}}, $$
where\(s^{\prime }:=s-(\frac{1}{r}-\frac{1}{p})_{+}\)and\(x_{+}= \max (x, 0)\).


$$ \mathbf{E} \bigl\Vert \hat{f}^{\mathrm{lin}}_{n}-f \bigr\Vert _{p}^{p} \lesssim \Vert P_{{j_{0}}}f-f \Vert _{p}^{p}+\mathbf{E} \bigl\Vert \hat{f}^{\mathrm{lin}}_{n}-P_{{j_{0}}}f \bigr\Vert _{p}^{p}, $$
it is sufficient to estimate \(\|P_{{j_{0}}}f-f\|_{p}^{p}\) and \(\mathbf{E}\|\hat{f}^{\mathrm{lin}}_{n}-P_{{j_{0}}}f\|_{p}^{p}\).
When \(r\leq p\), \(s^{\prime }=s-(\frac{1}{r}-\frac{1}{p})_{+}=s- \frac{1}{r}+\frac{1}{p}\) and \(B^{s}_{r,q}(\mathbb{R})\subset B^{s^{ \prime }}_{p,q}(\mathbb{R})\), one has
$$\begin{aligned} \sup _{f\in B^{s}_{r,q}(\mathbb{R},L)} \Vert P_{{j_{0}}}f-f \Vert _{p} ^{p} \lesssim & \sup _{f\in B^{s^{\prime }}_{p,q}(\mathbb{R},L)} \Vert P_{{j_{0}}}f-f \Vert _{p}^{p}. \end{aligned}$$
By the approximation theorem in Besov spaces and from Theorem 9.4 in [9], one gets
$$ \Vert P_{{j_{0}}}f-f \Vert _{p}^{p}\lesssim 2^{-{j_{0}}s^{\prime }p}. $$
When \(r>p\), because both f and ϕ have compact supports, one can assume that \(\operatorname{supp} (P_{j_{0}}f-f)\subseteq I\) with \(|I|\leq H\). Then Hölder inequality shows
$$ \Vert P_{{j_{0}}}f-f \Vert _{p}^{p}= \int _{I} \bigl\vert P_{{j_{0}}}f(y)-f(y) \bigr\vert ^{p}\,dy\lesssim \Vert P_{{j_{0}}}f-f \Vert _{r}^{p}. $$
Since \(f\in B^{s}_{r,q}(\mathbb{R},L)\), one knows \(\|P_{{j_{0}}}f-f\| _{r}\lesssim 2^{-{j_{0}}s}\). Moreover, \(\|P_{{j_{0}}}f-f\|_{p}^{p} \lesssim 2^{-{j_{0}}sp}\). Note that \(s^{\prime }=s\) for \(r>p\). Then \(\|P_{j_{0}}f-f\|_{p}^{p}\lesssim 2^{-{j_{0}}s^{\prime }p}\). This, together with (12), shows that for \(1\leq p<\infty \),
$$ \Vert P_{{j_{0}}}f-f \Vert _{p}^{p}\lesssim 2^{-{j_{0}}s^{\prime }p}. $$
Next, one estimates \(\mathbf{E}\|\hat{f}^{\mathrm{lin}}_{n}-P_{{j_{0}}}f\|^{p} _{p}\). It is easy to see that
$$ \hat{f}^{\mathrm{lin}}_{n}-P_{{j_{0}}}f=\sum _{k\in K}(\hat{\alpha }_{ {j_{0}},k}-\alpha _{{j_{0}},k}) \phi _{{j_{0}},k} $$
by the definitions of \(\hat{f}^{\mathrm{lin}}_{n}\) and \(P_{{j_{0}}}f\). Furthermore,
$$ \bigl\Vert \hat{f}^{\mathrm{lin}}_{n}-P_{{j_{0}}}f \bigr\Vert ^{p}_{p}\lesssim 2^{{j_{0}}p( \frac{1}{2}-\frac{1}{p})} \sum _{k\in K} \vert \hat{\alpha }_{{j_{0}},k}- \alpha _{{j_{0}},k} \vert ^{p} $$
due to Theorem 1.2. Let \(|K_{0}|\) denote the number of elements in \(K_{0}\). Then \(|K_{0}|\sim 2^{j_{0}}\), because \(K_{0}:=\{k\in Z, \operatorname{supp} f\cap \operatorname{supp} \phi _{{j_{0}},k}\neq \emptyset \}\) and f, ϕ have compact supports. This, together with Lemma 2.2, leads to
$$ \mathbf{E} \bigl\Vert \hat{f}^{\mathrm{lin}}_{n}-P_{{j_{0}}}f \bigr\Vert ^{p}_{p} \lesssim 2^{\frac{ {j_{0}}p}{2}} \mathbf{E} \vert \hat{\alpha }_{{j_{0}},k}-\alpha _{{j_{0}},k} \vert ^{p} \lesssim \biggl(\frac{2^{{j_{0}}}}{n}\biggr)^{\frac{p}{2}}. $$
Substituting (13) and (14) into (11), one obtains
$$ \mathbf{E} \bigl\Vert \hat{f}^{\mathrm{lin}}_{n}-f \bigr\Vert _{p}^{p} \lesssim \biggl( \frac{2^{{j_{0}}}}{n} \biggr)^{\frac{p}{2}}+2^{-{j_{0}}s^{\prime }p}. $$
Taking \(2^{{j_{0}}}\sim n^{\frac{1}{2s^{\prime }+1}}\), the desired conclusion follows. □

3 Nonlinear estimators

In this part, we will give a nonlinear wavelet estimator for \(f(x)\), which is better than the linear one in some cases. The nonlinear (hard thresholding) wavelet estimator is defined as follows:
$$ \hat{f}^{\mathrm{non}}_{n}(y):=\sum _{k\in K_{0}}\hat{\alpha }_{j_{0},k} \phi _{j_{0},k}(y)+ \sum_{j=j_{0}}^{j_{1}}\sum _{k\in K _{j}}\hat{\beta }_{j,k}^{*}\psi _{j,k}(y). $$
Here \(K_{0}=\{k\in \mathbb{Z}, \operatorname{supp} f\cap \operatorname{supp} \phi _{j_{0},k}\neq \emptyset \}\), \(K_{j}=\{k\in \mathbb{Z}, \operatorname{supp} f \cap \operatorname{supp} \psi _{j,k}\neq \emptyset \}\),
$$ \hat{\alpha }_{j_{0},k}=\frac{1}{n}\sum _{i=1}^{n}\phi _{j_{0},k}(X _{i}) \quad \text{and} \quad \hat{\beta }_{j,k}= \frac{1}{n}\sum_{i=1}^{n}\psi _{j,k}(X_{i}) $$
with \(\hat{\beta }_{j,k}^{*}=\hat{\beta }_{j,k}\mathcal{X}\{| \hat{\beta }_{j,k}|>\lambda =c\sqrt{\frac{j}{n}}\}\) while the constant c is determined (later on) by s, r, p and L.

For the wavelet coefficients, we can get the following lemma whose proof is very similar to that of Lemma 2.2 and so we omit it.

Lemma 3.1

Let\(\hat{\beta }_{j,k}\)be defined by (16). Then under the assumptions of Lemma 2.2,
$$ \mathbf{E} \vert \hat{\beta }_{j,k}-\beta _{j,k} \vert ^{p}\lesssim n^{- \frac{p}{2}} $$
for\(1\leq p<\infty \)and\(2^{j}\leq n\).

To prove Lemma 3.3, we need an important inequality.

Lemma 3.2

(Bernstein’s inequality)

Let\(X_{1},\ldots ,X_{n}\)be a sequence of ND random variables such that\(\mathbf{E}(X_{i})=0\), \(\mathbf{E}(X^{2}_{i})=\sigma ^{2}\)and\(|X_{i}|\leq M<\infty \) (\(i=1,\dots ,n\)). Then for each\(v>0\),
$$ \mathbb{P}\Biggl( \Biggl\vert \frac{1}{n}\sum _{i=1}^{n}X_{i} \Biggr\vert >v\Biggr)\leq 2\exp \biggl(-\frac{nv ^{2}}{2(\sigma ^{2}+\frac{vM}{3})} \biggr). $$
This above inequality is well-known, when \(X_{1}, \ldots , X_{n}\) are independent; see Theorem C.1 on page 241 in [9]. We find by checking the details that the same inequality holds for ND samples: In fact, because Theorem C.1 is a direct corollary of Lemma C.1 (page 239), it suffices to prove that lemma for the ND case. Note that
$$ \mathbf{E}\Biggl[\exp \Biggl(\sum_{i=1}^{n}tX_{i} \Biggr)\Biggr]=\prod_{i=1}^{n}\mathbf{E} \bigl[ \exp (tX_{i})\bigr] $$
for an independent sample \(X_{1}, \ldots ,X _{n}\), while
$$ \mathbf{E}\Biggl[\exp \Biggl(\sum_{i=1}^{n}tX_{i} \Biggr)\Biggr]\leq \prod_{i=1}^{n}\mathbf{E} \bigl[ \exp (tX_{i})\bigr] $$
for ND samples, according to Theorem 1.3. Then we only need to replace the equality
$$ \exp (-{\lambda }t)\mathbf{E}\Biggl[\exp \Biggl(\sum _{i=1}^{n}tX_{i}\Biggr)\Biggr]=\exp \Biggl\{ -\Biggl[ {\lambda } t-\sum_{i=1}^{n}\log \mathbf{E}\bigl(e^{tX_{i}}\bigr)\Biggr]\Biggr\} $$
$$ \exp (-{\lambda }t)\mathbf{E}\Biggl[\exp \Biggl(\sum _{i=1}^{n}tX_{i}\Biggr)\Biggr]\leq \exp \Biggl\{ -\Biggl[{\lambda } t-\sum_{i=1}^{n} \log \mathbf{E}\bigl(e^{tX_{i}}\bigr)\Biggr]\Biggr\} $$
on page 240 (line 8–9), in order to complete the proof of Lemma C.1, when \(X_{1}, \ldots , X_{n}\) are ND.

Lemma 3.3

Let\(\hat{\beta }_{j,k}\)be given by (16). Under the assumptions of Lemma 2.2and if\(j2^{j}\leq n\), then for each\(\omega >0\), there exists\(c>0\)such that
$$ \mathbb{P}\biggl( \vert \hat{\beta }_{j,k}-\beta _{j,k} \vert >\lambda =c\sqrt{ \frac{j}{n}}\biggr)\lesssim 2^{-\omega j}. $$


It is easy to see that
$$ \vert \hat{\beta }_{j,k}-\beta _{j,k} \vert = \frac{1}{n} \Biggl\vert \sum_{i=1}^{n} \bigl[ \psi _{j,k}(X_{i})-\beta _{j,k}\bigr] \Biggr\vert . $$
$$ I:=\mathbb{P}\bigl( \vert \hat{\beta }_{j,k}-\beta _{j,k} \vert >\lambda \bigr) =\mathbb{P}\Biggl( \frac{1}{n} \Biggl\vert \sum _{i=1}^{n}\bigl[\psi _{j,k}(X_{i})- \beta _{j,k}\bigr] \Biggr\vert > \lambda \Biggr). $$
In order to estimate I, denote \(\eta _{i}:=\psi _{j,k}(X_{i})-\beta _{j,k}\) (\(i=1,2,\ldots ,n\)). Then
$$ I=\mathbb{P}\Biggl(\frac{1}{n} \Biggl\vert \sum _{i=1}^{n}\eta _{i} \Biggr\vert >\lambda \Biggr). $$
Since ψ is a function of BV, \(\psi :=\tilde{\psi }-\bar{\psi }\), where ψ̃ and ψ̄ are bounded, nonnegative and nondecreasing functions. Denote
$$ \tilde{\beta }_{j,k}:= \int \tilde{\psi }_{j,k}(x)f(x)\,dx, \qquad \bar{\beta }_{j,k}:= \int \bar{\psi }_{j,k}(x)f(x)\,dx, $$
$$ \tilde{\eta }_{i}:=\tilde{\psi }_{j,k}(X_{i})- \tilde{\beta }_{j,k}, \qquad \bar{\eta }_{i}:=\bar{\psi }_{j,k}(X_{i})-\bar{\beta }_{j,k}. $$
Then \(\beta _{j,k}=\tilde{\beta }_{j,k}-\bar{\beta }_{j,k}\), \(\eta _{i}= \tilde{\eta }_{i}-\bar{\eta }_{i}\) and
$$\begin{aligned} I =&\mathbb{P} \Biggl(\frac{1}{n} \Biggl\vert \sum_{i=1}^{n}(\tilde{\eta } _{i}- \bar{\eta }_{i}) \Biggr\vert >{\lambda } \Biggr) \\ \leq &\mathbb{P} \Biggl(\frac{1}{n} \Biggl\vert \sum _{i=1}^{n} \tilde{\eta }_{i} \Biggr\vert >\frac{\lambda }{2} \Biggr) +\mathbb{P} \Biggl(\frac{1}{n} \Biggl\vert \sum_{i=1}^{n}\bar{\eta }_{i} \Biggr\vert >\frac{ \lambda }{2} \Biggr). \end{aligned}$$
Note that \(\tilde{\eta }_{1}, \ldots , \tilde{\eta }_{n}\) are ND thanks to the monotonicity of ψ̃ and Theorem 1.3. On the other hand, \(\mathbf{E}\tilde{\eta }_{i}=0\), \(\mathbf{E}(\tilde{\eta }_{i})^{2} \lesssim 1\) and \(|\tilde{\eta }_{i}|\lesssim 2^{\frac{j}{2}}\). Using Bernstein’s inequality, one obtains that
$$\begin{aligned} \mathbb{P} \Biggl(\frac{1}{n} \Biggl\vert \sum _{i=1}^{n}\tilde{\eta }_{i} \Biggr\vert > \frac{ \lambda }{2}=\frac{c}{2}\sqrt{\frac{j}{n}} \Biggr)\leq 2\exp \biggl(-\frac{c ^{2}j}{C(1+c\sqrt{\frac{j2^{j}}{n}})} \biggr) \end{aligned}$$
for some fixed constant \(C>0\). Due to \(j2^{j}\leq n\), one can take \(c>0\) such that \(\frac{c^{2}}{C(1+c)}\geq \omega \) and
$$ \mathbb{P} \Biggl(\frac{1}{n} \Biggl\vert \sum _{i=1}^{n}\tilde{\eta }_{i} \Biggr\vert >\frac{ \lambda }{2} \Biggr)\lesssim 2^{-\omega j}. $$
Similarly, \(\mathbb{P} (\frac{1}{n} |\sum_{i=1}^{n}\bar{ \eta }_{i}|>\frac{\lambda }{2} )\lesssim 2^{-\omega j}\). This, with (18) and (17), leads to
$$ I=\mathbb{P} \Biggl(\frac{1}{n} \Biggl\vert \sum _{i=1}^{n}\eta _{i} \Biggr\vert >\lambda \Biggr)\lesssim 2^{-\omega j}. $$
The desired conclusion follows. □

Theorem 3.1

Let\(f(x)\in B^{s}_{r,q}( \mathbb{R},L)\) (\(s>\frac{1}{r}\), \(r, q \geq 1\)), and let\(\hat{f}^{\mathrm{non}} _{n}\)be defined by (15). Under the assumptions of Lemma 2.2, for each\(1\leq p<\infty \), \(s^{\prime }:=s-(\frac{1}{r}-\frac{1}{p})_{+}\)and\(x_{+}=\max (x, 0)\), there exist\(\theta _{i}\in \mathbb{R}\) (\(i=1, 2, 3\)) such that
$$ \sup _{f\in B^{s}_{r,q}(\mathbb{R},L)}\mathbf{E} \bigl\Vert \hat{f}^{\mathrm{non}} _{n}-f \bigr\Vert _{p}^{p} \lesssim \textstyle\begin{cases} (\ln n)^{{\theta }_{1}}n^{-\frac{sp}{2s+1}} , & \frac{p}{2s+1}< r< p, \\ (\ln n)^{\theta _{2}}(\frac{\ln n}{n})^{\frac{s'p}{2(s-1/r)+1}} , & r= \frac{p}{2s+1}, \\ (\ln n)^{\theta _{3}}(\frac{\ln n}{n})^{\frac{s'p}{2(s-1/r)+1}}, & r< \frac{p}{2s+1}. \end{cases} $$


$$ \hat{f}_{n}^{\mathrm{non}}-f=\bigl(\hat{f}^{\mathrm{lin}}_{n}-P_{{j_{0}}}f \bigr)+(P_{j_{1}+1}f-f)+ \sum_{j=j_{0}}^{j_{1}} \sum_{k\in K_{j}}\bigl(\hat{\beta } ^{*}_{jk}- \beta _{jk}\bigr)\psi _{jk}. $$
$$ \mathbf{E} \bigl\Vert \hat{f}_{N}^{\mathrm{non}}-f^{X} \bigr\Vert ^{p}_{p}\lesssim T_{1}+T_{2}+T _{3}, $$
where \(T_{1}:=\mathbf{E}\|\hat{f}^{\mathrm{lin}}_{n}-P_{{j_{0}}}f\|^{p}_{p}\), \(T _{2}:=\|P_{j_{1}+1}f-f\|^{P}_{p}\) and \(T_{3}:=\mathbf{E}\|\sum_{j=j_{0}}^{j_{1}}\sum_{k\in K_{j}}(\hat{\beta }^{*} _{jk}-\beta _{jk})\psi _{jk}\|^{p}_{p}\). By (13) and (14),
$$ T_{1}\lesssim \biggl(\frac{2^{{j_{0}}}}{n} \biggr)^{\frac{p}{2}} \quad \text{and} \quad T_{2}\lesssim 2^{-j_{1}s'p}. $$
For estimating \(T_{3}\), one uses Minkowski and Jensen’s inequalities to get
$$ \Biggl\Vert \sum_{j=j_{0}}^{j_{1}}\sum _{k\in K_{j}}\bigl(\hat{\beta } ^{*}_{j,k}-\beta _{j,k}\bigr)\psi _{j,k} \Biggr\Vert ^{p}_{p} \leq (j_{1}-j_{0}+1)^{p-1} \sum _{j=j_{0}}^{j_{1}} \biggl\Vert \sum _{k\in K_{j}}\bigl(\hat{\beta }^{*}_{j,k}- \beta _{j,k}\bigr)\psi _{j,k} \biggr\Vert ^{p}_{p}. $$
This, together with Theorem 1.2, leads to
$$ T_{3}\leq (j_{1}-j_{0}+1)^{p-1} \mathbf{E}\sum_{j=j_{0}}^{j_{1}}2^{j( \frac{p}{2}-1)} \biggl(\sum_{k\in K_{j}} \bigl\vert \hat{\beta }^{*}_{j,k}-\beta _{j,k} \bigr\vert ^{p} \biggr). $$
Since \(\hat{\beta }^{*}_{j,k}=\delta ^{H}(\hat{\beta }_{j,k},\lambda )\),
$$\begin{aligned} \bigl\vert \hat{\beta }^{*}_{j,k}-\beta _{j,k} \bigr\vert ^{p} =& \vert \hat{\beta }_{j,k}-\beta _{j,k} \vert ^{p}[\mathcal{X}_{\{ \vert \hat{\beta }_{j,k} \vert >\lambda , \vert \beta _{j,k} \vert < \frac{\lambda }{2}\}} + \mathcal{X}_{\{ \vert \hat{\beta }_{j,k} \vert >\lambda , \vert \beta _{j,k} \vert \geq \frac{\lambda }{2}\}}] \\ &{}+ \vert \beta _{j,k} \vert ^{p}[\mathcal{X}_{\{ \vert \hat{\beta }_{j,k} \vert \leq \lambda , \vert \beta _{j,k} \vert >2\lambda \}} +\mathcal{X}_{\{ \vert \hat{\beta }_{j,k} \vert \leq \lambda , \vert \beta _{j,k} \vert \leq 2\lambda \}}]. \end{aligned}$$
$$\begin{aligned} T_{3} \lesssim & (j_{1}-j_{0}+1)^{p-1} \Biggl\{ \mathbf{E} \sum_{j=j _{0}}^{j_{1}}2^{j(\frac{p}{2}-1)} \sum_{k\in K_{j}} \vert \hat{\beta }_{j,k}-\beta _{j,k} \vert ^{p}[\mathcal{X}_{\{ \vert \hat{\beta }_{j,k} \vert > \lambda , \vert \beta _{j,k} \vert < \frac{\lambda }{2}\}} \\ &{}+\mathcal{X}_{\{ \vert \hat{\beta }_{j,k} \vert >\lambda , \vert \beta _{j,k} \vert \geq \frac{ \lambda }{2}\}}]+\mathbf{E}\sum_{j=j_{0}}^{j_{1}}2^{j( \frac{p}{2}-1)} \sum_{k\in K_{j}} \vert \beta _{j,k} \vert ^{p}[\mathcal{X} _{\{ \vert \hat{\beta }_{j,k} \vert \leq \lambda , \vert \beta _{j,k} \vert > 2\lambda \}} \\ &{}+\mathcal{X}_{\{ \vert \hat{\beta }_{j,k} \vert \leq \lambda , \vert \beta _{j,k} \vert \leq 2\lambda \}}]\Biggr\} . \end{aligned}$$
When \(|\hat{\beta }_{jk}|>\lambda \) and \(|\beta _{jk}|< \frac{\lambda }{2}\), \(|\hat{\beta }_{jk}-\beta _{jk}|\geq |\hat{\beta }_{jk}|-|\beta _{jk}|> \frac{\lambda }{2}\), one has
$$ I_{\{|\hat{\beta }_{jk}|>\lambda ,|\beta _{jk}| < \frac{\lambda }{2}\}} \leq I_{\{|\hat{\beta }_{jk}-\beta _{jk}| >\frac{\lambda }{2}\}}. $$
Similarly, when \(|\hat{\beta }_{jk}|\leq \lambda \) and \(|\beta _{jk}|> 2\lambda \), \(|\hat{\beta }_{jk}|\leq \lambda <\frac{|\beta _{jk}|}{2}\). Hence,
$$ \vert \hat{\beta }_{jk}-\beta _{jk} \vert \geq \vert \beta _{jk} \vert - \vert \hat{\beta }_{jk} \vert > \frac{ \vert \beta _{jk} \vert }{2}> \lambda \quad \text{and} \quad \vert \beta _{jk} \vert < 2 \vert \hat{\beta }_{jk}-\beta _{jk} \vert . $$
$$ \vert \beta _{jk} \vert ^{p}I_{\{ \vert \hat{\beta }_{jk} \vert \leq \lambda , \vert \beta _{jk} \vert > 2 \lambda \}}\lesssim \vert \hat{\beta }_{jk}-\beta _{jk} \vert ^{p} I_{\{ \vert \hat{\beta }_{jk}-\beta _{jk} \vert >\frac{\lambda }{2}\}}. $$
Then (22) reduces to
$$ T_{3}\lesssim T_{31}+T_{32}+T_{33}, $$
$$\begin{aligned}& T_{31}:=(j_{1}-j_{0}+1)^{p-1}\mathbf{E} \sum_{j=j_{0}}^{j_{1}}2^{j( \frac{p}{2}-1)}\sum _{k\in K_{j}} \vert \hat{\beta }_{j,k}-\beta _{j,k} \vert ^{p} \mathcal{X}_{\{ \vert \hat{\beta }_{j,k}-\beta _{j,k} \vert >\frac{\lambda }{2}\}}, \\& T_{32}:=(j_{1}-j_{0}+1)^{p-1}\mathbf{E} \sum_{j=j_{0}}^{j_{1}}2^{j( \frac{p}{2}-1)}\sum _{k\in K_{j}} \vert \hat{\beta }_{j,k}-\beta _{j,k} \vert ^{p} \mathcal{X}_{\{ \vert \beta _{j,k} \vert \geq \frac{\lambda }{2}\}} \end{aligned}$$
and \(T_{33}:=(j_{1}-j_{0}+1)^{p-1} \sum_{j=j_{0}}^{j_{1}}2^{j( \frac{p}{2}-1)}\sum_{k\in K_{j}}|\beta _{j,k}|^{p}\mathcal{X} _{\{|\beta _{j,k}| \leq 2\lambda \}}\).
In order to estimate \(T_{31}\), first one assumes \(\frac{1}{q}+ \frac{1}{q'}=1\). Then Jensen’s inequality shows that
$$\begin{aligned} T_{31} \leq &(j_{1}-j_{0}+1)^{p-1}\sum _{j=j_{0}}^{j_{1}}2^{j( \frac{p}{2}-1)}\sum _{k\in K_{j}}\bigl[\mathbf{E} \vert \hat{\beta }_{j,k}- \beta _{j,k} \vert ^{qp}\bigr]^{\frac{1}{q}} \bigl[ \mathbf{E}(\mathcal{X}_{\{ \vert \hat{\beta }_{j,k}-\beta _{j,k} \vert > \frac{\lambda }{2}\}})^{q'}\bigr]^{ \frac{1}{q'}} \\ \leq & (j_{1}-j_{0}+1)^{p-1}\sum _{j=j_{0}}^{j_{1}}2^{j( \frac{p}{2}-1)}\sum _{k\in K_{j}}\bigl(\mathbf{E} \vert \hat{\beta }_{j,k}- \beta _{j,k} \vert ^{qp}\bigr)^{\frac{1}{q}} \biggl[P\biggl( \vert \hat{\beta }_{j,k}-\beta _{j,k} \vert > \frac{ \lambda }{2}\biggr)\biggr]^{\frac{1}{q'}}. \end{aligned}$$
This, together with Lemmas 3.1 and 3.3, leads to
$$\begin{aligned} T_{31} \lesssim & (j_{1}-j_{0}+1)^{p-1} \sum_{j=j_{0}}^{j_{1}}2^{j( \frac{p}{2}-1)}2^{j}n^{-\frac{p}{2}}2^{-\frac{\omega j}{q'}}=(j_{1}-j _{0}+1)^{p-1}n^{-\frac{p}{2}}\sum _{j=j_{0}}^{j_{1}}2^{j( \frac{p}{2}-\frac{\omega }{q'})} \\ \lesssim & (j_{1}-j_{0}+1)^{p-1}n^{-\frac{p}{2}}2^{j_{0}(\frac{p}{2}-\frac{ \omega }{q'})} \leq (j_{1}-j_{0}+1)^{p-1}n^{-\frac{p}{2}}2^{\frac{j _{0}p}{2}} \end{aligned}$$
by choosing ω such that \(\frac{p}{2}<\frac{\omega }{q'}\).
It is easy to see that \(\|\beta _{j\cdot }\|_{r}\lesssim 2^{-j(s+ \frac{1}{2}-\frac{1}{r})}\) thanks to Theorem 1.1. Combining this with Lemma 3.1 and \(\mathcal{X}_{\{|\beta _{j,k}|\geq \frac{\lambda }{2}\}} \leq (\frac{|\beta _{j,k}|}{\frac{\lambda }{2}})^{r}\), one has
$$\begin{aligned} T_{32} \lesssim &(j_{1}-j_{0}+1)^{p-1} \sum_{j=j_{0}}^{j_{1}} 2^{j( \frac{p}{2}-1)} \sum _{k\in K_{j}}n^{-\frac{p}{2}} \biggl\vert \frac{\beta _{j,k}}{\frac{ \lambda }{2}} \biggr\vert ^{r} \\ \lesssim & (j_{1}-j_{0}+1)^{p-1}n^{-\frac{p}{2}} \sum_{j=j_{0}} ^{j_{1}} \lambda ^{-r}2^{j(\frac{p-r}{2}-rs)}. \end{aligned}$$
Similarly, it can be shown that
$$\begin{aligned} T_{33} \leq & (j_{1}-j_{0}+1)^{p-1} \sum_{j=j_{0}}^{j_{1}}2^{j( \frac{p}{2}-1)}\sum _{k\in K_{j}} \vert \beta _{j,k} \vert ^{p} \biggl(\frac{2 \lambda }{ \vert \beta _{j,k} \vert }\biggr)^{p-r} \\ \lesssim &(j_{1}-j_{0}+1)^{p-1}\sum _{j=j_{0}}^{j_{1}}{\lambda }^{p-r}2^{j(\frac{p-r}{2}-rs)} \end{aligned}$$
due to \(r< p\) and \(\mathcal{X}_{\{|\beta _{j,k}|\leq 2\lambda \}}\leq (\frac{2 \lambda }{|\beta _{j,k}|})^{p-r}\).
$$ 2^{j_{0}}\sim \textstyle\begin{cases} [(\ln n)^{\frac{p-r}{r}}n]^{\frac{1}{2s+1}}, & r>\frac{p}{2s+1}, \\ n^{\frac{1-2/p}{2(s-1/r)+1}} , & r\leq \frac{p}{2s+1}, \end{cases}\displaystyle \quad \text{and}\quad 2^{j_{1}} \sim \textstyle\begin{cases} n^{\frac{s}{s'(2s+1)}}, & r>\frac{p}{2s+1}, \\ (n/\ln n)^{\frac{1}{2(s-1/r)+1}} , & r\leq \frac{p}{2s+1}. \end{cases} $$
Then \(j_{0}< j_{1}\), \(j_{1}-j_{0}\sim \ln n\) and for \(j_{0}\leq j\leq j _{1}\), \(\lambda :=c\sqrt{\frac{j}{n}}\sim c\sqrt{\frac{\ln n}{n}}\). Moreover, (24) and (25) reduce to
$$ T_{32}\lesssim (j_{1}-j_{0}+1)^{p-1}n^{\frac{r-p}{2}}( \ln n)^{- \frac{r}{2}} \bigl[2^{j_{0}\xi }\mathcal{X}_{\{\xi < 0\}} +(j_{1}-j_{0}+1) \mathcal{X}_{\{\xi =0\}}+2^{j_{1}\xi } \mathcal{X}_{\{\xi >0\}}\bigr] $$
$$ T_{33}\lesssim (j_{1}-j_{0}+1)^{p-1} \biggl(\frac{\ln n}{n}\biggr)^{\frac{p-r}{2}}\bigl[2^{j _{0}\xi } \mathcal{X}_{\{\xi < 0\}}+ (j_{1}-j_{0}+1) \mathcal{X}_{\{ \xi =0\}}+2^{j_{1}\xi }\mathcal{X}_{\{\xi >0\}}\bigr], $$
where \(\xi =\frac{p-r}{2}-rs\).
Note that \(\xi \geq 0\) holds if and only if \(r\leq \frac{p}{2s+1}\). Then substituting (26) into (23), (27) and (28), one obtains
$$ T_{3}\lesssim T_{31}+T_{32}+T_{33} \lesssim \textstyle\begin{cases} (\ln n)^{{\theta }_{1}}n^{-\frac{sp}{2s+1}} , & \frac{p}{2s+1}< r< p, \\ (\ln n)^{\theta _{2}}(\frac{\ln n}{n})^{\frac{s'p}{2(s-1/r)+1}} , & r= \frac{p}{2s+1}, \\ (\ln n)^{\theta _{3}}(\frac{\ln n}{n})^{\frac{s'p}{2(s-1/r)+1}}, & r< \frac{p}{2s+1}. \end{cases} $$
Similarly, it is easy to check that
$$ T_{1}+T_{2}\lesssim \textstyle\begin{cases} n^{-\frac{sp}{2s+1}} , & \frac{p}{2s+1}< r< p, \\ (\frac{\ln n}{n})^{\frac{s'p}{2(s-1/r)+1}}, & r\leq \frac{p}{2s+1} \end{cases} $$
by (26) and (21). Finally, the desired conclusion (19) follows from (20), (29), and (30). □

Remark 3.1

From Theorems 2.1 and 3.1, we easily found that our results are consistent with those in [6] for independent samples.

Remark 3.2

In [7], Doosti and Chaubey provided a convergence rate of \(n^{-\frac{s'p}{2s'+1}}\) for ND samples, which is a little weaker than \(n^{-\frac{sp}{2s+1}}\) in Theorem 3.1 for \(r< p\) (note that \(s< s'\) when \(r< p\)).



The author would like to thank three referees and the editor for their important comments and suggestions, which substantially improved the manuscript.

Authors’ contributions

JX completed this work. The author read and approved the final manuscript.


This work is supported by the National Natural Science Foundation of China (No. 11626034, 61602010), and the Projection of Baoji University of Arts and Sciences (No. ZK16051).

Competing interests

The authors declare that they have no competing interests.


  1. 1.
    Asadian, N., Fakoor, V., Bozorgnia, A.: Rosenthal’s type inequalities for negatively orthant dependent random variables. J. Iran. Stat. Soc. 5, 69–75 (2006) zbMATHGoogle Scholar
  2. 2.
    Block, H.W., Savits, T.H., Shaked, M.: Some concepts of negative dependence. Ann. Probab. 10, 765–772 (1982) MathSciNetCrossRefGoogle Scholar
  3. 3.
    Bozorgnia, A., Patterson, R.F., Taylor, R.L.: Limit Theorems for ND. University of Georgia, Athens (1993) Google Scholar
  4. 4.
    Chesneau, C., Dewan, I., Doosti, H.: Wavelet linear density estimation for associated stratified size-biased sample. J. Nonparametr. Stat. 2, 429–445 (2012) MathSciNetCrossRefGoogle Scholar
  5. 5.
    Daubechies, I.: Ten Lectures on Wavelets. SIAM, Philadelphia (1992) CrossRefGoogle Scholar
  6. 6.
    Donoho, D.L., Johnstone, I.M., Kerkyacharian, G., Picard, D.: Density estimation by wavelet thresholding. Ann. Stat. 2, 508–539 (1996) MathSciNetzbMATHGoogle Scholar
  7. 7.
    Doosti, H., Chaubey, Y.P.: Wavelet linear density estimation for negatively dependent random variables. Curr. Dev. Theory Appl. Wavelets 1(1), 57–64 (2007) MathSciNetzbMATHGoogle Scholar
  8. 8.
    Doosti, H., Fakoor, V., Chaubey, Y.P.: Wavelet linear density estimation for negative associated sequences. J. Indian Stat. Assoc. 44, 127–136 (2006) Google Scholar
  9. 9.
    Härdle, W., Kerkyacharian, G., Picard, D., Tsybakov, A.: Wavelets, Approximations, and Statistical Applications. Lecture Notes in Statistics. Springer, Berlin (1998) CrossRefGoogle Scholar
  10. 10.
    Joag-dev, K., Proschan, F.: Negative association of random variables with application. Ann. Stat. 11, 286–295 (1983) MathSciNetCrossRefGoogle Scholar
  11. 11.
    Kou, J.K., Guo, H.J.: Wavelet density estimation for mixing and size-biased data. J. Inequal. Appl. 2018, 189 (2018) MathSciNetCrossRefGoogle Scholar
  12. 12.
    Liu, Y.M., Xu, J.L.: Wavelet density estimation for negatively associated stratified size-biased sample. J. Nonparametr. Stat. 26, 537–554 (2014) MathSciNetCrossRefGoogle Scholar
  13. 13.
    Newman, C.M.: Asymptotic independence and limit theorems for positively and negatively dependent random variables. Inequa. Stat. Probab. 5, 127–140 (1984) MathSciNetCrossRefGoogle Scholar
  14. 14.
    Sung, S.H.: A note on the complete convergence for weighted sums of negatively dependent random variables. J. Inequal. Appl. 2012, 158 (2012) MathSciNetCrossRefGoogle Scholar
  15. 15.
    Wu, Q.Y.: Complete convergence for negatively dependent sequences of random variables. J. Inequal. Appl. 2010, Article ID 507293 (2010) MathSciNetCrossRefGoogle Scholar
  16. 16.
    Zhang, L.X.: Rosenthal’s inequalities for independent and negatively dependent random variables under sub-linear expectations with applications. Sci. China Math. 59(4), 751–768 (2016) MathSciNetCrossRefGoogle Scholar

Copyright information

© The Author(s) 2019

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Authors and Affiliations

  1. 1.School of Mathematics and Information ScienceBaoji University of Arts and SciencesBaojiP.R. China

Personalised recommendations