# Jackknife variance estimation for general two-sample statistics and applications to common mean estimators under ordered variances

- 115 Downloads

## Abstract

We study the jackknife variance estimator for a general class of two-sample statistics. As a concrete application, we consider samples with a common mean but possibly different, ordered variances as arising in various fields such as interlaboratory experiments, field studies, or the analysis of sensor data. Estimators for the common mean under ordered variances typically employ random weights, which depend on the sample means and the unbiased variance estimators. They take different forms when the sample estimators are in agreement with the order constraints or not, which complicates even basic analyses such as estimating their variance. We propose to use the jackknife, whose consistency is established for general smooth two-sample statistics induced by continuously Gâteux or Fréchet-differentiable functionals, and, more generally, asymptotically linear two-sample statistics, allowing us to study a large class of common mean estimators. Furthermore, it is shown that the common mean estimators under consideration satisfy a central limit theorem (CLT). We investigate the accuracy of the resulting confidence intervals by simulations and illustrate the approach by analyzing several data sets.

## Keywords

Common mean Data science Central limit theorem Gâteaux derivative Fréchet differentiability Graybill–Deal estimator Jackknife Order constraint Resampling## 1 Introduction

We study the jackknife variance estimation methodology for a wide class of two-sample statistics including asymptotically linear statistics and statistics induced by differentiable two-sample statistical functionals, thus extending the well-studied one-sample results; see Efron and Stein (1981), Shao and Wu (1989), and Steland (2015) and the discussion below. Comparison of two samples by some statistic is a classical statistical design and also widely applied to analyze massive data arising in data science, e.g., when exploring such data by analyzing subsets. Our specific motivation comes from the following classical common mean estimation problem: in many applications, several samples of measurements with common mean but typically with different degrees of uncertainties (in the sense of variances) are drawn. This setting generally arises when using competing measurement systems of different quality or if the factor variable defining the samples affects the dispersion. For example, when checking the octane level in gasoline at pump stations, inspectors use cheap hand-held devices to collect many low precision measurements and send only a few samples to government laboratories for detailed analyses of high precision. In both cases, the same mean octane level is measured, but the variance and the shape of the distribution may differ. The issue of samples with common mean but heterogeneous, ordered variances also arises in big data applications, for example (e.g.,) when processing data from image sensors, see Degerli (2000) and Lin (2010), or accelerometors, Cemer (2011), as used in smartphones or specialized measurement systems. Here, the thermo-mechanical (or Brownian) noise represents a major source of noise and depends on temperature. Therefore, in the presence of a constant signal, samples taken under different conditions exhibit different variances and the order constraint is related to temperature. It is worth mentioning that the general statistical problem how to combine estimators from different samples dates back to the works of Fisher (1932), Tippett (1931) and Cochran (1937), cf. the discussion given in Keller and Olkin (2004). In the present article, we study the classical problem to estimate the common mean in the presence of ordered variances and propose to use the a two-sample jackknife variance estimator to assess the uncertainty.

The jackknife is easy to use and feasible for big data problems, since its computational costs are substantially lower than the other techniques such as the bootstrap. Indeed, our simulations indicate that it also provides substantially higher accuracy of confidence intervals than the bootstrap for the common mean estimation problem. We, therefore, extend the jackknife methodology for smooth statistical functionals to a general two-sample framework and establish a new result, which holds as long as the statistic of interest can be approximated by a linear statistic. This result goes beyond the case of smooth statistics induced by continuously differentiable functionals and allows us to treat a large class of common mean estimators.

Since the jackknife has not yet been studied for two-sample settings, we establish its consistency and asymptotic unbiasedness for possibly nonlinear but asymptotically linear two-sample statistics. We introduce a specific new jackknife variance estimator for two samples with possibly unequal sample sizes \(n_1\) and \(n_2\), which is based on \(n_1 + n_2\) leave-one-out replicates. In addition, for equal sample sizes, we study an alternative procedure which generates replicates by leaving out pairs of observations. Both jackknife estimators are shown to be weakly consistent and asymptotically unbiased. Those general results allow us to show that the jackknife consistently estimates the variance of a large class of common mean estimators including many of those proposed in the literature. They are, however, also interesting in their own right. First, because we provide conditions which are easier to verify in cases where the statistic of interest is not induced by a smooth statistical functional, e.g., due to discontinuities as arising in the common mean estimation problem. Second, since we provide a proof using elementary arguments avoiding the calculus of differentiable statistical functionals. Finally, our approach shows that the pseudo-values are consistent estimates of the summands of the asymptotically equivalent linear statistic. In addition to these results, we also extend the known consistency results for continuously Gâteaux- and Fréchet-differentiable statistical functionals addressing one-sample settings to two-sample settings resulting in a comprehensive treatment of the two-sample jackknife methodology.

For the common mean estimation problem, which we studied in depth as a non-trivial application, several common mean estimators have been discussed in the literature. For unequal variances, the Graybill–Deal (GD) estimator may be used, which weights the sample averages with the inverse unbiased variance estimates. If, however, an order constraint on the variances is imposed, several estimators have been proposed which dominate the GD estimator, see especially Nair (1982), Elfessi and Pal (1992), Chang and Shinozaki (2008), and Chang et al. (2012). Those estimators are given by convex combinations of the sample means with weights, additionally depending on whether the ordering of sample variances is in agreement with the order constraint on the variances.

When random weights are used, even the calculation of the variance of such a common mean estimator is a concern and has been only studied under the assumption of Gaussian samples for certain special cases such as the GD estimator. As a way out, we propose to employ the jackknife variance estimator of Quenouille (1949) and Tukey (1958), which basically calculates the sample variance of a pseudo-sample obtained by leave-one-out replicates of the statistic of interest. It has been extensively studied for the case of one-sample problems by Efron and Stein (1981) and Shao and Wu (1989), the latter for asymptotically linear statistics, and recently by Steland (2015) for the case of vertically weighted sample averages. Compared to the other approaches, the jackknife variance estimator has the advantage not to give underbiased estimates, cf. (Efron 1982, p. 42) and Efron and Stein (1981). It also made its way in the recent textbooks devoted to computer age statistics and data science (see Efron and Hastie (2016)).

A further issue studied in this paper is the asymptotic distribution of common mean estimators. We show that common mean estimators using random weights are asymptotically normal under fairly weak conditions, which are satisfied by those estimators studied in the literature. Combining this result with the jackknife variance estimators allows us to construct asymptotic confidence intervals and to test statistical hypotheses about the common mean. In Steland (2017), that approach was applied to real data from photovoltaics, compared with the other methods and investigated by data-driven simulations using distributions arising in that field. It was found that confidence intervals based on the proposed methodology have notably higher accuracy in terms of the coverage probability, thus providing a real world example where the approach improves upon the existing ones. In this paper, we broaden these data-driven results by a simulations study investigating distributions with different tail behavior.

The organization of the paper is as follows. Section 2 studies the jackknife variance estimator. Two-sample statistics induced by Gâteaux- and Fréchet-differentiable functionals are studied as well as asymptotically linear two-sample statistics. In Sect. 3, we introduce and review the common mean estimation problem for unequal and ordered variances for two samples and discuss related results from the literature. Section 4 presents the results about the jackknife variance estimation for common mean estimators with random weights and provides a central limit theorem. Finally, Sect. 5 presents the simulations and analyzes three data sets from physics, technology, and social information, to illustrate the approach.

## 2 The jackknife for two-sample statistics

The Quenouille–Tukey jackknife, see Quenouille (1949), Tukey (1958) and Miller (1974), is a simple and effective resampling technique for bias and variance estimation, see also the monograph Efron (1982). For a large class of one-sample statistics, the consistency of the jackknife variance estimator has been studied in depth by Shao and Wu (1989) and recently in Steland (2015). Lee (1991) studied jackknife variance estimation for a one-way random-effects model by simulations. In the present section, we extend the jackknife to quite general two-sample statistics. To the best of our knowledge, the results of this section are new and extend the existing theoretical investigations.

We shall first discuss the case of *asymptotically linear* two-sample statistics considering the cases of unequal and equal sample sizes separately, as it turns out that for equal sample sizes one may define a simpler jackknife variance estimator. Then, we study the case of two-sample statistics induced by a two-sample differentiable statistical functional extending the known results for the one-sample setting.

*asymptotically linear*.

*asymptotic variance of*\(U_n\). By (1), \(T_n\) inherits its asymptotic variance denoted \(\sigma ^2(T)\) from \(L_n\), and thus we obtain

### 2.1 Unequal samples sizes

*i*th observation is denoted by

### Theorem 2.1

*Assume that*\(T_n\)

*satisfies*(1), (2),

*and*(3).

*If, in addition, the remainder term satisfies*

*as*\(n \rightarrow \infty\),

*then the following assertions hold*.

- (i)
*For each*\(1 \le i \le n\):$$\begin{aligned} E({\widehat{ \xi }}_i - \xi _i)^2 = o(1), \end{aligned}$$*where*\(\xi _i = h_1(X_i)\),*if*\(1\le i \le n_1\),*and*\(\xi _i = h_2(X_i)\),*if*\(n_1+1\le i \le n\). - (ii)\({\widehat{ \sigma }}^2_n(T)\)
*is a consistent and asymptotically unbiased estimator for the asymptotic variance*\(\sigma ^2(T)\)*of*\(T_n\), that is:$$\begin{aligned} \left| \frac{ {\widehat{ \sigma }}^2_n(T) }{ \sigma ^2(T) } - 1 \right| \rightarrow 0, \end{aligned}$$*in probability, as*\(\min (n_1,n_2) \rightarrow \infty\)*and*$$\begin{aligned} \left| \frac{ E{\widehat{ \sigma }}^2_n(T) }{ \sigma ^2(T) } - 1 \right| \rightarrow 0, \end{aligned}$$*as*\(\min (n_1,n_2) \rightarrow \infty\).*The associated jackknife variance estimator of*\({\mathrm{Var\,}}( T_n )\)*shares the above consistency properties.*

### *Remark 2.1*

Observe that the examples for (8) and (9) to hold are \(h_i(x) = x - \mu _i\) (arithmetic means) and \(h_i(x) = (x-\mu _i)^2 - \sigma _i^2\) (sample variances, as verified in the Appendix). The conditions on the second moment of the remainder term in (8) and (9) can be interpreted as measures of smoothness of \(T_n\). They have been also employed by Shao and Wu (1989), cf. their Theorem 1, to study jackknife variance estimation for one-sample statistics, but our proof is quite different from the methods of proof used there. Especially, we show that, by virtue of condition (9), the summands of the asymptotic linearization, \(L_n\), i.e., the random variables \(h_1(X_i)\), \(i = 1, \dots , n_1\), and \(h_2(X_i)\), \(i = n_1 + 1 ,\dots , n_1+n_2\), can be estimated consistently. To the best of our knowledge, this interesting and useful result has not yet been established in the literature.

### *Proof*

*i*th observation, say \(i \le n_1\), and calculating the statistic from the resulting sample of size \(n-1\), we have the following:

*last*observation of the first sample is omitted. In the same vain, when omitting an arbitrary observation of second sample, the resulting decomposition of the statistic is equal in distribution to the decomposition obtained when omitting the last observation of the second sample. In particular, the second moment of the remainder term corresponding to \(T_{n,-i}\) does not depend on

*i*. By (1), we can write \(T_n\) as follows:

*i*th observation. This means that \(T_{n,-i} = L_{n,-i} + R_{n,-i}\) with \(L_{n,-i} = \frac{n_1-1}{n-1} \frac{1}{n_1-1} \sum _{j=1, j \not = i}^{n_1} h_1(X_j) + \frac{n_2}{n-1} \frac{1}{n_2} \sum _{j=n_1+1}^{n-1} h_2(X_j)\), if \(1 \le i \le n_1\) and \(L_{n,-i} = \frac{n_1}{n-1} \frac{1}{n_1} \sum _{j=1}^{n_1} h_1(X_j) + \frac{n_2-1}{n-1} \frac{1}{n_2-1} \sum _{j=n_1+1, j \not =i}^{n} h_2(X_j)\), if \(n_1+1 \le i \le n\), and \(R_{n,-i} = T_{n,-i} - L_{n,-i}\) for \(i = 1, \dots , n\). By assumption, \(n^2E(R_{n,-i})^2 = o(1)\) holds for all \(i = 1, \dots , n\). To show the validity of the jackknife variance estimator, we shall focus on the first sample and put \(\xi _i = h_1(X_i)\), \(i = 1, \dots , n_1\). By definition (6), the leave-one-out pseudo-values are then given by

### 2.2 Equal sample sizes \(N = n_1 = n_2\)

*N*pairs:

*N*pairs of random vectors explicit in our notation. Let

*leave-one-pair-out*statistics and define the

*leave-one-pair-out*pseudo-values:

### Theorem 2.2

*For equal sample sizes, the jackknife estimators* (12) *and* (13) *are consistent and asymptotically unbiased.*

### *Proof*

*n*. The asymptotic variance of \(T_N\) is now given by the following:

### *Remark 2.2*

*d*jackknife variance estimator. Let \(S_{N,r}\) be the collection of subsets of \(\{ 1, \dots , N \}\) which have size \(r = N-d\). For \(s = \{ i_1, \dots , i_r \} \in S_{N,r}\), let \(T_N^{(s)} = T_N( \xi _{i_1}, \dots , \xi _{i_r} )\). The delete-

*d*jackknife variance estimator is then defined by the following:

### Corollary 2.1

*Suppose that*\(N E( R_N^2 ) = o(1)\)

*and*\(d = d_N\)

*satisfies*

*and*\(r = N-d \rightarrow \infty\).

*Then, the delete*-

*d*

*jackknife variance estimator*\({\widehat{ \sigma }}_{J(d)}^2\)

*is consistent in the sense that*\(N {\widehat{ \sigma }}_{J(d)}^2 - \sigma ^2 = o_P(1)\)

*and asymptotically unbiased in the sense that*\(N {\widehat{ \sigma }}_{J(d)}^2 - \sigma ^2 = o(1)\),

*as*\(N \rightarrow \infty\).

### *Proof*

We have the decomposition \(T_N = L_N + R_N\), where the linear statistic \(L_N\) can be written as \(L_N = \frac{1}{N} \sum _{i=1}^N \xi _i\) with \(\xi _i = \frac{1}{2} [ h_1(X_{1i}) + h_2(X_{2i}) ]\), \(i = 1, \dots , N\). The proof of Theorem 1 in Shao and Wu (1989) uses the representation \(N {\widehat{ \sigma }}_{J(d)}^2 = \frac{N r}{d {N \atopwithdelims ()d}} \sum _s (L(s) - U_s)^2\), where \(L(s) = \frac{1}{r} \sum _{i \in s} \xi _i - \frac{1}{N} \sum _{i=1}^N \xi _i\) and \(U_s = R_N - R_{N,s}\) with \(R_{N,s}\) the remainder term associated with \(T_N^{(s)}\), and then only refers to the properties of *L*(*s*) and \(U_s\) and does not refer to the original definition of \(T_N\). Since the linear term has the same form as in Shao and Wu (1989), the proof directly carry overs and the stated conditions are sufficient to ensure (Shao and Wu 1989, (3.3)), cf. Corollary 1 therein. \(\square\)

### 2.3 Two-sample Gâteaux and Fréchet-differentiable functionals

Let us first study the case of statistics induced by sufficiently smooth statistical functionals. Let \({\mathcal {F}}\) be the (convex) set of distribution functions on \({\mathbb {R}}\) and let \({\mathcal {D}}= \{ (G_1,G_2) -(H_1,H_2) : (G_1, G_2), (H_1,H_2) \in {\mathcal {F}}^2 \}\) be the linear space associated with \({\mathcal {F}}^2 = {\mathcal {F}}\times {\mathcal {F}}\). \(d \delta _x\), \(x \in {\mathbb {R}}\), denotes the Dirac measure in *x* with distribution function \(\delta _x(z) = {\mathbf {1}}( x \le z )\), \(z \in {\mathbb {R}}\).

Denote by \({\widehat{ F }}_{n_i}^{(i)}(x) = \frac{1}{n_i} \sum _{j=1}^{n_i} {\mathbf {1}}( X_{ij} \le x )\), \(x \in {\mathbb {R}}\), the empirical distribution function of the *i*th sample, \(i = 1, 2\). A functional \(T: {\mathcal {F}}\times {\mathcal {F}}\rightarrow {\mathbb {R}}\) induces a two-sample statistic, namely \(T_n = T( {\widehat{ F }}_{n_1}^{(1)}, {\widehat{ F }}_{n_2}^{(2)} )\). A special case of interest is when *T* is *additive* with respect to the distribution functions, i.e., \(T_n = T_1( {\widehat{ F }}_{n_1}^{(1)} ) + T_2( {\widehat{ F }}_{n_2}^{(2)} )\) for two one-sample functionals \(T_1, T_2 : {\mathcal {F}}\rightarrow {\mathbb {R}}\), which we shall call components in what follows.

We will first consider statistics induced by a Gâteaux-differentiable statistical functional. Here, additional assumptions are required to obtain consistency of the jackknife variance estimator. Fréchet differentiability is a stronger notion and implies the consistency of the jackknife.

### Definition 2.1

*T*is called

*continuously Gâteaux differentiable*at \((G_1, G_2)\), if, for any sequence \(t_k \rightarrow 0\) and for all sequences \(( G_1^{(k)}, G_2^{(k)} )\), \(k \ge 1\), with \(\max _{i=1,2} \Vert G_i^{(k)} - G_i \Vert _\infty \rightarrow 0\), as \(k \rightarrow \infty\):

*T*is additive with (continuously) differentiable components \(T_1\) and \(T_2\), then \(L_{(G_1,G_2)}( H_1, H_2 ) = L^{(1)}_{G_1}(H_1) + L^{(2)}_{G_2}(H_2)\), where \(L^{(i)}_{G_i}\) is the Gâteaux derivative of \(T_i\), \(i = 1, 2\). In general, the linearization of the two-sample statistic \(T( {\widehat{ F }}_{n_1}^{(1)}, {\widehat{ F }}_{n_2}^{(2)} )\) induced by a (continuously) Gâteaux differentiable functional \(T(F_1,F_2)\) is given by the following:

*T*is Gâteaux differentiable is too weak to entail a central limit theorem. Therefore, from now on, we assume, as in Shao (1993), where the one-sample setting is studied, that \(T_n = T( {\widehat{ F }}_{n_1}^{(1)}, {\widehat{ F }}_{n_2}^{(2)} )\) is asymptotically normal:

The following lemma shows that, under weak conditions, the linearization \(L_n^{(T)}\) is asymptotically as required in the previous sections and clarifies the relationship between the functions \(\psi _1, \psi _2\) in (16) and the functions \(h_1, h_2\) in (2).

### Lemma 2.1

*Let*\(T_n\)

*be a two-sample statistic induced by an additive functional. If*\(E( \psi _i^2( X_{i1} ) ) < \infty\)

*and*\(n_i/n \rightarrow \lambda _i\), \(i = 1, 2\),

*then*

*where*\(E( \sqrt{n} R_n^{(T)} )^2 = o(1)\),

*as*\(n \rightarrow \infty\),

*such that*\(h_i = \lambda _i \psi _i\) , \(i = 1, 2\).

*Therefore, one obtains the representation*(2)

*when putting*\(h_i(x) = \lambda _i^{-1} \psi _i(x)\), \(x \in {\mathbb {R}}\), \(i = 1, 2\).

### *Proof*

### Theorem 2.3

*Assume that*

*T*

*is a continuously Gâteaux differentiable two-sample functional with*

*and*

*Then*

*as*\(n_i \rightarrow \infty\), \(i = 1, 2\),

*and therefore*:

*as*\(n \rightarrow \infty\),

*provided*\(n_i/n \rightarrow \lambda _i\), \(i = 1, 2\).

### *Proof*

*T*is continuously Gâteaux differentiable, we may conclude that

### Corollary 2.2

*Under the conditions of Theorem*2.3,

*the jackknife variance estimator defined in*(7)

*is consistent for*\(\sigma ^2(T) = \lambda _1^2 {\mathrm{Var\,}}( h_1( X_{11} )) + \lambda _2^2 {\mathrm{Var\,}}( h_2( X_{21} ))\),

*since*

*as*\(n \rightarrow \infty\),

*for*\(i = 1, 2\).

### *Proof*

For an additive functional, \(L_{(F_1,F_2)}(H_1, H_2) = L_{F_1}^{(1)}( H_1 ) + L_{F_2}^{(2)}( H_2 )\), such that \(Z_1 = L_{F_1}( \delta _{X_{11}} - F_1 ) = \psi _1( X_{11} )\), and hence, \({\mathrm{Var\,}}( Z_1 ) = {\mathrm{Var\,}}( \psi _1( X_{11} ) )\). As shown in Lemma 2.1, \(\psi _1 = \lambda _1 h_1\), where \(h_1\) is as in (2), leading to \({\widehat{ \tau }}_1^2 {\mathop {\rightarrow }\limits ^{a.s.}} \lambda _1^{-2} {\mathrm{Var\,}}(Z_1) = {\mathrm{Var\,}}( h_1(X_{11} ) )\). \(\square\)

To summarize, under Gâteaux differentiability, the jackknife variance estimator works, provided that asymptotic normality holds. Let us now study the case of a two-sample statistic induced by a Fréchet-differentiable statistical functional.

### Definition 2.2

*continuously Fréchet-differentiable*at \((G_1,G_2) \in {\mathcal {F}}\times {\mathcal {F}}\) with respect to \(\rho _1\), if

*T*is Fréchet-differentiable at \((G_1, G_2)\), i.e., there is a linear functional \(L_{(G_1,G_2)}\) on \({\mathcal {D}}\), such that, for all \((C_1,C_2) \in {\mathcal {C}}= \{ A \subset {\mathcal {D}}: A \ \text {bounded} \}\):

Proper examples for the choice of \(\rho _1\), e.g., to ensure that sample quantiles are continuously Fréchet-differentiable with respect to \(\rho _1\), are discussed in Shao (1993).

### Theorem 2.4

*Suppose that*

*T*

*is a continuously Fréchet-differentiable statistical functional with respect to a metric*\(\rho _1\).

*Assume that*

*as*\(n_i \rightarrow \infty\), \(i = 1, 2\),

*and*

*a.s., for*\(i = 1, 2\).

*Then, the assertions of Theorem*2.3

*and Corollary*2.2

*hold true*.

### *Proof*

*T*is continuously Fréchet-differentiable:

## 3 Review of estimation under ordered variances

*random*weights forming a convex combination as follows:

*c*, Kobokawa has given a sufficient condition on \(n_1\) and \(n_2\), so that \({\widehat{ \mu }}(\gamma _{\psi })\) is closer to \(\mu\) than \(\overline{X}_1\); see also Pitman (1937). Such estimators have also been studied by several authors assuming Gaussian samples (see Brown and Cohen (1974) and Bhattarcharya (1980), amongst others).

When there is an order restriction between both variances, Mehta and Gurland (1969) proposed three convex combination estimators for small samples and compare the efficiencies of proposed estimators with GD estimator. When order constraints on the variances apply, the question arises whether one can improve upon the above proposals. There is a rich literature on the general problem of estimation for constrained parameter spaces and we refer the reader to the monograph of van Eeden (2006).

**Assumption**(\(\Gamma\)): \(X_{ij} \sim F_i\), \(1 \le j \le n_i\), \(i = 1,2\), are independent random samples with common means \(\mu = \mu _1 = \mu _2\) and arbitrary variances \(\sigma _1^2\) and \(\sigma _2^2\). The random weights \(\gamma _n = \gamma _n( n_1/n, n_2/n, {\widetilde{ S }}_1^2, {\widetilde{ S }}_2^2, \overline{X}_1, \overline{X}_2 )\) are either of the form:

### *Example 3.1*

### *Remark 3.1*

As discussed above in greater detail, we have in mind the case of ordered variances and our results aim at contributing to the problem of common mean estimation under ordered variances. However, since it turns out that many results also hold for unequal variances without order restriction, we omit the order restriction in Assumption \((\Gamma )\).

### *Remark 3.2*

Additional assumptions on the underlying distributions will be stated where needed.

### *Remark 3.3*

*N*instead of

*n*. Furthermore, we may and will redefine \(\gamma _N\) as well as \(\gamma ^\le\) and \(\gamma ^>\) to be functions of

## 4 Variance estimation and asymptotic distribution theory for common mean estimators

As already indicated in the introduction, there is a lack of results about the estimation of the variance of the common mean estimators discussed in the literature. For the case of normal populations, Nair (1980) calculated the variance of the GD estimator for two populations and Voinov (1984) extended the result to the case of several samples. Mehta and Gurland (1969) gave formulas for the variances of their common mean estimators. The issue of unbiased estimation of the variance for the GD estimator has been studied in Voinov (1984) and Sinha (1985). All of those results, however, heavily rely on the normal assumption.

Therefore, we propose to use the nonparametric jackknife variance estimator studied in the previous section, which is applicable for a wide class common mean estimators defined by convex combinations of the sample means with random weights. We shall show that the jackknife is weakly consistent and asymptotically unbiased under fairly weak conditions without requiring normally distributed observations.

The consistency of the jackknife is now established by invoking the key results of the previous section, namely by proving that the common mean estimator is asymptotically linear with an appropriate remainder term. For simplicity of exposition, we state and prove the result for the case of equal sample sizes. The proof only uses elementary arguments, but it is long and technical. It is, therefore, provided in Appendix B.

### Theorem 4.1

*Suppose that*\(X_{ij} \sim F_i\), \(j = 1, \dots , N\),

*are two i.i.d. samples with distribution functions*\(F_1, F_2\)

*satisfying*\(E( X_{11}^{12} ) < \infty\), \(E( X_{21}^{12} ) < \infty\).

*Then, the following assertions hold*.

- (i)
*We have*$$\begin{aligned} {\widehat{ \mu }}_N(\gamma ) - \mu&= \left[ (\nabla [ \gamma ^\le (\theta ) \mu ] + \nabla [(1-\gamma ^\le (\theta )) \mu ]) {\mathbf {1}}_{\{ \sigma _1^2 < \sigma _2^2 \}} \right. \\&\left. \quad + (\nabla [\gamma ^>(\theta ) \mu ] + \nabla [(1-\gamma ^>(\theta )) \mu ]) {\mathbf {1}}_{\{ \sigma _1^2 > \sigma _2^2 \}} \right] ( {\widehat{ \theta }}_N - \theta ) + R_N^\gamma , \end{aligned}$$*for some remainder term*\(R_N^\gamma\)*with*\(N E(R_N^\gamma )^2 = O(1/N)\)*and*\(N^2 E( R_N - R_{N-1} )^2 = o(1)\),*as*\(N \rightarrow \infty\). - (ii)
*For ordered variances, i.e., if either*\(\sigma _1^2 < \sigma _2^2\),*it holds*$$\begin{aligned} {\widehat{ \mu }}_N(\gamma ) - \mu = \gamma (\theta ) ( \overline{X}_1 - \mu ) + (1-\gamma (\theta )) ( \overline{X}_2 - \mu ) + R_N, \end{aligned}$$*and for*\(\sigma _1^2 > \sigma _2^2\):$$\begin{aligned} {\widehat{ \mu }}_N(\gamma ) - \mu = (1-\gamma (\theta )) ( \overline{X}_1 - \mu ) + \gamma (\theta ) ( \overline{X}_2 - \mu ) + R_N, \end{aligned}$$*with*\(N E( R_N^2 ) = O(1/N)\). - (iii)
*The jackknife variance estimator is consistent*:$$\begin{aligned} \frac{{\widehat{ \sigma }}_N^2( {\widehat{ \mu }}_N )}{\sigma ^2(\gamma )} = 1 + o_P(1), \end{aligned}$$*as*\(N \rightarrow \infty\),*and asymptotically unbiased*:$$\begin{aligned} \frac{E( {\widehat{ \sigma }}_N^2( {\widehat{ \mu }}_N ))}{\sigma ^2(\gamma )} = 1 + o(1), \end{aligned}$$*as*\(N \rightarrow \infty\),*and the same applies to*\({\widehat{ \sigma }}_N^2(\gamma )\).

### *Remark 4.1*

### Theorem 4.2

*Under Assumption*\((\Gamma )\)

*and*(28),

*the common mean estimator*\({\widehat{ \mu }}_n(\gamma )\)

*satisfies the central limit theorem*:

*as*\(n \rightarrow \infty\),

*where the asymptotic variance is given by*

### *Proof*

### *Remark 4.2*

Under the assumptions of Theorem 4.1, the CLT follows directly from the asymptotic representation as a linear statistic shown there.

## 5 Simulations and data analysis

We investigated the proposed method by simulations and analyzed three data sets to illustrate both the method and its wide applicability. The data come from the fields of natural science (physics), technology (quality engineering), and social information.

### 5.1 Simulation study

The simulation study aims at comparing the accuracy of the jackknife with the accuracy of the bootstrap taking into account the computational costs. Bootstrapping is a commonly used tool. Recall that the nonparametric bootstrap draws *B* random samples of sizes \(n_1\) and \(n_2\) from the given data sets and then estimates the variance of a test statistics, the GD estimator in our study, by the sample variance of the *B* replicates of the statistic. We consider the balanced design where \(N = n_1 = n_2\). In this case, the computational costs of the bootstrap, measured as the number of times the test statistic has to be evaluated, are equal to the costs of the jackknife, if \(B = N\). In our study, we investigate the cases \(N = 25, 50, 75\) and \(B = 100, 200, \dots , 1000\), such that the computational costs of the bootstrap relative to those of the jackknife range from a factor of 4 / 3 to 40.

The simulations consider normally distributed data (model 1), the *t*(5) -distribution (model 2) as a distribution with fat tails, the \(U(-5,5)\)-distribution as a short-tailed law and \(\Gamma (a,b)\)-distributed samples with mean *ab* and variance \(a b^2\) (models 4 and 5) leading to skewed distributions. For model 4, observations distributed as \(\mu + \sigma _i ( \Gamma (a,\sigma _i) - a \sigma _i)\), \(i = 1, 2\), were simulated with \(a = 1.5\). Model 5 uses \(a = 2.5\). The common mean equals \(\mu = 10\). The results are provided in Table 1, which shows the coverage probabilities of the confidence intervals calculated based on the corresponding variance estimate for a nominal confidence level of \(95\%\). Each entry is based on \(S = 20,000\) runs.

*B*, the bootstrap intervals are even worse for \(B = 1000\). The results, therefore, demonstrate that jackknifing is a highly efficient tool when it comes to calculating confidence intervals on a large scale where the computational costs matter, as it is the case when analyzing big data.

Accuracy of the jackknife and the bootstrap variance estimators in terms of the coverage probability for the confidence level 0.95

Model | N | Jack | 100 | 200 | 300 | 400 | 500 | 600 | 700 | 800 | 900 | 1000 |
---|---|---|---|---|---|---|---|---|---|---|---|---|

1 | 25 | 0.946 | 0.935 | 0.938 | 0.932 | 0.937 | 0.933 | 0.938 | 0.936 | 0.938 | 0.940 | 0.937 |

1 | 50 | 0.951 | 0.941 | 0.942 | 0.943 | 0.944 | 0.943 | 0.946 | 0.946 | 0.945 | 0.944 | 0.941 |

1 | 75 | 0.951 | 0.942 | 0.943 | 0.945 | 0.947 | 0.945 | 0.946 | 0.945 | 0.946 | 0.945 | 0.944 |

2 | 25 | 0.949 | 0.934 | 0.936 | 0.934 | 0.935 | 0.937 | 0.938 | 0.937 | 0.935 | 0.939 | 0.935 |

2 | 50 | 0.949 | 0.941 | 0.942 | 0.941 | 0.940 | 0.943 | 0.943 | 0.942 | 0.944 | 0.942 | 0.941 |

2 | 75 | 0.949 | 0.941 | 0.945 | 0.943 | 0.946 | 0.943 | 0.944 | 0.946 | 0.944 | 0.943 | 0.946 |

3 | 25 | 0.946 | 0.940 | 0.940 | 0.940 | 0.941 | 0.937 | 0.937 | 0.940 | 0.941 | 0.943 | 0.942 |

3 | 50 | 0.948 | 0.941 | 0.944 | 0.946 | 0.947 | 0.944 | 0.945 | 0.943 | 0.946 | 0.945 | 0.944 |

3 | 75 | 0.948 | 0.946 | 0.947 | 0.944 | 0.947 | 0.948 | 0.949 | 0.950 | 0.949 | 0.948 | 0.946 |

4 | 25 | 0.921 | 0.908 | 0.915 | 0.912 | 0.911 | 0.910 | 0.913 | 0.916 | 0.912 | 0.913 | 0.912 |

4 | 50 | 0.933 | 0.928 | 0.926 | 0.930 | 0.924 | 0.931 | 0.931 | 0.931 | 0.926 | 0.928 | 0.928 |

4 | 75 | 0.940 | 0.932 | 0.934 | 0.933 | 0.937 | 0.937 | 0.935 | 0.937 | 0.933 | 0.934 | 0.934 |

5 | 25 | 0.930 | 0.918 | 0.919 | 0.920 | 0.918 | 0.919 | 0.917 | 0.922 | 0.923 | 0.920 | 0.920 |

5 | 50 | 0.938 | 0.932 | 0.929 | 0.933 | 0.936 | 0.931 | 0.934 | 0.935 | 0.932 | 0.932 | 0.935 |

5 | 75 | 0.942 | 0.935 | 0.936 | 0.938 | 0.940 | 0.938 | 0.939 | 0.936 | 0.937 | 0.939 | 0.939 |

### 5.2 Physics: acceleration due to gravity

In physics, the common mean model frequently applies when observing or measuring a time-invariant physical phenomenon. As an example, we analyze the Heyl and Cook measurements of the acceleration due to gravity, Heyl and Cook (1936). Two of those data sets, taken from Cressie (1997), are given by \(x_1 = (78, 78, 78, 86, 87, 81, 73, 67, 75, 82, 83)'\) and \(x_2 = (84, 86, 85, 82, 77, 76, 80, 83, 81, 78, 78, 78)\).

*F*test for homogeneity of variances accepts the null hypothesis of equal variances. The common mean assumption is obviously satisfied, since the same physical (invariant) phenomenon is measured; systematic errors can be excluded from considerations, due to the great amount of care taken in the experiments to avoid systematic errors. The Graybill–Deal estimator for this data is \({\widehat{ \mu }}^{(GD)} = 80.26123\). To estimate its variance and set up confidence intervals valid also under non-normal underlying distributions, we used the asymptotic approach based on the CLT, i.e., (31), and the jackknife for unequal sample sizes. Table 2 provides those estimates and the resulting confidence intervals for a confidence level of \(95\%\) for the common mean based on the central limit theorem.

Common mean estimation using the GD estimator: standard errors and confidence intervals using asymptotics and the jackknife, respectively

Method | Est. sd of \({\widehat{ \mu }}^{(GD)}\) | Confidence interval | Width |
---|---|---|---|

Asymptotics | 0.8455307 | [78.60399, 81.91847] | 3.31448 |

Jackknife | 0.8492987 | [78.5966, 81.92585] | 3.329251 |

Common mean estimation using Nair’s estimator: standard errors and confidence intervals using asymptotics and the jackknife, respectively

Method | Est. sd of \({\widehat{ \mu }}^{(N)}\) | Confidence interval | Width |
---|---|---|---|

Asymptotics | 1.279913 | [77.31746, 81.91847] | 5.017259 |

Jackknife | 0.9752919 | [77.91451, 81.92585] | 3.823144 |

Nair’s estimator uses the weights \(n_1/n = 11/23\) and \(n_2/n =12/23\), since the variance estimates \(s_1^2 = 34.09091\) and \(s_2^2 = 11.15152\) are not ordered, which substantially differ from the stochastic weights used by the GD estimator. The jackknife variance estimate provides a tighter confidence interval than the asymptotic approach in this case.

### 5.3 Technology: chip production data

*p*values \(< 10^{-3}\), and there are some outliers present which are, however, not treated for the present analysis. The sample means are \(\overline{x} = 6.293254\) and \(\overline{y} = 6.292667\) with estimated standard deviations \(s_x = 0.003785844\) and \(s_y = 0.004962341\). Table 4 provides the results when assuming \(\sigma _1^2 < \sigma _2^2\).

Nair’s estimator for the chip data: standard errors and confidence intervals using asymptotics and the jackknife, respectively

Method | Est. sd of \({\widehat{ \mu }}^{(N)}\) | Confidence interval | Width |
---|---|---|---|

Asymptotics | 0.0001942891 | [6.292657, 6.293419] | 0.0007616134 |

Jackknife | 0.0001941049 | [6.292658, 6.293418] | 0.000760891 |

Results for Nair’s estimator for the chip data when switching the samples

Method | Est. sd of \({\widehat{ \mu }}^{(N)}\) | Confidence interval | Width |
---|---|---|---|

Asymptotics | 0.0002921816 | [6.292388, 6.293419] | 0.001145352 |

Jackknife | 0.0001984456 | [6.292571, 6.293418] | 0.0007779067 |

### 5.4 Social information: Japanese child data

In Japan, the strength of 8-year-old boys and girls was investigated in the six prefectures (Aomori, Iwate, Miyagi, Akita, Yamagata, and Fukushima) of the Touhoku region on Honshu, the largest island of Japan, and the prefecture Hokkaido located in the north of Japan. The observations for the boys are 52.55, 54.08, 54.25, 52.92, 56.31, 53.63, 52.52 and those for girls 52.95, 55.72, 56.14, 54.24, 58.19, 55.32, 54.45. For those data sets, the Graybill–Deal estimator is given by 54.34878.

Results for the GD common estimator for the Japanese child data: standard errors and confidence intervals using asymptotics and the jackknife

Method | Est. sd of \({\widehat{ \mu }}^{(GD)}\) | Confidence interval | Width |
---|---|---|---|

Asymptotics | 0.3921168 | [53.58023, 55.11733] | 1.537098 |

Jackknife | 0.6874476 | [53.00139, 55.69618] | 2.694794 |

The Nair estimator for Japanese child data: standard errors and confidence intervals using asymptotics and the jackknife

Method | Est. sd of \({\widehat{ \mu }}^{(N)}\) | Confidence Interval | Width |
---|---|---|---|

Asymptotics | 0.5919936 | [53.35898, 55.11733] | 2.320615 |

Jackknife | 0.5593932 | [53.42288, 55.69618] | 2.192821 |

## Notes

### Acknowledgements

This work was supported by JSPS Kakenhi grants #JP26330047 and #JP8K11196. Parts of this paper have been written during research visits of the first author at Mejiro University, Tokyo. He thanks for the warm hospitality. Both authors thank Hideo Suzuki, Keio University at Yokohama, for invitations to his research seminar, Shinozaki Nobuo, Takahisa Iida, Shun Matsuura and the participants for comments and discussion. The authors gratefully acknowledge the support of Prof. Takenori Takahashi, Mejiro University and Keio University Graduate School, and Akira Ogawa, Mejiro University, for providing and discussing the chip manufacturing data. They would like to thank anonymous referees for the helpful comments.

## References

- Bhattarcharya, C. (1980). Estimation of a common mean and recovery of interblock information.
*The Annals of Statistics*,*8*, 205–211.MathSciNetCrossRefGoogle Scholar - Brown, L. D., & Cohen, A. (1974). Point and confidence interval estimation of a common mean and recovery of interblock information.
*The Annals of Statistics*,*2*, 963–976.MathSciNetCrossRefzbMATHGoogle Scholar - Cemer, I. (2011). Noise measurement. Sensors online. https://www.sensorsmag.com/embedded/noise-measurement.
- Chang, Y.-T., Oono, Y., & Shinozaki, N. (2012). Improved estimators for the common mean and ordered means of two normal distributions with ordered variances.
*Journal of Statistical Planning and Inference*,*142*(9), 2619–2628.MathSciNetCrossRefzbMATHGoogle Scholar - Chang, Y.-T., & Shinozaki, N. (2008). Estimation of linear functions of ordered scale parameters of two gamma distributions under entropy loss.
*Journal of the Japan Statistical Society*,*38*(2), 335–347.MathSciNetCrossRefGoogle Scholar - Chang, Y-T., & Shinozaki, N. (2015). Estimation of two ordered normal means under modified Pitman nearness criterion.
*Annals of the Institute of Statistical Mathematics*,*67*, 863–883. https://doi.org/10.1007/s10463-014-0479-4.MathSciNetCrossRefzbMATHGoogle Scholar - Cochran, W. (1937). Problems arising in the analysis of a series of similar experiments.
*JASA*,*4*, 172–175.Google Scholar - Cressie, N. (1997). Jackknifing in the presence of inhomogeneity.
*Technometrics*,*39*(1), 45–51.MathSciNetCrossRefzbMATHGoogle Scholar - Degerli, Y. (2000). Analysis and reduction of signal readout circuitry temporal noise in CMOS image sensors for low light levels.
*IEEE Transactions on Electron Devices*,*47*(5), 949–962.CrossRefGoogle Scholar - Efron, B. (1982).
*The jackknife, the bootstrap and other resampling plans*(Vol. 38)., CBMS-NSF regional conference series in applied mathematics Philadelphia, PA: Society for Industrial and Applied Mathematics (SIAM).CrossRefzbMATHGoogle Scholar - Efron, B., & Hastie, T. (2016).
*Computer age statistical inference: Algorithms, evidence, and data science*(Vol. 5)., Institute of mathematical statistics (IMS) monographs New York: Cambridge University Press.CrossRefzbMATHGoogle Scholar - Efron, B., & Stein, C. (1981). The jackknife estimate of variance.
*The Annals of Statistics*,*9*(3), 586–596.MathSciNetCrossRefzbMATHGoogle Scholar - Efron, B., & Tibshirani, R. J. (1993).
*An introduction to the bootstrap*(Vol. 57)., Monographs on statistics and applied probability New York: Chapman and Hall.CrossRefzbMATHGoogle Scholar - Elfessi, A., & Pal, N. (1992). A note on the common mean of two normal populations with order restrictions in location-scale families.
*Communications in Statistics—Theory and Methods*,*21*(11), 3177–3184.CrossRefzbMATHGoogle Scholar - Fisher, R. (1932).
*Statistical methods for research workers*(4th ed.). London: Oliver and Boyd.zbMATHGoogle Scholar - Heyl, P., & Cook, G. (1936). The value of gravity in Washington.
*Journal of Research of the U.S. Bureau of Standards*,*17*, 805–839.CrossRefGoogle Scholar - Keller, T., & Olkin, I. (2004). Combining correlated unbiased estimators of the mean of a normal distribution.
*A Festschrift for Herman Rubin*,*45*, 218–227.MathSciNetCrossRefzbMATHGoogle Scholar - Kubokawa, T. (1989). Closer estimation of a common mean in the sense of Pitman.
*Annals of the Institute of Statistical Mathematics*,*41*(3), 477–484.MathSciNetCrossRefzbMATHGoogle Scholar - Lee, Y. (1991). Jackknife variance estimators in the one-way random effects model.
*Annals of the Institute of Statistical Mathematics*,*43*(4), 707–714.MathSciNetCrossRefGoogle Scholar - Lin, D. (2010). Quantified temperature effect in a CMOS image sensor.
*IEEE Transactions on Electron Devices*,*57*(2), 422–428.CrossRefGoogle Scholar - Mehta, J., & Gurland, J. (1969). Combinations of unbiased estimators of the mean which consider inequality of unknown variances.
*JASA*,*64*(327), 1042–1055.MathSciNetCrossRefzbMATHGoogle Scholar - Nair, K. (1980). Variance and distribution of the Graybill–Deal estimator of the common mean of two normal populations.
*The Annals of Statistics*,*8*(1), 212–216.MathSciNetCrossRefzbMATHGoogle Scholar - Nair, K. (1982). An estimator of the common mean of two normal populations.
*Journal of Statistical Planning and Inference*,*6*, 119–122.MathSciNetCrossRefzbMATHGoogle Scholar - Pitman, E. (1937). The closest estimates of statistical parameters.
*Proceedings of the Cambridge Philosophical Society*,*33*, 212–222.CrossRefzbMATHGoogle Scholar - Quenouille, M. (1949). Approximate tests of correlation in time series.
*Mathematical Proceedings of the Cambridge Philosophical Society*,*11*, 68–84.MathSciNetzbMATHGoogle Scholar - Shao, J. (1993). Differentiability of statistical functionals and consistency of the jackknife.
*The Annals of Statistics*,*21*(1), 61–75.MathSciNetCrossRefzbMATHGoogle Scholar - Shao, J., & Wu, C. F. (1989). A general theory for jackknife variance estimation.
*The Annals of Statistics*,*17*, 1176–1197.MathSciNetCrossRefzbMATHGoogle Scholar - Sinha, B. (1985). Unbiased estimation of the variance of the GD estimator of the common mean of several normal populations.
*Canadian Journal of Statistics*,*13*(3), 243–247.MathSciNetCrossRefzbMATHGoogle Scholar - Steland, A. (2015). Vertically weighted averages in Hilbert spaces and applications to imaging: Fixed sample asymptotics and efficient sequential two-stage estimation.
*Sequential Analysis*,*34*(3), 295–323.MathSciNetCrossRefzbMATHGoogle Scholar - Steland, A. (2017). Fusing photovoltaic data for improved confidence intervals.
*AIMS Energy*,*5*, 113–136.CrossRefGoogle Scholar - Tippett, L. (1931).
*The method of statistics*. London: Williams and Norgate.Google Scholar - Tukey, J. W. (1958). Bias and confidence in not quite large samples (abstract).
*Annals of Mathematical Statistics*,*29*, 614.CrossRefGoogle Scholar - van Eeden, C. (2006).
*Restricted parameter space estimation problems*., Lecture notes in statistics Berlin: Springer.CrossRefzbMATHGoogle Scholar - Voinov, V. (1984). Variance and its unbiased estimator for the common mean of several normal populations.
*Sankhya: The Indian Journal of Statistics, Series B*,*46*, 291–300.MathSciNetGoogle Scholar