1 Introduction

Synthesis of diagnostic test accuracy studies is the most common medical application of multivariate meta-analysis; we refer the interested reader to the surveys by Jackson et al. (2011); Mavridis and Salanti (2013); Ma et al. (2016). These data have two important properties. The first is that the estimated sensitivities and specificities are typically negatively associated across studies, because studies that adopt less stringent criterion for declaring a test positive invoke higher sensitivities and lower specificities (Jackson et al. 2011). The second important property of the data is the substantial between-study heterogeneity in sensitivities and specificities (Chu et al. 2012).

Nikoloulopoulos (2015a), to deal with the aforementioned properties, proposed a copula mixed model for bivariate meta-analysis of diagnostic test accuracy studies and made the argument for moving to copula random effects models. This general model includes the generalized linear mixed model (Chu and Cole 2006; Arends et al. 2008) as a special case and can also operate on the original scale of sensitivity and specificity.

Chen et al. (2016b, 2017) proposed a composite likelihood (CL) method for estimation of the Sarmanov beta-binomial model (Chu et al. 2012) and the generalized linear mixed model (hereafter GLMM), respectively. Note in passing that both models are special cases of a copula mixed model (Nikoloulopoulos 2015a). The composite likelihood can be derived conveniently under the assumption of independence between the random effects. The CL method has been recommended by Chen et al. (2016b, 2017) to overcome practical ‘issues’ in the joint likelihood inference such as computational difficulty caused by a double integral in the joint likelihood function, and restriction to bivariate normality.

However,

  1. (a)

    The CL method is well established as a surrogate alternative of maximum likelihood when the joint likelihood is too difficult to compute (Varin et al. 2011), which is apparently not the case in the synthesis of diagnostic test accuracy studies. The general model in Nikoloulopoulos (2015a) includes the GLMM as a special case, and its numerical evaluation has been implemented in the package CopulaREMADA (Nikoloulopoulos 2016) within the open source statistical environment R (R Core Team 2015). Chen et al. (2016b, 2017) restrict themselves to SAS PROC NLMIXED which is a general routine for random effect models and thus gives limited capacity.

  2. (b)

    The random effects distribution of a copula mixed model can be expressed via other copulas (other than the bivariate normal) that allow for flexible dependence modelling, different from assuming simple linear correlation structures, normality and tail independence.

The contribution of this paper is to examine the merit of the CL method in the context of diagnostic test accuracy studies and compare it to the ML method in Nikoloulopoulos (2015a). The remainder of the paper proceeds as follows. Section 2 summarizes the copula mixed model for diagnostic test accuracy studies. Section 3 discusses both maximum and composite likelihood for estimation of the model parameters. Section 4 contains small-sample efficiency calculations to compare the two methods. Section 5 presents applications of the likelihood estimation methods to several data frames with diagnostic studies. We conclude with some discussion in Sect. 6.

2 The copula mixed model

We first introduce the notation used in this paper. The focus is on two-level (within-study and between-studies) cluster data. The data are \((y_{ij}, n_{ij}),\, i = 1, . . . ,N,\, j=1,2\), where j is an index for the within-study measurements and i is an index for the individual studies. The data, for study i, can be summarized in a \(2\times 2\) table with the number of true positives (\(y_{i1}\)), true negatives (\(y_{i2}\)), false negatives (\(n_{i1}-y_{i1}\)), and false positives (\(n_{i2}-y_{i2}\)).

The within-study model assumes that the number of true positives \(Y_{i1}\) and true negatives \(Y_{i2}\) are conditionally independent and binomially distributed given \(\mathbf {X}=\mathbf {x}\), where \(\mathbf {X}=(X_1,X_2)\) denotes the bivariate latent pair of (transformed) sensitivity and specificity. That is

$$\begin{aligned} Y_{i1}|X_{1}=x_1\sim & {} \text{ Binomial }\Bigl (n_{i1},l^{-1}(x_1)\Bigr );\nonumber \\ Y_{i2}|X_{2}=x_2\sim & {} \text{ Binomial }\Bigl (n_{i2},l^{-1}(x_2)\Bigr ), \end{aligned}$$
(1)

where \(l(\cdot )\) is a link function.

The stochastic representation of the between-studies model takes the form

$$\begin{aligned} \Bigl (F\bigl (X_1;l(\pi _1),\delta _1\bigr ),F\bigl (X_2;l(\pi _2),\delta _2\bigr )\Bigr ) \sim C(\cdot ;\theta ), \end{aligned}$$
(2)

where \(C(\cdot ;\theta )\) is a parametric family of copulas with dependence parameter \(\theta \) and \(F(\cdot ;l(\pi ),\delta )\) is the cdf of the univariate distribution of the random effect. The copula parameter \(\theta \) is a parameter of the random effects model, and it is separated from the univariate parameters, the univariate parameters \(\pi _1\) and \(\pi _2\) are the meta-analytic parameters for the sensitivity and specificity, and \(\delta _1\) and \(\delta _2\) express the between-study variabilities. The models in (1) and (2) together specify a copula mixed model (Nikoloulopoulos 2015a) with joint likelihood

$$\begin{aligned}&L(\pi _1,\pi _2,\delta _1,\delta _2,\theta )\nonumber \\&\quad =\prod _{i=1}^N\int _{0}^{1}\int _{0}^{1} \prod _{j=1}^2g\Bigl (y_{ij};n_{ij},l^{-1}\bigl (F^{-1}(u_j;l(\pi _j), \delta _j)\bigr )\Bigr )c(u_1,u_2;\theta )du_1du_2,\qquad \quad \end{aligned}$$
(3)

where \(c(u_1,u_2;\theta )=\partial ^2 C(u_1,u_2;\theta )/\partial u_1\partial u_2\) is the copula density and \(g\bigl (y;n,\pi \bigr )=\left( {\begin{array}{c}n\\ y\end{array}}\right) \pi ^y(1-\pi )^{n-y},\quad y=0,1,\ldots ,n,\quad 0<\pi <1,\) is the binomial probability mass function (pmf). The choices of the \(F\bigl (\cdot ;l(\pi ),\delta \bigr )\) and l are given in Table 1.

Table 1 The choices of the \(F\bigl (\cdot ;l(\pi ),\delta \bigr )\) and l in the copula mixed model

3 Estimation methods

3.1 Maximum likelihood method

Estimation of the model parameters \((\pi _1,\pi _2,\delta _1,\delta _2,\theta )\) can be approached by the standard maximum likelihood (ML) method, by maximizing the logarithm of the joint likelihood in (3). For mixed models of the form with joint likelihood as in (3), numerical evaluation of the joint pmf is easily done with the following steps (Nikoloulopoulos 2015a):

  1. 1.

    Calculate Gauss–Legendre quadrature points \(\{u_q: q=1,\ldots ,n_q\}\) and weights \(\{w_q: q=1,\ldots ,n_q\}\) in terms of standard uniform distribution (Stroud and Secrest 1966). Our comparisons with more quadrature points show that \(n_q=15\) is adequate with good precision to at least at four decimal places (Nikoloulopoulos 2015a, Appendix).

  2. 2.

    Convert from independent uniform random variables \(\{u_{q_1}: q_1=1,\ldots ,n_q\}\) and \(\{u_{q_2}: q_2=1,\ldots ,n_q\}\) to dependent uniform random variables \(\{u_{q_1}: q_1=1,\ldots ,n_q\}\) and \(\{C^{-1}(u_{q_2}|u_{q_1};\theta ): q_1=q_2=1,\ldots ,n_q\}\) that have distribution \(C(\cdot ;\theta )\). The inverse of the conditional distribution \(C(v|u;\theta )=\partial C(u,v;\theta )/\partial u\) corresponding to the copula \(C(\cdot ;\theta )\) is used to achieve this.

  3. 3.

    Numerically evaluate the joint pmf

    $$\begin{aligned} \int _{0}^{1}\int _{0}^{1} \prod _{j=1}^2g\Bigl (y_{ij};n_{ij},l^{-1}\bigl (F^{-1}(u_j;l(\pi _j),\delta _j)\bigr )\Bigr )c(u_1,u_2;\theta )du_1du_2 \end{aligned}$$

in a double sum:

$$\begin{aligned}&\sum _{q_1=1}^{n_q}\sum _{q_2=1}^{n_q}w_{q_1}w_{q_2} g\Bigl (y_{1};n_1,l^{-1}\bigr (F^{-1}(u_{q_1};l(\pi _1),\gamma _1)\bigr )\Bigr )\nonumber \\&\quad \times \, g\Bigl (y_{2};n_2,l^{-1}\bigl (F^{-1}(C^{-1}(u_{q_2}|u_{q_1};\theta );l(\pi _2),\gamma _2)\bigr )\Bigr ). \end{aligned}$$

The inverse conditional copula cdfs \(C^{-1}(v|u;\theta )\) are given in Table 2 for the sufficient list of parametric families of copulas for meta-analysis of diagnostic test accuracy studies in Nikoloulopoulos (2015a, b). Since the copula parameter \(\theta \) of each family has different range, in the sequel we re-parametrize them via their Kendalls \(\tau \); that is comparable across families.

Table 2 Parametric families of bivariate copulas and their Kendall’s \(\tau \) as a strictly increasing function of the copula parameter \(\theta \)

3.2 Composite likelihood method

The composite likelihood method assumes independence between the random effects. Hence, it is identical for any copula mixed model, since all the parametric families of copulas in Table 2 contain the independence copula as a special case. This subsection summarizes the composite likelihood estimating equations and the asymptotic covariance matrix for the estimator that solves them in the context of diagnostic test accuracy studies.

3.2.1 Composite likelihood estimator

Chen et al. (2016b, 2017) proposed the composite likelihood method for estimation of the copula mixed model with beta and normal margins, respectively. Composite likelihood is a surrogate likelihood which leads asymptotically to unbiased estimating equations obtained by the derivatives of the composite log-likelihoods. Estimation of the model parameters can be approached by solving the marginal estimating equations or equivalently by maximizing the sum of composite (univariate) likelihoods.

By using composite likelihood the authors are assuming between-study independence in sensitivities and specificities and thus the joint likelihood in (3) reduces to:

$$\begin{aligned} L(\pi _1,\pi _2,\delta _1,\delta _2)= & {} \prod _{i=1}^N\int _{0}^{1}\int _{0}^{1} \prod _{j=1}^2g\Bigl (y_{ij};n_{ij},l^{-1}\bigl (F^{-1}(u_j;l(\pi _j),\delta _j)\bigr )\Bigr )du_1du_2\nonumber \\= & {} L_1(\pi _1,\delta _1)L(\pi _2,\delta _2), \end{aligned}$$
(4)

where \(L_j(\pi _j,\delta _j)=\prod _{i=1}^N\int _{0}^{1}g\Bigl (y_{ij};n_{ij},l^{-1}\bigl (F^{-1}(u_j;l(\pi _j),\delta _j)\bigr )\Bigr )du_j\), since under the independence assumption the copula density \(c(\cdot )\) is equal to 1. Note that the joint likelihood reduces to the product of two univariate likelihoods and the evaluation of univariate integrals; thus, the computational effort (if any) is subsided. Essentially, for beta margins the univariate likelihoods \(L_j,\,j=1,2\) result in a closed form since

$$\begin{aligned} \int _{0}^{1}g\Bigl (y;n,F^{-1}(u;\pi ,\gamma _1\bigr )\Bigr )du= \int _{0}^{1}g(y;n,x)\,dF(x;\pi ,\gamma )=h(y;n,\pi ,\gamma ), \end{aligned}$$

where

$$\begin{aligned} h(y;n,\pi ,\gamma )= & {} \left( {\begin{array}{c}n\\ y\end{array}}\right) \frac{B\Bigl (y+\pi /\gamma -\pi ,n-y+(1 - \pi )(1 - \gamma )/\gamma \Bigr )}{B\Bigl (\pi /\gamma -\pi ,(1 - \pi )(1 - \gamma )/\gamma \Bigr )},\\&y=0,1,\ldots ,n,\, 0<\pi ,\gamma < 1, \end{aligned}$$

is the pmf of a Beta-Binomial(\(n,\pi ,\gamma \)) distribution.

Composite likelihood estimates can be obtained by maximizing the logarithm of the joint likelihood in (4) over the univariate parameters. The efficiency of the composite likelihood estimates has been studied and shown in a series of papers (Varin 2008; Varin et al. 2011). However, CL ignores the dependence at the estimation of the univariate marginal parameters; thus, it is expected to be worse as the dependence increases.

3.2.2 Asymptotic covariance matrix–inverse Godambe

Let \(\varvec{\alpha }=(\pi ,\delta )\). The asymptotic covariance matrix for the CL estimator \((\varvec{\alpha }_1,\varvec{\alpha }_2)\), also known as the inverse Godambe information matrix (Godambe 1991), is

$$\begin{aligned} \mathbf {V}=\begin{pmatrix} \mathbf {I}_{11}^{-1} &{}\quad \mathbf {I}_{11}^{-1}\mathbf {I}_{12}I_{22}^{-1}\\ (\mathbf {I}_{11}^{-1}\mathbf {I}_{12}I_{22}^{-1})^\top &{} \quad \mathbf {I}_{22}^{-1} \end{pmatrix}, \end{aligned}$$
(5)

where \(\mathbf {I}_{jj}=E\bigl [-\partial ^2 \log L_j(\varvec{\alpha }_j)/\partial \varvec{\alpha }_j^2\bigr ],\,j=1,2\) and \(\mathbf {I}_{12}=E\Bigl [\frac{\partial \log L_1(\varvec{\alpha }_1)}{\partial \varvec{\alpha }_1}\frac{\partial \log L_2(\varvec{\alpha }_2)}{\partial \varvec{\alpha }_2}^\top \Bigr ].\) For more information, including the observed inverse Godambe information matrix, we refer the reader to Chen et al. (2016b, 2017).

4 Small- and moderate-sample efficiency–misspecification of the univariate distribution of the random effect

In this section an extensive simulation study with two different scenarios is conducted (a) to assess the performance of the CL and ML methods, and (b) to investigate in detail the effect of the misspecification of the parametric margin of the random effects distribution. The CL method assumes the independence copula, and its focus is on marginal parameters and apparently not the choice of the copula. Hence, in the simulations we only investigate the effect of the misspecification of the parametric margin of the random effects distribution. We refer the interested reader to Nikoloulopoulos (2015a) for a study on the misspecification of the parametric family of copulas of the random effects distribution.

We use the simulation process in Nikoloulopoulos (2015a) and set the univariate parameters and disease prevalence to mimic the telomerase data in Sect. 5. The details are given below:

  1. 1.

    Simulate the study size n from a shifted gamma distribution, i.e. \(n\sim \text{ sGamma }(\alpha =1.2,\beta =0.01,\text{ lag }=30)\) and round off to the nearest integer.

  2. 2.

    Simulate \((u_1,u_2)\) from a parametric family of copulas \(C(;\tau )\); \(\tau \) is converted to the[3.] dependence parameter \(\theta \) via the relations in Table 2.

  3. 4.

    Convert to beta or normal realizations via \(x_j=l^{-1}\Bigl (F_j^{-1}\bigl (u_j,l(\pi _j),\delta _j\bigr )\Bigr )\) for \(j=1,2\).

  4. 5.

    Draw the number of diseased \(n_{1}\) from a B(n, 0.534) distribution.

  5. 6.

    Set \(n_2=n-n_1\) and generate \(y_j\) from a \(B(n_j,x_j)\) for \(j=1,2\).

In the first scenario the simulated data are generated from the BVN copula mixed model with normal margins, logit link (the resulting model is the same with the GLMM) and true marginal parameters \((\pi _1,\pi _2,\sigma _1,\sigma _2)=(0.79,0.91,0.43,1.83)\), while in the second scenario the simulated data are generated from the BVN copula mixed model with beta margins and true marginal parameters \((\pi _1,\pi _2,\gamma _1,\gamma _2)=(0.76,0.81,0.03,0.28)\). The number of studies is set to \(N=10\) and \(N=20\) to represent a relatively small and moderate meta-analysis, and the Kendall’s \(\tau \) association between study-specific sensitivity and specificity is set to \(\tau =-0.5\) and \(\tau =-0.8\) to represent moderate and strong negative dependence.

Table 3 Times of non-convergence out of \(10^4\) simulations for the CL and ML methods under different marginal choices in both simulated scenarios
Table 4 Biases, root mean square errors (RMSE) and standard deviations (SD), along with the square root of the average theoretical variances (\(\sqrt{\bar{V}}\)) for ML and CL estimates under different margins
Table 5 Biases, root mean square errors (RMSE) and standard deviations (SD), along with the square root of the average theoretical variances (\(\sqrt{\bar{V}}\)) for ML and CL estimates under different margins

As stated in Chen et al. (2016b, 2017) one advantage of the CL method is that the problem of non-convergence is avoided, so we also report on the non-convergence of different methods in Table 3. To summarize the simulated data, we report the resultant biases, root mean square errors (RMSE), and standard deviations (SD), along with average theoretical variances for the ML and CL estimates of the univariate parameters under different marginal choices based on iterations in which all four competing approaches converged in Tables 4 and 5. Following Chen et al. (2016b) we also summarize the diagnostic odds ratio, that is \(\hbox {dOR}={\frac{\pi _1}{(1-\pi _1)}}/{\frac{\pi _2}{(1-\pi _2)}}\). Clearly, this is a function of the univariate parameters; its value ranges from zero to infinity, with a higher value indicating better discriminatory power.

Conclusions from the values in the tables are as follows:

  • The CL method is nearly as efficient as the ‘gold standard’ ML method.

  • The meta-analytic ML and CL estimates and SDs are not robust to the margin misspecification.

  • The ML method has negligible non-convergence issues.

  • The CL method in Chen et al. (2017) has a non-convergence rate between 10% to 16%. Note in passing that the convergence problem is not because of the method itself, but the current implementation in Chen et al. (2016a) which incorporates a general R routine for univariate random effect models.

  • The CL method in Chen et al. (2016b) has no non-convergence issues at all as expected since the \(\log L\) has a closed form.

The simulation results indicate that for both methods the effect of misspecifying the marginal choice can be seen as substantial for both the univariate parameters and the parameters that are functions of them, such as the dOR. This is in line with Nikoloulopoulos (2015a, b) for the ML method. Here we also show that the CL method is not robust to the misspecification of the margin. This agrees with the conclusions of Xu and Reid (2011) and Ogden (2016) who argue that if the marginal distribution of the random effects is misspecified, then the CL estimator no longer retains robustness. This has not been revealed before, since Chen et al. (2017) (Chen et al. 2016b) focused solely on a normal (beta) margin and did not study the effect of misspecification of the marginal random effect distribution. The focus in the CL method is on marginal parameters and their functions (e.g. dOR). Since these are univariate inferences, all that matters, as regard as to the bias, is the univariate model.

Table 6 Estimated parameters, standard errors (SE) and log-likelihood values using the ML and CL methods for bladder cancer data

5 Tumour markers for bladder cancer

In this section we illustrate the methods with data of the published meta-analyses in Glas et al. (2003); also analysed in Chen et al. (2016b). This meta-analyses deal with the most common urological cancer, that is bladder cancer. Several diagnostic markers are assessed including the cytology (\(N=26\)) which is the classical marker for detecting bladder cancer since 1945 and is not expensive compared with the reference standard (that is cystoscopy procedure), but lacks the diagnostic sensitivity. The other markers under investigation to give a better sensitivity are NMP22 (\(N=14\)), BTA (\(N=6\)), BTASTAT(\(N=8\)), telomerase (\(N=10\)), and BTATRAK (\(N=5\)).

For all the meta-analyses, we fit the copula mixed model for all different choices of parametric families of copulas and margins. Sufficient choices of copulas are BVN, Frank, Clayton, and the rotated versions of the latter (Table 2). These families have different strengths of tail behaviour; for more details see Nikoloulopoulos (2015a, b). We use the log-likelihood at estimates as a rough diagnostic measure for goodness of fit between the models and summarize the choice of the copula and margin with the largest log-likelihood, along with the GLMM (BVN copula mixed model with normal margins) as a benchmark. We also estimate the model parameters with the CL method under the assumption of both normal (CL-norm) and beta (CL-beta) margins. In Table 6 we report the resulting maximized ML and CL log-likelihoods, estimates, and standard errors.

5.1 NMP22

The log-likelihoods show that a copula mixed model with rotated by 270\(^\circ \) Clayton copula and beta margins provides the best fit and the estimates of sensitivity \(\pi _1\) and specificity \(\pi _2\) are smaller under this assumption. The CL method performs well since the estimated \(\tau \) is weak and not significantly different from zero.

5.2 BTA

The log-likelihoods show that a copula mixed model with rotated by 180\(^\circ \) Clayton copula and normal margins provides the best fit. Chen et al. (2016b) previously restricted to beta margins; thus, the sensitivity \(\pi _1\) and dOR were overestimated (CL-beta).

5.3 BTASTAT

The log-likelihoods show that a BVN copula mixed model with beta margins provides the best fit and the estimates of specificity \(\pi _2\) and dOR are smaller and larger, respectively, under this assumption.

5.4 Telomerase

Nikoloulopoulos (2015a) has previously analysed these data to illustrate the copula mixed model when there exists negative perfect dependence, and thus there is only one copula: the countermonotonic copula. This is a limiting case for all the parametric families of copulas, when the dependence parameter is fixed to the left boundary of its parameter space. Both models agree on the estimated sensitivity \(\hat{\pi }_1\), but the estimate of specificity \(\hat{\pi }_2\) is larger under the standard GLMM. The log-likelihood is \(-50.37\) for normal margins and \(-51.14\) for beta margins, and thus a normal margin seems to be a better fit for the data. In this example the CL method overestimates the dOR, since it ignores the perfect negative dependence at the estimation of the parameters.

5.5 BTATRAK

The log-likelihoods show that a copula mixed model with rotated by 270\(^\circ \) Clayton copula and beta margins provides the best fit. Note that the CL-norm estimate of the between-study variance \(\sigma _1^2\) was approximately zero; thus, for this case the standard errors are unreliable as the between-study variance parameter estimate is on the boundary of the parameter space.

5.6 Cytology

The log-likelihoods show that a copula mixed model with rotated by 90\(^\circ \) Clayton copula and beta margins provides the best fit. All models agree on the estimated sensitivity \(\hat{\pi }_1\), but the estimated specificity \(\pi _2\) and dOR are smaller when beta margins are assumed. The CL method performs well on the estimation of the univariate parameters and their functions since the estimated \(\tau \) is weak and not significantly different from zero.

6 Discussion

In this paper we have challenged claims made in Chen et al. (2016b, 2017) about the advantages of using a composite likelihood in meta-analysis of diagnostic test accuracy studies, in terms of convergence and robustness to model misspecification. The usual reason for using a composite likelihood does not apply here, because the full likelihood is straightforward to compute. We have demonstrated that the copula mixed model in Nikoloulopoulos (2015a) does not suffer for computational problems or convergence issues. Nikoloulopoulos (2015a) proposed a numerically stable ML estimation technique based on Gauss–Legendre quadrature; the crucial step is to convert from independent to dependent quadrature points. Furthermore, it has been shown the secondary motivation of robustness of the CL method is not retained in this context if the marginal distributions are misspecified. Hence, it is a digression to use the CL methods in Chen et al. (2016b, 2017) for estimation in meta-analysis of diagnostic test accuracy studies as apparently there is neither computationally difficulty in the calculation of the bivariate log-likelihood nor robustness in the misspecification of the marginal distribution of the random effects. These conclusions hold to any context where clinical trials or observational studies report more than a single binary outcome.

Furthermore, in Chen et al. (2016b, 2017) the main inference is univariate such as the overall sensitivity or specificity or their functions as a single measure of diagnostic accuracy, e.g. the diagnostic odds ratio (dOR). The dOR for many cases is not useful since it cannot distinguish the ability to detect individuals with disease from the ability to identify healthy individuals (Chen et al. 2017). Whenever the balance between false negative and false positive rates is of immediate importance, both the prevalence and the conditional error rates of the test have to be taken into consideration to make a balanced decision; hence, the dOR is less useful, as it does not distinguish between the two types of diagnostic mistake (Glas et al. 2003a).

Fig. 1
figure 1

Contour plots (predictive region) and quantile regression curves from the best fitted copula mixed model for the bladder cancer data. Red and green lines represent the quantile regression curves \(x_1:=\widetilde{x}_1(x_2,q)\) and \(x_2:=\widetilde{x}_2(x_1,q)\), respectively; for \(q=0.5\) solid lines and for \(q\in \{0.01,0.99\}\) dotted lines (confidence region). In case of BTATRAK and telomerase the predictive and confidence region are meaningless since the Kendall’s \(\tau \) association is close to \(-1\). In this case all the quantile regression curves almost coincide, and hence, we depict only the median regression curve for each model. In case of BTA the axes are in logit scale since we also plot the estimated contour plot of the random effects distribution as predictive region; this has been estimated for the logit pair of (Sensitivity, Specificity) (colour figure online)

In fact, if the interest is only to overall sensitivity, and specificity, then the overall test accuracy across studies will not be clearly defined. Different studies use different thresholds for a positive test result; thus, the overall sensitivity and specificity do not make sense. Instead, some form of the summary receiver operating characteristic (SROC) curve makes much more sense and will help decision makers to assess the actual diagnostic accuracy of a diagnostic test. In an era of evidence-based medicine, decision makers need high-quality procedures such as the SROC curves to support decisions about whether or not to use a diagnostic test in a specific clinical situation and, if so, which test.

An SROC curve is deduced for the copula mixed model in Nikoloulopoulos (2015a) through a median regression curve of \(X_1\) on \(X_2\). For the copula mixed model, the model parameters (including dependence parameters), the choice of the copula, and the choice of the margin affect the shape of the SROC curve (Nikoloulopoulos 2015a). However, there is no priori reason to regress \(X_1\) on \(X_2\) instead of the other way around, so Nikoloulopoulos (2015a) also provides a median regression curve of \(X_2\) on \(X_1\). Apparently, while there is a unique definition of the ROC curve within a study with fixed accuracy, there is no unique definition of SROC curve across multiple studies with different accuracies (Rücker and Schumacher 2010). As Arends et al. (2008) have pointed out, none of the SROC curves proposed in the literature can be interpreted as an average ROC. Rücker and Schumacher (2009) stated that instead of summarizing data using an SROC, it might be preferable to give confidence regions. Hence, in addition to using just median regression curves, Nikoloulopoulos (2015a) proposed quantile regression curves with a focus on high (\(q=0.99\)) and low quantiles (\(q = 0.01\)), which are strongly associated with the upper and lower tail dependence imposed from each parametric family of copulas. These can been seen as confidence regions of the median regression SROC curve. Amongst the parametric families of copulas in Table 2 the tail dependence varies and is a property to consider when choosing amongst different families of copulas as it affects the shape of SROC curves (Nikoloulopoulos 2015a). Finally, Nikoloulopoulos (2015a) to reserve the nature of a bivariate response instead of a univariate response along with a covariate, proposed to plot the estimated contour of the random effects distribution. The contour plot can be seen as the predictive region of the estimated pair of sensitivity and specificity. The prediction region of the copula mixed model does not depend on the assumption of bivariate normality of the random effects and has non-elliptical shape.

Figure 1 demonstrates these curves and summary operating points (a pair of average sensitivity and specificity) with a confidence and a predictive region from the best fitted copula mixed model for all the meta-analyses in Sect. 5. Both CL methods in Chen et al. (2016b, 2017) cannot be used to produce the SROC curves, since the dependence parameters affect the shape of the SROC curve and these are set to independence by definition. Note in passing that the CL method in Chen et al. (2017) can provide a confidence region, but this is restricted to the elliptical shape.

Nevertheless, the additional feature of having to estimate the association amongst the random effects in ML estimation has been found to require larger sample sizes than in CL estimation where this parameter is set to independence. The application example includes cases with an adequate number of individual studies. For meta-analyses with fewer studies the CL methods in Chen et al. (2016b, 2017) can be recommended if a bivariate copula mixed model is near non-identifiable (or has a flat log-likelihood) and the estimation of an average operating point (summary sensitivity and specificity) is of interest instead of a SROC curve.

7 Software

The R package CopulaREMADA (Nikoloulopoulos 2016) has been used to produce the ML estimates (along with their SE) of the parameters from the copula mixed models and plot the SROC curves and summary operating points (a pair of average sensitivity and specificity) with a confidence and a predictive region. The R package xmeta (Chen et al. 2016a) has been used to produce the CL estimates (along with their SE) of the parameters from both methods in Chen et al. (2016b, 2017).