Comparison of Bayesian and classical methods in the analysis of cluster randomized controlled trials with a binary outcome: The Community Hypertension Assessment Trial (CHAT)
 6.1k Downloads
 9 Citations
Abstract
Background
Cluster randomized trials (CRTs) are increasingly used to assess the effectiveness of interventions to improve health outcomes or prevent diseases. However, the efficiency and consistency of using different analytical methods in the analysis of binary outcome have received little attention. We described and compared various statistical approaches in the analysis of CRTs using the Community Hypertension Assessment Trial (CHAT) as an example. The CHAT study was a cluster randomized controlled trial aimed at investigating the effectiveness of pharmacybased blood pressure clinics led by peer health educators, with feedback to family physicians (CHAT intervention) against Usual Practice model (Control), on the monitoring and management of BP among older adults.
Methods
We compared three clusterlevel and six individuallevel statistical analysis methods in the analysis of binary outcomes from the CHAT study. The three clusterlevel analysis methods were: i) unweighted linear regression, ii) weighted linear regression, and iii) randomeffects metaregression. The six individual level analysis methods were: i) standard logistic regression, ii) robust standard errors approach, iii) generalized estimating equations, iv) randomeffects metaanalytic approach, v) randomeffects logistic regression, and vi) Bayesian randomeffects regression. We also investigated the robustness of the estimates after the adjustment for the cluster and individual level covariates.
Results
Among all the statistical methods assessed, the Bayesian randomeffects logistic regression method yielded the widest 95% interval estimate for the odds ratio and consequently led to the most conservative conclusion. However, the results remained robust under all methods – showing sufficient evidence in support of the hypothesis of no effect for the CHAT intervention against Usual Practice control model for management of blood pressure among seniors in primary care. The individuallevel standard logistic regression is the least appropriate method in the analysis of CRTs because it ignores the correlation of the outcomes for the individuals within the same cluster.
Conclusion
We used data from the CHAT trial to compare different methods for analysing data from CRTs. Using different methods to analyse CRTs provides a good approach to assess the sensitivity of the results to enhance interpretation.
Keywords
Prior Distribution Generalize Estimate Equation Blood Pressure Clinic Generalize Estimate Equation Model Paired ClusterBackground
Cluster randomized trials (CRTs) are increasingly used in the assessment of the effectiveness of interventions to improve health outcomes or prevent diseases [1]. The units of randomization for such trials are groups or clusters such as family practices, families, hospitals, or entire communities rather than individuals themselves. CRT designs are used to evaluate the effectiveness of not only group interventions but also individual interventions where grouplevel effects are relevant. CRTs may also lead to substantially reduced statistical efficiency compared to trials that randomize the same number of individuals [2]. They may also produce selection bias since the allocation arm that the subject receives is often known in advance [3]. However, in practice, CRT designs have several attractive features that may outweigh these disadvantages. Cluster randomization minimizes the likelihood of contamination between the intervention and the control arms. In addition, the nature of the intervention itself may dictate its application as the optimal strategy [4].
The main consequence of a cluster design is that the outcomes for subjects within the same cluster can not be assumed to be independent. This is because the subjects within the same cluster are more likely to be similar to each other than those from different clusters. This leads to a reduction in statistical efficiency due to clustering, i.e. the design effect. The design effect is a function of the variance inflation factor (VIF), given by 1 + ( Open image in new window  1)ρ, where Open image in new window denotes the average cluster size and ρ is a measure of intracluster correlation – interpretable as the correlation between any two responses in the same cluster [2, 5]. Considering the two components of the variation in the outcome, betweencluster and withincluster variations, ρ may also be interpreted as the proportion of overall variation in outcome that can be accounted for by the betweencluster variation.
These principles are well established in the design of CRTs, especially when there are implications for the sample size planning. In statistical analysis, it has long been recognized that ignoring the clustering effect will increase the chance of obtaining statistically significant but spurious findings [6]. Although many papers have compared analytical methods for CRTs with binary outcomes over the last decade, none have investigated the Bayesian model in the analysis of CRT in detail. In particular, comparison of randomeffects meta analytic approach with other methods for the analysis of matched pair CRTs [7] has not been done. In this paper, we compare various statistical approaches in the analysis of CRTs using the Community Hypertension Assessment Trial (CHAT) as an example. The CHAT study is a multicentre randomized controlled trial using blocked stratified, matchedpair cluster randomization. We also explored in much detail the application of the Bayesian randomeffects model in the analysis of CRTs. In particular, we investigated the impact of different prior distributions on the estimate of the treatment effect.
Methods
Overview of the CHAT study
The CHAT study was a cluster randomized controlled trial aimed at investigating the effectiveness of pharmacybased blood pressure (BP) clinics led by peer health educators, with feedback to family physicians (FP) on the monitoring and management of BP among older adults [8]. The participants of the trial included the FP practices, patients, pharmacies and peer health educators. Eligible FPs were nonacademic, fulltime practitioners with regular family practices in terms of size and casemix, and were able to provide an electronic roster that included a mailing address of their patients 65 years and older. FPs who worked in walkin clinics or emergency departments, were about to retire or worked parttime, had fewer than 50 patients 65 years or older, or had a specialized practice profile were excluded from the study. Eligible patients were 65 years or older at the beginning of the study, considered by their FPs to be regular patients, communitydwelling and able to leave their homes to attend the communitybased pharmacy sessions. To ensure that the results would be generalizable to patients in other FP practices, the trial had very few exclusion criteria for patients.
The study design was a multicentre randomized controlled trial using blocked stratified, matchedpair cluster randomization. Family practices were the unit of randomization. Eligible practices were stratified according to (1) the median number of patients in the practice with adequate BP control and (2) the median number of patients aged 65 years and older, and matched according to centers. The trial started in 2003 with 28 FPs practising in Ottawa and Hamilton randomly selected from the eligible FPs. Fourteen were randomly allocated to the intervention (pharmacy BP clinics) and 14 to the control group (no BP clinics offered). Fiftyfive eligible patients were randomly selected from each FP roster. Therefore, 1540 patients participated in the study.
All eligible patients in both the intervention or control group got usual health service at their FP's office. Patients in the practices allocated to the intervention group were invited to visit the community BP clinics. Peer health educators assisted patients to measure their BP and record their readings on a form that also asked about cardiovascular risk factors. Research nurses, assisted by the FP office staff, conducted the baseline and endoftrial (12 months after the randomization) audits of the health records of the 1540 patients (55 per practice) who participated in the study. These data were collected to determine the effect of the intervention.
Outcomes
The primary outcome of the CHAT study was a binary outcome with "1" indicating the patient's BP was controlled at the end of the trial and "0" otherwise. We defined that the patient's BP was controlled as follows:

if the BP reading was available in the patient's chart at the end of the trial and the systolic BP ≤ 140 mmHg and diastolic BP ≤ 90 mmHg for patient without diabetes or target organ damage, or

the systolic BP ≤ 130 mmHg and diastolic BP ≤ 80 mmHg for patient with diabetes or target organ damage.
Secondary outcomes of the CHAT study included 'BP monitored', frequency of BP monitoring, systolic BP reading, and diastolic BP reading. The analyses presented in this paper are based on the primary outcome only. The analysis of secondary outcomes will be the subject of another paper reporting the trial results.
Statistical methods
The analysis of CRTs may be based on the analysis of aggregated data from each cluster or based on individual level data, which correspond to the clusterlevel and the individuallevel analysis methods, respectively. The adjustment for individuallevel covariates may be applied only for the individual level analysis. While the adjustment for the clusterlevel covariates may be applied for both the clusterlevel and the individuallevel analysis. In this paper, the randomeffects metaregression method was performed using STATA Version 8.2 (College Station, TX). Other standard analyses were performed using SAS Version 9.0 (Cary, NC). The Bayesian analysis was performed using WinBugs Version 1.4. The results from classical analyses for binary outcomes are reported as odds ratio (OR) and corresponding 95% confidence interval (CI). The results from the Bayesian method are reported as posterior estimate and corresponding 95% credible interval (CrI). CrIs are the Bayesian analog of confidence intervals. The reporting of the results follows the CONSORT (Consolidated Standards of Reporting Trials) statement guidelines for reporting clusterrandomized trials [9] and ROBUST guideline [10] for reporting Bayesian analysis.
Clusterlevel analyses methods
Unweighted linear regression
where x_{ i }denotes the vector of covariates (intervention groups and centers), β represents the effect of the covariates in the log odds scale, and u_{ i }represents the cluster level random effect. The u_{ i }here is assumed to follow normal distribution with a zero mean and a constant variance. In this method, each cluster/FP is given equal weight when estimating the regression coefficient β. We implemented this model using SAS proc glm.
Weighted linear regression
The weighted linear regression method [12] has the same model expression as the unweighted linear regression method. It treats the log odds estimated from each cluster as the outcome, and treatment group as one of the explanatory variables. The weight was defined as the inverse variance of the log odds, i.e. w_{ i }= 1/var_{ i }for FP i. Compared to the unweighted linear regression – in which all cluster estimates are weighed equally – the weighted linear regression gave clusters with higher precision more weight, and therefore more contribution in estimating the treatment effect. We implemented this model using SAS proc glm.
Randomeffects metaregression
However, the u_{ i }here is assumed to follow a normal distribution on the log odds scale with a zero mean and an uncertain variance, which represents the between cluster variance and can be estimated when fitting the model. We implemented this model using STATA metareg.
Individuallevel analyses
We used six individuallevel statistical methods that extend the standard logistic regression methods by adding specific strategies to handle the clustering of the data, and therefore are valid for analyzing clustering data.
Standard logistic regression
where x_{ ij }denotes the vector of covariates (BP controlled at baseline, intervention groups etc.) for patient j in the cluster/FP i; y_{ ij }is the binary outcome indicating if the BP is controlled for patient j in the cluster/FP i; and π_{ ij }= Pr(y_{ ij }= 1x_{ ij }).
The standard logistic model assumes that data from different patients are independent. Since this assumption is not valid for the correlated data, it is not valid for analyzing cluster randomized trials. We implemented this model using SAS proc genmod.
Robust standard errors
Like the standard logistic regression, the robust standard error method [14, 16] gives the same estimates since both of them assume independent data to get the estimate of the treatment effect. However, in the robust standard errors method, the standard errors for all the estimates are obtained using 'Huber sandwich estimator' which can be used to estimate the variance of the maximum likelihood estimate when the underlying model is incorrect or the model assumption is wrong [17]. It is often used for clustered data [18]. We implemented this model using SAS proc genmod.
Generalized estimating equations
The generalized estimating equations (GEE) [14, 19, 20] method permits the specification of a working correlation matrix that accounts for the form of withincluster correlation of the outcomes. In the analysis of CRTs, we generally assume that there is no logical ordering for individuals within a cluster, i.e. the individuals within the same cluster are equally correlated. In this case, an exchangeable correlation matrix should be used. We implemented this model using SAS proc genmod.
Though the sandwich standard error estimator is consistent even when the underlying model is specified incorrectly, it tends to underestimate the standard error of the regression coefficient when the number of clusters is not large enough [21, 22]. Furthermore, the estimate of standard error is highly variable when the number of clusters is too small. In this paper, we employed two methods proposed by Ukoumunne [23] to correct this bias. Both methods can be used when there are equal numbers of clusters in each arm and no covariate adjustment. In the first method – modified GEE (1), the bias of the sandwich standard error is corrected by multiplying it by Open image in new window , where J is the number of clusters in each arm. In the second method – modified GEE (2), the increased variability of the sandwich standard error estimator was accounted for by building the confidence interval for the treatment effect based on the quantiles from the tdistribution with 2(J1) degree of freedom.
Randomeffects metaanalytic approach
This method is appropriate only for CRTs with matched pair design [2]. If we assume that the data from each paired cluster are arising from a metaanalysis of independent randomized controlled clinical trials, then we can apply the traditional randomeffects metaanalysis method to pool the results from all the pairs [13]. The randomeffects metaanalytic approach for analysing CRTs consists of two steps. First, the treatment effect is estimated for each paired cluster. Second, the overall treatment estimator is calculated as a weighted average of the paired cluster estimates, where weights are the inverse of the estimated variances of treatment effects of the paired clusters. We implemented this model using SAS proc genmod and proc mixed.
Randomeffects logistic regression
The randomeffects logistic regression [15, 19] is a special kind of hierarchical linear model. Compared to the standard logistic regression, the randomeffects logistic regression includes a clusterlevel random effect in the model which is assumed to follow a normal distribution with a zero mean and an unknown variance τ^{2} (the betweencluster variance); τ^{2} is estimated in the regression. By allowing for overdispersion parameter to be estimated, we adopted the estimating algorithm of pseudolikelihood function of Wolfinger/O'Connell 1993 [24]. Compared to the Bayesian model, the CI for the treatment effect from this method is narrower since it is based on estimated constant variance components without allowance for uncertainty [25]. In practice, it may be difficult to assess the validity of the model assumption that the clusterlevel randomeffects follow a normal distribution. We implemented this model using SAS macro glimmix.
Bayesian randomeffects regression
The Bayesian randomeffects regression model [26] has the same format as the traditional randomeffects logistic regression. However it is based on different assumptions to the variance of the cluster level random effect. The Bayesian approach assumes the variance of the random effect τ^{2} as an unknown parameter while the traditional regression approach assumes it as a constant. In the Bayesian approach, the uncertainty of τ^{2} is taken into account by assuming a prior distribution which presents the researcher's prebelief or external information to τ^{2}. The observed data are presented as a likelihood function, which is used to update the researcher's prebelief and then obtain the final results. The final results are presented as the posterior distribution.
When applying the Bayesian model, it is essential to state in advance the source and structure of the prior distributions that are proposed for the principal analysis [27, 28]. In our Bayesian analysis, we assumed the noninformative uniform prior distribution with lower and upper bounds as 0 and 10 respectively to minimize the influence of the researcher's prebelief or external information on the observed data. Consequently, the result from the Bayesian approach should be comparable to the results from the classical statistical methods. We also assumed that the prior distribution for all the coefficients follows a normal distribution with a mean of zero and precision 1.0E6. The total number of iterations to obtain the posterior distribution for each end point is 500,000, the burnedin number is 10,000, and the seed is 314159. The nonconvergence of the Markov Chain is evaluated by examining the estimated Monte Carlo error for posterior distributions and a dynamic trace plots, times series plots, density plots and autocorrelation plots.
Impact of priors for Bayesian analysis
Even though the researcher's subjective prebeliefs, which are expressed as prior distribution functions, can be updated by the likelihood function of the observed data, misspecification of priors has an impact on the posterior in some cases. To verify the robustness of the results from the Bayesian randomeffects logistic regression, we evaluated the impact of different prior distributions of the variance parameter in the analysis of the primary outcome, BP controlled, without adjustment for any covariates.
Comparison of the Impact of Different Priors on Bayesian Model
Prior  Outcome: BP controlled (unadjusted for covariates)  

Type of Prior  Prior distribution  Odds Ratio  95% CI 
Uniform (0, 1)  1.11  (0.64 1.92)  
Uniform (0, 5)  1.09  (0.61 1.94)  
Noninformative  Uniform (0, 10)  1.09  (0.61 1.94) 
Uniform (0, 50)  1.09  (0.61 1.94)  
Uniform (0, 100)  1.09  (0.61 1.94)  
Noninformative and Conjugate  IGamma (0.001, 0.001)  1.11  (0.63 1.94) 
IGamma (0.01, 0.01)  1.11  (0.63 1.95)  
IGamma (0,1, 0.1)  1.12  (0.64 1.95) 
Results
Since the data collection was based on chart review, there were very few missing values for the CHAT study. Demographic information and health conditions were balanced between the two study arms at baseline. Of the 1540 patients who were included, there were 41% (319/770) male patients in the control group and 44% (339/769) male patients in the intervention group. At the beginning of the trial, the mean age of the patients was 74.36 with a standard deviation (SD) of 6.22 in the control group, and 74.16 with SD of 6.14 in the intervention group. In the intervention and control group, 55% (425/770) and 55% (420/770) of patients had BP controlled at baseline; 57% (437/770) and 53% (409/770) of patients had BP controlled at the end of the trial.
In analyzing the binary primary outcomes of the CHAT trial (BP controlled), the results from different statistical methods were different. However, the estimates obtained from all of the nine methods showed that there were no significant differences in improving the patients' BP between the intervention and the control groups.
Comparison of Nine Methods with and without Adjustment for Covariates
Unit of Analysis  Method of Analysis  Unadjusted for Covariates  Adjusted for Covariates  

OR  95% CI  OR  95% CI  
Cluster  Unweighted Regression  1.05  (0.59 1.87)  1.05  (0.60 1.84) 
Weighted Regression  1.27  (0.81 1.99)  1.27  (0.82 1.96)  
Randomeffects Meta Regression  1.05  (0.60 1.85)  1.05  (0.61 1.82)  
Individual  Standard Logistic Regression  1.14  (0.93 1.39)  1.17  (0.95 1.44) 
Robust Standard Error  1.14  (0.72 1.80)  1.17  (0.79 1.73)  
Generalized Estimating Equations **  1.14  (0.72 1.80)  1.15  (0.76 1.72)  
Modified GEE (1) ***  1.14  (0.71 1.83)  
Modified GEE (2) ****  1.14  (0.71 1.84)  
Randomeffects Meta Analysis  1.09  (0.68 1.74)  1.12  (0.73 1.70)  
Randomeffects Logistic Regression  1.10  (0.65 1.86)  1.13  (0.71 1.80)  
Bayesian Randomeffects Regression  1.12  (0.64 1.95)  1.13  (0.68 1.87) 
Discussion
Summary of Key Findings
We applied three clusterlevel and five individuallevel approaches to analyse results of the CHAT study. We also employed two methods to correct the bias of the sandwich standard error estimator from the GEE model. Among all the analytic approaches, only the individuallevel standard logistic regression was inappropriate since it does not account for the betweencluster variation. This is because it tends to underestimate the standard error of the treatment effect and its pvalue. Correspondingly, this method might exaggerate the treatment effect. All the other methods handle the clustering by different techniques, and therefore were appropriate. All but the weighted regression method yielded similar point estimates of the treatment effect. This is not surprising since the weighted regression method can potentially affect the location of the estimate as well as the precision. The Bayesian randomeffects logistic regression yielded the widest confidence interval. This was due to the fact that the Bayesian randomeffects logistic regression incorporates the uncertainty of all parameters. The 95% confidence intervals for the treatment effect from the two modified GEE models are slightly wider than that from the GEE model. Adjusting for important covariates that are correlated with the outcome increased the precision and reduced the ICC. This is consistent with the finding from Campbell for the analysis of cluster trials in family medicine with a continuous outcome [30]. By adjusting for important covariates, we are able to control for the effect of imbalances in baseline risk factors and reduce unexplained variation. In general, it is important to note that for logistic regression, the population averaged model (fitted using GEE) and the cluster specific method (modelled by random effects models) are in fact estimating different population models. This is covered in detail by Campbell [31] and was first discussed by Neuhaus and Jewell [32]. Thus, we would not expect the estimates for the GEE and the randomeffects logistic regression to be exactly the same. However, they are related through the ICC [31]. In our case, the estimates from the two models are similar since the ICC in the CHAT study is relatively small.
Sensitivity analysis and simulation study
Several sensitivity analyses can be considered for CRTs. First, since different methods yield different results, and very few methodological studies provide guidance on determining which method is the best, comparing the results from different methods might help researchers to draw a safer conclusion, though the marginal odds ratio estimated by the GEE and the conditional odds ratio estimated from randomeffect models may be interpreted differently [33]. Second, sensitivity analysis can be used to investigate the sensitivity of the conclusions to different model assumptions. For example, in the randomeffects model, we assume that the clusterlevel random effects follow a normal distribution on the log odds scale. However, a sensitivity analysis can be carried out by allowing empirical investigation on the distribution of the random effects. Finally, a sensitivity analysis can also indicate which parameter values are reasonable to use in the model.
The Bayesian analysis incorporates different sources of information in the model. However, a disadvantage of this technique is that the results of the analysis are dependent on the choice of prior distributions. We performed more analyses to assess the sensitivity of the results to different prior distributions representing weak information (i.e. noninformative prior) relative to the trial data, and the results remained robust.
A simulation study by Austin [34] suggested that the statistical power of GEE is the highest among ttest, Wilcoxon rank sum test, permutation test, adjusted chisquare test and logistic randomeffects model for the analysis of CRTs. However, researchers should be cautioned about the limitations of the GEE method. First, when the number of clusters is small, the estimate of variance produced under GEE could be biased [21, 22], particularly if the number of clusters is less than 20 [35]. In this case, correction for the bias would be necessary. Second, the research on the goodnessoffit tests to the GEE application still faces some challenges [36]. Third, Ukoumunne et al [23] compared the accuracy of the estimation and the confidence interval coverage from three clusterlevel methods – the unweighted clusterlevel mean difference, weighted clusterlevel mean difference and clusterlevel randomeffects linear regression – and the GEE model in the analysis of binary outcome from a CRT. Their results showed that the clusterlevel methods performed well for trials with sufficiently large number of subjects in each cluster and a small ICC. The GEE model led to some bias of the sandwich standard error estimator when the number of clusters are relatively few. However, this bias could be corrected by multiplying the sandwich standard error by Open image in new window , where J is the number of clusters in each arm, or by building the confidence interval for the treatment effect based on the quantiles from the tdistribution with 2(J1) degree of freedom. With these corrections, the GEE was found to have good properties and would be generally preferred in practice over the clusterlevel methods since both clusterlevel and individuallevel confounders can be adjusted for.
Conclusion
We used data from the CHAT trial to compare different methods for analysing data from CRTs. Among all the statistical methods, Bayesian analysis gives us the largest standard error for the treatment effect and the widest 95% CI and therefore provides the most conservative evidence to the researchers. However, the results remained robust under all methods – showing sufficient evidence in support of the hypothesis of no effect for the CHAT intervention against Usual Practice control model for management of blood pressure among seniors in primary care. Our analysis reinforces the importance of building sensitivity analyses to support primary analysis in analysis of trial data so as to assess impact of different model assumptions on results. Nonetheless, we cannot infer from these analyses which method is superior in the analysis of CRTs with binary outcomes. Further research based on simulation studies is required to provide better insights into the comparability of the methods in terms of statistical power for designing CRTs.
Notes
Acknowledgements
The CHAT trial was funded by the grant from the Canadian Institute of Health Research (CIHR). Dr. Lehana Thabane is a clinical trials mentor for CIHR. We thank the reviewers for insightful comments that improved the presentation of the manuscript.
Supplementary material
References
 1.Campbell MK, Grimshaw JM: Cluster randomised trials: time for improvement. The implications of adopting a cluster design are still largely being ignored. BMJ. 1998, 317 (7167): 11711172.CrossRefPubMedPubMedCentralGoogle Scholar
 2.Donner A, Klar N: Design and Analysis of Cluster Randomisation Trials in Health Research. 2000, London: ArnoldGoogle Scholar
 3.Torgerson DJ: Diabetes education: Selection bias in cluster trial. BMJ. 2008, 336 (7644): 57310.1136/bmj.39514.402535.80.CrossRefPubMedPubMedCentralGoogle Scholar
 4.The COMMIT Research Group: Community Intervention Trial for Smoking Cessation (COMMIT): I. cohort results from a fouryear community intervention. Am J Public Health. 1995, 85 (2): 183192. 10.2105/AJPH.85.2.183.CrossRefGoogle Scholar
 5.Donner A, Klar N: Pitfalls of and controversies in cluster randomization trials. Am J Public Health. 2004, 94 (3): 416422. 10.2105/AJPH.94.3.416.CrossRefPubMedPubMedCentralGoogle Scholar
 6.Cornfield J: Randomization by group: a formal analysis. m J Epidemiol. 1978, 108 (2): A100102.Google Scholar
 7.Ukoumunne OC, Gulliford MC, Chinn S, Sterne JA, Burney PG: Methods for evaluating areawide and organisationbased interventions in health and health care: a systematic review. Health Technol Assess. 1999, 3 (5): iii92.PubMedGoogle Scholar
 8.Community Hypertension Assessment Trial. [http://www.chapprogram.ca]
 9.Campbell MK, Elbourne DR, Altman DG, CONSORT group: CONSORT statement: extension to cluster randomised trials. BMJ. 2004, 328 (7441): 702708. 10.1136/bmj.328.7441.702.CrossRefPubMedPubMedCentralGoogle Scholar
 10.Sung L, Hayden J, Greenberg ML, Koren G, Feldman BM, Tomlinson GA: Seven items were identified for inclusion when reporting a Bayesian analysis of a clinical study. J Clin Epidemiol. 2005, 58 (3): 261268. 10.1016/j.jclinepi.2004.08.010.CrossRefPubMedGoogle Scholar
 11.Peters TJ, Richards SH, Bankhead CR, Ades AE, Sterne JA: Comparison of methods for analysing cluster randomized trials: an example involving a factorial design. Int J Epidemiol. 2003, 32 (5): 840846. 10.1093/ije/dyg228.CrossRefPubMedGoogle Scholar
 12.Draper NR, Smith H: Applied regression analysis. 1998, New York: Wiley, 3Google Scholar
 13.Thompson SG, Sharp SJ: Explaining heterogeneity in metaanalysis: a comparison of methods. Stat Med. 1999, 18 (20): 26932708. 10.1002/(SICI)10970258(19991030)18:20<2693::AIDSIM235>3.0.CO;2V.CrossRefPubMedGoogle Scholar
 14.Dobson AJ: An introduction to generalized linear models. 2002, Boca Raton: Chapman & Hall/CRC, 2Google Scholar
 15.Hosmer DW, Lemeshow S: Applied logistic regression. 2000, New York: Toronto: Wiley, 2CrossRefGoogle Scholar
 16.Huber PJ: Robust statistics. 1981, New York: WileyCrossRefGoogle Scholar
 17.Long JS, Ervin LH: Using heteroscedastic consistent standard errors in the linear regression model. American statistician. 2000, 54: 795806. 10.2307/2685594.Google Scholar
 18.White H: A heteroscedasticconsistent covariance matrix estimator and a direct test of heteroskedasticity. Econometrica. 1980, 48 (4): 817838. 10.2307/1912934.CrossRefGoogle Scholar
 19.McCullagh P, Nelder J: Generalized linear models. 1989, London; New York: Chapman and Hall, 2CrossRefGoogle Scholar
 20.Zeger SL, Liang KY, Albert PS: Models for longitudinal data: a generalized estimating equation approach. Biometrics. 1988, 44 (4): 10491060. 10.2307/2531734.CrossRefPubMedGoogle Scholar
 21.Prentice RL: Correlated binary regression with covariates specific to each binary observation. Biometrics. 1988, 44 (4): 10331048. 10.2307/2531733.CrossRefPubMedGoogle Scholar
 22.Mancl LA, DeRouen TA: A covariance estimator for GEE with improved smallsample properties. Biometrics. 2001, 57 (1): 126134. 10.1111/j.0006341X.2001.00126.x.CrossRefPubMedGoogle Scholar
 23.Ukoumunne OC, Carlin JB, Gulliford MC: A simulation study of odds ratio estimation for binary outcomes from cluster randomized trials. Stat Med. 2007, 26 (18): 34153428. 10.1002/sim.2769.CrossRefPubMedGoogle Scholar
 24.Wolfinger RD, O'Connell M: Generalized linear models: a pseudolikelihood approach. Journal of Statistical Computation and Simulation. 1993, 233243. 10.1080/00949659308811554. 48Google Scholar
 25.Omar RZ, Thompson SG: Analysis of a cluster randomized trial with binary outcome data using a multilevel model. Stat Med. 2000, 19 (19): 26752688. 10.1002/10970258(20001015)19:19<2675::AIDSIM556>3.0.CO;2A.CrossRefPubMedGoogle Scholar
 26.Gelman A: Bayesian data analysis. 2004, Boca Raton, Fla.: Chapman & Hall/CRC, 2Google Scholar
 27.Spiegelhalter DJ: Bayesian methods for cluster randomized trials with continuous responses. Stat Med. 2001, 20 (3): 435452. 10.1002/10970258(20010215)20:3<435::AIDSIM804>3.0.CO;2E.CrossRefPubMedGoogle Scholar
 28.Turner RM, Omar RZ, Thompson SG: Bayesian methods of analysis for cluster randomized trials with binary outcome data. Stat Med. 2001, 20 (3): 453472. 10.1002/10970258(20010215)20:3<453::AIDSIM803>3.0.CO;2L.CrossRefPubMedGoogle Scholar
 29.Gelman A: Prior distributions for variance parameters in hierarchical models. Bayesian Analysis. 2006, 1 (3): 515533.Google Scholar
 30.Campbell MJ: Cluster randomized trials in general (family) practice research. Stat Methods Med Res. 2000, 9 (2): 8194. 10.1191/096228000676246354.CrossRefPubMedGoogle Scholar
 31.Campbell MJ, Donner A, Klar N: Developments in cluster randomized trials and Statistics in Medicine. Stat Med. 2007, 26 (1): 219. 10.1002/sim.2731.CrossRefPubMedGoogle Scholar
 32.Neuhaus JM, Jewell NP: A geometric approach to assess bias due to omitted covariates in generalized linear models. Biometrika. 1993, 80 (4): 807815. 10.1093/biomet/80.4.807.CrossRefGoogle Scholar
 33.FitzGerald PE, Knuiman MW: Use of conditional and marginal oddsratios for analysing familial aggregation of binary data. Genet Epidemiol. 2000, 18 (3): 193202. 10.1002/(SICI)10982272(200003)18:3<193::AIDGEPI1>3.0.CO;2W.CrossRefPubMedGoogle Scholar
 34.Austin PC: A comparison of the statistical power of different methods for the analysis of cluster randomization trials with binary outcomes. Stat Med. 2007, 26 (19): 35503565. 10.1002/sim.2813.CrossRefPubMedGoogle Scholar
 35.Horton NJ, Lipsitz SR: Review of software to fit generalized estimating equation regression models. American Statistician. 1999, 53: 160169. 10.2307/2685737.Google Scholar
 36.Ballinger GA: Using Generalized Estimating Equations for Longitudinal Data Analysis. Organizational Research Methods. 2004, 7 (2): 127150. 10.1177/1094428104263672.CrossRefGoogle Scholar
Prepublication history
 The prepublication history for this paper can be accessed here:http://www.biomedcentral.com/14712288/9/37/prepub
Copyright information
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.