In their recent Letter to the Editor, Cohen et al. expressed concerns with our reanalysis using Bayesian methods of the Waalkes et al. (2014) arsenic-induced lung tumors in CD1 mice data. Our independent analysis of the Waalkes et al. (2014) and Tokar et al. (2011, 2012) data showed that low-dose inorganic arsenic exposures increase lung tumor incidences in CD1 male mice. Our conclusions were consistent with the findings of Waalkes et al. (2014), which have subsequently been called into question by Cohen et al. (2014, 2015). In their Letter to the Editor, Cohen et al. (2016) raised several concerns with our analysis, however none of their concerns were sufficient to refute the conclusion of our analysis of the Waalkes et al. (2014) and Tokar et al. (2011, 2012) data. Specifically, that arsenic related lung tumors in CD-1 mice result from the doses administered in the Waalkes study, Cohen et al made points seeking clarification of our analyses. A number of the concerns raised by Cohen et al. were discussed in the analysis report we published alongside our original letter (Druwe and Burgoon 2016) (https://github.com/DataSciBurgoon/arsenic_mouse_lung_tumor_reanalysis) Burgoon and Druwe (2016). However, in the following we offer rebuttals to each of their three concerns.

Cohen et al.’s first concern stated that we did not include the mathematical formulas used and the rationale for their selection. The formulas and rationale we used were published along with the raw data needed to repeat the analysis as a JUPYTER notebook, which we referenced in the first letter and again above. A JUPYTER notebook is an web based notebook that contains both computer code (e.g., such as that run in R) and rich text elements such as figures, equations etc. A JUPYTER document are humans-readable documents that can also execute computer code simultaneously. In this way, we have made all of our analyses and the data completely transparent and accessible in our analysis report.

Cohen et al.’s second concern expressed that we did not provide a rationale for the use of the Bernoulli distribution. The Bernoulli distribution is discussed in the statistical literature as the standard distribution when dealing with binary cases, such as success/failure, heads/tails, cancer/no-cancer (Freedman et al. 2007; Kruske 2011). The case of the Waalkes and Tokar data is a classic binary case of tumor/no tumor, making the Bernoulli distribution the simplest distribution with the fewest number of parameters choices.

Cohen et al.’s third concern expressed that our ROPE interval was arbitrary. We provided a rationale for the ROPE analysis in the report referenced in the previous letter and again above. We stated, “Expanding the ROPE slightly, such as to ±0.10, still would not bring the 95 % HDI within the ROPE. We do not believe that a ±10 % rope is justified, when dealing with tumor incidences in the control population that are approximately 28 %. Specifically, we do not believe it is biologically plausible that a tumor incidence that spans from 18 to 38 % can be considered practically equivalent, whereas one that spans from 23 to 33 % is.”

We also wanted to address Cohen et al.’s contention that the Haseman criteria should apply to a common tumor as it requires a more stringent statistical approach (p < 0.01 vs p < 0.05). In the classical sense, this means a smaller alpha value (accepting a smaller false-positive rate). In the Bayesian ROPE construct, “more stringent” translates into a larger between group mean difference—this is equivalent to requiring a larger numerator in a t test. With respect to the stringency of the Bayesian ROPE construct method we used, we tested two null hypotheses and in the second of these constructs, we observed a mean difference of around 26 %, or an odds ratio of 3.12 (Fig. 20 in the report cited above). Speaking as toxicologists, we find it hard to believe that anyone would be required to be more stringent than demonstrating an odds ratio of 3.12 or a mean difference of around 26 %—it is simply not scientifically reasonable, practical, or prudent.

Lastly, Cohen et al. stated that we did not use “historical data” or consider the Charles River CD1 mouse data in our analysis. These data were considered, and it was clear to us that the data are not exchangeable, and thus not usable for a number of reasons (e.g., studies were performed in both the USA and Europe, in contract as well as industrial laboratories, under varying environmental conditions). Compare this to the Tokar and Waalkes et al. studies that were performed under the same general conditions. Our determination of historical controls is in line with current historical control protocols at EPA and in line with best practice protocols used by the wider scientific community (BMD technical guidance 2012; EPA Cancer Guidelines 2005; Keenan et al. 2009).

Our detailed analysis independently addressed each issue raised by Cohen et al. In an unbiased way, we allowed the data to speak for itself. Based on our analyses, we firmly stand by our independent conclusion, based on the published experimental data by Waalkes et al. (2014) and Tokar et al. (2011, 2012) that low-dose inorganic arsenic exposures increase the lung tumor incidences in CD1 male mice.