Skip to main content

Recent Research Projects by the FDA’s Pharmacology and Toxicology Statistics Team

  • Chapter
Nonclinical Statistics for Pharmaceutical and Biotechnology Industries

Abstract

In addition to regular review work, the Pharmacology and Toxicology Statistics Team in CDER/FDA is actively engaged in a number of research projects. In this chapter we summarize some of our recent investigations and findings.

We have conducted a simulation study (discussed in Sect. 12.2) to evaluate the increase in Type 2 error attributable to the adoption by some non-statistical scientists within the agency of more stringent decision criteria than those we have recommended for the determination of statistically significant carcinogenicity findings in long term rodent bioassays. In many cases, the probability of a Type 2 error is inflated by a factor of 1. 5 or more.

A second simulation study (Sect. 12.3) has found that both the Type 1 and Type 2 error rates are highly sensitive to experimental design. In particular, designs using a dual vehicle control group are more powerful than designs using the same number of animals but a single vehicle control group, but this increase in power comes at the expense of a greatly inflated Type 1 error rate.

Since the column totals of the tables of permutations of animals to treatment groups cannot be presumed to be fixed, the exact methods used in the Cochran-Armitage test are not applicable to the poly-k test for trend. Section 12.4 presents simple examples showing all possible permutations of animals, and procedures for computing the probabilities of the individual permutations to obtain the exact p-values. Section 12.5 builds on this by proposing an exact ratio poly-k test method using samples of possible permutations of animals. The proposed ratio poly-k test does not assume fixed column sums and uses the procedure in Bieler and Williams (Biometrics 49(3):793–801, 1993) to obtain the null variance estimate of the adjusted quantal tumor response estimate. Results of simulations show that the modified exact poly-3 method has similar sizes and levels of power compared to the method proposed in Mancuso et al. (Biometrics 58:403–412, 2002) that also uses samples of permutations but uses the binomial null variance estimate of the adjusted response rates and is based on the assumption of fixed column sums.

Bayesians attempt to model not only the statistical data generating process as in the frequentist statistics, but also to model knowledge about the parameters governing that process. Section 12.6 includes a short review of possible reasons for adopting a Bayesian approach, and examples of survival and carcinogenicity analyses.

The article reflects the views of the authors and should not be construed to represent FDA’s views or policies.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 199.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Although this tendency is not absolute. See the discussion in footnote 8 on page 37.

  2. 2.

    The NTP partition divides the 104 week study into the following subintervals: 0–52 weeks, 53–78 weeks, 79–92 weeks, 93–104 weeks, and terminal sacrifice.

  3. 3.

    It becomes more complicated in the evaluation of conservativeness or anti-conservativeness of the joint test (simultaneous combination of the trend test and the pairwise comparison test) under the agency practice. This is so because the two tests are not independent since the pairwise comparison tests used a subset (a half) of the data used in the trend test. Theoretically, if the trend test and the pairwise comparison test are actually independent and are tested at 0.005 and 0.01 levels of significance, respectively for the effect of a common tumor type, then the nominal level of significance of the joint tests should be 0. 005 × 0. 010 = 0. 00005. Some of the levels of attained Type I error of the joint tests are larger than 0. 00005 due to the dependence of the two tests that were applied simultaneously. To evaluate this nominal rate directly would require estimation of the association between the two tests. Results of the simulation study, as expected, show that the attained levels of type I error (1-retention probability under the simulation conditions in which there is no drug effect on tumor prevalence) are smaller than those of the trend test alone.

  4. 4.

    See, for instance, the discussion of osteosarcomas and osteomas in female mice in Center for Drug Evaluation and Research (2103).

  5. 5.

    This calculation assumes that all endpoints are independent. This assumption is not strictly true, especially when considering combinations of endpoints. However, it is reasonable to assume that the endpoints are close enough to being independent that the resulting estimate of the GFPR is accurate enough for our purposes.

  6. 6.

    In fact, if four independent experiments are to have a combined study-wise false positive rate of 10 %, then it suffices for them individually to have GFPRs of \(1 - (1 - 0.1)^{1/4} = 0.026\). However, since it is not practical to calibrate the GFPR so precisely, there is no practical distinction between target GFPRs of 0. 025 and 0. 026.

  7. 7.

    More sophisticated models might treat \(\mathcal{T}\) as higher dimensional. For example, in the simulation study in Sect. 12.2, the parameter space \(\mathcal{T}\) is two dimensional, with the two dimensions representing the background prevalence rate and the tumor onset time. (Although Eq. (12.1) has three independent parameters (not counting the dose response parameter D), the parameters A and B are not varied independently—see Table 12.3.)

  8. 8.

    It is a familiar result for asymptotic tests that simply increasing the sample size improves power while maintaining the Type 1 error rate at the nominal level. (This principle also applies to rare event data except that exact tests are frequently over-conservative, meaning that increases in the sample size can actually increase the Type 1 error rate even while keeping the rate below the nominal level, and that power can sometimes decrease as the sample size increases—see Chernick and Liu 2002). However, the inclusion of large numbers of extra animals is an inelegant (and expensive) way to shift the ROC curve; we are interested in modifications to the experimental design that leave the overall number of animals unchanged.

  9. 9.

    The observation that the trend tests is strongly over-conservative for rare tumors is not at odds with the finding in Dinse (1985) that the trend test is not over-conservative for tumors with a background prevalence rate of 5 or 20 %. As the expected number of tumors increases, one expects exact tests to converge to the asymptotic tests, and the LFPRs to converge to the nominal value of α.

  10. 10.

    For this reason, the converse effect is of much less concern; misclassification of rare tumors as common is positively associated with large p-values so the cases where misclassification occurs are unlikely to be significant at even the rare tumor thresholds.

  11. 11.

    The actual nominal value of the joint test is hard to evaluate. See footnote 3 on page 27.

  12. 12.

    Although it is to be hoped that in the longer term the use of the SEND data standard (Clinical Data Interchange Standards Consortium (CDISC) 2011) will enable the more efficient construction of large historical control databases.

  13. 13.

    For computational reasons, these calculations use the lifetime tumor incidence rate rather than the background prevalence rate used elsewhere in this chapter.

  14. 14.

    Insofar as we know what is typical. These spectra should only be taken to represent a range of plausible scenarios, and not assumed to be in any way definitive.

  15. 15.

    Portier et al. (1986) recommends k = 3, although other values have been investigated (Gebregziabher and Hoel 2009; Moon et al. 2003). However, as noted in Gebregziabher and Hoel (2009), it appears that the tests are largely insensitive to the choice of k.

  16. 16.

    For this reason it is often called a semiparametric model.

  17. 17.

    It is interesting to note that this model implies that the odds of tumorigenesis are proportional to \(t_{j}^{\gamma _{i}}\), which (when the probability of tumorigenesis is low) is essentially equivalent to the poly-k assumption discussed elsewhere in this chapter.

References

  • Ahn H, Kodell R (1995) Estimation and testing of tumor incidence rates in experiments lacking cause-of-death data. Biom J 37:745–765

    Article  MATH  Google Scholar 

  • Armitage P (1955) Tests for linear trends in proportions and frequencies. Biometrics 11(3): 375–386

    Article  Google Scholar 

  • Bailer AJ, Portier CJ (1988) Effects of treatment-induced mortality and tumor-induced mortality on tests for carcinogenicity in small samples. Biometrics 44(2):417–431

    Article  MATH  Google Scholar 

  • Baldrick P, Reeve L (2007) Carcinogenicity evaluation: comparison of tumor data from dual control groups in the cd-1 mouse. Toxicol Pathol 35(4):562–575

    Article  Google Scholar 

  • Bergman CL, Adler RR, Morton DG, Regan KS, Yano BL (2003) Recommended tissue list for histopathologic examination in repeat-dose toxicity and carcinogenicity studies: a proposal of the society of toxicologic pathology (stp). Toxicol Pathol 31(2):252–253

    Google Scholar 

  • Bernardo JM (1979) Reference posterior distributions for Bayesian inference. J R Stat Soc Ser B Methodol 41(2):113–147

    MATH  MathSciNet  Google Scholar 

  • Bernardo JM, Smith AFM (1994) Bayesian statistics. Wiley, Chichester

    Google Scholar 

  • Bieler GS, Williams RL (1993) Ratio estimates, the delta method, and quantal response tests for increased carcinogenicity. Biometrics 49(3):793–801

    Article  Google Scholar 

  • Center for Drug Evaluation and Research (2005) Reviewer guidance: conducting a clinical safety review of a new product application and preparing a report on the review. United States Food and Drug Administration

    Google Scholar 

  • Center for Drug Evaluation and Research (2103) Pharmacology review—NDA 205437 (otzela). Technical report, US Food and Drug Administration. http://www.accessdata.fda.gov/drugsatfda_docs/nda/2014/205437Orig1s000PharmR.pdf

  • Chang J, Ahn H, Chen J (2000) On sequential closed testing procedures for a comparison of dose groups with a control. Commun Stat Theory Methods 29:941–956

    Article  MATH  Google Scholar 

  • Chernick MR, Liu CY (2002) The saw-toothed behavior of power versus sample size and software solutions. Am Stat 56(2):149–155

    Article  MathSciNet  Google Scholar 

  • Clinical Data Interchange Standards Consortium (CDISC) (2011) Standard for exchange of nonclinical data implementation guide: nonclinical studies version 3.0

    Google Scholar 

  • De Iorio M, Johnson WO, Müller P, Rosner GL (2009) Bayesian nonparametric nonproportional hazards survival modeling. Biometrics 65(3):762–771. doi:10.1111/j.1541-0420.2008.01166.x. http://dx.doi.org/10.1111/j.1541-0420.2008.01166.x

    Article  MATH  MathSciNet  Google Scholar 

  • Dinse GE (1985) Testing for a trend in tumor prevalence rates: I. nonlethal tumors. Biometrics 41(3):751

    Google Scholar 

  • Fairweather WR, Bhattacharyya A, Ceuppens PR, Heimann G, Hothorn LA, Kodell RL, Lin KK, Mager H, Middleton BJ, Slob W, Soper KA, Stallard N, Venture J, Wright J (1998) Biostatistical methodology in carcinogenicity studies. Drug Inf J 32:401–421

    Google Scholar 

  • Gebregziabher M, Hoel D (2009) Applications of the poly-k statistical test to life-time cancer bioassay studies. Hum Ecol Risk Assess 15(5):858–875

    Article  Google Scholar 

  • Giknis MLA, Clifford CB (2004) Compilation of spontaneous neoplastic lesions and survival in Crl:CDⓇ rats from control groups. Charles River Laboratories, Worcester

    Google Scholar 

  • Giknis MLA, Clifford CB (2005) Spontaneous neoplastic lesions in the CrlCD-1(ICR) mouse in control groups from 18 month and 2 year studies. Charles River Laboratories, Worcester

    Google Scholar 

  • Haseman J (1983) A reexamination of false-positive rates carcinogenesis studies. Fundam Appl Toxicol 3(4):334–343

    Article  Google Scholar 

  • Haseman J (1984) Statistical issues in the design, analysis and interpretation of animal carcinogenicity studies. Environ Health Perspect 58:385–392

    Article  Google Scholar 

  • Haseman J, Winbush J, O’Donnel M (1986) Use of dual control groups to estimate false positive rates in laboratory animal carcinogenicity studies. Fundam Appl Toxicol 7:573–584

    Article  Google Scholar 

  • Haseman JK, Hailey JR, Morris RW (1998) Spontaneous neoplasm incidences in fischer 344 rats and b6c3f1 mice in two-year carcinogenicity studies: a national toxicology program update. Toxicol Pathol 26(3):428–441

    Article  Google Scholar 

  • Heimann G, Neuhaus G (1998) Permutational distribution of the log-rank statistic under random censorship with applications to carcinogenicity assays. Biometrics 54:168–184

    Article  MATH  Google Scholar 

  • Jackson MT (2015) Improving the power of long term rodent bioassays by adjusting the experimental design. Regul Toxicol Pharmacol 72(2):231–342. http://dx.doi.org/10.1016/j.yrtph.2015.04.011

    Article  Google Scholar 

  • Jara A, Hanson T, Quintana F, Mueller P, Rosner G (2014) Package dppackage. http://www.mat.puc.cl/~ajara

  • Kodell R, Ahn H (1997) An age-adjusted trend test for the tumor incidence rate for multiple-sacrifice experiments. Biometrics 53:1467–1474

    Article  MATH  Google Scholar 

  • Kodell R, Chen J, Moore G (1994) Comparing distributions of time to onset of disease in animal tumorigenicity experiments. Commun Stat Theory Methods 23:959–980

    Article  MATH  Google Scholar 

  • Lin KK (1995) A regulatory perspective on statistical methods for analyzing new drug carcinogenicity study data. Bio/Pharam Q 1(2):19–20

    Google Scholar 

  • Lin KK (1997) Control of overall false positive rates in animal carcinogenicity studies of pharmaceuticals. Presentation, 1997 FDA Forum on Regulatory Science, Bethesda MD

    Google Scholar 

  • Lin KK (1998) CDER/FDA formats for submission of animal carcinogenicity study data. Drug Inf J 32:43–52

    Article  Google Scholar 

  • Lin KK (2000a) Carcinogenicity studies of pharmaceuticals. In: Chow SC (ed) Encylopedia of biopharmaceutical statistics, 3rd edn. Encylopedia of biopharmaceutical statistics. CRC Press, Boca Raton, pp 88–103

    Google Scholar 

  • Lin KK (2000b) Progress report on the guidance for industry for statistical aspects of the design, analysis, and interpretation of chronic rodent carcinogenicity studies of pharmaceuticals. J Biopharm Stat 10(4):481–501

    Article  Google Scholar 

  • Lin KK, Ali MW (1994) Statistical review and evaluation of animal carcinogenicity studies of pharmaceuticals. In: Buncher CR, Tsay JY (eds) Statistics in the pharmaceutical industry, 2nd edn. Marcel Dekker, New York

    Google Scholar 

  • Lin KK, Ali MW (2006) Statistical review and evaluation of animal carcinogenicity studies of pharmaceuticals. In: Buncher CR, Tsay JY (eds) Statistics in the pharmaceutical industry, 3rd edn. Chapman & Hall, Boca Raton, pp 17–54

    Google Scholar 

  • Lin KK, Rahman MA (1998) Overall false positive rates in tests for linear trend in tumor incidence in animal carcinogenicity studies of new drugs. J Biopharm Stat 8(1):1–15

    Article  Google Scholar 

  • Lin KK, Thomson SF, Rahman MA (2010) The design and statistical analysis of toxicology studies. In: Jagadeesh G, Murthy S, Gupta Y, Prakash A (eds) Biomedical research: from ideation to publications, 1st edn. Wolters Kluwer, New Delhi

    Google Scholar 

  • Lunn D, Thomas A, Best N, Spiegelhalter D (2000) Winbugs—a bayesian modelling framework: concepts, structure, and extensibility. Stat Comput 10:325–337

    Article  Google Scholar 

  • Mancuso J, Ahn H, Chen J, Mancuso J (2002) Age-adjusted exact trend tests in the event of rare occurrences. Biometrics 58:403–412

    Article  MATH  MathSciNet  Google Scholar 

  • Moon H, Ahn H, Kodell RL, Lee JJ (2003) Estimation of k for the poly-k test with application to animal carcinogenicity studies. Stat Med 22(16):2619–2636

    Article  Google Scholar 

  • Peto R, Pike MC, Day NE, Gray RG, Lee PN, Parish S, Peto J, Richards S, Wahrendorf J (1980) Guidelines for simple, sensitive significance tests for carcinogenic effects in long-term animal experiments. IARC Monogr Eval Carcinog Risk Chem Hum Suppl NIL (2 Suppl):311–426

    Google Scholar 

  • Portier C, Hoel D (1983) Optimal design of the chronic animal bioassay. J Toxicol Environ Health 12(1):1–19

    Article  Google Scholar 

  • Portier C, Hedges J, Hoel D (1986) Age-specific models of mortality and tumor onset for historical control animals in the national toxicology programs carcinogenicity experiments. Cancer Res 46:4372–4378

    Google Scholar 

  • R Core Team (2012) R: A language and environment for statistical computing. http://www.R-project.org/

  • Rahman MA, Lin KK (2008) A comparison of false positive rates of Peto and poly-3 methods for long-term carcinogenicity data analysis using multiple comparison adjustment method suggested by Lin and Rahman. J Biopharm Stat 18(5):949–958

    Article  MathSciNet  Google Scholar 

  • Rahman MA, Lin KK (2009) Design and analysis of chronic carcinogenicity studies of pharmaceuticals in rodents. In: Peace KE (ed) Design and analysis of clinical trials with time-to-event endpoints. Chapman & Hall/CRC Biostatistics series. Taylor & Francis, Boca Raton

    Google Scholar 

  • Rahman MA, Lin KK (2010) Statistics in pharmacology. In: Jagadeesh G, Murthy S, Gupta Y, Prakash A (eds) Biomedical research: from ideation to publications, 1st edn. Wolters Kluwer, New Delhi

    Google Scholar 

  • Rahman MA, Tiwari RC (2012) Pairwise comparisons in the analysis of carcinogenicity data. Health 4:910–918

    Article  Google Scholar 

  • US Food and Drug Administration—Center for Drug Evaluation and Research (2001) Guidance for industry: statistical aspects of the design, analysis, and interpretation of chronic rodent carcinogenicity studies of pharmaceuticals. US Department of Health and Human Services, unfinalized – draft only

    Google Scholar 

  • Westfall P, Soper K (1998) Weighted multiplicity adjustments for animal carcinogenicity tests. J Biopharm Stat 8(1):23–44

    Article  MATH  Google Scholar 

Download references

Acknowledgements

The authors would like to acknowledge the support of Yi Tsong, the director of Division of Biometrics 6, in FDA/CDER/OTS/OB, while researching the work contained in this chapter.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Karl K. Lin .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Lin, K.K., Jackson, M.T., Min, M., Rahman, M.A., Thomson, S.F. (2016). Recent Research Projects by the FDA’s Pharmacology and Toxicology Statistics Team. In: Zhang, L. (eds) Nonclinical Statistics for Pharmaceutical and Biotechnology Industries. Statistics for Biology and Health. Springer, Cham. https://doi.org/10.1007/978-3-319-23558-5_12

Download citation

Publish with us

Policies and ethics