Multi-gene Expression-based Statistical Approaches to Predicting Patients’ Clinical Outcomes and Responses

  • Feng Cheng
  • Sang-Hoon Cho
  • Jae K. Lee
Part of the Methods in Molecular Biology book series (MIMB, volume 620)


Gene expression profiling technique now enables scientists to obtain a genome-wide picture of cellular functions on various human disease mechanisms which has also proven to be extremely valuable in forecasting patients’ prognosis and therapeutic responses. A wide range of multivariate techniques have been employed in biomedical applications on such expression profiling data in order to identify expression biomarkers that are highly associated with patients’ clinical outcome and to train multi-gene prediction models that can forecast various human disease outcome and drug toxicities. We provide here a brief overview on some of these approaches, succinctly summarizing relevant basic concepts, statistical algorithms, and several practical applications. We also introduce our recent in vitro molecular expression-based algorithm, the so-called COXEN technique, which uses specialized gene profile signatures as a Rosetta Stone for translating the information between two different biological systems or populations.

Key words

Multivariate analysis gene expression profiling COXEN classification toxicogenomics 



This work was supported in part by National Institutes of Health grant R01HL081690 to JKL.


  1. 1.
    Jaluria P, Konstantopoulos K, Betenbaugh M, Shiloach J. A perspective on microarrays: Current applications, pitfalls, and potential uses. Microb Cell Fact 2007; 6:4.PubMedCrossRefGoogle Scholar
  2. 2.
    King HC, Sinha AA. Gene expression profile analysis by DNA microarrays: Promise and pitfalls. JAMA 2001; 286:2280–8.Google Scholar
  3. 3.
    Lee JK, Havaleshko DM, Cho H, Weinstein JN, Kaldjian EP, Karpovich J, et al. A strategy for predicting the chemosensitivity of human cancers and its application to drug discovery. Proc Natl Acad Sci USA 2007; 104:13086–91.Google Scholar
  4. 4.
    Welsh JB, Zarrinkar PP, Sapinoso LM, Kern SG, Behling CA, Monk BJ, et al. Analysis of gene expression profiles in normal and neoplastic ovarian tissue samples identifies candidate molecular markers of epithelial ovarian cancer. Proc Natl Acad Sci USA 2001; 98:1176–81.Google Scholar
  5. 5.
    Gatzidou ET, Zira AN, Theocharis SE. Toxicogenomics: A pivotal piece in the puzzle of toxicological research. J Appl Toxicol 2007; 27:302–9.Google Scholar
  6. 6.
    Yang Y, Blomme EA, Waring JF. Toxicogenomics in drug discovery: From preclinical studies to clinical trials. Chem Biol Interact 2004; 150:71–85.PubMedCrossRefGoogle Scholar
  7. 7.
    Yamashita T, Honda M, Kaneko S.Application of serial analysis of gene expression in cancer research. Curr Pharm Biotechnol 2008; 9:375–82.Google Scholar
  8. 8.
    van Beers EH, Nederlof PM. Array-CGH and breast cancer. Breast Cancer Res 2006; 8:210.PubMedCrossRefGoogle Scholar
  9. 9.
    Vietor I, Huber LA. In search of differentially expressed genes and proteins. Biochim Biophys Acta 1997; 1359:187–99.Google Scholar
  10. 10.
    Feng X, Liu X, Luo Q, Liu BF. Mass spectrometry in systems biology: An overview. Mass Spectrom Rev 2008; 27:635–60.Google Scholar
  11. 11.
    Bandara LR, Kennedy S. Toxicoproteomics – a new preclinical tool. Drug Discov Today 2002; 7:411–8.Google Scholar
  12. 12.
    Lashkari DA, DeRisi JL, McCusker JH, Namath AF, Gentile C, Hwang SY, et al. Yeast microarrays for genome wide parallel genetic and gene expression analysis. Proc Natl Acad Sci USA 1997; 94:13057–62.Google Scholar
  13. 13.
    Schena M, Shalon D, Davis RW, Brown PO. Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science 1995; 270:467–70.Google Scholar
  14. 14.
    Hackett JL, Lesko LJ. Microarray data – the US FDA, industry and academia. Nat Biotechnol 2003; 21:742–3.Google Scholar
  15. 15.
    Baken KA, Vandebriel RJ, Pennings JL, Kleinjans JC, van Loveren H. Toxicogenomics in the assessment of immunotoxicity. Methods 2007; 41:132–41.Google Scholar
  16. 16.
    Battershill JM. Toxicogenomics: Regulatory perspective on current position. Hum Exp Toxicol 2005; 24:35–40.PubMedCrossRefGoogle Scholar
  17. 17.
    Werner T. Bioinformatics applications for pathway analysis of microarray data. Curr Opin Biotechnol 2008; 19:50–4.Google Scholar
  18. 18.
    Curtis RK, Oresic M, Vidal-Puig A. Pathways to the analysis of microarray data. Trends Biotechnol 2005; 23:429–35.Google Scholar
  19. 19.
    Potti A, Dressman HK, Bild A, Riedel RF, Chan G, Sayer R, et al. Genomic signatures to guide the use of chemotherapeutics. Nat Med 2006; 12:1294–300.Google Scholar
  20. 20.
    Efron B, Tibshirani R, Storey JD, Tusher V. Empirical Bayes analysis of a microarray experiment. J Am Stat Assoc 2001; 96:1151–60.Google Scholar
  21. 21.
    Dudoit S, Yang YH, Callow MJ, Speed TP. Statistical methods for identifying differentially expressed genes in replicated cDNA microarray experiments. Stat Sin 2002; 12:111–39.Google Scholar
  22. 22.
    Smyth GK. Linear models and empirical Bayes methods for assessing differential expression in microarray experiments. Stat Appl Genet Mol Biol 2004; 224:111–36.Google Scholar
  23. 23.
    Tusher VG, Tibshirani R, Chu G. Significance analysis of microarrays applied to the ionizing radiation response. PNAS 2001; 98:5116–21.Google Scholar
  24. 24.
    Jianhua Hu FAW. Assessing differential gene expression with small sample sizes in oligonucleotide arrays using a mean-variance model. Biometrics 2007; 63:41–9.Google Scholar
  25. 25.
    Ingrid Lonnstedt TS. Replicated microarray data. Stat Sin 2002; 12:31–46.Google Scholar
  26. 26.
    Newton MA, Kendziorski CM, Richmond CS, Blattner FR. On differential variability of expression ratios: Improving statistical inference about gene expression changes from microarray data. J Comput Biol 2001; 8:37.PubMedCrossRefGoogle Scholar
  27. 27.
    Newton MA, Noueiry A, Sarkar D, Ahlquist P. Detecting differential gene expression with a semiparametric hierarchical mixture method. Biostatistics 2004; 5:155–76.Google Scholar
  28. 28.
    Wolfinger RD, Gibson G, Wolfinger ED, Bennett L, Hamadeh H, Bushel P, et al. Assessing gene significance from cDNA microarray expression data via mixed models. J Comput Biol 2001; 8:625–37.Google Scholar
  29. 29.
    Wang S, Ethier S. A generalized likelihood ratio test to identify differentially expressed genes from microarray data. Bioinformatics 2004; 20:100–4.Google Scholar
  30. 30.
    Jain N, Thatte J, Braciale T, Ley K, O’Connell M, Lee JK. Local-pooled-error test for identifying differentially expressed genes with a small number of replicated microarrays. Bioinformatics 2003; 19:1945–51.Google Scholar
  31. 31.
    Bishop CM. Pattern Recognition and Machine Learning. New York: Springer, 2006.Google Scholar
  32. 32.
    Hastie T, Tibshirani R, Friedman J. The Elements of Statistical Learning. New York: Springer-Verlag, 2001.Google Scholar
  33. 33.
    Michie D, Spiegelhalter D, Taylor C, eds. Machine Learning, Neural and Statistical Classification. New York: Ellis Horwood, 1994.Google Scholar
  34. 34.
    Golub TR, Slonim DK, Tamayo P, Huard C, Gaasenbeek M, Mesirov P, et al. Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring. Science 1999; 286:531–7.Google Scholar
  35. 35.
    Brown MPS, Grundy WN, Lin D, Cristianini N, Sugnet CW, Furey TS, et al. Knowledge-based analysis of microarray gene expression data by using support vector machines. PNAS 2000; 97:262–7.Google Scholar
  36. 36.
    Furey TS, Cristianini N, Duffy N, Bednarski DW, Schummer M, Haussler D. Support vector machine classification and validation of cancer tissue samples using microarray expression data. Bioinformatics 2000; 16:906–14.Google Scholar
  37. 37.
    Mukherjee S, Tamayo P, Slonim D, Verri A, Golub T, Mesirov JP, Poggio T. Support vector machine classification of microarray data. MIT 1998.Google Scholar
  38. 38.
    West M, Blanchette C, Dressman H, Huang E, Ishida S, Spang R, et al. Prediction the clinical status of human breast cancer by using gene expression profiles. PNAS 2001; 98.Google Scholar
  39. 39.
    Nguyen DV, Rocke DM. Tumor classification by partial least squares using microarray gene expression data. Bioinformatics 2002; 18:39–50.PubMedCrossRefGoogle Scholar
  40. 40.
    Li L, Weinberg CR, Darden TA, Pedersen LG. Gene selection for sample classification based on gene expression data: Study of sensitivity to choice of parameters of the GA/KNN method. Bioinformatics 2001; 17:1131–42.Google Scholar
  41. 41.
    Culhane AC, Perriere G, Considine EC, Cotter TG, Higgins DG. Between-group analysis of microarray data. Bioinformatics 2002; 18:1600–8.Google Scholar
  42. 42.
    Vapnik V. The Nature of Statistical Learning Theory. New York: Springer-Verlag, 1995.Google Scholar
  43. 43.
    Lin Y. Support vector machines and the Bayes rule in classification. Data Min Knowl Discov 2002:259–75.CrossRefGoogle Scholar
  44. 44.
    Burges CJC. A tutorial on support vector machines for pattern recognition. Data Min Knowl Discov 1998; 2:121–67.Google Scholar
  45. 45.
    Bennett KP, Campbell C. Support Vector Machines: Hype or Hallelujah? SIGKDD Explorations 2000; 2:1–13.CrossRefGoogle Scholar
  46. 46.
    Wahba G. Spline Models for Observational Data. Vol. Philadelphia: SIAM, 1990; 2.Google Scholar
  47. 47.
    Bradley AP. The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recogn 1997; 30:1145–59.Google Scholar
  48. 48.
    Hand DJ. Construction and Assessment of Classification Rules. 1st edn. Chichester: John Wiley and Sons, 1997.Google Scholar
  49. 49.
    Soukup M, Lee JK. Developing optimal prediction models for cancer classification using gene expression data. J Bioinform Comput Biol 2004; 1:681–94.Google Scholar
  50. 50.
    Soukup M, Cho H, Lee JK. Robust classification modeling on microarray data using misclassification penalized posterior. Bioinformatics 2005; 21:i423–i30.Google Scholar
  51. 51.
    Chang JC, Wooten EC, Tsimelzon A, Hilsenbeck SG, Gutierrez MC, Elledge R, et al. Gene expression profiling for the prediction of therapeutic response to docetaxel in patients with breast cancer. Lancet 2003; 362:362–9.Google Scholar
  52. 52.
    Ma XJ, Wang Z, Ryan PD, Isakoff SJ, Barmettler A, Fuller A, et al. A two-gene expression ratio predicts clinical outcome in breast cancer patients treated with tamoxifen. Cancer Cell 2004; 5:607–16.Google Scholar
  53. 53.
    Gant TW. Application of toxicogenomics in drug development. Drug News Perspect 2003; 16:217–21.Google Scholar
  54. 54.
    Mendrick DL. Genomic and genetic biomarkers of toxicity. Toxicology 2008; 245:175–81.Google Scholar
  55. 55.
    Lord PG, Nie A, McMillian M. Application of genomics in preclinical drug safety evaluation. Basic Clin Pharmacol Toxicol 2006; 98:537–46.Google Scholar
  56. 56.
    Lettieri T. Recent applications of DNA microarray technology to toxicology and ecotoxicology. Environ Health Perspect 2006; 114:4–9.Google Scholar
  57. 57.
    Ju Z, Wells MC, Walter RB. DNA microarray technology in toxicogenomics of aquatic models: methods and applications. Comp Biochem Physiol C Toxicol Pharmacol 2007; 145:5–14.Google Scholar
  58. 58.
    Martin MT, Brennan RJ, Hu W, Ayanoglu E, Lau C, Ren H, et al. Toxicogenomic study of triazole fungicides and perfluoroalkyl acids in rat livers predicts toxicity and categorizes chemicals based on mechanisms of toxicity. Toxicol Sci 2007; 97:595–613.Google Scholar
  59. 59.
    Lynch T, Price A. The effect of cytochrome P450 metabolism on drug response, interactions, and adverse effects. Am Fam Physician 2007; 76:391–6.Google Scholar
  60. 60.
    Michalets EL. Update: Clinically significant cytochrome P-450 drug interactions. Pharmacotherapy 1998; 18:84–112.Google Scholar
  61. 61.
    Navarro VJ, Senior JR. Drug-related hepatotoxicity. N Engl J Med 2006; 354:731–9.Google Scholar
  62. 62.
    Zimmerman HJ. Drug-induced liver disease. Drugs 1978; 16:25–45.PubMedCrossRefGoogle Scholar
  63. 63.
    Mumoli N, Cei M, Cosimi A. Drug-related hepatotoxicity. N Engl J Med 2006; 354: 2191–3; author reply-3.Google Scholar
  64. 64.
    Zidek N, Hellmann J, Kramer PJ, Hewitt PG. Acute hepatotoxicity: A predictive model based on focused illumina microarrays. Toxicol Sci 2007; 99:289–302.PubMedCrossRefGoogle Scholar
  65. 65.
    Spicker JS, Brunak S, Frederiksen KS, Toft H. Integration of clinical chemistry, expression, and metabolite data leads to better toxicological class separation. Toxicol Sci 2008; 102:444–54.Google Scholar
  66. 66.
    Fielden MR, Brennan R, Gollub J. A gene expression biomarker provides early prediction and mechanistic assessment of hepatic tumor induction by nongenotoxic chemicals. Toxicol Sci 2007; 99:90–100.PubMedCrossRefGoogle Scholar
  67. 67.
    Uehara T, Hirode M, Ono A, Kiyosawa N, Omura K, Shimizu T, et al. A toxicogenomics approach for early assessment of potential non-genotoxic hepatocarcinogenicity of chemicals in rats. Toxicology 2008; 250:15–26.PubMedCrossRefGoogle Scholar
  68. 68.
    Natsoulis G, El Ghaoui L, Lanckriet GR, Tolley AM, Leroy F, Dunlea S, et al. Classification of a large microarray data set: Algorithm comparison and analysis of drug signatures. Genome Res 2005; 15:724–36.Google Scholar
  69. 69.
    Bentzen SM. Preventing or reducing late side effects of radiation therapy: radiobiology meets molecular pathology. Nat Rev Cancer 2006; 6:702–13.Google Scholar
  70. 70.
    Peeters ST, Heemsbergen WD, van Putten WL, Slot A, Tabak H, Mens JW, et al. Acute and late complications after radiotherapy for prostate cancer: results of a multicenter randomized trial comparing 68 Gy to 78 Gy. Int J Radiat Oncol Biol Phys 2005; 61:1019–34.Google Scholar
  71. 71.
    Kruse JJ, Stewart FA. Gene expression arrays as a tool to unravel mechanisms of normal tissue radiation injury and prediction of response. World J Gastroenterol 2007; 13:2669–74.Google Scholar
  72. 72.
    Chon BH, Loeffler JS. The effect of nonmalignant systemic disease on tolerance to radiation therapy. Oncologist 2002; 7:136–43.Google Scholar
  73. 73.
    Rodningen OK, Overgaard J, Alsner J, Hastie T, Borresen-Dale AL. Microarray analysis of the transcriptional response to single or multiple doses of ionizing radiation in human subcutaneous fibroblasts. Radiother Oncol 2005; 77:231–40.Google Scholar
  74. 74.
    Andreassen CN. Can risk of radiotherapy-induced normal tissue complications be predicted from genetic profiles? Acta Oncol 2005; 44:801–15.Google Scholar
  75. 75.
    Rieger KE, Hong WJ, Tusher VG, Tang J, Tibshirani R, Chu G. Toxicity from radiation therapy associated with abnormal transcriptional responses to DNA damage. Proc Natl Acad Sci USA 2004; 101:6635–40.Google Scholar
  76. 76.
    Svensson JP, Stalpers LJ, Esveldt-van Lange RE, Franken NA, Haveman J, Klein B, et al. Analysis of gene expression using gene sets discriminates cancer patients with and without late radiation toxicity.PLoS Med 2006; 3:e422.PubMedCrossRefGoogle Scholar

Copyright information

© Humana Press, a part of Springer Science+Business Media, LLC 2010

Authors and Affiliations

  • Feng Cheng
    • 1
  • Sang-Hoon Cho
    • 2
  • Jae K. Lee
    • 2
  1. 1.Department of BiophysicsUniversity of VirginiaCharlottesvilleUSA
  2. 2.Department of Public Health SciencesUniversity of VirginiaCharlottesvilleUSA

Personalised recommendations