Stepped Linear Regression to Accurately Assess Statistical Significance in Batch Confounded Differential Expression Analysis

  • Juntao Li
  • Jianhua Liu
  • R. Krishna Murthy Karuturi
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4983)


Batch effects in microarray experiments may lead to systematic shift in expression measurements from one batch to another. It poses great challenge if batches are confounded with the biological groups of interest especially in the estimation of statistical significance, FDR. Even the widely used well-tailored methods such as SAM are not immune to the effects of batch confounding of groups. We propose a stepped linear regression (SLR) method in the context of SAM to re-estimate the expected statistics and FDR in two class analysis to nullify batch effects and get really significant genes. SLR is equally applicable to the other similar methods and multi-group differential expression analysis.


Differential expression SAM Microarray Batch effect 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Fare, T.L., Coffey, E.M., Dai, H., He, Y.D., Kessler, D.A., Kilian, K.A., Koch, J.E., LeProust, E., Marton, M.J., Meyer, M.R., Stoughton, R.B., Tokiwa, G.Y., Wang, Y.: Effects of atmospheric ozone on microarray data quality. Analytical Chemistry 75, 4672–4675 (2003)Google Scholar
  2. 2.
    Tusher, V.G., Tibshirani, R., Chu, G.: Significance analysis of microarrays applied to the ionizing radiation response. Proc. Natl. Acad. Sci. USA. 98, 5116–5121 (2001)Google Scholar
  3. 3.
    Chu G., Narasimhan B., Tibshirani R., Tusher V.G.: SAM, significance Analysis of Microarrays, Users guide and technical document Google Scholar
  4. 4.
    Chu, Z., Li., K.R.K.M., Lin, K., Liu, J.: Adaptive Expression Responses in the Pol-gamma Null Strain Depleted of Mitochondrial Genome in S. pombe. BMC Genomics 8, 323 (2007)CrossRefGoogle Scholar
  5. 5.
    Johnson, W.E., Li, C., Rabinovic, A.: Adjusting batch effects in microarray expression data using empirical Bayes methods. Biostatistics 8(1), 118–127 (2007)zbMATHCrossRefGoogle Scholar
  6. 6.
    Smyth, G.K., Speed, T.: Normalization of cDNA microarray data. Methods 31(4), 265–273 (2003)CrossRefGoogle Scholar
  7. 7.
    Lander, E.S.: Array of hope. Nature Genetics 21, 3–4 (1999)CrossRefGoogle Scholar
  8. 8.
    Subramanian, A., Tamayo, P., Mootha, V.K., Mukherjee, S., Ebert, B.L., Gillette, M.A., Paulovich, A., Pomeroy, S.L., Golub, T.R., Lander, E.S., Mesirov, J.P.: A knowledge-based approach for interpreting genome-wide expression profiles. PNAS 102, 15545–15550 (2005)CrossRefGoogle Scholar
  9. 9.
    Wit, E., McClure, J.: Statistical adjustment of signal censoring in gene expression experiments. Bioinformatics 19(9), 1055–1060 (2003)CrossRefGoogle Scholar
  10. 10.
    Xie, Y., Pan, W., Khodursky, A.B.: A note on using permutation-based false discovery rate estimates to compare different analysis methods for microarray data. Bioinformatics 21(23), 4280–4288 (2005)CrossRefGoogle Scholar
  11. 11.
    Efron, B., Tibshirani, R.: On testing the significance of sets of genes, Tech report. Stanford University (August 2006),

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Juntao Li
    • 1
  • Jianhua Liu
    • 2
  • R. Krishna Murthy Karuturi
    • 1
  1. 1.Computational and Mathematical Biology 
  2. 2.Systems Biology, Genome Institute of Singapore, A*STAR (Agency for Science, Technology and Research)Republic of Singapore

Personalised recommendations