Cross-Platform Analysis with Binarized Gene Expression Data

  • Salih Tuna
  • Mahesan Niranjan
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5780)


With widespread use of microarray technology as a potential diagnostics tool, the comparison of results obtained from the use of different platforms is of interest. When inference methods are designed using data collected using a particular platform, they are unlikely to work directly on measurements taken from a different type of array. We report on this cross-platform transfer problem, and show that working with transcriptome representations at binary numerical precision, similar to the gene expression bar code method, helps circumvent the variability across platforms in several cancer classification tasks. We compare our approach with a recent machine learning method specifically designed for shifting distributions, i.e., problems in which the training and testing data are not drawn from identical probability distributions, and show superior performance in three of the four problems in which we could directly compare.


Cross-platform analysis binary gene expression  classification 


  1. 1.
    Brown, M.P.S., Grundy, W.N., Lin, D., Cristianini, N., Sugnet, C.W., Furey, T.S., Ares Jr., M., Haussler, D.: Knowledge-based analysis of microarray gene expression data by using support vector machines. PNAS 97(1), 262–267 (2000)CrossRefPubMedPubMedCentralGoogle Scholar
  2. 2.
    Tomayko, M.M., Anderson, S.M., Brayton, C.E., Sadanand, S., Steinel, N.C., Behrens, T.W., Shlomchik, M.J.: Systematic Comparison of Gene Expression between Murine Memory and Naive B Cells Demonstrates That Memory B Cells Have Unique Signaling Capabilities. J. Immunol. 181(1), 27 (2008)CrossRefPubMedPubMedCentralGoogle Scholar
  3. 3.
    MAQC consortium, The MicroArray Quality Control (MAQC) project shows inter-and intraplatform reproducibility of gene expression measurements. Nat. Biotechnol. 24, 1151–1161 (2006)Google Scholar
  4. 4.
    Draghici, S., Khatri, P., Eklund, A.C., Szallasi, Z.: Reliability and reproducibility issues in DNA microarray measurements. Trends Genet. 22, 101–109 (2006)CrossRefPubMedGoogle Scholar
  5. 5.
    Kuo, W.P., Jenssen, T.K., Butte, A.J., Ohno-Machado, L., Kohane, I.S.: Analysis of matched mRNA measurements from two different microarray technologies. Bioinformatics 18(3), 405–412 (2002)CrossRefPubMedGoogle Scholar
  6. 6.
    Tuna, S., Niranjan, M.: Inference from low precision transcriptome data representation. Journal of Signal Processing Systems (April 22, 2009), doi:10.1007/s11265-009-0363-2Google Scholar
  7. 7.
    Tanimoto, T.T.: IBM Internal Report, An elementary mathematical theory of classification and prediction (1958)Google Scholar
  8. 8.
    Tuna, S., Niranjan, M.: Classification with binary gene expressions. Journal of Biomedical Sciences and Engineering (in press, 2009)Google Scholar
  9. 9.
    Zilliox, M.J., Irizarry, R.A.: A gene expression bar code for microarray data. Nat. Met. 4(11), 911–913 (2007)CrossRefGoogle Scholar
  10. 10.
    Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification. John Wiley & Sons, USA (2001)Google Scholar
  11. 11.
    Shmulevich, I., Zhang, W.: Binary analysis and optimization-based normalization of gene expression data. Bioinformatics 18(4), 555–565 (2002)CrossRefPubMedGoogle Scholar
  12. 12.
    Warnat, P., Eils, R., Brors, B.: Cross-platform analysis of cancer microarray data improves gene expression based classification of phenotypes. BMC Bioinformatics 6, 265 (2005)CrossRefPubMedPubMedCentralGoogle Scholar
  13. 13.
    Gretton, A., Smola, A., Huang, J., Schmittfull, M., Borgwardt, K., Scholkopf, B.: Covariate shift by kernel mean matching. In: Quionero-Candela, J., Sugiyama, M., Schwaighofer, A., Lawrence, N.D. (eds.) Dataset shift in machine learning, pp. 131–160. Springer/The MIT Press, London (2009)Google Scholar
  14. 14.
    Dougherty, J., Kohavi, R., Sahami, M.: Supervised and Unsupervised Discretization of Continuous Features. In: International Conference on Machine Learning, pp. 194–202 (1995)Google Scholar
  15. 15.
    Zhou, X., Wang, X., Dougherty, E.R.: Binarization of microarray data on the basis of a mixture model. Mol. Cancer Ther. 2(7), 679–684 (2003)PubMedGoogle Scholar
  16. 16.
    Friedman, N., Linial, M., Nachman, I., Pe’er, D.: Using Bayesian networks to analyze expression data. J. Comput. Biol. 7(3-4), 601–620 (2000)CrossRefPubMedGoogle Scholar
  17. 17.
    Brazma, A., Jonassen, I., Vilo, J., Ukkonen, E.: Predicting Gene Regulatory Elements in Silico on a Genomic Scale. Genome Res. 8(11), 1202–1215 (1998)PubMedPubMedCentralGoogle Scholar
  18. 18.
    Swamidass, S.J., Chen, J., Bruand, J., Phung, P., Ralaivola, L., Baldi, P.: Kernels for small molecules and the prediction of mutagenicity, toxicity and anti-cancer activity. Bioinformatics 21(suppl. 1), i359–i368 (2005)CrossRefGoogle Scholar
  19. 19.
    Trotter, M.W.B.: Support vector machines for drug discovery. Ph.D. thesis, University College London, UK (2006)Google Scholar
  20. 20.
    Gunn, S.R.: Support vector machines for classification and regression, Technical Report, University of Southampton (1997),
  21. 21.
    Milo, M., Fazeli, A., Niranjan, M., Lawrence, N.D.: A probabilistic model for the extraction of expression levels from oligonucleotide arrays. Biochem. Soc. Trans. 31(Pt 6), 1510–1512 (2003)CrossRefPubMedGoogle Scholar
  22. 22.
    Rattray, M., Liu, X., Sanguinetti, G., Milo, M., Lawrence, N.D.: Propagating uncertainty in microarray data analysis. Brief Bioinform. 7(1), 37–47 (2006)CrossRefPubMedGoogle Scholar
  23. 23.
    Sanguinetti, G., Milo, M., Rattray, M., Lawrence, N.D.: Accounting for probe-level noise in principal component analysis of microarray data. Bioinformatics 21(19), 3748–3754 (2005)CrossRefPubMedGoogle Scholar
  24. 24.
    Liu, X., Lin, K., Andersen, B., Rattray, M.: Including probe-level uncertainty in model-based gene expression clustering. BMC Bioinformatics 8(1), 98 (2007)CrossRefPubMedPubMedCentralGoogle Scholar
  25. 25.
    West, M., Blanchette, C., Dressman, H., Huang, E., Ishida, S., Spang, R., Zuzan, H., Olson Jr., J.A., Marks, J.R., Nevins, J.R.: Predicting the clinical status of human breast cancer by using gene expression profiles. PNAS 98(20), 11462–11467 (2001)CrossRefPubMedPubMedCentralGoogle Scholar
  26. 26.
    Gruvberger, S., Ringnér, M., Chen, Y., Panavally, S., Saal, L.H., Borg, A., Ferno, M., Peterson, C., Meltzer, P.S.: Estrogen Receptor Status in Breast Cancer Is Associated with Remarkably Distinct Gene Expression Patterns. Cancer Res. 61(16), 5979–5984 (2001)PubMedGoogle Scholar
  27. 27.
    Welsh, J.B., Sapinoso, L.M., Su, A.I., Kern, S.G., Wang-Rodriguez, J., Moskaluk, C.A., Frierson, H.F., Hampton, G.M.: Analysis of gene expression identifies candidate markers and pharmacological targets in prostate cancer. Cancer Res. 61(16), 5974–5978 (2001)PubMedGoogle Scholar
  28. 28.
    Dhanasekaran, S.M., Barrette, T.R., Ghosh, D., Shah, R., Varambally, S., Kurachi, K., Pienta, K.J., Rubin, M.A., Chinnaiyan, A.M.: Delineation of prognostic biomarkers in prostate cancer. Nature 412(6849), 822–826 (2001)CrossRefPubMedGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Salih Tuna
    • 1
  • Mahesan Niranjan
    • 1
  1. 1.School of Electronics and Computer Science, ISIS Research GroupUniversity of SouthamptonUK

Personalised recommendations