Employing Gene Set Top Scoring Pairs to Identify Deregulated Pathway-Signatures in Dilated Cardiomyopathy from Integrated Microarray Gene Expression Data

  • Aik Choon TanEmail author
Part of the Methods in Molecular Biology book series (MIMB, volume 802)


It is well accepted that a set of genes must act in concert to drive various cellular processes. However, under different biological phenotypes, not all the members of a gene set will participate in a biological process. Hence, it is useful to construct a discriminative classifier by focusing on the core members (subset) of a highly informative gene set. Such analyses can reveal which of those subsets from the same gene set correspond to different biological phenotypes. In this study, we propose Gene Set Top Scoring Pairs (GSTSP) approach that exploits the simple yet powerful relative expression reversal concept at the gene set levels to achieve these goals. To illustrate the usefulness of GSTSP, we applied this method to five different human heart failure gene expression data sets. We take advantage of the direct data integration feature in the GSTSP approach to combine two data sets, identify a discriminative gene set from >190 predefined gene sets, and evaluate the predictive power of the GSTSP classifier derived from this informative gene set on three independent test sets (79.31% in test accuracy). The discriminative gene pairs identified in this study may provide new biological understanding on the disturbed pathways that are involved in the development of heart failure. GSTSP methodology is general in purpose and is applicable to a variety of phenotypic classification problems using gene expression data.

Key words

Gene set analysis Top scoring pairs Relative expression classifier Microarray Gene expression 


  1. 1.
    Mootha VK, Lindgren CM, Eriksson K-F et al (2003) PGC-1alpha-responsive genes involved in oxidative phosphorylation are coordinately downregulated in human diabetes. Nature Genetics 34:267–273.PubMedCrossRefGoogle Scholar
  2. 2.
    Winslow RL, Gao Z (2005) Candidate gene discovery in cardiovascular disease Circ Res 96:605–606.PubMedCrossRefGoogle Scholar
  3. 3.
    Sharma UC, Pokharel S, Evelo CTA et al (2005) A systematic review of large scale and heterogeneous gene array data in heart failure. J Mol Cell Cardiol 38: 425–432.PubMedCrossRefGoogle Scholar
  4. 4.
    Rhodes DR, Chinnaiyan AM (2005) Integrative analysis of the cancer transcriptome. Nature Genetics 37:S31-S37.PubMedCrossRefGoogle Scholar
  5. 5.
    Segal E, Friedman N, Kaminski N et al (2005) From signatures to models: understanding cancer using microarrays. Nature Genetics 37:S38-S45.PubMedCrossRefGoogle Scholar
  6. 6.
    Subramanian A, Tamayo P, Mootha VK et al (2005) Gene Set Enrichment Analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci U S A 102:15545–15550.PubMedCrossRefGoogle Scholar
  7. 7.
    AHA. (2005) Heart Disease and Stroke Statistics - 2005 Update. American Heart Association.Google Scholar
  8. 8.
    Liew CC, Dzau VJ (2004) Molecular genetics and genomics of heart failure. Nature Reviews Genetics 5:811–825.PubMedCrossRefGoogle Scholar
  9. 9.
    Ventura-Clapier R, Garnier A, Veksler V (2004) Energy metabolism in heart failure. Journal of Physiology 555:1–13.PubMedCrossRefGoogle Scholar
  10. 10.
    Barrans JD, Allen PD, Stamatiou D et al (2002) Global gene expression profiling of end-stage dilated cardiomyopathy using a human cardiovascular-based cDNA microarray. American Journal of Pathology 160:2035–2043.PubMedCrossRefGoogle Scholar
  11. 11.
    Yung CK, Halperin VL, Tomaselli GF et al (2004) Gene expression profiles in end-stage human idiopathic dilated cardiomyopathy: altered expression of apoptotic and cytoskeletal genes. Genomics 83:281–297.PubMedCrossRefGoogle Scholar
  12. 12.
    Geman D, d'Avignon C, Naiman DQ et al (2004) Classifying gene expression profiles from pairwise mRNA comparisons. Statistical Applications in Genetics and Molecular Biology 3:Article 19.Google Scholar
  13. 13.
    Tan AC, Naiman DQ, Xu L et al (2005) Simple decision rules for classifying human cancers from gene expression profiles. Bioinformatics 21:3896–3904.PubMedCrossRefGoogle Scholar
  14. 14.
    Tibshirani R, Hastie T, Narasimhan B et al (2003) Class prediction by nearest shrunken centroids, with applications to dna microarrays. Statistical Science 18:104–117.CrossRefGoogle Scholar
  15. 15.
    Xu L, Tan AC, Naiman DQ et al (2005) Robust prostate cancer marker genes emerge from direct integration of inter-study microarray data. Bioinformatics 21: 3905–3911.PubMedCrossRefGoogle Scholar
  16. 16.
    Xu L, Tan AC, Winslow RL et al (2008) Merging microarray data from separate breast cancer studies provides a robust prognostic test. BMC Bioinformatics 9:125.PubMedCrossRefGoogle Scholar
  17. 17.
    Chen YJ, Park S, Li Y et al (2003) Alterations of gene expression in failing myocardium following left ventricular assist device support. Physiology Genomics 14:251–260.Google Scholar
  18. 18.
    Hall JL, Grindle S, Han X et al (2004) Genomic profiling of the human heart before and after mechanical support with a ventricular assist device reveals alterations in vascular signaling networks. Physiology Genomics 17:283–291.CrossRefGoogle Scholar
  19. 19.
    Kittleson MM, Ye SQ, Irizarry RA et al (2004) Identification of a gene expression profile that differentiates between ischemic and nonischemic cardiomyopathy. Circulation 110:3444–3451.PubMedCrossRefGoogle Scholar
  20. 20.
    Harvard. (2005) Genomics of Cardiovascular Development, Adaptation, and Remodeling. NHLBI Program for Genomic Applications, Harvard Medical School. URL:
  21. 21.
    Kanehisa M, Goto S, Kawashima S et al (2004) The KEGG resource for deciphering the genome. Nucleic Acids Research 32:D277-D280.PubMedCrossRefGoogle Scholar
  22. 22.
    Dahlquist KD, Salomonis N, Vranizan K et al (2002) GenMAPP: a new tool for viewing and analyzing microarray data on biological pathways. Nature Genetics 31:19–20.PubMedCrossRefGoogle Scholar
  23. 23.
    van Rijsbergen CJ (1979) Information Retrieval, 2nd ed., Butterworths.Google Scholar
  24. 24.
    Tusher VG, Tibshirani R, Chu G (2001) Significance analysis of microarrays applied to the ionizing radiation response. PNAS 98:5116–5121.PubMedCrossRefGoogle Scholar
  25. 25.
    Sanoudou D, Vafiadaki E, Arvanitis DA et al (2005) Array lessons from the heart: focus on the genome and transcriptome of cardiomyopathies. Phyisology Genomics 21:131–143.CrossRefGoogle Scholar
  26. 26.
    Margulies KB, Matiwala S, Cornejo C et al (2005) Mixed messages: transcription patterns in failing and recovering human myocardium. Circ Res 96:592–599.PubMedCrossRefGoogle Scholar
  27. 27.
    Rhodes DR, Kalyana-Sundaram S, Mahavisno V et al (2005) Mining for regulatory programs in the cancer transcriptome. Nature Genetics 37:579–583.PubMedCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2012

Authors and Affiliations

  1. 1.Division of Medical Oncology, Department of Medicine, School of MedicineUniversity of Colorado Anschutz Medical CampusAuroraUSA

Personalised recommendations