Clustering: A Novel Meta-Analysis Approach for Differentially Expressed Gene Detection

  • Agaz Hussain WaniEmail author
  • H. L. Shashirekha
Conference paper
Part of the Lecture Notes in Networks and Systems book series (LNNS, volume 14)


Analysis of gene expression data obtained from microarray experiments is helpful for various biological purposes such as identifying Differentially Expressed genes, disease classification, predicting survival rate of patients etc. However, data from microarray experiments come with less sample size and thus have limited statistical power for any analysis. To overcome this problem, researchers are now relying on a more powerful technique called meta-analysis, an integrated analysis of existing data from different but related independent studies. Microarray data reveal that genes are normally expressed in related functional pattern, which suggests using clustering as an alternative technique to group genes into relatively homogenous clusters such as Differentially Expressed and Non-Differentially Expressed. In this paper, we explore k-Means Clustering technique to perform meta-analysis of gene expression data for finding Differentially Expressed genes. Comparative analysis of k-Means Clustering technique is performed, and the results are validated by various statistical meta-analysis techniques, which prove clustering as a robust alternative technique for meta-analysis of gene expression data.


Meta-analysis Gene expression Microarray analysis Clustering 


  1. 1.
    Scheetz TE, Kim K-YA, Swiderski RE, Philp AR, Braun TA, Knudtson KL, Dorrance AM, DiBona GF, Huang J, Casavant TL, Sheffield VC, Stone EM (2006) Regulation of gene expression in the mammalian eye and its relevance to eye disease. Proc. Natl. Acad. Sci. U. S. A. 103(13):14429–14434CrossRefGoogle Scholar
  2. 2.
    Li J, Tseng GC (2011) An adaptively weighted statistic for detecting differential gene expression when combining multiple transcriptomic studies. Ann. Appl. Stat. 5(2):9941019MathSciNetzbMATHGoogle Scholar
  3. 3.
    Wang X, Kang DD, Shen K, Song C, Lu S, Chang LC, Liao SG, Huo Z, Tang S, Ding Y, Kaminski N, Sibille E, Lin Y, Li J, Tseng GC (2012) An r package suite for microarray meta-analysis in quality control, differentially expressed gene analysis and pathway enrichment detection. Bioinformatics 28(19):2534–2536CrossRefGoogle Scholar
  4. 4.
    Sun H, Xing X, Li J, Zhou F, Chen Y, He Y, Li W, Wei G, Chang X (2013) Identification of gene fusions from human lung cancer mass spectrometry data. BMC Genomics 14(Suppl 8):S5CrossRefGoogle Scholar
  5. 5.
    Zaravinos A, Lambrou GI, Boulalas I, Delakas D, Spandidos DA (2011) Identification of common differentially expressed genes in urinary bladder cancer, PLoS One 6(4)Google Scholar
  6. 6.
    Alon U, Barkai N, Notterman DA, Gish K, Ybarra S, Mack D, Levine AJ (1999) Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proc. Natl. Acad. Sci. USA 96(12):6745–6750CrossRefGoogle Scholar
  7. 7.
    Heyer LJ, Kruglyak S, Yooseph S (1999) Exploring expression data: identification and analysis of coexpressed genes exploring expression data: identification and analysis of coexpressed genes. (213):1106–1115Google Scholar
  8. 8.
    Lin IH, Chen DT, Chang YF, Lee YL, Su CH, Cheng C, Tsai YC, Ng SC, Chen HT, Lee MC, Chen HW, Suen SH, Chen YC, Liu TT, Chang CH, Hsu MT (2015) Hierarchical clustering of breast cancer methylomes revealed differentially methylated and expressed breast cancer genes. PLoS ONE 10(2):130Google Scholar
  9. 9.
    Fisher R, Fisher RA (1925) Statistical methods for research workers. Genesis Publishing, Oliver and Boyd, EdinburghzbMATHGoogle Scholar
  10. 10.
    Shashirekha HL, Wani AH (2016) ShinyMDE: shiny tool for microarray metaanalysis for differentially expressed gene detection. In: 2016 international conference on bioinformatics and systems biology (BSB), Allahabad, 2016, pp. 1–5. doi: 10.1109/BSB.2016.7552152
  11. 11.
    Morissette L, Chartier S (2013) The k-means clustering technique: general considerations and implementation in Mathematica. Tutor. Quant. Methods Psychol 9(1):15–24CrossRefGoogle Scholar
  12. 12.
    Lloyd SP (1982) Least squares quantization in PCM. IEEE Trans Inf Theory 28(2):129–137MathSciNetCrossRefzbMATHGoogle Scholar
  13. 13.
    MacQueen JB (1967) K-means some methods for classification and analysis of multivariate observations. In: 5th Berkeley Symposium on mathematical statistics and probability, vol. 1(233), pp. 281–297Google Scholar
  14. 14.
    Forgy EW (1965) Cluster analysis of multivariate data: efficiency versus interpretability of classifications. Biometrics 21:786–796Google Scholar
  15. 15.
    Stouffer SA (1949) A study of attitudes. Sci Am 180(5):11CrossRefGoogle Scholar
  16. 16.
    Lu S, Li J, Song C, Shen K, Tseng GC (2010) Biomarker detection in the integration of multiple multi-class genomic studies. Bioinformatics 26(3):33340CrossRefGoogle Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2018

Authors and Affiliations

  1. 1.Department of Computer ScienceMangalore UniversityMangaloreIndia

Personalised recommendations