Algorithmic Approach for Removing the Redundancy in Diabetic Gene Categories Based on Semantic Similarity and Gene Expression Data

  • Atul KumarEmail author
  • D. Jeya Sundara Sharmila
Original Research Article


Even after so much advancement in gene expression microarray technology, the main hindrance in analyzing microarray data is its limited number of samples as compared to a number of factors, which is a major impediment in revealing actual gene functionality and valuable information from the data. Analyzing gene expression data can indicate the factors which are differentially expressed in the diseased tissue. As most of these genes have no part to play in causing the disease of interest, thus, identification of disease-causing genes can reveal not just the case of the disease, but also its pathogenic mechanism. There are a lot of gene selection methods available which have the capacity to remove irrelevant genes, but most of them are not sufficient enough in removing redundancy in genes from microarray data, which increases the computational cost and decreases the classification accuracy. Combining the gene expression data with the gene ontology information can be helpful in determining the redundancy which can then be removed using the algorithm mentioned in the work. The gene list obtained after these sequential steps of the algorithm can be analyzed further to obtain the most deterministic genes responsible for type 2 diabetes.


Microarray technology Gene expression Diabetes Greedy algorithm Gene ontology Sematic similarity Pearson correlation GEO database 


  1. 1.
    Schena M, Shalon D, Davis RW, Brown PO (1995) Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science 270:467–470CrossRefGoogle Scholar
  2. 2.
    Zhang A (2006) Advanced Analysis of Gene Expression Microarray Data. World Scientific Publishing Co., DanversCrossRefGoogle Scholar
  3. 3.
    Mohammadi A, Saraee MH, Salehi M (2011) Identification of disease-causing genes using microarray data mining and Gene Ontology. BMC Med Genom 4:12–19CrossRefGoogle Scholar
  4. 4.
    Couto FM, Silva MJ, Coutinho PM (2007) Measuring semantic similarity between Gene Ontology terms. Data Knowl Eng 61:137–152CrossRefGoogle Scholar
  5. 5.
    Edgar R, Domrachev M, Lash AE (2002) Gene Expression Omnibus: NCBI gene expression and hybridization array data repository. Nucleic Acids Res 30(1):207–210CrossRefGoogle Scholar
  6. 6.
    Kumar A, Sharmila DJS, Kant R (2014) Selection of discriminatory gene set for Type II diabetes using fisher linear discriminant. Int J Adv Comput Mathe Sci 5(2):36–42Google Scholar
  7. 7.
    Parikh H, Carlsson E, Chutkow WA, Johansson LE, Storgaard H, Poulsen P, Saxena R, Ladd C, Schulze PC, Mazzini MJ, Jensen CB, Krook A, Björnholm M, Tornqvist H, Zierath JR, Ridderstråle M, Altshuler D, Lee RT, Vaag A, Groop LC, Mootha VK (2007) TXNIP regulates peripheral glucose metabolism in humans. PLoS Med 4(5):868–879CrossRefGoogle Scholar
  8. 8.
    Gunton JE, Kulkarni RN, Yim S, Okada T, Hawthorne WJ, Tseng YH, Roberson RS, Ricordi C, O’Connell PJ, Gonzalez FJ, Kahn CR (2005) Loss of ARNT/HIF1beta mediates altered gene expression and pancreatic-islet dysfunction in human type 2 diabetes. Cell 122(3):337–349CrossRefGoogle Scholar
  9. 9.
    Mootha VK, Lindgren CM, Eriksson KF, Subramanian A, Sihag S, Lehar J, Puigserver P, Carlsson E, Ridderstrale M, Laurila E, Houstis N, Daly MJ, Patterson N, Mesirov JP, Golub TR, Tamayo P, Spiegelman B, Lander ES, Hirschhorn JN, Altshuler D, Groop LC (2003) PGC-1a responsive genes involved in oxidative phosphorylation are coordinately downregulated in human diabetes. Nat Genet 34(3):267–273CrossRefGoogle Scholar
  10. 10.
    Schlicker A, Albrecht M (2008) FunSimMat: a comprehensive functional similarity database. Nucleic Acids Res 36:434–439CrossRefGoogle Scholar
  11. 11.
    Schlicker A, Domingues FS, Rahnenführer J, Lengauer T (2006) A new measure for functional similarity of gene products based on Gene Ontology. BMC Bioinform 7:302–317CrossRefGoogle Scholar

Copyright information

© International Association of Scientists in the Interdisciplinary Areas and Springer-Verlag Berlin Heidelberg 2015

Authors and Affiliations

  1. 1.Department of BioinformaticsKarunya UniversityCoimbatoreIndia
  2. 2.Department of Nanosciences and TechnologyTamil Nadu Agriculture UniversityCoimbatoreIndia

Personalised recommendations