Abstract
Methylation-based study is currently a popular ongoing research topic. The researchers generally use 5-methylcytosine (5-mC) samples for their study since this category of samples is the highest stable methylation cytosine variant, and the impact of 5-mC methylation on different diseases is known to the common people. But, through recent studies, it has been observed that other cytosine variants (e.g., 5-hmC) have also high impact on those diseases. Therefore, in this chapter, we firstly demonstrate the abovementioned different cytosine variants. In the second part of the chapter, we describe a framework of identifying co-methylated gene modules on a methylation profile having multiple cytosine variants (viz., 5-hmC and 5-mC samples). For this, at first we determine significant genes using statistical method. Thereafter, weighted topological overlap matrix (weighted TOM) measure and average linkage method are applied, consecutively on the resultant significant genes. Then dynamic tree cut method with color thresholding is utilized, and co-methylated gene modules are identified from it. The resultant gene modules are then validated biologically by KEGG pathway and gene ontology analyses. Moreover, regulatory transcription factors (TFs) and targeter miRNAs connected with the genes belonging to the different modules are found, and further biological validation has been carried out on them. Finally, other related module-based and correlation-based popular computation methodologies and applications are also shortly demonstrated.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Anderson M (2001) Permutation tests for univariate or multivariate analysis of variance and regression. Can J Fish Aquat Sci 58:626–639
Aqil M, Naqvi AR, Mallik S, Bandyopadhyay S, Maulik U, Jameel S (2014) The HIV Nef protein modulates cellular and exosomal miRNA profiles in human monocytic cells. J Extracell Vesicles 3:1–11. https://doi.org/10.3402/jev.v3.23129
Aqil M, Mallik S, Bandyopadhyay S, Maulik U, Jameel S (2015) Transcriptomic analysis of mRNAs in human Monocytic cells expressing the HIV-1 Nef protein and their exosomes. Biomed Res Int 2015(492395):1–10. https://doi.org/10.1155/2015/492395
Bandyopadhyay S, Bhattacharyya M (2011) A biologically inspired measure for coexpression analysis. IEEE/ACM Trans Comput Biol Bioinform 8:929–942. https://doi.org/10.1109/TCBB.2010.106
Bandyopadhyay S, Mallik S (2016) Integrating multiple data sources for combinatorial marker discovery: a study in tumorigenesis. IEEE/ACM Trans Comput Biol Bioinform. https://doi.org/10.1109/TCBB.2016.2636207
Bandyopadhyay S, Mallik S, Mukhopadhyay A (2013) A survey and comparative study of statistical tests for identifying differential expression from microarray data. IEEE/ACM Trans Comput Biol Bioinform 11:95–115. https://doi.org/10.1109/TCBB.2013.147
Barrat A, Weigt M (2000) On the properties of small world networks. Eur Phys J B 13:547–560
Barrat A, Barthelemy M, Pastor-Satorras R, Vespignani A (2004) The architecture of complex weighted networks. PNAS 101:3747–3752
Batagelj V, Zavernik M (2011) Fast algorithms for determining (generalized) core groups in social networks. Adv Data Anal Classif 5:129–145
Baylin SB, Herman JG, Graff JR, Vertino PM, Issa JP (1998) Alterations in DNA methylation: a fundamental aspect of neoplasia. Adv Cancer Res 72:141–196
Bhadra T, Bhattacharyya M, Feuerbach L, Lengauer T, Bandyopadhyay S (2013) DNA methylation patterns facilitate the identification of microRNA transcription start sites: a brain-specific study. PLoS One 8:1–7. https://doi.org/10.1371/annotation/dd8f4acc-3859-46e2-9136-20b6b4d08d21
Bhattacharyya M (2012a) Mining co-expression graphs: applications to microRNA regulation and disease analysis. Nat Precedings. https://doi.org/10.1038/npre.2012.7119.1
Bhattacharyya M (2012b) Co-expression toggling of microRNAs in Alzheimer’s brain. Nat Precedings. https://doi.org/10.1038/npre.2012.7123.1
Bhattacharyya M, Bandyopadhyay S (2009) Integration of co-expression networks for gene clustering. Seventh international conference on advances in pattern recognition, pp 355–358. doi: https://doi.org/10.1109/ICAPR.2009.55
Bhattacharyya M, Bandyopadhyay S (2013) Studying the differential co-expression of microRNAs reveals significant role of white matter in early Alzheimer’s progression. Mol BioSyst 9:457–466. https://doi.org/10.1039/C2MB25434D
Bhattacharyya M, Das M, Bandyopadhyay S (2013) A new approach for combining knowledge from multiple Coexpression networks of microRNAs. IEEE Trans Biomed 60:2167–2173. https://doi.org/10.1109/TBME.2013.2250285
Bolstad BM, Irizarry RA, Astrand M, Speed T (2003) A comparison of normalization methods for high density oligonucleotide array data based on variance and bias. Bioinformatics 19:185–193
Bonacich P, Lloyd P (2001) Eigenvector-like measures of centrality for asymmetric relations. Soc Networks 23:191–201
Cedar H, Bergman Y (2009) Linking DNA methylation and histone modification: patterns and paradigms. Nat Rev Genet 10:295–304. https://doi.org/10.1038/nrg2540
Chen EY, Tan CM, Kou Y, Duan Q, Wang Z, Meirelles GV, Clark NR, Ma’ayan A (2013) Enrichr: interactive and collaborative HTML5 gene list enrichment analysis tool. BMC Bioinf 14:128. https://doi.org/10.1186/1471-2105-14-128
Chou CH, Chang NW, Shrestha S, Hsu SD, Lin YL, Lee WH, Yang CD, Hong HC, Wei TY, SJ T, Tsai TR, Ho SY, Jian TY, HY W, Chen PR, Lin NC, Huang HT, Yang TL, Pai CY, Tai CS, Chen WL, Huang CY, Liu CC, Weng SL, Liao KW, Hsu WL, Huang HD (2016) miRTarBase 2016: updates to the experimentally validated miRNA-target interactions database. Nucleic Acids Res 44:D239–D247. https://doi.org/10.1093/nar/gkv1258
Dango S et al (2011) DNA unwinding by ASCC3 helicase is coupled to ALKBH3 dependent DNA alkylation repair and cancer cell proliferation. Mol Cell 44:373–384. https://doi.org/10.1016/j.molcel.2011.08.039
Dweep H, Sticht C, Pandey P, Gretz N (2011) miRWalk--database: prediction of possible miRNA binding sites by “walking” the genes of three genomes. J Biomed Inform 44:839–847. https://doi.org/10.1016/j.jbi.2011.05.002
Estrada E, Rodrguez-Velzquez JA (2005) Subgraph centrality in complex networks. Phys Rev E 71:1–9
Freeman LC (1977) A set of measures of centrality based on betweenness. Sociometry 577:35–41
Freeman LC (1979) Centrality in social networks: conceptual clarification. Sociometry 1:215–239
Gevaert O, Villalobos V, Sikic BI, Plevritis SK (2013) Identification of ovarian cancer driver genes by using module network integration of multi-omics data. Interface Focus 3(4):20130013. https://doi.org/10.1098/rsfs.2013.0013
Hamed M, Spaniol C, Zapp A, Helms V (2015) Integrative network-based approach identifies key genetic elements in breast invasive carcinoma. BMC Genomics 16:S2. https://doi.org/10.1186/1471-2164-16-S5-S2
Hartwell LH, Hopfield JJ, Leibler S, Murray AW (1999) From molecular to modular cell biology. Nature 402:C47–C52
Hashimshony T, Zhang JM, Keshet I, Bustin M, Cedar H (2003) The role of DNA methylation in setting up chromatin structure during development. Nat Genet 34:187–192. https://doi.org/10.1038/ng1158
He YF et al (2011) Tet-mediated formation of 5-carboxylcytosine and its excision by TDG in mammalian DNA. Science 333:1303–1307. https://doi.org/10.1126/science.1210944
Huang DW, Sherman BT, Lempicki RA (2008) Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat Protoc 4:44–57. https://doi.org/10.1038/nprot.2008.211
Ito S et al (2011) Tet proteins can convert 5-methylcytosine to 5-formylcytosine and 5-carboxylcytosine. Science 333(6047):1300–1303. https://doi.org/10.1126/science.1210597
John B, Enright AJ, Aravin A, Tuschl T, Sander C, Marks DS (2004) Human MicroRNA targets. PLoS Biol 2:1862–1879
Jones PA (1999) The DNA methylation paradox. Trends Genet 15:34–37. https://doi.org/10.1016/S0168-9525(98)01636-9
Kass SU, Landsberger N, Wolffe AP (1997) DNA methylation directs a time-dependent repression of transcription initiation. Curr Biol 7:157–165. https://doi.org/10.1016/S0960-9822(97)70086-1
Kertesz M, Iovino N, Unnerstall U, Gaul U, Segal E (2007) The role of site accessibility in microRNA target recognition. Nat Genet 39:1278–1284
Krek A, Grun D, Poy MN, Wolf R, Rosenberg L, Epstein EJ, MacMenamin P, Piedade ID, Gunsalus KC, Stoffel M, Rajewsky N (2005) Combinatorial microRNA target predictions. Nat Genet 37:495–500
Kriaucionis S, Heintz N (2009) The nuclear DNA base 5-hydroxymethylcytosine is present in Purkinje neurons and the brain. Science 324(5929):929–930. https://doi.org/10.1126/science.1169786
Kruger J, Rehmsmeier M (2006) RNAhybrid: microRNA target prediction easy, fast and flexible. Nucleic Acids Res 34:W451–W454
Kumar A, Wong AKL, Tizarda ML, Moorea RJ, Lefèvreb C (2012) miRNA_Targets: a database for miRNA target predictions in coding and non-coding regions of mRNAs. Genomics 100:352–356. https://doi.org/10.1016/j.ygeno.2012.08.006
Langfelder P, Horvath S (2007) Eigengene networks for studying the relationships between co-expression modules. BMC Syst Biol 1(54):1–17. http://www.biomedcentral.com/1752-0509/1/54
Langfelder P, Horvath S (2008) WGCNA: an R package for weighted correlation network analysis. BMC Bioinf 9:559. https://doi.org/10.1186/1471-2105-9-559
Langfelder P, Zhang B, Horvath S (2007) Defining clusters from a hierarchical cluster tree: the dynamic tree cut package for R. Bioinformatics 24(5):719–720
Latham T, Gilbert N, Ramsahoye B (2008) DNA methylation in mouse embryonic stem cells and development. Cell Tissue Res 331:31–55
Lewis BP, Shih IH, Jones-Rhoades MW, Bartel DP, Burge CB (2003) Prediction of mammalian microRNA targets. Cell 115:787–798
Li E, Beard C, Jaenisch R (1993) Role for DNA methylation in genomic imprinting. Nature 366:362–365. https://doi.org/10.1038/366362a0
Liu CT, Yuan S, Li KC (2009) Patterns of co-expression for protein complexes by size in Saccharomyces cerevisiae. Nucleic Acids Res 37:526–532. https://doi.org/10.1093/nar/gkn972
Mallik S, Maulik U (2015) MiRNA-TF-gene network analysis through ranking of biomolecules for multi-informative uterine leiomyoma dataset. J Biomed Inform 57:308–319. https://doi.org/10.1016/j.jbi.2015.08.014
Mallik S, Mukhopadhyay A, Maulik U, Bandyopadhyay S (2013) Integrated analysis of gene expression and genome-wide DNA methylation for tumor prediction: an association rule mining-based approach. Proc IEEE symposium on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB), IEEE Symposium Series on Computational Intelligence – SSCI, Singapore, pp 120–127. doi:https://doi.org/10.1109/CIBCB.2013.6595397
Mallik S, Mukhopadhyay A, Maulik U (2014) Integrated statistical and rule- mining techniques for DNA methylation and gene expression data analysis. JAISCR 3:101–115. https://doi.org/10.2478/jaiscr-2014-0008
Mallik S, Mukhopadhyay A, Maulik U (2015) RANWAR: rank-based weighted association rule mining from gene expression and methylation data. IEEE Trans Nanobiosci 14:59–66. https://doi.org/10.1109/TNB.2014.2359494
Mallik S, Sen S, Maulik U (2016) IDPT: insights into potential intrinsically disordered proteins through transcriptomic analysis of genes for prostate carcinoma epigenetic data. Gene 586(2016):87–96. https://doi.org/10.1016/j.gene.2016.03.056
Mallik S, Bhadra T, Maulik U (2017) Identifying epigenetic biomarkers using maximal relevance and minimal redundancy based feature selection for multi-omics data. IEEE Trans Nanobiosci. https://doi.org/10.1109/TNB.2017.2650217
Maragkakis M, Vergoulis T, Alexiou P, Reczko M, Plomaritou K, Gousis M, Kourtis K, Koziris N, Dalamagas T, Hatzigeorgiou AG (2011) DIANA-microT Web server upgrade supports Fly and Worm miRNA target prediction and bibliographic miRNA to disease association. Nucleic Acids Res 39:W145–W148
Maulik U, Mallik S, Mukhopadhyay A, Bandyopadhyay S (2015) Analyzing gene expression and methylation data profiles using StatBicRM: statistical biclustering-based rule mining. PLoS One 10(4):e0119448. https://doi.org/10.1371/journal.pone.0119448
Newman MEJ (2003) The structure and function of complex networks. SIAM Rev 45(2):167–256
Ozgur A, Vu T, Erkan G, Radev DR (2008) Identifying gene-disease associations using centrality on a literature mined gene-interaction network. Bioinformatics 24:i277–i285. https://doi.org/10.1093/bioinformatics/btn182
Payer B, Lee JT (2008) X chromosome dosage compensation: how mammals keep the balance. Annu Rev Genet 42:733–772. https://doi.org/10.1146/annurev.genet.42.110807.091711
Ramsahoye B et al (2000) Non-CpG methylation is prevalent in embryonic stem cells and may be mediated by DNA methyltransferase 3a. PNAS 97:5237–5242
Ravasz E, Somera AL, Mongru DA, Oltvai ZN, Barabasi AL (2002) Hierarchical organization of modularity in metabolic networks. Science 297:1551–1555
Razali N, Wah Y (2011) Power comparisons of Shapiro-Wilk, Kolmogorov-Smirnov, Lilliefors and Anderson-Darling tests. J Stat Model Anal 2:21–33
Roy A, Bhattacharyya M (2016) Identifying microRNAs related to Alzheimer’s disease from differential methylation signatures. Gene Rep 4:104–111. https://doi.org/10.1016/j.genrep.2016.04.006
Sandelin A, Alkema W, Engström P, Wasserman WW, Lenhard B (2004) JASPAR: an open-access database for eukaryotic transcription factor binding profiles. Nucleic Acids Res 32:D91–D94. https://doi.org/10.1093/nar/gkh012
Sass S, Buettner F, Mueller NS, Theis FJ (2013) A modular framework for gene set analysis integrating multilevel omics data. Nucleic Acids Res 41:9622–9633. https://doi.org/10.1093/nar/gkt752
Shen R, Ghosh D, Chinnaiyan A, Meng Z (2006) Eigengene-based linear discriminant model for tumor classification using gene expression microarray data. Bioinformatics 22:2635–2642. https://doi.org/10.1093/bioinformatics/btl442
Smyth G (2004) Linear models and empirical Bayes methods for assessing differential expression in microarray experiments. Stat Appl Genet Mol Biol 3:Article3.
Sreekumar J, Jose KK (2008) Statistical tests for identification of differentially expressed genes in cDNA microarray experiments. Indian J Biotechnol 7:423–436
Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA, Paulovich A, Pomeroyh SL, Golub TR, Lander ES, Mesirov JP (2005) Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. BMC Bioinf 102:15545–15550. https://doi.org/10.1073/pnas.0506580102
Tahiliani M et al (2009) Conversion of 5-methylcytosine to 5-hydroxymethylcytosine in mammalian DNA by MLL partner TET1. Science 324:930–935. https://doi.org/10.1126/science.1170116
Tan L, Shi YG (2012) Tet family proteins and 5-hydroxymethylcytosine in development and disease. Development 139:1895–1902. https://doi.org/10.1242/dev.070771
Thadewald T, Buning H (2007) Jarque-Bera test and its competitors for testing normality. J Appl Stat 34:87–105
Thurman RE, Rynes E, Humbert R, Vierstra J, Maurano MT, Haugen E, Sheffield NC, Stergachis AB, Wang H, Vernot B, Garg K, John S, Sandstrom R, Bates D, Boatman L, Canfield TK, Diegel M, Dunn D, Ebersol AK, Frum T, Giste E, Johnson AK, Johnson EM, Kutyavin T, Lajoie B, Lee BK, Lee K, London D, Lotakis D, Neph S et al (2012) The accessible chromatin landscape of the human genome. Nature 489:75–82. https://doi.org/10.1038/nature11232
Tusher VG, Tibshirani R, Chu G (2001) Significance analysis of microarrays applied to the ionizing radiation response. Proc Natl Acad Sci USA 98:5116–5121
Van Eijk KR, de Jong S, Boks MP et al (2012) Genetic analysis of DNA methylation and gene expression levels in whole blood of healthy human subjects. BMC Genomics 13:636. https://doi.org/10.1186/1471-2164-13-636
Wagner JR, Busche S, Ge B, Kwan T, Pastinen T, Blanchette M (2014) The relationship between DNA methylation, genetic and expression inter-individual variation in untransformed human fibroblasts. Genome Biol 15:R37. https://doi.org/10.1186/gb-2014-15-2-r37
Wingender E, Dietze P, Karas H, Knuppel R (1996) TRANSFAC: a database on transcription factors and their DNA binding sites. Nucleic Acids Res 24:238–241. https://doi.org/10.1093/nar/24.1.238
Wu H, Tao J, Sun YE (2012) Regulation and function of mammalian DNA methylation patterns: a genomic perspective. Brief Funct Genomics 11:240–250
Wyatt GR, Cohen SS (1953) The bases of the nucleic acids of some bacterial and animal viruses: the occurrence of 5-hydroxymethylcytosine. Biochem J 55(5):774–782. PMID: 13115372 PMCID: PMC1269533.
Zheng G, Tu K, Yang Q, Xiong Y, Wei C, Xie L, Zhu Y, Li Y (2008) ITFP: an integrated platform of mammalian transcription factors. Bioinformatics 24:2416–2417. https://doi.org/10.1093/bioinformatics/btn439
Acknowledgments
The authors wish to thank Prof. Sanghamitra Bandyopadhyay, Director, Indian Statistical Institute, Kolkata, India, and Dr. Hemant J. Purohit, Chief Scientist and Head, Environmental Genomics Division, National Environmental Engineering, Research Institute, NEERI, CSIR, Nagpur, India.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Singapore Pte Ltd.
About this chapter
Cite this chapter
Mallik, S., Maulik, U. (2018). Module-Based Knowledge Discovery for Multiple-Cytosine-Variant Methylation Profile. In: Purohit, H., Kalia, V., More, R. (eds) Soft Computing for Biological Systems. Springer, Singapore. https://doi.org/10.1007/978-981-10-7455-4_10
Download citation
DOI: https://doi.org/10.1007/978-981-10-7455-4_10
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-7454-7
Online ISBN: 978-981-10-7455-4
eBook Packages: Biomedical and Life SciencesBiomedical and Life Sciences (R0)