Abstract
Motivation: Dynamic regulation and packaging of genetic information is achieved by the organization of DNA into chromatin. Nucleosomal core histones, which form the basic repeating unit of chromatin, are subject to various post-translational modifications such as acetylation, methylation, phosphorylation and ubiquitinylation. These modifications have effects on chromatin structure and, along with DNA methylation, regulate gene transcription. The goal of this study was to determine if patterns in modifications were related to different categories of genomic features, and, if so, if the patterns had predictive value.
Results: In this study, we used publically available data (ChIP-chip) for different types of histone modifications (methylation and acetylation) and for DNA methylation for Arabidopsis thaliana and then applied a machine learning based approach (a support vector machine) to demonstrate that patterns of these modifications are very different among different kinds of genomic feature categories (protein, RNA, pseudogene and transposon elements). These patterns can be used to distinguish the types of genomic features. DNA methylation and H3K4me3 methylation emerged as features with most discriminative power. From our analysis on Arabidopsis, we were able to predict 33 novel genomic features, whose existence was also supported by analysis of RNA-seq experiments. In summary, we present a novel approach which can be used to discriminate/detect different categories of genomic features based upon their patterns of chromatin modification and DNA methylation.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Barutcuoglu, Z., Schapire, R.E., Troyanskaya, O.G.: Hierarchical multi-label prediction of gene function. Bioinformatics 22, 830–836 (2006)
Bender, J.: DNA methylation and epigenetics. Annu. Rev. Plant Biol. 55, 41–68 (2004)
Bernstein, B.E., et al.: Methylation of histone H3 Lys 4 in coding regions of active genes. Proc. Natl. Acad. Sci. U.S.A. 99, 8695–8700 (2002)
Bhardwaj, N., et al.: Kernel-based machine learning protocol for predicting DNA-binding proteins. Nucleic Acids Res. 33, 6486–6493 (2005)
Chang, C.-C., Lin, C.-J.: LIBSVM: a library for support vector machines (2001)
Chodavarapu, R.K., et al.: Relationship between nucleosome positioning and DNA methylation. Nature 466, 388–392 (2010)
Costas, C., et al.: Genome-wide mapping of Arabidopsis thaliana origins of DNA replication and their associated epigenetic marks. Nat. Struct. Mol. Biol. 18, 395-400 (2011)
Hoglund, A., et al.: MultiLoc: prediction of protein subcellular localization using N-terminal targeting sequences, sequence motifs and amino acid composition. Bioinformatics 22, 1158–1165 (2006)
Ji, H.K., Wong, W.H.: TileMap: create chromosomal map of tiling array hybridizations. Bioinformatics 21, 3629–3636 (2005)
Kong, L., et al.: CPC: assess the protein-coding potential of transcripts using sequence features and support vector machine. Nucleic Acids Res. 35, W345–W349 (2007)
Kouzarides, T.: Chromatin modifications and their function. Cell 128, 693–705 (2007)
Langmead, B., et al.: Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 10, R25 (2009)
Li, X.Y., et al.: High-resolution mapping of epigenetic modifications of the rice genome uncovers interplay between DNA methylation, histone methylation, and gene expression. Plant Cell 20, 259–276 (2008)
Lister, R., et al.: Highly integrated single-base resolution maps of the epigenome in Arabidopsis. Cell 133, 523–536 (2008)
Luger, K., et al.: Crystal structure of the nucleosome core particle at 2.8 A resolution. Nature 389, 251–260 (1997)
Park, P.J.: ChIP-seq: advantages and challenges of a maturing technology. Nat. Rev. Genet. 10, 669–680 (2009)
Paszkowski, J., Whitham, S.A.: Gene silencing and DNA methylation processes. Curr. Opin. Plant Biol. 4, 123–129 (2001)
Trapnell, C., Pachter, L., Salzberg, S.L.: TopHat: discovering splice junctions with RNA-Seq. Bioinformatics 25, 1105–1111 (2009)
Vapnik, N.V.: The Nature of Statistical Learning Theory. Springer (1995)
Wang, Z., et al.: Combinatorial patterns of histone acetylations and methylations in the human genome. Nat. Genet. 40, 897–903 (2008)
Wu, T.F., Lin, C.J., Weng, R.C.: Probability estimates for multi-class classification by pairwise coupling. J. Mach. Learn. Res. 5, 975–1005 (2004)
Zhang, X., et al.: Genome-wide high-resolution mapping and functional analysis of DNA methylation in Arabidopsis. Cell 126, 1189–1201 (2006)
Zhang, X.Y., et al.: Genome-wide analysis of mono-, di- and trimethylation of histone H3 lysine 4 in Arabidopsis thaliana. Genome Biol. 10 (2009)
Zhang, Y., Reinberg, D.: Transcription regulation by histone methylation: interplay between different covalent modifications of the core histone tails. Genes. Dev. 15, 2343–2360 (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Srivastava, A., Zhang, X., LaMarca, S., Cai, L., Malmberg, R.L. (2013). Patterns of Chromatin-Modifications Discriminate Different Genomic Features in Arabidopsis . In: Cai, Z., Eulenstein, O., Janies, D., Schwartz, D. (eds) Bioinformatics Research and Applications. ISBRA 2013. Lecture Notes in Computer Science(), vol 7875. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-38036-5_9
Download citation
DOI: https://doi.org/10.1007/978-3-642-38036-5_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-38035-8
Online ISBN: 978-3-642-38036-5
eBook Packages: Computer ScienceComputer Science (R0)