Skip to main content

Semi-nonparametric Modeling of Topological Domain Formation from Epigenetic Data

  • Conference paper
  • First Online:
Algorithms in Bioinformatics (WABI 2015)

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 9289))

Included in the following conference series:

Abstract

Hi-C experiments capturing the 3D genome architecture have led to the discovery of topologically-associated domains (TADs) that form an important part of the 3D genome organization and appear to play a role in gene regulation and other functions. Several histone modifications have been independently suggested as the possible explanations of TAD formation, but their combinatorial effects on domain formation remain poorly understood at a global scale. Here, we propose a convex semi-nonparametric approach called nTDP based on Bernstein polynomials to explore the joint effects of histone markers on TAD formation as well as predict TADs solely from the histone data. We find a small subset of modifications to be predictive of TADs across species. By inferring TADs using our trained model, we are able to predict TADs across different species and cell types, without the use of Hi-C data, suggesting their effect is conserved. This work provides the first comprehensive joint model of the effect histone markers on domain formation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Bach, F.R.: Exploring large feature spaces with hierarchical multiple kernel learning. In: Advances in Neural Information Processing Systems, pp. 105–112 (2009)

    Google Scholar 

  2. Baù, D., Marti-Renom, M.A.: Structure determination of genomic domains by satisfaction of spatial restraints. Chromosome Res. 19(1), 25–35 (2011)

    Article  Google Scholar 

  3. Bednarz, P., Wilczyński, B.: Supervised learning method for predicting chromatin boundary associated insulator elements. J. Bioinform. Computat. Biol. 12(06), 1442006 (2014)

    Article  Google Scholar 

  4. Bernstein, B.E., et al.: The NIH roadmap epigenomics mapping consortium. Nat. Biotechnol. 28(10), 1045–1048 (2010)

    Article  Google Scholar 

  5. Bickmore, W.A., van Steensel, B.: Genome architecture: Domain organization of interphase chromosomes. Cell 152(6), 1270–1284 (2013)

    Article  Google Scholar 

  6. Dixon, J.R., Selvaraj, S., Yue, F., Kim, A., Li, Y., Shen, Y., Hu, M., Liu, J.S., Ren, B.: Topological domains in mammalian genomes identified by analysis of chromatin interactions. Nature 485(7398), 376–380 (2012)

    Article  Google Scholar 

  7. ENCODE Project Consortium, et al.: An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012)

    Article  Google Scholar 

  8. Ernst, J., Kellis, M.: ChromHMM: automating chromatin-state discovery and characterization. Nat. Methods 9(3), 215–216 (2012)

    Article  Google Scholar 

  9. Filippova, D., Patro, R., Duggal, G., Kingsford, C.: Identification of alternative topological domains in chromatin. Alg. Mol. Biol. 9(1), 14 (2014)

    Article  Google Scholar 

  10. Gibcus, J.H., Dekker, J.: The hierarchy of the 3D genome. Mol. Cell 49(5), 773–782 (2013)

    Article  Google Scholar 

  11. Guelen, L., et al.: Domain organization of human chromosomes revealed by mapping of nuclear lamina interactions. Nature 453(7197), 948–951 (2008)

    Article  Google Scholar 

  12. Ho, J.W., et al.: Comparative analysis of metazoan chromatin organization. Nature 512(7515), 449–452 (2014)

    Article  Google Scholar 

  13. Hoffman, M.M., Buske, O.J., Wang, J., Weng, Z., Bilmes, J.A., Noble, W.S.: Unsupervised pattern discovery in human chromatin structure through genomic segmentation. Nat. Methods 9, 473–476 (2012)

    Article  Google Scholar 

  14. Hou, C., Li, L., Qin, Z.S., Corces, V.G.: Gene density, transcription, and insulators contribute to the partition of the drosophila genome into physical domains. Mol. Cell 48(3), 471–484 (2012)

    Article  Google Scholar 

  15. Le, T.B.K., et al.: High-resolution mapping of the spatial organization of a bacterial chromosome. Science 342(6159), 731–734 (2013)

    Article  Google Scholar 

  16. Libbrecht, M.W., Ay, F., Hoffman, M.M., Gilbert, D.M., Bilmes, J.A., Noble, W.S.: Joint annotation of chromatin state and chromatin conformation reveals relationships among domain types and identifies domains of cell type-specific expression. Genome Res. 25, 544–557 (2015)

    Article  Google Scholar 

  17. Lieberman-Aiden, E., et al.: Comprehensive mapping of long-range interactions reveals folding principles of the human genome. Science 326(5950), 289–293 (2009)

    Article  Google Scholar 

  18. Liu, D.C., Nocedal, J.: On the limited memory BFGS method for large scale optimization. Math. Program. 45(1–3), 503–528 (1989)

    Article  MathSciNet  MATH  Google Scholar 

  19. McKay Curtis, S., Ghosh, S.K., et al.: A variable selection approach to monotonic regression with Bernstein polynomials. J. Appl. Stat. 38(5), 961–976 (2011)

    Article  MathSciNet  Google Scholar 

  20. Meilă, M.: Comparing clusterings–an information based distance. J. Multivar. Anal. 98(5), 873–895 (2007)

    Article  MATH  Google Scholar 

  21. Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT Press, Cambridge (2012)

    Google Scholar 

  22. Nora, E.P., et al.: Segmental folding of chromosomes: a basis for structural and regulatory chromosomal neighborhoods? BioEssays 35(9), 818–828 (2013)

    Article  Google Scholar 

  23. Phillips-Cremins, J.E., Sauria, M.E., Sanyal, A., Gerasimova, T.I., Lajoie, B.R., Bell, J.S., Ong, C.T., Hookway, T.A., Guo, C., Sun, Y., Bland, M.J., Wagstaff, W., Dalton, S., McDevitt, T.C., Sen, R., Dekker, J., Taylor, J., Corces, V.G.: Architectural protein subclasses shape 3D organization of genomes during lineage commitment. Cell 153(6), 1281–1295 (2013)

    Article  Google Scholar 

  24. Rao, S.S., et al.: A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping. Cell 159(7), 1665–1680 (2014)

    Article  Google Scholar 

  25. Sefer, E., Duggal, G., Kingsford, C.: Deconvolution of ensemble chromatin interaction data reveals the latent mixing structures in cell subpopulations. In: Przytycka, T.M. (ed.) RECOMB 2015. LNCS, vol. 9029, pp. 293–308. Springer, Heidelberg (2015)

    Google Scholar 

  26. Sefer, E., Kingsford, C.: Metric labeling and semi-metric embedding for protein annotation prediction. In: Bafna, V., Sahinalp, S.C. (eds.) RECOMB 2011. LNCS, vol. 6577, pp. 392–407. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  27. Sexton, T., Yaffe, E., Kenigsberg, E., Bantignies, F., Leblanc, B., Hoichman, M., Parrinello, H., Tanay, A., Cavalli, G.: Three-dimensional folding and functional organization principles of the Drosophila genome. Cell 148(3), 458–472 (2012)

    Article  Google Scholar 

  28. Tolhuis, B., Palstra, R.J., Splinter, E., Grosveld, F., de Laat, W.: Looping and interaction between hypersensitive sites in the active \(\beta \)-globin locus. Mol. Cell 10(6), 1453–1465 (2002)

    Article  Google Scholar 

  29. Wahba, G.: Spline models for observational data, vol. 59. SIAM (1990)

    Google Scholar 

  30. Yaffe, E., Tanay, A.: Probabilistic modeling of Hi-C contact maps eliminates systematic biases to characterize global chromosomal architecture. Nat. Genet. 43(11), 1059–1065 (2011)

    Article  Google Scholar 

  31. Zhou, J., Troyanskaya, O.G.: Global quantitative modeling of chromatin factor interactions. PLoS Comput. Biol. 10(3), e1003525 (2014)

    Article  Google Scholar 

Download references

Funding

This research is funded in part by the Gordon and Betty Moore Foundations Data-Driven Discovery Initiative through Grant GBMF4554 to Carl Kingsford, by the US NSF (1256087, 1319998), and by the US NIH (HG006913, HG007104). C.K. received support as an Alfred P. Sloan Research Fellow.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Emre Sefer .

Editor information

Editors and Affiliations

Appendix

Appendix

\(R(f^{p}_{m})\) can be written more explicitly as in (18) according to [19]:

$$\begin{aligned} \frac{\partial ^{2} f^{p}_{m}(x,\mathbf {w^{p}_{m}})}{\partial x^{2}} = A(A-1) \sum _{i=0}^{A-2} (w^{p}_{m}[i+2]-2w^{p}_{m}[i+1]+w^{p}_{m}[i]) \genfrac(){0.0pt}0{A-2}{i} x^{i}(1-x)^{A-2-i} \end{aligned}$$
(18)

which turns \(R(f^{p}_{m})\) into (19):

$$\begin{aligned} \int _{0}^{1} \left( \frac{\partial ^{2} f^{p}_{m}(x)}{\partial x^{2}}\right) ^{2} dx = A^{2}(A-1)^{2} \sum _{i=0}^{A} \sum _{j=i}^{A} (w^{p}_{m}[i] w^{p}_{m}[j]) \nonumber \\ \Bigg ( \sum _{q=\overline{e}_{i}}^{\min (i,2)} \sum _{r=\overline{e}_{j}}^{\min (j,2)} (-1)^{q+r} \genfrac(){0.0pt}0{2}{q} \genfrac(){0.0pt}0{2}{r} T^{i-q}_{j-r}(x) \Bigg ) \end{aligned}$$
(19)

where \(\overline{e}_{p} = \max (0,2-A+p)\), \(T^{i-q}_{j-r}(x)\) is defined below and \(\beta (i+j-q-r+1,2A-3-i-j+q+r)\) is the beta function:

$$\begin{aligned} T^{i-q}_{j-r}(x) = \genfrac(){0.0pt}0{A-2}{i-q} \genfrac(){0.0pt}0{A-2}{j-r} \underbrace{\int _{0}^{1} x^{i-q}(1-x)^{A-2-i+q} x^{j-r}(1-x)^{A-2-j+r} dx}_{\beta (i+j-q-r+1, 2A-3-i-j+q+r)} \end{aligned}$$
(20)

\(R(f^{p}_{m})\) is convex which follows from semidefiniteness of the resulting polynomial.

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Sefer, E., Kingsford, C. (2015). Semi-nonparametric Modeling of Topological Domain Formation from Epigenetic Data. In: Pop, M., Touzet, H. (eds) Algorithms in Bioinformatics. WABI 2015. Lecture Notes in Computer Science(), vol 9289. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-48221-6_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-662-48221-6_11

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-662-48220-9

  • Online ISBN: 978-3-662-48221-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics