Advertisement

Deconvolution of Ensemble Chromatin Interaction Data Reveals the Latent Mixing Structures in Cell Subpopulations

  • Emre SeferEmail author
  • Geet Duggal
  • Carl Kingsford
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9029)

Abstract

Chromosome conformation capture (3C) experiments provide a window into the spatial packing of a genome in three dimensions within the cell. This structure has been shown to be highly correlated with gene regulation, cancer mutations, and other genomic functions. However, 3C provides mixed measurements on a population of typically millions of cells, each with a different genome structure due to the fluidity of the genome and differing cell states. Here, we present several algorithms to deconvolve these measured 3C matrices into estimations of the contact matrices for each subpopulation of cells and relative densities of each subpopulation. We formulate the problem as that of choosing matrices and densities that minimize the Frobenius distance between the observed 3C matrix and the weighted sum of the estimated subpopulation matrices. Results on HeLa 5C and mouse and bacteria Hi-C data demonstrate the methods’ effectiveness. We also show that domain boundaries from deconvolved matrices are often more enriched or depleted for regulatory chromatin markers when compared to boundaries from convolved matrices.

Keywords

Mean Absolute Error Class Density Nucleic Acid Research Prior Weight Ensemble Matrix 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Ay, F., Bunnik, E.M., Varoquaux, N., Bol, S.M., Prudhomme, J., Vert, J.P., Noble, W.S., Le Roch, K.G.: Three-dimensional modeling of the p. falciparum genome during the erythrocytic cycle reveals a strong connection between genome architecture and gene expression. Genome Research 24(6), 974–988 (2014)CrossRefGoogle Scholar
  2. 2.
    Barski, A., Cuddapah, S., Cui, K., Roh, T.Y., Schones, D.E., Wang, Z., Wei, G., Chepelev, I., Zhao, K.: High-resolution profiling of histone methylations in the human genome. Cell 129(4), 823–837 (2007)CrossRefGoogle Scholar
  3. 3.
    Bickmore, W.A., van Steensel, B.: Genome architecture: domain organization of interphase chromosomes. Cell 152(6), 1270–1284 (2013)CrossRefGoogle Scholar
  4. 4.
    Deaton, A.M., Webb, S., Kerr, A.R., Illingworth, R.S., Guy, J., Andrews, R., Bird, A.: Cell type-specific DNA methylation at intragenic CpG islands in the immune system. Genome Research 21(7), 1074–1086 (2011)CrossRefGoogle Scholar
  5. 5.
    Dekker, J., Marti-Renom, M.A., Mirny, L.A.: Exploring the three-dimensional organization of genomes: interpreting chromatin interaction data. Nature Reviews Genetics 14(6), 390–403 (2013)CrossRefGoogle Scholar
  6. 6.
    Dixon, J.R., Selvaraj, S., Yue, F., Kim, A., Li, Y., Shen, Y., Hu, M., Liu, J.S., Ren, B.: Topological domains in mammalian genomes identified by analysis of chromatin interactions. Nature 485(7398), 376–380 (2012)CrossRefGoogle Scholar
  7. 7.
    Duggal, G., Wang, H., Kingsford, C.: Higher-order chromatin domains link eQTLs with the expression of far-away genes. Nucleic Acids Research 42(1), 87–96 (2014)CrossRefGoogle Scholar
  8. 8.
    Feldman, M., Naor, J., Schwartz, R.: A unified continuous greedy algorithm for submodular maximization. In: 2011 IEEE 52nd Annual Symposium on Foundations of Computer Science (FOCS), pp. 570–579. IEEE (2011)Google Scholar
  9. 9.
    Filippova, D., Patro, R., Duggal, G., Kingsford, C.: Identification of alternative topological domains in chromatin. Algorithms for Molecular Biology 9(1), 14 (2014)CrossRefGoogle Scholar
  10. 10.
    Fudenberg, G., Getz, G., Meyerson, M., Mirny, L.A.: High order chromatin architecture shapes the landscape of chromosomal alterations in cancer. Nature biotechnology 29(12), 1109–1113 (2011)CrossRefGoogle Scholar
  11. 11.
    Gorkin, D., Leung, D., Ren, B.: The 3D genome in transcriptional regulation and pluripotency. Cell Stem Cell 14(6), 762–775 (2014)CrossRefGoogle Scholar
  12. 12.
    Hu, M., Deng, K., Qin, Z., Dixon, J., Selvaraj, S., Fang, J., Ren, B., Liu, J.S.: Bayesian inference of spatial organizations of chromosomes. PLoS Computational Biology 9(1), e1002893 (2013)CrossRefGoogle Scholar
  13. 13.
    ILOG Inc: ILOG CPLEX: High-performance software for mathematical programming and optimization (2006). http://www.ilog.com/products/cplex/
  14. 14.
    Jin, F., Li, Y., Dixon, J.R., Selvaraj, S., Ye, Z., Lee, A.Y., Yen, C.A., Schmitt, A.D., Espinoza, C.A., Ren, B.: A high-resolution map of the three-dimensional chromatin interactome in human cells. Nature 503(7475), 290–294 (2013)Google Scholar
  15. 15.
    Kalhor, R., Tjong, H., Jayathilaka, N., Alber, F., Chen, L.: Genome architectures revealed by tethered chromosome conformation capture and population-based modeling. Nature Biotechnology 30(1), 90–98 (2012)CrossRefGoogle Scholar
  16. 16.
    Le, T.B.K., Imakaev, M.V., Mirny, L.A., Laub, M.T.: High-resolution mapping of the spatial organization of a bacterial chromosome. Science 342(6159), 731–734 (2013)CrossRefGoogle Scholar
  17. 17.
    Lieberman-Aiden, E., van Berkum, N.L., Williams, L., Imakaev, M., Ragoczy, T., Telling, A., Amit, I., Lajoie, B.R., Sabo, P.J., Dorschner, M.O., et al.: Comprehensive mapping of long-range interactions reveals folding principles of the human genome. Science 326(5950), 289–293 (2009)CrossRefGoogle Scholar
  18. 18.
    Luo, Z.Q., Ma, W.K., So, A.C., Ye, Y., Zhang, S.: Semidefinite relaxation of quadratic optimization problems. IEEE Signal Processing Magazine 27(3), 20–34 (2010)CrossRefGoogle Scholar
  19. 19.
    Meilă, M.: Comparing clusterings–an information based distance. J. Multivar. Anal. 98(5), 873–895 (2007)CrossRefzbMATHGoogle Scholar
  20. 20.
    Nagano, T., Lubling, Y., Stevens, T.J., Schoenfelder, S., Yaffe, E., Dean, W., Laue, E.D., Tanay, A., Fraser, P.: Single-cell Hi-C reveals cell-to-cell variability in chromosome structure. Nature 502(7469), 59–64 (2013)CrossRefGoogle Scholar
  21. 21.
    Naumova, N., Imakaev, M., Fudenberg, G., Zhan, Y., Lajoie, B.R., Mirny, L.A., Dekker, J.: Organization of the mitotic chromosome. Science 342(6161), 948–953 (2013)CrossRefGoogle Scholar
  22. 22.
    Noble, W.S., Jun Duan, Z., Andronescu, M., Schutz, K., McIlwain, S., Kim, Y.J., Lee, C., Shendure, J., Fields, S., Blau, C.A.: A three-dimensional model of the yeast genome. Nature 465(7296), 363–367 (2010)Google Scholar
  23. 23.
    Rousseau, M., Crutchley, J.L., Miura, H., Suderman, M., Blanchette, M., Dostie, J.: Hox in motion: tracking HoxA cluster conformation during differentiation. Nucleic Acids Research 42(3), 1524–1540 (2014)CrossRefGoogle Scholar
  24. 24.
    Rousseau, M., Fraser, J., Ferraiuolo, M., Dostie, J., Blanchette, M.: Three-dimensional modeling of chromatin structure from interaction frequency data using markov chain monte carlo sampling. BMC Bioinformatics 12(1), 1–16 (2011)CrossRefGoogle Scholar
  25. 25.
    Shen, Y., Yue, F., McCleary, D.F., Ye, Z., Edsall, L., Kuan, S., Wagner, U., Dixon, J., Lee, L., Lobanenkov, V.V., Ren, B.: A map of the cis-regulatory sequences in the mouse genome. Nature 488(7409), 116–120 (2012)CrossRefGoogle Scholar
  26. 26.
    Simonis, M., Klous, P., Splinter, E., Moshkin, Y., Willemsen, R., de Wit, E., van Steensel, B., de Laat, W.: Nuclear organization of active and inactive chromatin domains uncovered by chromosome conformation capture-on-chip (4C). Nature Genetics 38(11), 1348–1354 (2006)CrossRefGoogle Scholar
  27. 27.
    Svitkina, Z., Fleischer, L.: Submodular approximation: Sampling-based algorithms and lower bounds. SIAM Journal on Computing 40(6), 1715–1737 (2011)CrossRefzbMATHMathSciNetGoogle Scholar
  28. 28.
    Tanizawa, H., Iwasaki, O., Tanaka, A., Capizzi, J.R., Wickramasinghe, P., Lee, M., Fu, Z., Noma, K.I.: Mapping of long-range associations throughout the fission yeast genome reveals global genome organization linked to transcriptional regulation. Nucleic Acids Research 38(22), 8164–8177 (2010)CrossRefGoogle Scholar
  29. 29.
    Tütüncü, R.H., Toh, K.C., Todd, M.J.: Solving semidefinite-quadratic-linear programs using SDPT3. Mathematical Programming 95(2), 189–217 (2003)CrossRefzbMATHMathSciNetGoogle Scholar
  30. 30.
    Varoquaux, N., Ay, F., Noble, W.S., Vert, J.P.: A statistical approach for inferring the 3D structure of the genome. Bioinformatics 30(12), i26–i33 (2014)CrossRefGoogle Scholar
  31. 31.
    Verdú, S.: Computational complexity of optimum multiuser detection. Algorithmica 4(1–4), 303–312 (1989)CrossRefzbMATHMathSciNetGoogle Scholar
  32. 32.
    Yaffe, E., Tanay, A.: Probabilistic modeling of Hi-C contact maps eliminates systematic biases to characterize global chromosomal architecture. Nature Genetics 43(11), 1059–1065 (2011)CrossRefGoogle Scholar
  33. 33.
    Zhang, Z.Z., Li, G., Toh, K.-C., Sung, W.-K.: Inference of spatial organizations of chromosomes using semi-definite embedding approach and Hi-C data. In: Deng, M., Jiang, R., Sun, F., Zhang, X. (eds.) RECOMB 2013. LNCS, vol. 7821, pp. 317–332. Springer, Heidelberg (2013) CrossRefGoogle Scholar
  34. 34.
    Ziebarth, J.D., Bhattacharya, A., Cui, Y.: CTCFBSDB 2.0: A database for CTCF-binding sites and genome organization. Nucleic Acids Research 41(D1), D188–D194 (2013)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  1. 1.Computational Biology DepartmentCarnegie Mellon UniversityPittsburghUSA

Personalised recommendations