Advertisement

Topological Data Analysis of Single-Cell Hi-C Contact Maps

  • Mathieu CarrièreEmail author
  • Raúl Rabadán
Conference paper
  • 64 Downloads
Part of the Abel Symposia book series (ABEL, volume 15)

Abstract

Due to recent breakthroughs in high-throughput sequencing, it is now possible to use chromosome conformation capture (CCC) to understand the three dimensional conformation of DNA at the whole genome level, and to characterize it with the so-called contact maps. This is very useful since many biological processes are correlated with DNA folding, such as DNA transcription. However, the methods for the analysis of such conformations are still lacking mathematical guarantees and statistical power. To handle this issue, we propose to use the Mapper, which is a standard tool of Topological Data Analysis (TDA) that allows one to efficiently encode the inherent continuity and topology of underlying biological processes in data, in the form of a graph with various features such as branches and loops. In this article, we show how recent statistical techniques developed in TDA for the Mapper algorithm can be extended and leveraged to formally define and statistically quantify the presence of topological structures coming from biological phenomena, such as the cell cyle, in datasets of CCC contact maps.

Notes

Acknowledgements

This work has been funded by NIH grants (U54 CA193313 and U54 CA209997) and Chan Zuckerberg Initiative pilot grant.

References

  1. 1.
    Alan Agresti. Categorical data analysis, 3rd edition. Wiley, 2012.Google Scholar
  2. 2.
    Ferhat Ay and William Noble. Analysis methods for studying the 3D architecture of the genome. Genome Biology, 16:183–198, 2015.CrossRefGoogle Scholar
  3. 3.
    Gunnar Carlsson. Topology and data. Bulletin of the American Mathematical Society, 46:255–308, 2009.MathSciNetCrossRefGoogle Scholar
  4. 4.
    Pablo Camara, Arnold Levine, and Raul Rabadan. Inference of Ancestral Recombination Graphs through Topological Data Analysis. PLoS Computational Biology, 12(8):1–25, 2016.CrossRefGoogle Scholar
  5. 5.
    Mathieu Carrière, Bertrand Michel, and Steve Oudot. Statistical analysis and parameter selection for mapper. Journal of Machine Learning Research, 19(12):1–39, 2018.MathSciNetzbMATHGoogle Scholar
  6. 6.
    Mathieu Carrière and Steve Oudot. Structure and Stability of the 1-Dimensional Mapper. Foundations of Computational Mathematics, 2017.Google Scholar
  7. 7.
    David Cohen-Steiner, Herbert Edelsbrunner, and John Harer. Stability of Persistence Diagrams. Discrete and Computational Geometry, 37(1):103–120, 2007.MathSciNetCrossRefGoogle Scholar
  8. 8.
    David Cohen-Steiner, Herbert Edelsbrunner, and John Harer. Extending persistence using Poincaré and Lefschetz duality. Foundation of Computational Mathematics, 9(1):79–103, 2009.CrossRefGoogle Scholar
  9. 9.
    Tamal Dey, Facundo Mémoli, and Yusu Wang. Multiscale Mapper: Topological Summarization via Codomain Covers. In Proceedings of the 27th Symposium on Discrete Algorithms, pages 997–1013, 2016.Google Scholar
  10. 10.
    Josée Dostie, Todd Richmond, Ramy Arnaout, Rebecca Selzer, William Lee, Tracey Honan, Eric Rubio, Anton Krumm, Justin Lamb, Chad Nusbaum, Roland Green, and Job Dekker. Chromosome Conformation Capture Carbon Copy (5C): A massively parallel solution for mapping interactions between genomic elements. Genome Research, 16(10):1299–1309, 2006.CrossRefGoogle Scholar
  11. 11.
    Job Dekker, Karsten Rippe, Martijn Dekker, and Nancy Kleckner. Capturing chromosome conformation. Science, 295(5558):1306–1311, 2002.CrossRefGoogle Scholar
  12. 12.
    Elzo de Wit and Wouter de Laat. A decade of 3C technologies: insights into nuclear organization. Genes and Development, 26(1):11–24, 2012.CrossRefGoogle Scholar
  13. 13.
    Herbert Edelsbrunner and John Harer. Computational Topology: an introduction. AMS Bookstore, 2010.zbMATHGoogle Scholar
  14. 14.
    Erez Lieberman-Aiden, Nynke van Berkum, Louise Williams, Maxim Imakaev, Tobias Ragoczy, Agnes Telling, Ido Amit, Bryan Lajoie, Peter Sabo, Michael Dorschner, Richard Sandstrom, Bradley Bernstein, Michael Bender, Mark Groudine, Andreas Gnirke, John Stamatoyannopoulos, Leonid Mirny, Eric Lander, and Job Dekker. Comprehensive Mapping of Long-Range Interactions Reveals Folding Principles of the Human Genome. Science, 326(5950):289–293, 2009.CrossRefGoogle Scholar
  15. 15.
    Jie Liu, Dejun Lin, Galip Yardimci, and William Noble. Unsupervised embedding of single-cell Hi-C data. Bioinformatics, 34(13):i96–i104, 2018.CrossRefGoogle Scholar
  16. 16.
    Nathan Mantel. Chi-Square Tests with One Degree of Freedom; Extensions of the Mantel-Haenszel Procedure. Journal of the American Statistical Association, 58(303):690–700, 1963.MathSciNetzbMATHGoogle Scholar
  17. 17.
    Fionn Murtagh and Pedro Contreras. Algorithms for hierarchical clustering: an overview. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 2(1):86–97, 2012.Google Scholar
  18. 18.
    Elizabeth Munch and Bei Wang. Convergence between Categorical Representations of Reeb Space and Mapper. In Proceedings of the 32nd Symposium on Computational Geometry, volume 51, pages 53:1–53:16, 2016.Google Scholar
  19. 19.
    Takashi Nagano, Yaniv Lubling, Csilla Varnai, Carmel Dudley, Wing Leung, Yael Baran, Netta Cohen, Steven Wingett, Peter Fraser, and Amos Tanay. Cell-cycle dynamics of chromosomal organization at single-cell resolution. Nature, 547:61–67, 2017.CrossRefGoogle Scholar
  20. 20.
    Steve Oudot. Persistence Theory: From Quiver Representations to Data Analysis. Number 209 in Mathematical Surveys and Monographs. American Mathematical Society, 2015.Google Scholar
  21. 21.
    Abbas Rizvi, Pablo Camara, Elena Kandror, Thomas Roberts, Ira Schieren, Tom Maniatis, and Raul Rabadan. Single-cell topological RNA-seq analysis reveals insights into cellular differentiation and development. Nature Biotechnology, 35:551–560, 2017.CrossRefGoogle Scholar
  22. 22.
    Marieke Simonis, Petra Klous, Erik Splinter, Yuri Moshkin, Rob Willemsen, Elzo de Wit, Bas van Steensel, and Wouter de Laat. Nuclear organization of active and inactive chromatin domains uncovered by chromosome conformation capture-on-chip (4C). Nature Genetics, 38:1348–1354, 2006.CrossRefGoogle Scholar
  23. 23.
    Gurjeet Singh, Facundo Mémoli, and Gunnar Carlsson. Topological Methods for the Analysis of High Dimensional Data Sets and 3D Object Recognition. In Symposium on Point Based Graphics, pages 91–100, 2007.Google Scholar
  24. 24.
    Tao Yang, Feipeng Zhang, Galip Yardimci, Fan Song, Ross Hardison, William Noble, Feng Yue, and Qunhua Li. HiCRep: assessing the reproducibility of Hi-C data using a stratum-adjusted correlation coefficient. Genome Research, 27(11):1939–1949, 2017.CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.Department of Systems BiologyColumbia University Irving Medical CenterNew YorkUSA

Personalised recommendations