Advertisement

RetroSpect, a New Method of Measuring Gene Regulatory Evolution Rates Using Co-mapping of Genomic Functional Features with Transposable Elements

  • Daniil Nikitin
  • Maxim Sorokin
  • Victor Tkachev
  • Andrew Garazha
  • Alexander Markov
  • Anton BuzdinEmail author
Chapter
  • 485 Downloads

Abstract

Transposable elements (TEs) are selfish genetic sequences that proliferate in the host genomes by spreading their copies in new genomic locations. TEs reside in the genomes of all groups of living organisms. TE sequences may be recruited by the host cells to serve as regulatory sites for the neighboring genes. These regulatory sites can be transcription factor binding sites (TFBS), histone modification loci, DNase I hypersensitivity sites, etc. Insertion of a TE in a gene neighborhood changes an equilibrium of regulatory sequences controlling this gene functioning. The more regulatory sites can be identified within gene-proximate TEs, the faster should be the evolution of gene regulation. We proposed a method for measuring evolutionary rates of gene regulation based on relative quantitation of regulatory sites located within TEs next to gene transcriptional start sites. It allows interrogating regulatory evolution for organisms with TE-rich genomes. This method termed RetroSpect was applied first for studying human gene evolution using TFBS co-mapping with the human retroelements (REs). RE is a subgroup of TEs that was active in mammals before and after their radiation. We characterized human genes and molecular pathways either enriched or deficient in RE-linked TFBS regulation for 563 transcription factors in thirteen human cell lines. We found that major groups enriched by RE regulation deal with gene control by microRNAs, olfaction, color vision, fertilization, cellular immune response, amino acids and fatty acids metabolism and detoxication. The deficient groups were involved in protein translation, RNA transcription and processing, chromatin organization, and molecular signaling.

Keywords

Genome evolution Gene regulation Human genetics Transcription factor binding sites Transposable elements Retrotransposons Molecular pathways ChIP-seq Omics approach in evolutionary biology 

Notes

Acknowledgements

We acknowledge Amazon and Microsoft Azure grants for cloud-based computations which helped us to complete this study. We thank Oncobox/OmicsWay research program in machine learning and digital oncology for providing access to software and pathway databases. The authors (A.B and M.S.) were supported by the Russian Science Foundation grant no. 18-15-00061.

Conflicts of Interests

The authors declare that they have no competing interests.

References

  1. Albert FW, Kruglyak L (2015) The role of regulatory variation in complex traits and disease. Nat Rev Genet 16(4):197–212.  https://doi.org/10.1038/nrg3891CrossRefPubMedGoogle Scholar
  2. Aliper AM, Korzinkin MB, Kuzmina NB, Zenin AA, Venkova LS, Smirnov PY, Borisov NM (2017) Mathematical justification of expression-based pathway activation scoring (PAS). Methods Mol Biol 1613:31–51.  https://doi.org/10.1007/978-1-4939-7027-8_3CrossRefPubMedGoogle Scholar
  3. Artemov A, Aliper A, Korzinkin M, Lezhnina K, Jellen L, Zhukov N, Buzdin A (2015) A method for predicting target drug efficiency in cancer based on the analysis of signaling pathway activation. Oncotarget 6(30):29347–29356.  https://doi.org/10.18632/oncotarget.5119CrossRefPubMedPubMedCentralGoogle Scholar
  4. Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Sherlock G (2000) Gene ontology: tool for the unification of biology. Nat Genet 25(1):25–29.  https://doi.org/10.1038/75556CrossRefPubMedPubMedCentralGoogle Scholar
  5. Badge RM, Alisch RS, Moran JV (2003) ATLAS: a system to selectively identify human-specific L1 insertions. Am J Hum Genet 72(4):823–838.  https://doi.org/10.1086/373939CrossRefPubMedPubMedCentralGoogle Scholar
  6. Barrio AM, Lagercrantz E, Sperber GO, Blomberg J, Bongcam-Rudloff E (2009) Annotation and visualization of endogenous retroviral sequences using the distributed annotation system (DAS) and eBioX. BMC Bioinf 10(Suppl 6):S18.  https://doi.org/10.1186/1471-2105-10-s6-s18CrossRefGoogle Scholar
  7. BioCarta (2019) Available online: https://cgap.nci.nih.gov/Pathways/BioCarta_Pathways. Cited 26 Mar 2019
  8. Boehm T, Swann JB (2014) Origin and evolution of adaptive immunity. Annu Rev Anim Biosci 2(1):259–283.  https://doi.org/10.1146/annurev-animal-022513-114201CrossRefPubMedGoogle Scholar
  9. Borisov N, Suntsova M, Sorokin M, Garazha A, Kovalchuk O, Aliper A, Buzdin A (2017) Data aggregation at the level of molecular pathways improves stability of experimental transcriptomic and proteomic data. Cell Cycle 16(19):1810–1823.  https://doi.org/10.1080/15384101.2017.1361068CrossRefPubMedPubMedCentralGoogle Scholar
  10. Borisov NM, Terekhanova NV, Aliper AM, Venkova LS, Smirnov PY, Roumiantsev S, Buzdin AA (2014) Signaling pathways activation profiles make better markers of cancer than expression of individual genes. Oncotarget 5(20):10198–10205.  https://doi.org/10.18632/oncotarget.2548CrossRefPubMedPubMedCentralGoogle Scholar
  11. Burns KH, Boeke JD (2012) Human transposon tectonics. Cell 149(4):740–752.  https://doi.org/10.1016/j.cell.2012.04.019CrossRefGoogle Scholar
  12. Buzdin AA, Prassolov V, Garazha AV (2017a) Friends-enemies: endogenous retroviruses are major transcriptional regulators of human DNA. Front Chem 5.  https://doi.org/10.3389/fchem.2017.00035
  13. Buzdin AA, Prassolov V, Zhavoronkov AA, Borisov NM (2017b) Bioinformatics meets biomedicine: OncoFinder, a quantitative approach for interrogating molecular pathways using gene expression data. Methods Mol Biol 1613:53–83.  https://doi.org/10.1007/978-1-4939-7027-8_4CrossRefPubMedGoogle Scholar
  14. Caetano-Anollés G, Yafremava LS, Gee H, Caetano-Anollés D, Kim HS, Mittenthal JE (2009) The origin and evolution of modern metabolism. Int J Biochem Cell Biol 41(2):285–297.  https://doi.org/10.1016/j.biocel.2008.08.022CrossRefPubMedGoogle Scholar
  15. Cheatle Jarvela AM, Hinman VF (2015) Evolution of transcription factor function as a mechanism for changing metazoan developmental gene regulatory networks. Evodevo 6(1):3.  https://doi.org/10.1186/2041-9139-6-3CrossRefPubMedPubMedCentralGoogle Scholar
  16. Chuong EB, Elde NC, Feschotte C (2016) Regulatory evolution of innate immunity through co-option of endogenous retroviruses. Science 351(6277):1083–1087.  https://doi.org/10.1126/science.aad5497CrossRefPubMedPubMedCentralGoogle Scholar
  17. Cordaux R, Batzer MA (2009) The impact of retrotransposons on human genome evolution. Nat Rev Genet 10(10):691–703.  https://doi.org/10.1038/nrg2640CrossRefPubMedPubMedCentralGoogle Scholar
  18. Danino YM, Even D, Ideses D, Juven-Gershon T (2015) The core promoter: at the heart of gene expression. Biochim Biophys Acta Gene Regul Mech 1849(8):1116–1131.  https://doi.org/10.1016/j.bbagrm.2015.04.003CrossRefGoogle Scholar
  19. DAVID (2019) DAVID functional annotation bioinformatics microarray analysis. Available online: https://david.ncifcrf.gov/. Cited 26 Mar 2019
  20. Doucet-O’Hare TT, Sharma R, Rodić N, Anders RA, Burns KH, Kazazian HH (2016) Somatically acquired LINE-1 insertions in normal esophagus undergo clonal expansion in esophageal squamous cell carcinoma. Hum Mutat 37(9):942–954.  https://doi.org/10.1002/humu.23027CrossRefPubMedPubMedCentralGoogle Scholar
  21. Eden E, Navon R, Steinfeld I, Lipson D, Yakhini Z (2009) GOrilla: a tool for discovery and visualization of enriched GO terms in ranked gene lists. BMC Bioinf 10(1):48.  https://doi.org/10.1186/1471-2105-10-48CrossRefGoogle Scholar
  22. ENCODE Project Consortium (2012) An integrated encyclopedia of DNA elements in the human genome. Nature 489(7414):57–74.  https://doi.org/10.1038/nature11247CrossRefGoogle Scholar
  23. ENCODE (2019a) ENCODE database, transcription factors. Available online: https://www.encodeproject.org/chip-seq/transcription_factor/ Cited 26 Mar 2019
  24. ENCODE Database, BWA Software (2019b) Available online: https://www.encodeproject.org/pipelines/ENCPL220NBH/. Cited 26 Mar 2019
  25. ENCODE ChIP-seq Analysis Pipeline (2019c) Available online: https://www.encodeproject.org/pipelines/ENCPL138KID/. Cited 26 Mar 2019
  26. Feschotte C (2008) Transposable elements and the evolution of regulatory networks. Nat Rev Genet 9(5):397–405.  https://doi.org/10.1038/nrg2337CrossRefPubMedPubMedCentralGoogle Scholar
  27. Fox GE (2010) Origin and evolution of the ribosome. Cold Spring Harb Perspect Biol 2(9):a003483–a003483.  https://doi.org/10.1101/cshperspect.a003483CrossRefPubMedPubMedCentralGoogle Scholar
  28. Garazha A, Ivanova A, Suntsova M, Malakhova G, Roumiantsev S, Zhavoronkov A, Buzdin A (2015) New bioinformatic tool for quick identification of functionally relevant endogenous retroviral inserts in human genome. Cell Cycle 14(9):1476–1484.  https://doi.org/10.1080/15384101.2015.1022696CrossRefGoogle Scholar
  29. Giordano J, Ge Y, Gelfand Y, Abrusán G, Benson G, Warburton PE (2007) Evolutionary history of mammalian transposons determined by genome-wide defragmentation. PLoS Comput Biol 3(7):e137.  https://doi.org/10.1371/journal.pcbi.0030137CrossRefPubMedPubMedCentralGoogle Scholar
  30. GOrilla (2019) GOrilla—a tool for identifying enriched GO terms. http://cbl-gorilla.cs.technion.ac.il. Cited 26 Mar 2019
  31. Harris BHL, Barberis A, West CML, Buffa FM (2015) Gene expression signatures as biomarkers of tumour hypoxia. Clin Oncol 27(10):547–560.  https://doi.org/10.1016/j.clon.2015.07.004CrossRefGoogle Scholar
  32. Hoeijmakers JHJ (2009) DNA damage, aging, and cancer. N Engl J Med 361(15):1475–1485.  https://doi.org/10.1056/NEJMra0804615CrossRefPubMedGoogle Scholar
  33. Huang DW, Sherman BT, Lempicki RA (2009a) Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists. Nucleic Acids Res 37(1):1–13.  https://doi.org/10.1093/nar/gkn923CrossRefGoogle Scholar
  34. Huang DW, Sherman BT, Lempicki RA (2009b) Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat Protoc 4(1):44–57.  https://doi.org/10.1038/nprot.2008.211CrossRefGoogle Scholar
  35. Johnson DS, Mortazavi A, Myers RM, Wold B (2007) Genome-wide mapping of in Vivo protein-DNA interactions. Science (80-)316(5830):1497–1502.  https://doi.org/10.1126/science.1141319CrossRefGoogle Scholar
  36. Kapitonov VV, Jurka J (2008) A universal classification of eukaryotic transposable elements implemented in Repbase. Nat Rev Genet 9(5):411–412; author reply 414.  https://doi.org/10.1038/nrg2165-c1CrossRefGoogle Scholar
  37. Kato T, Iwamoto K (2014) Comprehensive DNA methylation and hydroxymethylation analysis in the human brain and its implication in mental disorders. Neuropharmacology 80:133–139.  https://doi.org/10.1016/j.neuropharm.2013.12.019CrossRefPubMedGoogle Scholar
  38. Kazazian HH Jr, Moran JV (2017) Mobile DNA in health and disease. N Engl J Med 377(4):361.  https://doi.org/10.1056/NEJMRA1510092CrossRefPubMedPubMedCentralGoogle Scholar
  39. KEGG (2019) Available online: http://www.genome.jp/kegg/. Cited 26 Mar 2019
  40. Lander ES, Linton LM, Birren B, Nusbaum C, Zody MC, Baldwin J (2001) International human genome sequencing consortium. Initial sequencing and analysis of the human genome. Nature 409(6822):860–921.  https://doi.org/10.1038/35057062
  41. Lavialle C, Cornelis G, Dupressoir A, Esnault C, Heidmann O, Vernochet C, Heidmann T (2013) Paleovirology of 'syncytins', retroviral env genes exapted for a role in placentation. Philos Trans R Soc Lond B Biol Sci 368(1626):20120507.  https://doi.org/10.1098/rstb.2012.0507CrossRefGoogle Scholar
  42. Lynch M, Ackerman MS, Gout JF, Long H, Sung W, Thomas WK, Foster PL (2016) Genetic drift, selection and the evolution of the mutation rate. Nat Rev Genet 17(11):704–714.  https://doi.org/10.1038/nrg.2016.104CrossRefGoogle Scholar
  43. Maleszka R, Mason PH, Barron AB (2014) Epigenomics and the concept of degeneracy in biological systems. Brief Funct Genomics 13(3):191–202.  https://doi.org/10.1093/bfgp/elt050CrossRefPubMedGoogle Scholar
  44. Meier K, Brehm A (2014) Chromatin regulation: how complex does it get? Epigenetics 9(11):1485–1495.  https://doi.org/10.4161/15592294.2014.971580CrossRefPubMedPubMedCentralGoogle Scholar
  45. Mundade R, Ozer HG, Wei H, Prabhu L, Lu T (2014) Role of ChIP-seq in the discovery of transcription factor binding sites, differential gene regulation mechanism, epigenetic marks and beyond. Cell Cycle 13(18):2847–2852.  https://doi.org/10.4161/15384101.2014.949201CrossRefPubMedPubMedCentralGoogle Scholar
  46. National Cancer Institute (2019) Available online: https://cactus.nci.nih.gov/ncicadd/about.htm. Cited 26 Mar 2019
  47. Nikitin D, Garazha A, Sorokin M, Penzar D, Tkachev V, Markov A, Buzdin A (2019) Retroelement-linked transcription factor binding patterns point to quickly developing molecular pathways in human evolution. Cells 8(2):130.  https://doi.org/10.3390/cells8020130CrossRefPubMedCentralGoogle Scholar
  48. Nikitin D, Penzar D, Garazha A, Sorokin M, Tkachev V, Borisov N, Buzdin AA (2018) Profiling of human molecular pathways affected by retrotransposons at the level of regulation by transcription factor proteins. Front Immunol 9:30.  https://doi.org/10.3389/fimmu.2018.00030CrossRefPubMedPubMedCentralGoogle Scholar
  49. Numpy Least squares polynomial fit (2019) Available online: https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.polyfit.html. Cited 26 Mar 2019
  50. O’Brien PJ (2006) Catalytic promiscuity and the divergent evolution of DNA repair enzymes. Chem Rev 106(2):720–752.  https://doi.org/10.1021/cr040481vCrossRefPubMedGoogle Scholar
  51. Pathway Central (2019) Available online: http://www.sabiosciences.com/pathwaycentral.php. Cited 26 Mar 2019
  52. Reactome (2019) Available online: http://reactome.org. Cited 26 Mar 2019
  53. RepeatMasker (2019) Available online: http://www.repeatmasker.org. Cited 26 Mar 2019
  54. Royer-Bertrand B, Rivolta C (2015) Whole genome sequencing as a means to assess pathogenic mutations in medical genetics and cancer. Cell Mol Life Sci 72(8):1463–1471.  https://doi.org/10.1007/s00018-014-1807-9CrossRefPubMedGoogle Scholar
  55. Seaborn (2019) Available online: http://seaborn.pydata.org/. Cited 26 Mar 2019
  56. Sloan CA, Chan ET, Davidson JM, Malladi VS, Strattan JS, Hitz BC, Cherry JM (2016) ENCODE data at the ENCODE portal. Nucleic Acids Res 44(D1):D726–D732.  https://doi.org/10.1093/nar/gkv1160CrossRefPubMedGoogle Scholar
  57. Suntsova M, Garazha A, Ivanova A, Kaminsky D, Zhavoronkov A, Buzdin A (2015) Molecular functions of human endogenous retroviruses in health and disease. Cell Mol Life Sci 72(19):3653–3675.  https://doi.org/10.1007/s00018-015-1947-6CrossRefPubMedGoogle Scholar
  58. The Gene Ontology Consortium (2017) Expansion of the gene ontology knowledgebase and resources. Nucleic Acids Res 45(D1):D331–D338.  https://doi.org/10.1093/nar/gkw1108CrossRefGoogle Scholar
  59. Thompson D, Regev A, Roy S (2015) Comparative analysis of gene regulatory networks: from network reconstruction to evolution. Annu Rev Cell Dev Biol 31(1):399–428.  https://doi.org/10.1146/annurev-cellbio-100913-012908CrossRefPubMedGoogle Scholar
  60. Turner BM (2014) Nucleosome signalling; an evolving concept. Biochim Biophys Acta 1839(8):623–626.  https://doi.org/10.1016/j.bbagrm.2014.01.001CrossRefPubMedGoogle Scholar
  61. UCSC Browser, bedGraph files (2019a) Available online: https://genome.ucsc.edu/goldenpath/help/bedgraph.html. Cited 26 Mar 2019
  62. UCSC Browser, Human genome (2019b) Available online: https://genome.ucsc.edu/cgi-bin/hgs. Cited 26 Mar 2019
  63. Varriale A (2014) DNA methylation, epigenetics, and evolution in vertebrates: facts and challenges. Int J Evol Biol 2014:475981.  https://doi.org/10.1155/2014/475981CrossRefPubMedPubMedCentralGoogle Scholar
  64. Villar D, Flicek P, Odom DT (2014) Evolution of transcription factor binding in metazoans—mechanisms and functional implications. Nat Rev Genet 15(4):221–233.  https://doi.org/10.1038/nrg3481CrossRefPubMedPubMedCentralGoogle Scholar
  65. Yin H, Wang S, Zhang Y-H, Cai Y-D, Liu H (2016) Analysis of important gene ontology terms and biological pathways related to pancreatic cancer. Biomed Res Int 2016:1–10.  https://doi.org/10.1155/2016/7861274CrossRefGoogle Scholar
  66. Yuryev A (2015) Gene expression profiling for targeted cancer treatment. Expert Opin Drug Discov 10(1):91–99.  https://doi.org/10.1517/17460441.2015.971007CrossRefPubMedGoogle Scholar
  67. Zhong X (2016) Comparative epigenomics: a powerful tool to understand the evolution of DNA methylation. New Phytol 210(1):76–80.  https://doi.org/10.1111/nph.13540CrossRefPubMedGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Daniil Nikitin
    • 1
    • 2
    • 4
  • Maxim Sorokin
    • 1
    • 3
  • Victor Tkachev
    • 2
  • Andrew Garazha
    • 2
  • Alexander Markov
    • 4
  • Anton Buzdin
    • 1
    • 2
    • 3
    Email author
  1. 1.I.M. Sechenov First Moscow State Medical UniversityMoscowRussia
  2. 2.Omicsway Corp.WalnutUSA
  3. 3.Shemyakin-Ovchinnikov Institute of Bioorganic ChemistryMoscowRussia
  4. 4.Faculty of BiologyMoscow State UniversityMoscowRussia

Personalised recommendations