RetroSpect, a New Method of Measuring Gene Regulatory Evolution Rates Using Co-mapping of Genomic Functional Features with Transposable Elements
- 485 Downloads
Abstract
Transposable elements (TEs) are selfish genetic sequences that proliferate in the host genomes by spreading their copies in new genomic locations. TEs reside in the genomes of all groups of living organisms. TE sequences may be recruited by the host cells to serve as regulatory sites for the neighboring genes. These regulatory sites can be transcription factor binding sites (TFBS), histone modification loci, DNase I hypersensitivity sites, etc. Insertion of a TE in a gene neighborhood changes an equilibrium of regulatory sequences controlling this gene functioning. The more regulatory sites can be identified within gene-proximate TEs, the faster should be the evolution of gene regulation. We proposed a method for measuring evolutionary rates of gene regulation based on relative quantitation of regulatory sites located within TEs next to gene transcriptional start sites. It allows interrogating regulatory evolution for organisms with TE-rich genomes. This method termed RetroSpect was applied first for studying human gene evolution using TFBS co-mapping with the human retroelements (REs). RE is a subgroup of TEs that was active in mammals before and after their radiation. We characterized human genes and molecular pathways either enriched or deficient in RE-linked TFBS regulation for 563 transcription factors in thirteen human cell lines. We found that major groups enriched by RE regulation deal with gene control by microRNAs, olfaction, color vision, fertilization, cellular immune response, amino acids and fatty acids metabolism and detoxication. The deficient groups were involved in protein translation, RNA transcription and processing, chromatin organization, and molecular signaling.
Keywords
Genome evolution Gene regulation Human genetics Transcription factor binding sites Transposable elements Retrotransposons Molecular pathways ChIP-seq Omics approach in evolutionary biologyNotes
Acknowledgements
We acknowledge Amazon and Microsoft Azure grants for cloud-based computations which helped us to complete this study. We thank Oncobox/OmicsWay research program in machine learning and digital oncology for providing access to software and pathway databases. The authors (A.B and M.S.) were supported by the Russian Science Foundation grant no. 18-15-00061.
Conflicts of Interests
The authors declare that they have no competing interests.
References
- Albert FW, Kruglyak L (2015) The role of regulatory variation in complex traits and disease. Nat Rev Genet 16(4):197–212. https://doi.org/10.1038/nrg3891CrossRefPubMedGoogle Scholar
- Aliper AM, Korzinkin MB, Kuzmina NB, Zenin AA, Venkova LS, Smirnov PY, Borisov NM (2017) Mathematical justification of expression-based pathway activation scoring (PAS). Methods Mol Biol 1613:31–51. https://doi.org/10.1007/978-1-4939-7027-8_3CrossRefPubMedGoogle Scholar
- Artemov A, Aliper A, Korzinkin M, Lezhnina K, Jellen L, Zhukov N, Buzdin A (2015) A method for predicting target drug efficiency in cancer based on the analysis of signaling pathway activation. Oncotarget 6(30):29347–29356. https://doi.org/10.18632/oncotarget.5119CrossRefPubMedPubMedCentralGoogle Scholar
- Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Sherlock G (2000) Gene ontology: tool for the unification of biology. Nat Genet 25(1):25–29. https://doi.org/10.1038/75556CrossRefPubMedPubMedCentralGoogle Scholar
- Badge RM, Alisch RS, Moran JV (2003) ATLAS: a system to selectively identify human-specific L1 insertions. Am J Hum Genet 72(4):823–838. https://doi.org/10.1086/373939CrossRefPubMedPubMedCentralGoogle Scholar
- Barrio AM, Lagercrantz E, Sperber GO, Blomberg J, Bongcam-Rudloff E (2009) Annotation and visualization of endogenous retroviral sequences using the distributed annotation system (DAS) and eBioX. BMC Bioinf 10(Suppl 6):S18. https://doi.org/10.1186/1471-2105-10-s6-s18CrossRefGoogle Scholar
- BioCarta (2019) Available online: https://cgap.nci.nih.gov/Pathways/BioCarta_Pathways. Cited 26 Mar 2019
- Boehm T, Swann JB (2014) Origin and evolution of adaptive immunity. Annu Rev Anim Biosci 2(1):259–283. https://doi.org/10.1146/annurev-animal-022513-114201CrossRefPubMedGoogle Scholar
- Borisov N, Suntsova M, Sorokin M, Garazha A, Kovalchuk O, Aliper A, Buzdin A (2017) Data aggregation at the level of molecular pathways improves stability of experimental transcriptomic and proteomic data. Cell Cycle 16(19):1810–1823. https://doi.org/10.1080/15384101.2017.1361068CrossRefPubMedPubMedCentralGoogle Scholar
- Borisov NM, Terekhanova NV, Aliper AM, Venkova LS, Smirnov PY, Roumiantsev S, Buzdin AA (2014) Signaling pathways activation profiles make better markers of cancer than expression of individual genes. Oncotarget 5(20):10198–10205. https://doi.org/10.18632/oncotarget.2548CrossRefPubMedPubMedCentralGoogle Scholar
- Burns KH, Boeke JD (2012) Human transposon tectonics. Cell 149(4):740–752. https://doi.org/10.1016/j.cell.2012.04.019CrossRefGoogle Scholar
- Buzdin AA, Prassolov V, Garazha AV (2017a) Friends-enemies: endogenous retroviruses are major transcriptional regulators of human DNA. Front Chem 5. https://doi.org/10.3389/fchem.2017.00035
- Buzdin AA, Prassolov V, Zhavoronkov AA, Borisov NM (2017b) Bioinformatics meets biomedicine: OncoFinder, a quantitative approach for interrogating molecular pathways using gene expression data. Methods Mol Biol 1613:53–83. https://doi.org/10.1007/978-1-4939-7027-8_4CrossRefPubMedGoogle Scholar
- Caetano-Anollés G, Yafremava LS, Gee H, Caetano-Anollés D, Kim HS, Mittenthal JE (2009) The origin and evolution of modern metabolism. Int J Biochem Cell Biol 41(2):285–297. https://doi.org/10.1016/j.biocel.2008.08.022CrossRefPubMedGoogle Scholar
- Cheatle Jarvela AM, Hinman VF (2015) Evolution of transcription factor function as a mechanism for changing metazoan developmental gene regulatory networks. Evodevo 6(1):3. https://doi.org/10.1186/2041-9139-6-3CrossRefPubMedPubMedCentralGoogle Scholar
- Chuong EB, Elde NC, Feschotte C (2016) Regulatory evolution of innate immunity through co-option of endogenous retroviruses. Science 351(6277):1083–1087. https://doi.org/10.1126/science.aad5497CrossRefPubMedPubMedCentralGoogle Scholar
- Cordaux R, Batzer MA (2009) The impact of retrotransposons on human genome evolution. Nat Rev Genet 10(10):691–703. https://doi.org/10.1038/nrg2640CrossRefPubMedPubMedCentralGoogle Scholar
- Danino YM, Even D, Ideses D, Juven-Gershon T (2015) The core promoter: at the heart of gene expression. Biochim Biophys Acta Gene Regul Mech 1849(8):1116–1131. https://doi.org/10.1016/j.bbagrm.2015.04.003CrossRefGoogle Scholar
- DAVID (2019) DAVID functional annotation bioinformatics microarray analysis. Available online: https://david.ncifcrf.gov/. Cited 26 Mar 2019
- Doucet-O’Hare TT, Sharma R, Rodić N, Anders RA, Burns KH, Kazazian HH (2016) Somatically acquired LINE-1 insertions in normal esophagus undergo clonal expansion in esophageal squamous cell carcinoma. Hum Mutat 37(9):942–954. https://doi.org/10.1002/humu.23027CrossRefPubMedPubMedCentralGoogle Scholar
- Eden E, Navon R, Steinfeld I, Lipson D, Yakhini Z (2009) GOrilla: a tool for discovery and visualization of enriched GO terms in ranked gene lists. BMC Bioinf 10(1):48. https://doi.org/10.1186/1471-2105-10-48CrossRefGoogle Scholar
- ENCODE Project Consortium (2012) An integrated encyclopedia of DNA elements in the human genome. Nature 489(7414):57–74. https://doi.org/10.1038/nature11247CrossRefGoogle Scholar
- ENCODE (2019a) ENCODE database, transcription factors. Available online: https://www.encodeproject.org/chip-seq/transcription_factor/ Cited 26 Mar 2019
- ENCODE Database, BWA Software (2019b) Available online: https://www.encodeproject.org/pipelines/ENCPL220NBH/. Cited 26 Mar 2019
- ENCODE ChIP-seq Analysis Pipeline (2019c) Available online: https://www.encodeproject.org/pipelines/ENCPL138KID/. Cited 26 Mar 2019
- Feschotte C (2008) Transposable elements and the evolution of regulatory networks. Nat Rev Genet 9(5):397–405. https://doi.org/10.1038/nrg2337CrossRefPubMedPubMedCentralGoogle Scholar
- Fox GE (2010) Origin and evolution of the ribosome. Cold Spring Harb Perspect Biol 2(9):a003483–a003483. https://doi.org/10.1101/cshperspect.a003483CrossRefPubMedPubMedCentralGoogle Scholar
- Garazha A, Ivanova A, Suntsova M, Malakhova G, Roumiantsev S, Zhavoronkov A, Buzdin A (2015) New bioinformatic tool for quick identification of functionally relevant endogenous retroviral inserts in human genome. Cell Cycle 14(9):1476–1484. https://doi.org/10.1080/15384101.2015.1022696CrossRefGoogle Scholar
- Giordano J, Ge Y, Gelfand Y, Abrusán G, Benson G, Warburton PE (2007) Evolutionary history of mammalian transposons determined by genome-wide defragmentation. PLoS Comput Biol 3(7):e137. https://doi.org/10.1371/journal.pcbi.0030137CrossRefPubMedPubMedCentralGoogle Scholar
- GOrilla (2019) GOrilla—a tool for identifying enriched GO terms. http://cbl-gorilla.cs.technion.ac.il. Cited 26 Mar 2019
- Harris BHL, Barberis A, West CML, Buffa FM (2015) Gene expression signatures as biomarkers of tumour hypoxia. Clin Oncol 27(10):547–560. https://doi.org/10.1016/j.clon.2015.07.004CrossRefGoogle Scholar
- Hoeijmakers JHJ (2009) DNA damage, aging, and cancer. N Engl J Med 361(15):1475–1485. https://doi.org/10.1056/NEJMra0804615CrossRefPubMedGoogle Scholar
- Huang DW, Sherman BT, Lempicki RA (2009a) Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists. Nucleic Acids Res 37(1):1–13. https://doi.org/10.1093/nar/gkn923CrossRefGoogle Scholar
- Huang DW, Sherman BT, Lempicki RA (2009b) Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat Protoc 4(1):44–57. https://doi.org/10.1038/nprot.2008.211CrossRefGoogle Scholar
- Johnson DS, Mortazavi A, Myers RM, Wold B (2007) Genome-wide mapping of in Vivo protein-DNA interactions. Science (80-)316(5830):1497–1502. https://doi.org/10.1126/science.1141319CrossRefGoogle Scholar
- Kapitonov VV, Jurka J (2008) A universal classification of eukaryotic transposable elements implemented in Repbase. Nat Rev Genet 9(5):411–412; author reply 414. https://doi.org/10.1038/nrg2165-c1CrossRefGoogle Scholar
- Kato T, Iwamoto K (2014) Comprehensive DNA methylation and hydroxymethylation analysis in the human brain and its implication in mental disorders. Neuropharmacology 80:133–139. https://doi.org/10.1016/j.neuropharm.2013.12.019CrossRefPubMedGoogle Scholar
- Kazazian HH Jr, Moran JV (2017) Mobile DNA in health and disease. N Engl J Med 377(4):361. https://doi.org/10.1056/NEJMRA1510092CrossRefPubMedPubMedCentralGoogle Scholar
- KEGG (2019) Available online: http://www.genome.jp/kegg/. Cited 26 Mar 2019
- Lander ES, Linton LM, Birren B, Nusbaum C, Zody MC, Baldwin J (2001) International human genome sequencing consortium. Initial sequencing and analysis of the human genome. Nature 409(6822):860–921. https://doi.org/10.1038/35057062
- Lavialle C, Cornelis G, Dupressoir A, Esnault C, Heidmann O, Vernochet C, Heidmann T (2013) Paleovirology of 'syncytins', retroviral env genes exapted for a role in placentation. Philos Trans R Soc Lond B Biol Sci 368(1626):20120507. https://doi.org/10.1098/rstb.2012.0507CrossRefGoogle Scholar
- Lynch M, Ackerman MS, Gout JF, Long H, Sung W, Thomas WK, Foster PL (2016) Genetic drift, selection and the evolution of the mutation rate. Nat Rev Genet 17(11):704–714. https://doi.org/10.1038/nrg.2016.104CrossRefGoogle Scholar
- Maleszka R, Mason PH, Barron AB (2014) Epigenomics and the concept of degeneracy in biological systems. Brief Funct Genomics 13(3):191–202. https://doi.org/10.1093/bfgp/elt050CrossRefPubMedGoogle Scholar
- Meier K, Brehm A (2014) Chromatin regulation: how complex does it get? Epigenetics 9(11):1485–1495. https://doi.org/10.4161/15592294.2014.971580CrossRefPubMedPubMedCentralGoogle Scholar
- Mundade R, Ozer HG, Wei H, Prabhu L, Lu T (2014) Role of ChIP-seq in the discovery of transcription factor binding sites, differential gene regulation mechanism, epigenetic marks and beyond. Cell Cycle 13(18):2847–2852. https://doi.org/10.4161/15384101.2014.949201CrossRefPubMedPubMedCentralGoogle Scholar
- National Cancer Institute (2019) Available online: https://cactus.nci.nih.gov/ncicadd/about.htm. Cited 26 Mar 2019
- Nikitin D, Garazha A, Sorokin M, Penzar D, Tkachev V, Markov A, Buzdin A (2019) Retroelement-linked transcription factor binding patterns point to quickly developing molecular pathways in human evolution. Cells 8(2):130. https://doi.org/10.3390/cells8020130CrossRefPubMedCentralGoogle Scholar
- Nikitin D, Penzar D, Garazha A, Sorokin M, Tkachev V, Borisov N, Buzdin AA (2018) Profiling of human molecular pathways affected by retrotransposons at the level of regulation by transcription factor proteins. Front Immunol 9:30. https://doi.org/10.3389/fimmu.2018.00030CrossRefPubMedPubMedCentralGoogle Scholar
- Numpy Least squares polynomial fit (2019) Available online: https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.polyfit.html. Cited 26 Mar 2019
- O’Brien PJ (2006) Catalytic promiscuity and the divergent evolution of DNA repair enzymes. Chem Rev 106(2):720–752. https://doi.org/10.1021/cr040481vCrossRefPubMedGoogle Scholar
- Pathway Central (2019) Available online: http://www.sabiosciences.com/pathwaycentral.php. Cited 26 Mar 2019
- Reactome (2019) Available online: http://reactome.org. Cited 26 Mar 2019
- RepeatMasker (2019) Available online: http://www.repeatmasker.org. Cited 26 Mar 2019
- Royer-Bertrand B, Rivolta C (2015) Whole genome sequencing as a means to assess pathogenic mutations in medical genetics and cancer. Cell Mol Life Sci 72(8):1463–1471. https://doi.org/10.1007/s00018-014-1807-9CrossRefPubMedGoogle Scholar
- Seaborn (2019) Available online: http://seaborn.pydata.org/. Cited 26 Mar 2019
- Sloan CA, Chan ET, Davidson JM, Malladi VS, Strattan JS, Hitz BC, Cherry JM (2016) ENCODE data at the ENCODE portal. Nucleic Acids Res 44(D1):D726–D732. https://doi.org/10.1093/nar/gkv1160CrossRefPubMedGoogle Scholar
- Suntsova M, Garazha A, Ivanova A, Kaminsky D, Zhavoronkov A, Buzdin A (2015) Molecular functions of human endogenous retroviruses in health and disease. Cell Mol Life Sci 72(19):3653–3675. https://doi.org/10.1007/s00018-015-1947-6CrossRefPubMedGoogle Scholar
- The Gene Ontology Consortium (2017) Expansion of the gene ontology knowledgebase and resources. Nucleic Acids Res 45(D1):D331–D338. https://doi.org/10.1093/nar/gkw1108CrossRefGoogle Scholar
- Thompson D, Regev A, Roy S (2015) Comparative analysis of gene regulatory networks: from network reconstruction to evolution. Annu Rev Cell Dev Biol 31(1):399–428. https://doi.org/10.1146/annurev-cellbio-100913-012908CrossRefPubMedGoogle Scholar
- Turner BM (2014) Nucleosome signalling; an evolving concept. Biochim Biophys Acta 1839(8):623–626. https://doi.org/10.1016/j.bbagrm.2014.01.001CrossRefPubMedGoogle Scholar
- UCSC Browser, bedGraph files (2019a) Available online: https://genome.ucsc.edu/goldenpath/help/bedgraph.html. Cited 26 Mar 2019
- UCSC Browser, Human genome (2019b) Available online: https://genome.ucsc.edu/cgi-bin/hgs. Cited 26 Mar 2019
- Varriale A (2014) DNA methylation, epigenetics, and evolution in vertebrates: facts and challenges. Int J Evol Biol 2014:475981. https://doi.org/10.1155/2014/475981CrossRefPubMedPubMedCentralGoogle Scholar
- Villar D, Flicek P, Odom DT (2014) Evolution of transcription factor binding in metazoans—mechanisms and functional implications. Nat Rev Genet 15(4):221–233. https://doi.org/10.1038/nrg3481CrossRefPubMedPubMedCentralGoogle Scholar
- Yin H, Wang S, Zhang Y-H, Cai Y-D, Liu H (2016) Analysis of important gene ontology terms and biological pathways related to pancreatic cancer. Biomed Res Int 2016:1–10. https://doi.org/10.1155/2016/7861274CrossRefGoogle Scholar
- Yuryev A (2015) Gene expression profiling for targeted cancer treatment. Expert Opin Drug Discov 10(1):91–99. https://doi.org/10.1517/17460441.2015.971007CrossRefPubMedGoogle Scholar
- Zhong X (2016) Comparative epigenomics: a powerful tool to understand the evolution of DNA methylation. New Phytol 210(1):76–80. https://doi.org/10.1111/nph.13540CrossRefPubMedGoogle Scholar