Abstract
Species-specific DNA regions are segments that are unique or share high dissimilarity relatively to close species. Their discovery is important, because they allow the localization of evolutionary traits that are often related to novel functionalities and, sometimes, diseases.
We have detected distinct DNA regions specific in the modern human, when compared to a Neanderthal high-quality genome sequence obtained from a bone of a Siberian woman. The bone is around 50,000 years old and the DNA raw data totalizes more than 418 GB. Since the data size required for localizing efficiently such events is very high, it is not practical to store the model on a table or hash table. Thus, we propose a probabilistic method to map and visualize those regions. The time complexity of the method is linear. The computational tool is available at http://pratas.github.io/chester.
The results, computed in approximately two days using a single CPU core, show several regions with documented neanderthal absent regions, namely genes associated with the brain (neurotransmiters and synapses), hearing, blood, fertility and the immune system. However, it also shows several undocumented regions, that may express new functions linked with the evolution of the modern human.
Keywords
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Pratas, D., Silva, R.M., Pinho, A.J., Ferreira, P.J.S.G: Detection and visualisation of regions of human DNA not present in other primates. In: Proceedings of the 21st RecPad 2015, Faro, Portugal, October 2015
Rahman, M.S., Alatabbi, A., Athar, T., Crochemore, M., et al.: Absent words and the (dis)similarity analysis of DNA sequences: an experimental study. BMC Res. Notes 9(1), 186 (2016)
Krings, M., Stone, A., Schmitz, R.W., Krainitzki, H., et al.: Neandertal DNA sequences and the origin of modern humans. Cell 90(1), 19–30 (1997)
Green, R.E., Krause, J., Ptak, S.E., Briggs, A.W., et al.: Analysis of one million base pairs of Neanderthal DNA. Nature 444(7117), 330–336 (2006)
Noonan, J.P., Coop, G., Kudaravalli, S., Smith, D., et al.: Sequencing and analysis of Neanderthal genomic DNA. Science 314(5802), 1113–1118 (2006)
Green, R.E., Malaspinas, A.S., Krause, J., Briggs, A.W., et al.: A complete Neandertal mitochondrial genome sequence determined by high-throughput sequencing. Cell 134(3), 416–426 (2008)
Green, R.E., Krause, J., Briggs, A.W., Maricic, T., et al.: A draft sequence of the Neandertal genome. Science 328(5979), 710–722 (2010)
Reich, D., Green, R.E., Kircher, M., Krause, J., et al.: Genetic history of an archaic hominin group from Denisova Cave in Siberia. Nature 468(7327), 1053–1060 (2010)
Prüfer, K., Racimo, F., Patterson, N., Jay, F., et al.: The complete genome sequence of a Neanderthal from the Altai Mountains. Nature 505(7481), 43–49 (2014)
Fu, Q., Hajdinjak, M., Moldovan, O.T., Constantin, S., et al.: An early modern human from Romania with a recent Neanderthal ancestor. Nature 524(7564), 216–219 (2015)
Skoglund, P., Northoff, B.H., Shunkov, M.V., Derevianko, A.P., et al.: Separating endogenous ancient DNA from modern day contamination in a Siberian Neandertal. PNAS 111(6), 2229–2234 (2014)
Hofreiter, M., Jaenicke, V., Serre, D., von Haeseler, A., et al.: DNA sequences from multiple amplifications reveal artifacts induced by cytosine deamination in ancient DNA. Nucl. Acids Res. 29(23), 4793–4799 (2001)
Briggs, A.W., Stenzel, U., Johnson, P.L., Green, R.E., et al.: Patterns of damage in genomic DNA sequences from a Neandertal. PNAS 104(37), 14616–14621 (2007)
Silva, R.M., Pratas, D., Castro, L., Pinho, A.J., Ferreira, P.J.S.G.: Three minimal sequences found in Ebola virus genomes and absent from human DNA. Bioinformatics 31(15), 2421–2425 (2015)
Bloom, B.H.: Space/time trade-offs in hash coding with allowable errors. Commun. ACM 13(7), 422–426 (1970)
Lin, Y.L., Pavlidis, P., Karakoc, E., Ajay, J., Gokcumen, O.: The evolution and functional impact of human deletion variants shared with archaic hominin genomes. Mol. Biol. Evol. (2015). https://doi.org/10.1093/molbev/msu405
Qu, R., Sang, Q., Xu, Y., Feng, R., et al.: Identification of a novel homozygous mutation in MYO3A in a chinese family with DFNB30 non-syndromic hearing impairment. Int. J. Pediatr. Otorhinolaryngol. 84, 43–47 (2016)
Silva, I.M., Rosenfeld, J., Antoniuk, S.A., Raskin, S., Sotomaior, V.S.: A 1.5 Mb terminal deletion of 12p associated with autism spectrum disorder. Gene 542(1), 83–86 (2014)
Baker, K., Gordon, S.L., Grozeva, D., van Kogelenberg, M., et al.: Identification of a human synaptotagmin-1 mutation that perturbs synaptic vesicle cycling. J. Clin. Invest. 125(4), 1670 (2015)
Meyer, M., Kircher, M., Gansauge, M.T., Li, H., et al.: A high-coverage genome sequence from an archaic Denisovan individual. Science 338(6104), 222–226 (2012)
Acknowledgments
We thank Martin Kircher, for very helpful comments and explanations, and Cláudio Teixeira, for computational infrastructures. This work was funded by FEDER (Programa Operacional Factores de Competitividade - COMPETE) and by National Funds through the FCT - Foundation for Science and Technology, in the context of the projects UID/CEC/00127/2013, UID/BIM/04501/2013, PTCD/EEI-SII/6608/2014 and the grant SFRH/BPD/111148/2015 to RMS.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Pratas, D., Hosseini, M., Silva, R.M., Pinho, A.J., Ferreira, P.J.S.G. (2017). Visualization of Distinct DNA Regions of the Modern Human Relatively to a Neanderthal Genome. In: Alexandre, L., Salvador Sánchez, J., Rodrigues, J. (eds) Pattern Recognition and Image Analysis. IbPRIA 2017. Lecture Notes in Computer Science(), vol 10255. Springer, Cham. https://doi.org/10.1007/978-3-319-58838-4_26
Download citation
DOI: https://doi.org/10.1007/978-3-319-58838-4_26
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-58837-7
Online ISBN: 978-3-319-58838-4
eBook Packages: Computer ScienceComputer Science (R0)