Comp-D: a program for comprehensive computation of D-statistics and population summaries of reticulated evolution
Abstract
Computation of Patterson’s D-statistic and its five-taxon derivatives are important phylogenetic methods for the quantification of reticulated evolution, yet are limited in application by the lack of a single, comprehensive program to efficiently perform all necessary calculations from common phylogenetic and population genetic program file formats. To increase accessibility for a broad range of researchers, we present a user-friendly program (COMP-D) that provides flexibility for incorporating heterozygous sites, implements multiple statistical methods, and aggregates results from multiple tests. Program augmentations also facilitate the detection of population-level introgression. COMP-D provides a threefold increase in speed relative to comparable software. It is implemented in C++ and released under the GNU General Public License v3.0. Source code is available for Linux/Mac OS X from: https://github.com/stevemussmann/Comp-D_MPI.
Keywords
RADseq Introgression SNP analysis Next-generation sequencingNotes
Acknowledgements
The Arkansas High Performance Computing Center (AHPCC) provided technical assistance and computational resources. Tyler K. Chafin and Bradley T. Martin promoted software development by testing an early version of the program. This research was conducted in partial fulfillment of the Ph.D. degree in Biological Sciences at University of Arkansas (SMM). It was supported by generous University of Arkansas endowments: The Bruker Professorship in Life Sciences (MRD), the twenty-first Century Chair in Global Change Biology (MED), and a Doctoral Academy Fellowship (SMM). Three anonymous reviewers provided comments that greatly improved the manuscript.
Compliance with ethical standards
Conflict of interest
The authors have nothing to disclose.
Supplementary material
References
- Allendorf FW et al (2001) The problems with hybrids: setting conservation guidelines. Trends Ecol Evol 16(11):613–622CrossRefGoogle Scholar
- Árnason Ú (2018) Whole-genome sequencing of the blue whale and other rorquals finds signatures for introgressive gene flow. Sci Adv 4:eaap9873CrossRefGoogle Scholar
- Bangs MR et al (2018) Unraveling historical introgression and resolving phylogenetic discord within Catostomus (Osteichthyes: Catostomidae). BMC Evol Biol 18:86CrossRefGoogle Scholar
- Blackmon H, Adams RA (2015) EvobiR: Tools for comparative analyses and teaching evolutionary biology. https://doi.org/10.5281/zenodo.30938
- Bohling JH (2016) Strategies to address the conservation threats posed by hybridization and genetic introgression. Biol Conserv 203:321–327CrossRefGoogle Scholar
- DaCosta JM, Sorensen MD (2014) Amplification biases and consistent recovery of loci in a double-digest RAD-seq protocol. PLoS ONE 9(9):e106713CrossRefGoogle Scholar
- Durand EY et al (2011) Testing for ancient admixture between closely related populations. Mol Biol Evol 28:2239–2252CrossRefGoogle Scholar
- Eaton DA (2014) PyRad: assembly of de novo RADseq loci for phylogenetic analyses. Bioinformatics 30:1844–1849CrossRefGoogle Scholar
- Eaton DA, Ree RH (2013) Inferring phylogeny and introgression using RADseq data: an example from flowering plants (Pedicularis: Orobanchaceae). Syst Biol 62(5):689–706CrossRefGoogle Scholar
- Eaton DA et al (2015) Historical introgression among the American live oaks and the comparative nature of tests for introgression. Evolution 69:2587–2601CrossRefGoogle Scholar
- Efron B (1981) Nonparametric estimates of standard error: the jackknife, the bootstrap and other methods. Biometrika 68(3):589–599CrossRefGoogle Scholar
- Gompert Z, Buerkle CA (2010) Introgress: a software package for mapping components of isolation in hybrids. Mol Ecol Res 10:378–384CrossRefGoogle Scholar
- Green RE et al (2010) A draft sequence of the Neanderthal genome. Science 328(5979):710–722CrossRefGoogle Scholar
- Holm S (1979) A simple sequentially rejective multiple test procedure. Scand J Stat 6:65–70Google Scholar
- Hou Y et al (2015) Thousands of RAD-seq loci fully resolve the phylogeny of the highly disjunct arctic-alpine Diapensia (Diapensiaceae). PLoS ONE 10(10):e0140175CrossRefGoogle Scholar
- Korneliussen TS et al (2014) ANGSD: analysis of next generation sequencing data. BMC Bioinformatics 15:356CrossRefGoogle Scholar
- Malukiewicz J et al (2015) Natural and anthropogenic hybridization in two species of eastern Brazilian marmosets (Callithrix jacchus and C. penicillate). PLoS One 10(6):e0127268CrossRefGoogle Scholar
- Martin SH et al (2015) Evaluating the use of ABBA-BABA statistics to locate introgressed loci. Mol Biol Evol 32:244–257CrossRefGoogle Scholar
- Ottenburghs J et al (2017) A history of hybrids? Genomic patterns of introgression in the true geese. BMC Evol Biol 17:201CrossRefGoogle Scholar
- Patterson N et al (2012) Ancient admixture in human history. Genetics 192:1065–1093CrossRefGoogle Scholar
- Pease JB, Hahn MW (2015) Detection and polarization of introgression in a five-taxon phylogeny. Syst Biol 64:651–662CrossRefGoogle Scholar
- Perneger TV (1998) What’s wrong with Bonferroni adjustments. Brit Med J 316:1236–1238CrossRefGoogle Scholar
- Rice WR (1989) Analyzing tables of statistical tests. Evolution 43:223–225CrossRefGoogle Scholar
- Zhang W et al (2016) Genome-wide introgression among distantly related Heliconius butterfly species. Genome Biol 17:25CrossRefGoogle Scholar
- Zheng Y, Janke A (2018) Gene flow analysis method, the D-statistic, is robust in a wide parameter space. BMC Bioinform 19:10CrossRefGoogle Scholar