The plough is one of the most ancient and most valuable of man's inventions; but long before he existed the land was in fact regularly ploughed, and still continues to be thus ploughed by earth-worms. It may be doubted whether there are many other animals which have played so important a part in the history of the world, as have these lowly organised creatures.

Charles Darwin, The formation of vegetable mould through the actions of worms, with observations on their habits, pg. 313 [1]

Earthworms (Crassiclitellata) constitute a diverse group of primarily terrestrial, burrowing annelids comprising 6000+ extant species in 18 families and found on all continents except Antarctica. Most earthworm species live in soil, but some live in decaying logs, leaf litter, stream mud and riverbanks, as well as arboreal (e.g., epiphytic root masses) and even marine littoral habitats. Charles Darwin famously extolled the importance of earthworms as terrestrial ecosystem engineers, churning and aerating the soil with their burrows as well as burying and processing large fragments of organic matter and making their nutrients available to plants. Large-scale engineering by earthworms has recently been documented in South America [2] and may occur elsewhere. Even apart from their direct agricultural importance as soil processors, earthworms have a substantial economic impact—epigeic (leaf litter/compost-dwelling) species are used to process food waste (vermiculture), larger species are sold as bait for fish, and some earthworm species are considered delicacies and are sold for human consumption. Earthworms are prey items for many other species, including planarians, leeches, mollusks, insects, amphibians, lizards, snakes, birds and mammals, and thus serve as a crucial link in numerous terrestrial food webs. Many earthworms are considered invasive; approximately one-third of all earthworm species in North America are introduced from Europe and Asia [3, 4]. As invasive earthworms spread in recently glaciated and otherwise earthworm-free forests in North America, they affect many microbial, plant and invertebrate species that have come to rely on large amounts of undisturbed leaf material [5].

Widespread distribution and limited dispersal abilities make earthworms a promising model of historical biogeographic patterns at a global scale. Indeed, speculation about earthworm biogeography has a long history, perhaps unusually attractive to history-of-science enthusiasts. Early ideas about earthworm distributions relied on dubious land bridge hypotheses (review in [6]). Apocryphal lore has it that J.W. Michaelsen (a great Clitellata taxonomist of the late 19th and early 20th centuries; e.g., [7]) and Alfred Wegener were office neighbors in Hamburg, Germany for a time. Michaelsen [8] cited Wegener’s hypothesis of continental drift [9] as providing considerable explanatory power for the distributions of earthworms, and named an amphi-Atlantic genus after him (Wegeneriella Michaelsen 1933). Despite Michaelsen’s contribution, speculation about land bridges continued to pervade the earthworm biogeographic literature.

Earthworms have a very poor fossil record, and specialists have long disagreed about directions of character evolution within the group. Early earthworm phylogenies were highly intuitive (cf. [10]) and shed little light on earthworm historical biogeography. Earthworm phylogenetic understanding has progressed slowly since these initial attempts. The few applications of cladistic analysis, such as Jamieson’s (1988) morphological study [11], yielded mixed conclusions, and the first use of molecular data [12] overturned many of the morphology-based hypotheses. James and Davidson [13] included a broader gene (16S, 18S and 28S ribosomal RNA genes) and taxon sampling of Crassiclitellata and several outgroups and were able to reinterpret many morphological changes defining the families of crassiclitellates, proposing new hypotheses of morphological evolution and rehabilitating older ones.

Although James and Davidson [13] clarified many aspects of earthworm phylogeny, relationships among several major groups remain poorly supported. Fortunately, the advent of low-cost, high-throughput sequencing methods has revolutionized the study of higher-level relationships across the tree of life, allowing researchers to bring dozens to thousands of genes to bear on previously intractable questions. To test previous hypotheses of relationships among earthworms and provide a robust framework for historical biogeographic inference and studies of character evolution, we generated transcriptomic data from representatives of nearly all major extant lineages of Crassiclitellata and performed a series of analyses to infer relationships among the major lineages of earthworms.


Taxon sampling

A total of forty taxa (33 crassiclitellates, one moniligastrid and six outgroup taxa) were sampled for this study (Table 1). James and Davidson [13] used representatives of several clitellate taxa as outgroups for their analysis of crassiclitellate phylogeny based on 18S data, but only used an enchytraeid for most other analyses (including combined analyses of multiple loci). Their 18S Bayesian phylogeny (Fig. 1 in [13]) suggested that Haplotaxidae s. str. (represented by Haplotaxis gordioides) was sister to Metagynophora (Crassiclitellata + Moniligastridae), and our preliminary analyses of a broader sample of clitellate transcriptomes also suggested that members of Haplotaxidae are the closest extant relatives of Metagynophora (not shown). Haplotaxidae, with its currently recognized eight genera, is no longer considered to be monophyletic and has long been regarded as a “dustbin” for slender, primitive-looking clitellates [10, 14,15,16,17]. We chose representatives of four haplotaxid species, Lumbriculidae (Lumbriculus variegatus) and Propappidae (Propappus volki) as outgroups; P. volki was used to root the phylogeny. No leeches or branchiobdellidans were used in this study, for two reasons. First, previous work [13, 18] and preliminary analyses including several leech and branchiobdellidan transcriptomes supported a clade comprising Lumbriculidae, Branchiobdellida and Hirudinea. Second, all available leech and branchiobdellidan transcriptomes showed appreciably longer branch lengths on preliminary ML trees than did all other clitellates. Sampling only the relatively short-branch Lumbriculus variegatus allows this outgroup clade to be represented while avoiding potential confounding factors due to branch-length heterogeneity.

Table 1 Collection locality, museum location of voucher specimen, museum catalog number, SRA project number, number of Illumina reads, number of Trinity contigs and number of HaMSTr ortholog groups represented for each of the thirty-seven transcriptomes generated in this study
Fig. 1
figure 1

PhyloBayes 50%-majority-rule consensus phylogram for the 75% data set (59 loci, 16,458 amino acid characters, CAT-GTR model, 500-generation burn-in; see text for details). Posterior probabilities are shown at nodes; nodes without values have posterior probabilities of 1.0. Members of Metagynophora (Moniligastridae + Crassiclitellata) are highlighted in bold font; with Drawida sp. representing Moniligastridae. Transcriptomes downloaded from the Sequence Read Archive are labeled “(SRA)”. Crassiclitellate taxa are color coded by biogeographic region; Dichogaster saliens and Pontodrilus litoralis are cosmopolitan species. Dates for nodes labeled 1 and 2 were estimated with PhyloBayes; see text for details

The crassiclitellate samples represented all extant crassiclitellate families but one (Biwadrilidae) and at least 28 genera. Transcriptomes for thirty-one crassiclitellate taxa and all six outgroup taxa were generated as part of this study, and two additional crassiclitellate transcriptomes were assembled as described below from data in the Sequence Read Archive (SRA; for Hormogaster elisae (PRJNA196484) and Eisenia andrei (DRX021555). A transcriptome was also generated for a representative of Moniligastridae (Drawida sp.). Voucher specimens are deposited at the North Carolina Museum of Natural Sciences (NCSM), the Swedish Museum of Natural History (SMNH) and the Western Australian Museum (WAM) (Table 1).

Molecular techniques

Total RNA was extracted from RNAlater®-preserved samples using the Ambion RNAqueous®-Micro Total RNA Isolation kit. First-strand cDNA was constructed using the SMART® cDNA Library Construction Kit (Clontech Laboratories, Inc.), replacing the included 3′ primer with the Cap-TRSA-CV oligo [19]. We amplified double-stranded cDNA using the Advantage® 2 PCR Kit (Clontech Laboratories, Inc.). To minimize the risk of contamination, extractions and cDNA construction were performed in small batches of four tissue samples or fewer, and the workstation and tools were cleaned with bleach between each set of extractions. Where possible, we avoided sampling the external body surface and the gut to limit the potential for contamination from epibionts and gut contents (e.g., prey items and microorganisms).

Non-normalized cDNA libraries were sent to Hudson Alpha Institute for Biotechnology, Huntsville, Alabama USA for library preparation and 2 × 100–bp paired-end sequencing on an Illumina HiSeq 2000. Approximately one-sixth of a lane was used for each taxon.

Sequence assembly and processing

Raw PE Illumina reads were digitally normalized using khmer ( -C 30 -k 20 -N 4 -× 2.5e9) [20] and assembled using the October 5, 2012 release of Trinity [21]. We used TransDecoder ( to find open reading frames and translate nucleotide sequences into amino acid sequences that were at least 100 amino acids in length.

Dataset construction

Translated data for all 40 taxa were searched against the Lophotrochozoa pHMMs in HaMStR v.13.2.3 [22] using Helobdella robusta as reference species. We set HaMStR to output all sequences that fulfilled the reciprocity requirement and then used a custom script to generate FASTA-formatted files for each orthogroup that included all sequences and deleted duplicated contigs. Each orthogroup was then aligned with MAFFT (L-INS-i) [23].

One of the major difficulties in phylogenomic analysis—particularly when dealing with transcriptome data—is orthology assessment. Most animals harbor paralogous copies of many genes, but standard molecular phylogenetic analyses assume that data sampled from each taxon for each locus are orthologs. Failure to distinguish orthologs from paralogs can cause major problems in phylogenetic inference [24]. Given this, we used a tree-based approach to remove likely paralogs from our alignments. We inferred a maximum-likelihood (ML) tree for each aligned orthogroup with FastTreeMP [25] (under the –slow and –gamma settings), and used PhyloTreePruner [26] to screen each of the resulting trees. In PhyloTreePruner, nodes on each ML tree with SH-like local support values <0.7 were collapsed into polytomies, and the largest subtree was retained where each taxon was represented by either no sequences or only one sequence, unless all sequences for a given taxon formed part of a clade or part of the same polytomy (in which case, all were retained). Sequences falling outside this maximally inclusive subtree were assumed to be paralogs and were deleted from the data set. If multiple in-paralogs were initially retained, all but the longest sequence were subsequently deleted by PhyloTreePruner. This returned an alignment for each orthogroup that included (at most) a single, putatively orthologous sequence for each taxon. PhyloTreePruner was used to retain only orthogroups found in at least 25% (10 taxa), 50% (20 taxa), 75% (30 taxa) and 100% (40 taxa) of transcriptomes. All loci were subsequently realigned with MAFFT (L-INS-i). FASconCAT [27] was then used to concatenate orthogroups. The script ( was used to find the best-fitting amino-acid substitution model for each orthogroup (for downstream analyses using TreSpEx; see below) and for each concatenated data matrix. We chose not to use any automated alignment filtering methods (e.g., GBlocks [28]), due to concerns about their efficacy in improving phylogenetic inference [29].

Distantly related outgroups may be problematic for phylogenomic inference [30]. We used two approaches to explore the effect of outgroup sampling on estimates of ingroup relationships. First, we deleted Lumbriculus variegatus and Propappus volki (the two most distant outgroups in terms of summed branch length to the base of Crassiclitellata across analyses) and “?Haplotaxidae sp.” (a conspicuously long outgroup branch) from the set of transcriptomes prior to processing with the approach outlined above, leaving a total of 37 taxa. Following the approach outlined above, we used PhyloTreePruner to only retain orthogroups found in at least 25, 50 and 75% of the taxa (in this case, 10, 19 and 28 taxa, respectively). Second, we deleted only “?Haplotaxidae sp.” from the original set of transcriptomes, leaving a total of 39 taxa. For this data set, we processed the transcriptomes as described above, but used PhyloTreePruner to only retain orthogroups found in ≥75% of the taxa (i.e., 30 taxa). To assess the influence of sites with high percentages of gaps/missing data on our inferences, we produced two concatenated “no ?Haplotaxidae sp.” 75% data matrices. For one, we did no additional filtering. For the other, we used TrimAl v1.2 [31] to remove all sites comprising >50% gaps from each individual orthogroup alignment prior to concatenation and model testing. Amounts of missing data per taxon were calculated using TREE-PUZZLE 4.3 [32] for all matrices.

All data matrices, ML tree files, custom scripts and supplementary figures are available via the Dryad Digital Repository (

Long-branch effects and compositional heterogeneity

Differences in substitution rates and nucleotide/amino acid composition among lineages constitute two well-known confounding factors in phylogenetic analysis [33,34,35,36]. To assess potential impact of these factors on our inferences, TreSpEx.v1.1 [37] was used to calculate three measures of branch-length heterogeneity—the average patristic distance (PD), the standard deviation of the tip-to-root distance and the LB score (the mean pairwise PD of a taxon to all other taxa in the tree relative to the average pairwise PD over all taxa [37])—for each locus. Any single-gene alignment that had a value equal to or greater than 1.5 times the interquartile range above the median for any of these three indices was eliminated. Remaining loci were evaluated with BaCoCa v. 1.104r [38]. Data partitions (loci) with a p-value of less than 0.05 for a chi-square test of homogeneity were eliminated, as were all loci that were 1.5 times the interquartile range above the median RCFV value. RCFV measures the absolute deviation from the mean for each amino acid and taxon, in this case summed across taxa for each partition (locus); higher RCFVs indicate a higher degree of compositional heterogeneity in that partition [39]. TreSpEx and BaCoCa filtering was not applied to the 100% data set, which was already quite small in terms of number of loci (Table 2).

Table 2 Characteristics of all data matrices analyzed in this study

Maximum Likelihood (ML) analyses

Partitioned maximum-likelihood (ML) analyses were conducted with RAxML versions 8.1.24 and 8.2.3 [40] on CIPRES [41] with 1000 rapid bootstrap replicates, using the following options: -f a -x < random number seed for rapid bootstrapping; unique for each analysis > −p < random number seed for initial parsimony inferences; unique for each analysis > −# 1000 -m PROTGAMMA < amino acid model > −s < inputfile> − n < outputfile> (Table 2). Best-fitting amino acid substitution models were inferred for each locus and applied to each locus in RAxML by adding “-q < partitionfile>” to the command listed above. Identical random number seeds for rapid bootstrapping and parsimony inferences were used for the two “no ?Haplotaxidae sp.” 75% matrices (one that was not cleaned with TrimAl and one from which sites with >50% gaps were removed, both filtered with TreSpEx and BaCoCa) to allow a direct comparison of tree topologies for these two matrices.

We used SuperQ v.1.1 [42] to visualize topological conflict among loci for the 25, 50 and 75% unfiltered data sets. SuperQ rescales the partial, unrooted ML gene trees for each data set to produce comparable branch lengths, decomposes the trees into weighted quartet trees and employs the QNet algorithm to produce a split network from the quartet trees. We used the Gurobi optimizer to calculate initial split weights and optimize the weights under the “balanced” objective function. We used SplitsTree v.4.14.4 [43] to visualize the resulting networks.

Bayesian Inference (BI) analyses

Site-heterogeneous Bayesian Inference (BI) analyses of the 25, 50, and 75% data sets and for the two filtered “no ?Haplotaxidae sp.” 75% matrices (one that was not cleaned with TrimAl and one from which sites with >50% gaps were removed) were conducted with PhyloBayes-MPI v1.5a [44] under the CAT-GTR model with two independent chains and gamma-distributed rates on CIPRES. Analyses were allowed to run for up to 168 h (the CIPRES limit), constant sites were removed, and four categories were used for the discrete gamma distribution. Convergence checks were conducted automatically every 1800 s and analyses were terminated early if after a burn-in of 500 cycles, the minimum effective size exceeded 50, and the “maxdiff” value between chains was less than 0.1. For runs that terminated due to reaching the time limit, convergence of parameter estimates and topologies across chains was assessed by evaluating the basecomp and tracecomp files produced by PhyloBayes and via visual inspection of trace files in Tracer v1.6 [45].

Topology tests

Tree topologies recovered in our analyses contradicted previous hypotheses regarding the monophyly of Dichogaster (see below). The Shimodaira-Hasegawa and approximately unbiased tests [46, 47] are often used to evaluate particular topological hypotheses (including at least one hypothesis chosen a posteriori), but these tests are actually designed to evaluate whether all topologies in a plausible set of topologies are equally good explanations of the data, rather than to compare specific alternative topologies [48]. Fortunately, the parametric bootstrapping (SOWH) test [48, 49] and Bayesian topology tests [50] are both appropriate in this context.

We used SOWHAT [51] to perform SOWH tests to test Dichogaster monophyly. SOWH tests require two ML analyses—an unconstrained analysis and an analysis in which the topology is constrained to match a particular alternative hypothesis. The difference in likelihoods between the trees resulting from each analysis (δ) constitutes the test statistic for the SOWH test. The ML topology and branch lengths from the constrained analysis are then used to simulate a large number of data sets using the model parameter estimates for the constrained ML topology and original data. We provided SOWHAT with a Dichogaster monophyly constraint (forcing monophyly of the three Dichogaster transcriptomes) in Newick format and a reduced data set in which three distant/long-branch outgroup taxa (Propappus volki, Lumbriculus variegatus and ?Haplotaxidae sp.) were removed, retaining only orthogroups found in at least 28 of the transcriptomes, emulating the 75% data set described above. SOWHAT called Seq-Gen 1.3.2 [52] to simulate 100 data sets and RAxML 8.2.8 [40] to infer topologies for each simulated data set in an unconstrained and constrained ML analysis. SOWHAT calculates confidence intervals around a SOWH test p-value after addition of each replicate to determine if the sample size of the test was adequate.

For Bayesian topology tests, we used the posterior sample of trees generated in the PhyloBayes CAT-GTR analysis of the 75% data set to estimate posterior model odds for alternative topological hypotheses, following suggestions by Bergsten et al. [50]. We calculated posterior model odds by dividing the frequency of trees in the post burn-in sample of trees that support one hypothesis (e.g., Dichogaster is not monophyletic) by the frequency of trees that support the alternative hypothesis (e.g., Dichogaster is monophyletic; all three Dichogaster transcriptomes form a clade).

Divergence time estimation

Unfortunately, the dearth of fossils that can be attributed to earthworms [53, 54] presents a challenge for estimating divergence times, but there are some relevant fossils as well as some previous dating studies on earthworms. Putative earthworm trace fossils (burrows or casts) have been recovered from the Triassic [55], with possible body fossils in the Paleocene [56]. Possible clitellate body fossils have been recovered from Permian deposits [57], and fossil leech cocoons are known from the late Triassic [58]. Finally, a molecular study of hormogastrid earthworms (calibrated using the separation of the Corso-Sardinian microplate from continental Europe) suggests that they radiated in the Late Cretaceous [59]; if this is correct, the common ancestor of all crassiclitellates must have arisen much earlier.

These fossils and inferences give us a set of calibration points that we can use to estimate dates for key divergences within our phylogenies. We performed dating analyses for three data matrices: the unfiltered 75% data set (including ?Haplotaxidae sp.) and two versions of the 75% data matrix that did not include ?Haplotaxidae sp. (one with all sites and the other with sites containing >50% gaps removed, both filtered with TreSpEx and BaCoCa as described above) in PhyloBayes 3.3f [60]. In each case, we used the CAT-GTR PhyloBayes majority rule consensus tree for each data matrix as a fixed topology. We ran four independent chains for each data set, sampling every ten cycles, under the CAT-GTR substitution model with gamma-distributed rates, a lognormal autocorrelated relaxed clock model and a uniform prior on divergence times.

We used three calibration points/ranges in our analyses—the oldest known leech cocoon fossil (201 Mya) [58], the divergence of Hormogastridae (67–97 Mya) [59] and a minimum age estimate for crown-group Annelida of 520 Mya (based on the earliest known—probably stem-group—polychaetes from the Sirius Passet deposit of North Greenland; [61,62,63,64]). Though we did not include leeches in our analyses, previous studies have supported a sister-group relationship between leeches and their allies (branchiobdellidans and Acanthobdella) and Lumbriculidae [13, 18], providing a minimum age for divergence of the Lumbiculidae + Hirudinea clade and Crassiclitellata based on the earliest fossil cocoons attributable to leeches. We used 67 Mya as a minimum age and 97 Mya as a maximum age for the deepest divergence within Hormogastridae as represented in our data matrices [59] (the node subtending Hemigastrodrilus monicae and Vignysa popi/Hormogaster elisae; a recent phylogenomic study of Hormogastridae [65] corroborates this pattern of relationships). Finally, we argue that a minimum age of crown-group Annelida (520 Mya) is suitable as a maximum age constraint for the root of our phylogeny, because no evidence of clitellates is known prior to the Permian, and the root of our phylogeny is deeply nested within Clitellata, which is itself deeply nested within the annelid crown group.

The calibration for the divergence between Lumbriculidae and Hirudinea (201 mya) was treated as a hard upper bound, with the lower bound modeled as a truncated Cauchy distribution (p = 0.1 and c = 1). We placed uniform priors of 67–97 mya and 201–520 mya on the Hormogastridae divergence and the root node, respectively. Convergence was assessed with estimated sample sizes and visual inspection of parameter traces in Tracer v1.6. To assess whether the priors conditional on our calibrations match our intended prior distributions, we ran PhyloBayes under the prior and our calibrations using the F81 model without rate variation across sites (these model parameters do not factor into the prior over divergence times) and visually inspected the results.

We focused on divergence times for two nodes in our phylogeny that separated Northern and Southern Hemisphere subclades—1) a node separating Kynotus pittarelli (Madagascar) and a clade comprising Sparganophilus sp. and Komarekiona eatoni (both found in eastern North America) and 2) a node separating a Northern Hemisphere clade comprising Lutodrilus (North America) and Lumbricoidea (Criodrilidae, Hormogastridae, Lumbricidae) (Europa and Asia) and a primarily Southern Hemisphere clade comprising representatives of Almidae, Acanthodrilidae, Eudrilidae, Glossoscolecidae, Megascolecidae, Microchaetidae and Ocnerodrilidae (Africa, Australia, New Zealand and South America). We hypothesized that these divergences may be due to vicariance during the breakup of Pangaea starting in the late Triassic to early Jurassic (~200–185 Mya) [66, 67]; divergence time estimation using molecular data allows a test of this hypothesis.

Ideally, we would also infer dates using a Bayesian method such as BEAST [68], but preliminary analyses suggested that the computational demands of inferring divergence times for our data in this manner would be prohibitive.


Transcriptomes for thirty-one crassiclitellates, one moniligastrid, and six outgroup taxa were generated as part of this study (Table 1; Additional file 1: Figure S1) and are available from the SRA at NCBI under BioProject accession number PRJNA362879. We added publicly available transcriptome data for two additional crassiclitellates—Hormogaster elisae (PRJNA196484) and Eisenia andrei (PRJDB3115)—for a total of forty transcriptomes. Four gene sets from these transcriptomes were analyzed, reflecting different levels of gene occupancy.

Ingroup relationships

We filtered our data matrices to attempt to account for several issues known to cause problems in phylogenomic analysis. First, we built four data sets representing different levels of missing data, ranging from a data matrix with a high number of genes but also a high amount of missing data (i.e., the 25% data set) to a data matrix with a very low number of genes and very little missing data (i.e., the 100% data set) (Table 2). Second, we attempted to improve substitution model fit by partitioning our matrices by orthogroup (gene) and inferring best-fitting substitution models for each gene and also by using a site-heterogeneous model (CAT-GTR) in PhyloBayes. Third, we eliminated subsets of loci that showed high levels of branch-length heterogeneity (which could be due to either the presence of previously undetected paralogs or substitutional rate differences among taxa) and amino-acid compositional heterogeneity with TreSpEx and BaCoCa; trees for one set of genes that passed through this filter generally show low levels of branch-length heterogeneity (Additional file 2: Figure S2). Amounts of missing data per taxon varied widely among taxa and across matrices, ranging from a low value of 1.57% (Hormogaster elisae, 75% no ?Haplotaxidae sp., gappy sites removed matrix) to a high value of 93.32% (Alma sp., 25% unfiltered matrix) (Table 3). Alma sp. had the most missing data, followed by Fimoscolex sp. and Glossoscolex sp. (Table 3; Additional file 1: Figure S1).

Table 3 Percent missing data per taxon across all matrices analyzed in this study, calculated using TREE-PUZZLE 5.3

Across this array of data matrices and analyses, inferred patterns of relationships within Crassiclitellata were largely congruent and well supported (Figs. 1 and 2). Bayesian analyses under the CAT-GTR model were attempted for the unfiltered 25, 50, and 75% data sets and the two filtered “no ?Haplotaxidae sp.” 75% matrices in PhyloBayes, but examination of PhyloBayes output for the 25% data set in Tracer confirmed that it did not converge. For the 75% data set, two chains ran for an average of 17,890 cycles; for the 50% data set, two chains ran for an average of 8915 cycles. The filtered “no ?Haplotaxidae sp.” 75% matrix with all sites ran for an average of 18,465 cycles; the filtered “no ?Haplotaxidae sp.” 75% matrix with >50% gap sites deleted ran for an average of 29,205 cycles (Additional file 3: Figure S3). The largest discrepancy observed across all bipartitions was <0.1, the maximum discrepancy between the chains was ~0.3 and the effective sample sizes for all parameters were >50, all suggesting an acceptable PhyloBayes run for the 75% data set (the same was true for the analysis of the 50% data set, except that the maximum discrepancy was <0.5). The 75 and 50% PhyloBayes majority-rule consensus topologies are almost identical except for the position of Drawida sp. (recovered as sister to the clade comprising all crassiclitellates except Kynotus, Komarekiona and Sparganophilus in the 50% tree), so only the 75% PhyloBayes tree is shown (Fig. 1).

Fig. 2
figure 2

a Strict consensus tree of ML trees for unfiltered, filtered (with TreSpEx and BaCoCa) and deleted outgroup 75, 50 and 25% data matrices and two 75% data matrices from which ?Haplotaxidae sp. was deleted (one with all sites included, the other with sites with >50% gaps deleted) (eleven analyses/trees total). Numbers at nodes highlight well-supported discrepancies across analyses. Taxa in bold black or gray were not included in the “deleted outgroup” analyses. b Resolutions of conflicting clades across eleven analysis/matrix combinations, with ML bootstrap values or ranges shown

Within Crassiclitellata, several clades were consistently recovered, including a clade comprising Kynotidae (Madagascar), Sparganophilidae and Komarekionidae (both found in the southern and eastern United States) as sister to the rest of Crassiclitellata. All analyses revealed a deep split between two major clades, one with a largely Northern Hemisphere (Laurasian) distribution (Lumbricoidea sensu James and Davidson [12]; Lumbricidae, Hormogastridae, Criodrilidae and Lutodrilus multivesiculatus) and the other with a primarily Southern Hemisphere (Gondwanan) distribution (Microchaetidae, Rhinodrilidae, Almidae, Glossoscolecidae, Eudrilidae and Megascolecoidea sensu James and Davidson [12]) (note that Rhinodrilidae was mistakenly given the name Pontoscolecidae in James and Davidson [13]; this error was subsequently corrected [69].

Data partitioning, deleting long-branch outgroup taxa (e.g., ?Haplotaxidae sp.) and removal of loci that showed variable rates of change in different lineages or signs of compositional heterogeneity had little impact on trees resulting from analyses of the 25, 50, and 75% data sets. Relationships recovered in analyses of the 100% data set were poorly supported, likely reflecting the very small size of this data set (7 loci, <2000 amino acids; Table 2), and will not be discussed in detail. Across the 25, 50, and 75% data sets, relationships differed in only two ingroup clades. In one of these cases (relationships among Maoridrilus, Parachilota and Acanthodrilidae sp.), bootstrap support for any particular resolution was low across all analyses (Fig. 2). In the other, support for Hormogastridae (represented here by Hemigastrodrilus, Vignysa and Hormogaster) and a Scherotheca + Eisenia pairing increased as the number of genes (and amount of missing data) increased (Fig. 2). Bootstrap support for both of these clades was >90% across all data sets from which one or more distant/long-branch outgroup taxa were deleted. With the exception of these cases, all ingroup relationships were identical across all trees, whether or not ?Haplotaxidae sp. was included, sites containing >50% gaps were deleted, the data were filtered with TreSpEx and BaCoCa or analyzed with a site-heterogeneous model in PhyloBayes.

The split networks produced with SplitsTree from the SuperQ lists of weighted splits generally reflect the tree-like structure recovered in concatenated ML and Bayesian analyses, but also show substantial incongruence among loci in all three unfiltered data sets (Fig. 3; the three supernetworks are very similar, so only the supernetwork for the 75% data set is shown).

Fig. 3
figure 3

Quartet-based supernetwork based on all single-gene ML trees for the 75% data set (49 genes). Geographically delineated subclades of the main crassiclitellate clade (the clade subtending node 2 in Fig. 1) are circled

Outgroups and basal relationships

Despite the high stability and levels of bootstrap support for relationships within the ingroup across analyses, positions of some outgroup taxa and the lone representative of Moniligastridae (Drawida) varied among data sets and analyses (Fig. 2). Based on previous analyses [13], we expected Propappus volki to be a suitable distant outgroup to root our phylogeny, with Lumbriculidae and Haplotaxidae forming successively closer outgroups to Crassiclitellata. However, most analyses failed to recover this pattern of relationships. Partitioned ML analyses of the 25% through 75% data matrices supported a paraphyletic Crassiclitellata and Metagynophora (due to the inclusion of Pelodrilus sp., an alleged haplotaxid) as well as a doubly paraphyletic Haplotaxidae (due to the inclusion of both Metagynophora and Lumbriculus variegatus) (Fig. 2). Recovery of topologies in which Lumbriculus variegatus is more closely related to Metagynophora than Haplotaxis gordioides is strains credulity; Lumbriculus is a member of Lumbriculidae, a clitellate group that previous molecular and morphological phylogenetic reconstructions (18S data; [13, 18]) suggest is more closely related to leeches (Hirudinida) than to haplotaxids or crassiclitellates.

Our outgroup sampling was designed to test crassiclitellate monophyly and root Crassiclitellata, not to infer deep-level relationships among major clitellate taxa. As such, unexpected relationships among outgroups may not be surprising, but failure to recover Crassiclitellata is of greater concern—in some trees (e.g., based on partitioned ML analyses of the 25 and 50% data sets), Drawida (Moniligastridae) was found to be nested within Crassiclitellata, usually as sister to a clade comprising all earthworms except Komarekiona, Kynotus and Sparganophilus (Fig. 2). Removal of outlier loci detected by TreSpEx and BaCoCa did not consistently recover expected relationships among the outgroup taxa, nor did it consistently yield a monophyletic Crassiclitellata across data sets (Fig. 2b).

Elimination of potentially problematic loci is one way to explore the impact of systematic bias and possibly improve inferences; elimination of potentially problematic taxa is another. Inclusion of distant outgroups can perturb phylogenomic analyses, particularly with respect to basal ingroup relationships [30]. Cursory visual inspection of our trees revealed that one of the haplotaxids in our data sets—?Haplotaxidae sp.—is a rather long-branch taxon, and this could be confounding our results. To test this, we eliminated ?Haplotaxidae sp. alone, or the two (putatively) most distant outgroup taxa in our data sets (Propappus and Lumbriculus) and ?Haplotaxidae sp. Unfortunately, despite the seemingly positive impact of outgroup deletion on inference of some ingroup relationships (see above), analyses of these matrices failed to clarify basal crassiclitellate relationships, usually yielding trees in which either Pelodrilus sp. or Drawida sp. was weakly supported as sister to the Komarekiona + Kynotus + Sparganophilus clade (Fig. 2).

By contrast, PhyloBayes analysis of the unfiltered 75% data set recovered both a monophyletic Crassiclitellata and a monophyletic Metagynophora, though the posterior probability of Crassiclitellata was low (0.61) (Fig. 1). Assuming Crassiclitellata and Metagynophora are, indeed, monophyletic, our PhyloBayes results suggest that accounting for site-specific substitution processes, if computationally feasible, rather than simply partitioning by gene, can yield improved inferences.

Topology tests

In the SOWH test of Dichogaster monophyly, the observed δ test statistic was 4148.083, and Dichogaster monophyly was rejected (p-value <0.01, 95% confidence interval = 0.03621669–0). No trees in the post burn-in sample of 2578 trees from PhyloBayes include a monophyletic Dichogaster, making the posterior model odds in favor of a non-monophyletic Dichogaster infinite.

Divergence times

We ran PhyloBayes under the prior for 370,000+ cycles for the unfiltered 75% matrix, and visual inspection of the output suggests that the induced prior distributions for the root and nodes of interest are non-informative. The dating analysis of the 75% data matrix ran for an average (across four independent chains) of 21,500 cycles, the analysis of the “no ?Haplotaxidae sp.” 75% data matrix, filtered with TreSpEx and BaCoCa with no sites deleted, ran for an average of 16,660 cycles, and the analysis for the “no ?Haplotaxidae sp.” 75% data matrix, filtered with TreSpEx and BaCoCa with sites containing >50% gaps deleted, ran for an average of 18,000 cycles. For all three analyses, inspection of the four chains in Tracer suggested that a 10% burn-in was appropriate, and all ESS values were above 200. The consensus tree topologies differ slightly across the three analyses, most notably in that Drawida sp. is recovered as sister to all crassiclitellates in the unfiltered 75% consensus tree (Fig. 2), but it is recovered as sister to the Komarekiona + Kynotus + Sparganophilus clade in the consensus trees for the 75% “no ?Haplotaxidae sp.” analyses (Additional file 3: Figure S3). Date estimates for each node across all four independent chains were within 2% of each other, so only results from the first chain are reported for each data matrix (Table 4); chronograms are presented in Additional file 4: Figure S4.

Table 4 PhyloBayes divergence time estimates (mean ± standard error) in millions of years ago for three data matrices for two key nodes in the earthworm radiation—the node separating Kynotus from Sparganophilus + Komarekiona (node 1, Fig. 1) and the node separating the Northern Hemisphere clade comprising Lutodrilus and Lumbricoidea and the clade comprising Southern Hemisphere representatives of several families (node 2, Fig. 1)


Crassiclitellata systematics

The convergence of results from multiple approaches on a consistent topology (Figs. 1, 2 and 4) provides us with a strong framework for understanding earthworm phylogeny and evolutionary relationships. Our data lend substantial support to revisions of the classical, intuitive, understandings of earthworm phylogeny proposed by James and Davidson [13]. Among the more striking revisions supported here is the placement of Kynotidae (endemic to Madagascar) within a group also containing the exclusively North American families Komarekionidae and Sparganophilidae, rather than sister to or nested within Microchaetidae (South Africa) [70]. Unfortunately, we were unable to obtain suitable material of Biwadrilus bathybates, the sole representative of the monotypic family Biwadrilidae from Japan, for transcriptome sequencing; previous work suggests B. bathybates may be sister to Kynotidae [13], a relationship that, if supported, reflects a paleogeography no longer obvious from Pangaean or post-Pangaea continental configurations.

Fig. 4
figure 4

Summary tree of relationships among the taxa included in this study, highlighting the names of relevant higher taxa. The topology shown is a strict consensus of all trees recovered for the 25, 50 and 75% data matrices generated in this study (filtered with TreSpEx and BaCoCa and unfilitered, with and without long-branch outgroups, etc.), with families and major taxa highlighted. The two numbered nodes were the focus of divergence time estimation. See Fig. 2 and text for details

A second important revision to classical earthworm thinking supported by the current study is the placement of the monotypic families Lutodrilidae and Criodrilidae as successive sister taxa to the Hormogastridae + Lumbricidae clade. Both Lutodrilidae and Criodrilidae are aquatic, as is Almidae, and these three families show strong morphological similarities of body form (quadrangular tail segments), color (dusky gray with blue-green in the head segments) and clitellum length (extraordinarily long, tens of segments rather than the usual 3 to 10 or so seen in most terrestrial earthworms). The finding that the two closest relatives of the clade comprising the predominant earthworms of Europe (Lumbricidae and Hormogastridae) are aquatic suggests a possible aquatic ancestor for European earthworms. Typically, aquatic earthworms lack dorsal pores, but most members of Lumbricidae have them, as do members of the crown clade Megascolecoidea (represented here by representatives of Dichogaster, Maoridrilus, Parachilota and Pontodrilus, along with a thus-far-unidentified acanthodrilid from Madagascar, Place Kabary 2 sp.). Microchaetidae through Eudrilidae and Ocnerodrilidae (represented here by Kerriona) lack dorsal pores, with rare exceptions in the last family [71], indicating that dorsal pores probably evolved independently at least twice.

In the current study, placement of the only member of Ocnerodrilidae, Kerriona sp. Graciosa 1, as sister to a clade composed of Dichogaster saliens (Benhamiidae) and the acanthodriline Place Kabary 2 sp. is unusual and, if validated with a larger sampling of Ocnerodrilidae, would be a major change in the systematics of Megascolecoidea. Traditionally Ocnerodrilidae is considered to be close to acanthodriline earthworms (Acanthodrilidae, Benhamiidae, and “Octochaetidae”), because they share similar male reproductive apparatuses composed of prostate glands associated with the male gonopores (cf. [21,22,23]). They are also morphologically similar in a few respects to the African Eudrilidae.

The status of Dichogaster is uncertain from the present results, perhaps largely due to the inclusion of New World species, which have not been included in previous phylogenetic efforts. Two of the three sampled Dichogaster species, from Guadeloupe (French West Indies) and from an arboreal epiphyte root mass north of Manaus, Brazil (Dichogaster green tree worm), are clearly separated from D. saliens, historically endemic to Africa, and both a SOWH test and a Bayesian topology test strongly reject Dichogaster monophyly. The latter species has previously been included in a highly supported African and south Pacific Dichogaster clade, within the also highly supported Benhamiinae [12]. Morphologically, the New and Old World Dichogaster species share many derived characters, but differ on a few points [72]. The geographic distribution of the genus (equatorial Africa, north Neotropics, northern South America, South Pacific) remains enigmatic in the absence of a well resolved and more broadly sampled phylogeny of Benhamiinae.

The classically defined Glossoscolecidae was separated into Rhinodrilidae (“Pontoscolecidae”) and a restricted Glossoscolecidae based on a weakly supported node in the topology recovered in [12]. That node had Almidae intervening between the two families. Our results confirm that node with strong support, suggesting that Almidae is probably secondarily aquatic given that Glossoscolecidae and Rhinodrilidae are predominantly terrestrial. We hypothesize that the common ancestor of Almidae and Rhinodrilidae occurred at a time when paleocontinents made possible the occupation of South American, African and Asian landmasses; South America would seem to be the most probable area of origin for Almidae.

The current study confirmed relationships within Lumbricoidea put forth by [12], and resolved an outstanding conflict about Hormogastridae, which was found to be monophyletic in [65] but paraphyletic or unresolved due to the placement of Hemigastrodrilus in [12]. Although analyses of the 75% data set support paraphyly of Hormogastridae, analyses of the 25 and 50% data sets, as well as all “deleted outgroup” data sets, return a monophyletic Hormogastridae (Fig. 2).

Despite the consistent topological patterns seen across all analyses, supernetwork visualization revealed high levels of interlocus conflict (Fig. 3). Some regions of high incongruence—e.g., near the base of Crassiclitellata—are unsurprising, given that concatenated analyses of different data sets recover different relationships in this region of the tree. However, the networks also show a higher level of conflict among loci along the backbone of the Southern Hemisphere subclade than in the Northern Hemisphere group. The reasons for this are unclear, but more taxa were sampled from the Southern Hemisphere clade, and branches in this group on both the network (Fig. 3) and, less obviously, on the PhyloBayes tree (Fig. 1) are generally longer.

Pangaean earthworms?

No known earthworm fossils exist. Although several ichnofossils have been attributed to earthworm-like organisms, these traces provide little or no concrete information about the clade membership of the author of any hole, burrow, fecal material or other fossilized biostructure made by an elongated soft-bodied invertebrate. However, we can make some inferences about the age of earthworm clades based on the biology and distributions of extant earthworm species and the results of our dating analyses. First, transoceanic movement of adult crassiclitellates seems unlikely except for a few cases where species have become salt-water tolerant inhabitants of marine littoral zones (e.g., Pontodrilus litoralis). Transoceanic dispersal of earthworms is nonetheless a possibility over geological time scales—such dispersal events have been inferred for other subterranean terrestrial animals (e.g., amphisbaenians; [73]), earthworm cocoons may be dispersed via rafting or by birds, and earthworms are known from many islands. Second, current earthworm distributions show a high degree of congruence with post-Pangaean continental movements [6, 70]. Third, current earthworm distributions generally show high degrees of local endemicity in topographically complex landscapes, and even in non-complex areas in some lowland tropical forests [74].

Relative ages of New Zealand earthworm clades are comparable to those of continental earthworm faunas [75], and multiple sister-group relationships span large distributional gaps (e.g., New Zealand-Madagascar and trans-Pacific relationships between Australia and North and Central America). Lumbricidae (Eurasia, North America) has been estimated to be about 125 million years old [76] using biogeographic calibrations, while the split between Lumbricoidea and earthworm families on the branch leading to Megascolecoidea was previously estimated at about 200 MYA, the Triassic-Jurassic boundary, coinciding with the separation of Laurasia from Gondwana [77]. The latter split is also present in our trees, with comparable taxon sampling. Our date estimates for this node are somewhat more recent (ranging from 161 to 186 mya, depending on the data matrix; Table 4), but the standard errors on these estimates (±21- ± 23 my) are substantial. However, deletion of the long-branch outgroup taxon ?Haplotaxidae sp. yielded earlier divergence times for this node (~178 mya with >50% gap sites excluded, ~186 mya with all sites) that are more concordant with the breakup of Pangaea.

Recovery of a sister-group relationship between a Laurasian clade and a Gondwanan clade is not unprecedented; similar patterns have been seen in crayfish (Astacoidea and Parastacoidea) [78], dragonflies (Petaluridae) [79], stoneflies (Arctoperlaria and Antarctoperlaria) [80], mayflies (Ephemerelloidea) [81] and squeak beetles (Hygrobiidae) [82]. Within Lumbricidae, the split between European Eisenia and a North American clade containing Eisenoides and others (not sampled in this study) may be consistent with the final separation of the two continents at ~72 MYA [77].

The clade containing Komarekiona, Sparganophilidae, Kynotidae and Biwadrilidae (the latter not sampled in this study) also shows some sign of a northern continent / southern continent split, which also suggests a Pangaean distribution. As above, our divergence time estimate for the split between Kynotus and Komarekiona + Sparganophilus using the 75% data set (165 ± 22 mya) is more recent than we might expect if the Pangaean hypothesis was true. Once again, however, estimates based on the “no ?Haplotaxidae sp.” data matrices (178 and 186 mya) are deeper in time, and the congruence in divergence time estimates for the two focal nodes is noteworthy (for the “no ?Haplotaxidae sp.” data matrices, the mean estimates are within one million years of each other). These are very small families, two from North America and one each from Madagascar and Japan, respectively. There is little evidence to lead us to a hypothesis about the geographic location of the ancestor of Crassiclitellata. Moniligastridae, the sister group of Crassiclitellata represented in this study by Drawida, is now only found in South and East Asia. It is extremely diverse in India, but less so elsewhere in Asia as far east as Borneo [83] and Mindoro Island, Philippines (James, unpublished data), an Asian crustal fragment. Based on this distribution, the Moniligastridae–Crassiclitellata divergence would seem most likely to have occurred in a southern landmass.

Our dating analyses seem to be broadly consistent with the hypothesis that the two major “north-south” divergences within Crassiclitellata were caused by the breakup of Pangaea, but they do not constitute a particularly strong test. Additional data and more thorough dating analyses will be required to provide a more rigorous test of the Pangaean breakup hypothesis.

There remain several unanswered questions about the evolutionary history of Clitellata. Within the former, for example, we do not yet have a clear picture of the sister group to Crassiclitellata, nor have we robust support for crassiclitellate monophyly using the data presented here. The shared presence of a multi-layered clitellum remains the strongest evidence for crassiclitellate monophyly, but the possibility of multiple origins of this trait cannot be disregarded. Ongoing phylogenomic work on Clitellata as a whole should shed substantial light on this question.


This study clarifies earthworm phylogeny and evolution, supporting several recently proposed revisions to our understanding of earthworm relationships and resolving others, most notably including 1) placement of Kynotidae (Madagascar) with a group containing the North American taxa Komarekionidae and Sparganophilidae, 2) a clade comprising Lutodrilidae, Criodrilidae, Hormogastridae and Lumbricidae, 3) Dichogaster paraphyly, 4) affirmation of a restricted Glossoscolecidae and 5) Hormgastridae monophyly. Recovery of two major clades, each consisting of a Northern Hemisphere subclade and a Southern Hemisphere subclade, suggested a major role for vicariance (specifically, the breakup of Pangaea during the Mesozoic) in earthworm phylogeny and biogeography. Divergence time estimation provided additional support for this hypothesis, dating the north-south splits within each major clade to ~161–185 Mya.