10.5061/DRYAD.8MP17
Fernández, Rosa
Harvard University
Edgecombe, Gregory D.
Natural History Museum
Giribet, Gonzalo
Harvard University
Data from: Exploring phylogenetic relationships within Myriapoda and the
effects of matrix composition and occupancy on phylogenomic reconstruction
Dryad
dataset
2016
Symphyla
Diplopoda
Gene tree
node calibration
missing data
Chilopoda
2016-05-02T13:17:16Z
2016-05-02T13:17:16Z
en
https://doi.org/10.1093/sysbio/syw041
6080322 bytes
2
CC0 1.0 Universal (CC0 1.0) Public Domain Dedication
Myriapods, including the diverse and familiar centipedes and millipedes,
are one of the dominant terrestrial arthropod groups. Although molecular
evidence has shown that Myriapoda is monophyletic, its internal phylogeny
remains contentious and understudied, especially when compared to those of
Chelicerata and Hexapoda. Until now, efforts have focused on taxon
sampling (e.g., by including a handful of genes from many species) or on
maximizing matrix size (e.g., by including hundreds or thousands of genes
in just a few species), but a phylogeny maximizing sampling at both levels
remains elusive. In this study, we analyzed 40 Illumina transcriptomes
representing 3 of the 4 myriapod classes (Diplopoda, Chilopoda, and
Symphyla); 25 transcriptomes were newly sequenced to maximize
representation at the ordinal level in Diplopoda and at the family level
in Chilopoda. Ten supermatrices were constructed to explore the effect of
several potential phylogenetic biases (e.g., rate of evolution,
heterotachy) at 3 levels of gene occupancy per taxon (50%, 75%, and 90%).
Analyses based on maximum likelihood and Bayesian mixture models retrieved
monophyly of each myriapod class, and resulted in 2 alternative
phylogenetic positions for Symphyla, as sister group to Diplopoda +
Chilopoda, or closer to Diplopoda, the latter hypothesis having been
traditionally supported by morphology. Within centipedes, all orders were
well supported, but 2 deep nodes remained in conflict in the different
analyses despite dense taxon sampling at the family level. Relationships
among centipede orders in all analyses conducted with the most complete
matrix (90% occupancy) are at odds not only with the sparser but more
gene-rich supermatrices (75% and 50% supermatrices) and with the matrices
optimizing phylogenetic informativeness or most conserved genes, but also
with previous hypotheses based on morphology, development, or other
molecular data sets. Our results indicate that a high percentage of
ribosomal proteins in the most complete matrices, in conjunction with
distance from the root, can act in concert to compromise the estimated
relationships within the ingroup. We discuss the implications of these
findings in the context of the ever more prevalent quest for completeness
in phylogenomic studies.
Figure S1Figure S1. Phylogenetic hypothesis of the interrelationships of
Pauropoda, Symphyla, Diplopoda and Chilopoda inferred from a 4-gene data
set using ExaBayes.FigS1_tree_with_pauropoda.pdfFigure S2Figure S2.
Phylogenetic informativeness profile of the genes included in
supermatrices III (a), VI (b) and V (c) during the interval of time
corresponding to the diversification of all myriapod clades (as delimited
by blue lines in the ultrametric tree in (a)).FigS2.pdfFigure S3Figure S3.
Phylogenetic hypothesis of Myriapoda based on 232 morphological characters
coded for both extant and extinct species (see Material and Methods for
further details). Strict consensus of 2515 shortest cladograms (420
steps). Fossil taxa are marked with a dagger symbol.FigS3.pdfFigure
S4Figure S4. Topologies recovered in the SAW analyses to test for LBA
between a) Symphyla and Polyxenida, and b) Scutigeromorpha and
Craterostigmomorpha.FigS4_new.pdfFigure S5Figure S5. BaCoCa results
showing compositional homogeneity values per taxon and per gene in
supermatrices II and III (results in supermatrix I not shown due to a lack
of resolution related to the high number of genes included in the
matrix).FigS5_Comp_Homog.pdfFigure S6Figure S6. Supernetwork
representation of quartets derived from individual ML gene trees, for the
genes concatenated in supermatrices I (a), II (b) and III (c).
Phylogenetic conflict is represented by reticulations and short
branches.FigS6.pdfSources of codings and dates for fossils and
supplementary referencesSupplementary_Fossil codings.pdf