10.5061/DRYAD.RS449
Brown, Joseph W.
University of Michigan-Ann Arbor
Smith, Stephen A.
University of Michigan-Ann Arbor
Data from: The past sure is tense: on interpreting phylogenetic divergence
time estimates
Dryad
dataset
2017
Triassic
Angiospermae
information content
Fossil record
divergence time estimation
diptych
BEAST
marginal priors
National Science Foundation
https://ror.org/021nxhr62
NSF AVATOL Grant 1207915
2017-09-08T13:42:46Z
2017-09-08T13:42:46Z
en
https://doi.org/10.1093/sysbio/syx074
2876969 bytes
1
CC0 1.0 Universal (CC0 1.0) Public Domain Dedication
Divergence time estimation — the calibration of a phylogeny to geological
time — is an integral first step in modelling the tempo of biological
evolution (traits and lineages). However, despite increasingly
sophisticated methods to infer divergence times from molecular genetic
sequences, the estimated age of many nodes across the tree of life
contrast significantly and consistently with timeframes conveyed by the
fossil record. This is perhaps best exemplified by crown angiosperms,
where molecular clock (Triassic) estimates predate the oldest (Early
Cretaceous) undisputed angiosperm fossils by tens of millions of years or
more. While the incompleteness of the fossil record is a common concern,
issues of data limitation and model inadequacy are viable (if
underexplored) alternative explanations. In this vein, Beaulieu et al.
(2015) convincingly demonstrated how methods of divergence time inference
can be misled by both (i) extreme state-dependent molecular substitution
rate heterogeneity and (ii) biased sampling of representative major
lineages. These results demonstrate the impact of (potentially common)
model violations. Here, we suggest another potential challenge: that the
configuration of the statistical inference problem (i.e., the parameters,
their relationships, and associated priors) alone may preclude the
reconstruction of the paleontological timeframe for the crown age of
angiosperms. We demonstrate, through sampling from the joint prior (formed
by combining the tree (diversification) prior with the calibration
densities specified for fossil-calibrated nodes) that with no data present
at all, that an Early Cretaceous crown angiosperms is rejected (i.e., has
essentially zero probability). More worrisome, however, is that for the 24
nodes calibrated by fossils, almost all have indistinguishable marginal
prior and posterior age distributions when employing routine lognormal
fossil calibration priors. These results indicate that there is inadequate
information in the data to overrule the joint prior. Given that these
calibrated nodes are strategically placed in disparate regions of the
tree, they act to anchor the tree scaffold, and so the posterior inference
for the tree as a whole is largely determined by the pseudo-data present
in the (often arbitrary) calibration densities. We recommend, as for any
Bayesian analysis, that marginal prior and posterior distributions be
carefully compared, especially for parameters of direct interest. This
recommendation is not novel. However, given how rarely such checks are
carried out in evolutionary biology, it bears repeating. Ideally such
practices will become customarily integrated into both the peer review
process, as well as part of the standard workflow for conscientious
scientists. Finally, we note that the results presented here do not refute
the biological modelling concerns identified by Beaulieu et al. (2015).
Both sets of issues remain apposite to the goals of accurate divergence
time estimation, and only by considering them in tandem can we move
forward more confidently.
Fig. S1: Crown angiosperm age traceFigure S1: Traces of the age of the
crown angiosperm node across replicate analyses (burnin phase only) for
Beaulieu et al. (2015) (top) and Magallo ́n et al. (2015) (bottom) data
sets. Top: prior (lognormal priors), orange (n = 2); posterior (lognormal
priors), blue (n = 3); prior (exponential priors), red (n = 2); posterior
(exponential priors), green (n = 3). Bottom: prior (no angiosperm
constraint), orange (n = 4); posterior (no angiosperm constraint), blue (n
= 4); prior (all constraints), green (n = 4); posterior (all constraints),
red (n = 3; note that this is mostly obscured by the green prior trace).
All analyses were initialized with the age of crown angiosperms set at
∼140
Ma.FigS1.pdfPosterior-landPlants.4gene.original.lognormal.constraintsBEAST
xml file to carry out the posterior analysis of angiosperm divergence
times using the original lognormal fossil calibration
priors.Prior-landPlants.4gene.original.lognormal.constraintsBEAST xml file
to carry out the prior analysis of angiosperm divergence times using the
original lognormal fossil calibration
priors.Posterior-landPlants.4gene.extreme_exponential_constraintsBEAST xml
file to carry out the posterior analysis of angiosperm divergence times
using the extreme exponential fossil calibration
priors.Posterior-landPlants.4gene.uniform_spermatophytesBEAST xml file to
carry out the posterior analysis of angiosperm divergence times using the
uniform fossil calibration
priors.Prior-landPlants.4gene.uniform_spermatophytesBEAST xml file to
carry out the prior analysis of angiosperm divergence times using the
uniform fossil calibration
priors.Prior-landPlants.4gene_extreme_exponential_constraintsBEAST xml
file to carry out the prior analysis of angiosperm divergence times using
the extreme exponential fossil calibration priors.