10.5061/DRYAD.XWDBRV1C5
Janzen, Thijs
0000-0002-4162-1140
University of Groningen
Estimating the time since admixture from phased and unphased molecular data
Dryad
dataset
2020
FOS: Biological sciences
2021-10-29T00:00:00Z
2021-10-29T00:00:00Z
en
https://doi.org/10.5281/zenodo.5602758
https://doi.org/10.5281/zenodo.5602760
81923227 bytes
8
CC0 1.0 Universal (CC0 1.0) Public Domain Dedication
After admixture, recombination breaks down genomic blocks of contiguous
ancestry. The breakdown of these blocks forms a new `molecular
clock', that ticks at a much faster rate than the mutation clock,
enabling accurate dating of admixture events in the recent past. However,
existing theory on the break down of these blocks, or the accumulation of
delineations between blocks, so called `junctions', has mostly been
limited to using regularly spaced markers on phased data. Here, we present
an extension to the theory of junctions using the Ancestral Recombination
Graph that describes the expected number of junctions for any distribution
of markers along the genome. Furthermore, we provide a new framework to
infer the time since admixture using unphased data. We demonstrate both
the phased and unphased methods on simulated data and show that our new
extensions have improved accuracy with respect to previous methods,
especially for smaller population sizes and more ancient admixture times.
Lastly, we demonstrate the applicability of our method on three empirical
datasets, including labcrosses of yeast (Saccharomyces cerevisae) and two
case studies of hybridization in swordtail fish and Populus trees.
This contains all code used for the manuscript, including code to simulate
underlying data, analyze empirical data, and also code to visualize the
results and reproduce the figures from the main text. The code is
organized per figure and each folder is named according to the associated
figure. See associated Zenodo Related Works for code. Each folder
typically contains 3 files: - data.zip - simulate.R - plot_figure.R
data.zip contains the underlying (often simulated) data simulate.R
contains the scripts used to generate data.zip plot_figure.R contains the
code to summarise the simulated data, and create the figure found in the
main text. In some cases, additional scripts can be found to analyze the
data, in particular for the three empirical datasets. The raw data used in
the empirical data analysis was not included here, but can be found in the
references used, or in the additional README files in the relevant
folders. Furthermore, the code for the junctions R package was amended as
well, which can be installed either from CRAN using
'install.packages('junctions')', or installed from the
local tar file included in this dryad repository. For completeness, the
Supplementary file available via de journal's website has been added
here as well.