10.5061/DRYAD.SK2PD88
Martin, Simon H.
University of Cambridge
Davey, John W.
University of York
Salazar, Camilo
Del Rosario University
Jiggins, Chris D.
University of Cambridge
Data from: Recombination rate variation shapes barriers to introgression
across butterfly genomes
Dryad
dataset
2019
Heliconius timareta
Heliconius cydno
recombination
Heliconius melpomene
2019-02-15T15:20:52Z
2019-02-15T15:20:52Z
en
https://doi.org/10.1371/journal.pbio.2006288
736381664 bytes
2
CC0 1.0 Universal (CC0 1.0) Public Domain Dedication
Hybridisation and introgression can dramatically alter the relationships
among groups of species, leading to phylogenetic discordance across the
genome and between populations. Introgression can also erode species
differences over time, but selection against introgression at certain loci
acts to maintain post-mating species barriers. Theory predicts that
species barriers made up of many loci throughout the genome should lead to
a broad correlation between introgression and recombination rate, which
determines the extent to which selection on deleterious foreign alleles
will affect neutral alleles at physically linked loci. Here we describe
the variation in genealogical relationships across the genome among three
species of Heliconius butterflies: H. melpomene, H. cydno and H. timareta,
using whole genomes of 92 individuals, and ask whether this variation can
be explained by heterogeneous barriers to introgression. We find that
species relationships vary predictably at the chromosomal scale. By
quantifying recombination rate and admixture proportions, we then show
that rates of introgression are predicted by variation in recombination
rate. This implies that species barriers are highly polygenic, with
selection acting against introgressed alleles across most of the genome.
In addition, long chromosomes, which have lower recombination rates,
produce stronger barriers on average than short chromosomes. Finally, we
find a consistent difference between two species pairs on either side of
the Andes, which suggests differences in the architecture of the species
barriers. Our findings illustrate how the combined effects of
hybridisation, recombination and natural selection, acting at multitudes
of loci over long periods, can dramatically sculpt the phylogenetic
relationships among species.
VCF: SNP Set 1 (92 individuals) VCF file for SNP Set 1 described in the
paper. Although reads were mapped to the Hmel2 scaffolds, the coordinate
system has been converted to chromosomes according to the Hmel2.5
scaffolding. Genotypes of females on chromosome 21 are given as haploid.
This version was updated in May 2020 to correct a misordering of the
sample labels. bar92.DP8MP4BIMAC2HET75.hapFem.minimal.corrected.vcf.gz
VCF: SNP Set 2 (92 individuals) VCF file for SNP Set 2 described in the
paper. Although reads were mapped to the Hmel2 scaffolds, the coordinate
system has been converted to chromosomes according to the Hmel2.5
scaffolding. Genotypes of females on chromosome 21 are given as haploid.
This version was updated in May 2020 to correct a misordering of the
sample labels. bar92.DP8MP9BIMAC2HET75.hapFem.minimal.corrected.vcf.gz
Whole genome distance matrix for 92 Heliconius individuals Pairwise
absolute genetic distances among the 92 individuals computed using SNP Set
1, further filtered to retain only SNPs separated by at least 1 kb. This
file was used as input for SplitsTree to generate the network in Figure
1B. bar92.DP8MP4BIMAC2HET75dist1K.dist PCA results (90 Heliconius
individuals) Eigenstrat SmartPCA results for 90 individuals (outgroups
excluded), presented in Figure 1B.
bar90.DP8MP9BIMAC2HET75.hapFem.eigenstrat.PCA.evec Topology weightings
from Twisst (42 files) Twisst data underlying figures 2 & 3 and
supplementary figures S1-S4. 42 files are provided, two for each
chromosome. Files ending in ".weights.tsv.gz" give the topology
weightings for the fifteen possible topologies (see Figure 2). Each line
represents a different 50 SNP window. Files ending in
".data.tsv" give the chromosome and coordinates of each window.
twisst_data.tar.gz Population recombination rates from LDhelmet (84 files)
Maximum likelihood estimates of the population recombination rate for each
100 kb window. 84 files are provided: one for each chromosome for four
separate populations of Heliconius cydno, timareta, melpomene (west) and
melpomene (east) LDhelmet_data.tar.gz Admixture proportion data (fd) (14
files) Estimated admixture proportions (fd) along with related estimates,
used to generate figures 3 and 4 and supplementary figures S7, S8 and
S11-S13. 14 files are provided, covering different population sets as
described in supplementary figure S6. File names indicate the populations
used in the order P1, P2, P3 and Outgroup (always "num"). Files
of two different window sizes are provided: 100 kb windows, sliding in
increments of 20 kb, with a minimum of 1000 genotyped sites per window
required ("w100m1s20"); and 20 kb windows, sliding in increments
of 5 kb, with a minimum of 300 genotyped sites per window required
("w20m300s5"). Two files are also provided giving estimates for
whole-chromosomes. ABBA_BABA_data.tar.gz Simulation results (3 files)
Simulated topology weighting results presented in S2 Fig. are provided in
the files "sim_weights_means_S2_Fig.tsv" (mean values) and
"sim_weights_fixed_S2_Fig.tsv" (fixed values). The three lines
in each file represent the three scenarios shown in S2 Fig. Simulated
values for divergence and admixture statistics presented in S5 Fig. are
provided in the file
"div_admix_stats_polarized.50k_100_inst.mean.tsv". The
"Scenario" column refers to the three scenarios shown on the
left in S5 Fig. sim_data.tar.gz Crossover recombination rate Crossover
recombination rate estimated for 100 kb windows based on the linkage maps
of Davey et al. (Evol. Lett. 1, 138–154, 2017).
crossover_rec_rates_Davey2017.tsv Proportion CDS Proportion of CDS per 100
kb window based on the Hmel2 genome annotation. Coordinates are converted
to chromosomes based on the Hmel2.5 scaffolding. prop_CDS.tsv