10.5061/DRYAD.1RN8PK0PZ
Suurväli, Jaanus
0000-0003-0133-7011
University of Cologne
Whiteley, Andrew R.
University of Montana
Zheng, Yichen
University of Cologne
Gharbi, Karim
Earlham Institute
Leptin, Maria
University of Cologne
Wiehe, Thomas
University of Cologne
Data from: The laboratory domestication of zebrafish: from diverse
populations to inbred substrains
Dryad
dataset
2019
genetic variants
wild populations
laboratory strains
Inbreeding
Deutsche Forschungsgemeinschaft
https://ror.org/018mejw64
SPP1819
National Science Foundation
https://ror.org/021nxhr62
DEB-1652278
National Science Foundation
https://ror.org/021nxhr62
IOS-1257562
2019-12-08T00:00:00Z
2019-12-08T00:00:00Z
en
https://doi.org/10.1111/j.1365-294X.2011.05272.x
https://doi.org/10.1093/molbev/msu325
https://doi.org/10.1534/genetics.114.169284
https://doi.org/10.1093/molbev/msz289
972185492 bytes
5
CC0 1.0 Universal (CC0 1.0) Public Domain Dedication
We know from human genetic studies that practically all aspects of biology
are strongly influenced by the genetic background, as reflected in the
advent of ‘personalized medicine’. Yet, with few exceptions, this is not
taken into account when using laboratory populations as animal model
systems for research in these fields. Laboratory strains of zebrafish
(Danio rerio) are widely used for research in vertebrate developmental
biology, behaviour and physiology, for modelling diseases, and for testing
pharmaceutic compounds in vivo. However, all of these strains are derived
from artificial bottleneck events and therefore are likely to represent
only a fraction of the genetic diversity present within the species. Here
we use Restriction site-Associated DNA sequencing (RAD-seq) to genetically
characterize wild populations of zebrafish from India, Nepal and
Bangladesh, and to compare them to previously published data on four
common laboratory strains. We measured nucleotide diversity,
heterozygosity and allele frequency spectra, and find that wild zebrafish
are much more diverse than laboratory strains. Further, in wild zebrafish
there is a clear signal of GC-biased gene conversion that is missing in
laboratory strains. We also find that zebrafish populations in Nepal and
Bangladesh are most distinct from all other strains studied, making them
an attractive subject for future studies of zebrafish population genetics
and molecular ecology. Finally, isolates of the same strains kept in
different laboratories show a pattern of ongoing differentiation into
genetically distinct substrains. Together, our findings broaden the basis
for future genetic, physiological, pharmaceutic and evolutionary studies
in Danio rerio.
All data used for producing this dataset originates from RAD-sequencing
of wild and laboratory zebrafish genomic DNA digested with SbfI. Sequences
were mapped to GRCz11 using BWA-MEM (version 0.7.17-r1188). Samtools
(version 1.9) was used to filter out unmapped reads and non-primary
alignments. Variant calling itself was performed with Stacks (version
2.4).Variants were annotated using Ensembl Variant Effect Predictor.
The allele in the "reference" column of the .vcf was determined
in the associated study itself and does not always correspond to the
reference genome. Populations: CHT - Wild fish from Chittagong, Bangladesh
KHA - Wild fish from Khair Khola, Nepal UT - Wild fish from Uttarbhag,
India CB - Wild derived fish from Cooch Behar, India AB, EKW, Nadia, TU,
WIK - laboratory strains Files in the dataset: wildlabdanio.vcf.gz -
polymorphic sites identified in data from wild and laboratory zebrafish
wildlabdanio.sumstats.tsv - output file from the populations module of
Stacks v2.4 wildlabdanio.vep.tsv - functional annotation of variants in
the .vcf wildlabdanio.vep_summary.html - summary of the functional
annotation