10.5061/DRYAD.962
Palsson, Arnar
University of Iceland
Rouse, Ann
Riley-Berger, Rebecca
Dworkin, Ian
Gibson, Greg
North Carolina State University
Data from: Nucleotide variation in the Egfr locus of Drosophila melanogaster
Dryad
dataset
2009
2009-10-17T01:48:48Z
2009-10-17T01:48:48Z
en
https://doi.org/10.1534/genetics.104.026252
7769135 bytes
1
CC0 1.0 Universal (CC0 1.0) Public Domain Dedication
The Epidermal growth factor receptor is an essential gene with diverse
pleiotropic roles in development throughout the animal kingdom. Analysis
of sequence diversity in 10.9 kb covering the complete coding region and
6.4 kb of potential regulatory regions in a sample of 250 alleles from
three populations of Drosophila melanogaster suggests that the intensity
of different population genetic forces varies along the locus. A total of
238 independent common SNPs and 20 indel polymorphisms were detected, with
just six common replacements affecting >1475 amino acids, four of
which are in the short alternate first exon. Sequence diversity is lowest
in a 2-kb portion of intron 2, which is also highly conserved in
comparison with D. simulans and D. pseudoobscura. Linkage disequilibrium
decays to background levels within 500 bp of most sites, so haplotypes are
generally restricted to up to 5 polymorphisms. The two North American
samples from North Carolina and California have diverged in allele
frequency at a handful of individual SNPs, but a Kenyan sample is both
more divergent and more polymorphic. The effect of sample size on
inference of the roles of population structure, uneven recombination, and
weak selection in patterning nucleotide variation in the locus is
discussed.
Supplementary GenBank FileGenBank File 17571116 (Egfr
Sequence)EGFRgenbank.docSupplementary Figure 1Fu and Li’s D* statistic
across the EGFRPRBDG_SF1.pdfSupplementary Table 6Excel file with complete
sequence of each of the allelesEGFRalleles.zipSupplementary Figure
3Conserved 30 bp sequence motif in D.
melanogasterPRBDG_SF3.pdfSupplementary Figure 2Polymorphism and Divergence
in the CN repeatPRBDG_SF2.pdfSupplementary Figure 4LD plot for all common
polymorphismsPRBDG_SF4.pdfSupplementary Table 1List of Primers used in
sequencingPRBDG_ST1.pdfSupplementary Table 2List of Amino Acid Replacement
variants in DERPRBDG_ST2.pdfSupplementary Table 3Effect of sample size on
estimation of nucleotide diversityPRBDG_ST3.pdfSupplementary Table 4Length
distribution of microsatellite allelesPRBDG_ST4.pdfSupplementary Table
5Effect of sample size on estimation of Fst statisticsPRBDG_ST5.pdf