10.5061/DRYAD.TN6652T
Epstein, Brendan
University of Minnesota
Abou-Shanab, Reda A.l.
Shameldsin, Abdelaal
Taylor, Margaret R.
University of Minnesota
Guhlin, Joseph
University of Minnesota
Burghardt, Liana T.
University of Minnesota
Nelson, Matthew
University of Minnesota
Sadowsky, Michael J.
University of Minnesota
Tiffin, Peter
University of Minnesota
Abou-Shanab, Reda A. I.
University of Minnesota
Data from: Genome-wide association analyses in the model rhizobium Ensifer
meliloti
Dryad
dataset
2018
association mapping
Medicago truncatula
phenotypic variation
Ensifer meliloti
BSLMM
Nitrogen fixation
Chip heritability
Holocene
National Science Foundation
https://ror.org/021nxhr62
IOS-1237993, IOS-1724993
2018-10-15T20:17:22Z
2018-10-15T20:17:22Z
en
https://doi.org/10.1128/msphere.00386-18
1931052710 bytes
1
CC0 1.0 Universal (CC0 1.0) Public Domain Dedication
Genome-wide association studies (GWAS) can identify genetic variants
responsible for naturally occurring and quantitative phenotypic variation
and therefore provide a powerful complement to approaches that rely on de
novo mutations for characterizing gene function. Although bacteria should
be amenable to GWAS, few GWAS have been conducted on bacteria, and the
extent to which non-independence among genomic variants (e.g. linkage
disequilibrium, LD) and the genetic architecture of phenotypic traits will
affect GWAS performance is unclear. We apply association analyses to
identify candidate genes underlying variation in 20 biochemical, growth,
and symbiotic phenotypes among 153 stains of Ensifer meliloti. For 10
traits we find genotype-phenotype associations that are stronger than
expected by chance, with the candidates in relatively small linkage
groups, indicating that LD does not preclude resolving association
candidates to relatively small genomic regions. The significant candidates
show an enrichment for nucleotide polymorphisms (SNPs) over gene
presence-absence variation (PAV), and for five traits, candidates are
enriched in large linkage groups, a possible signature of epistasis. Many
of the variants most strongly associated with symbiosis phenotypes were in
genes previously known to be involved in nitrogen-fixation or nodulation.
For other traits, apparently strong associations were not stronger than
the range of associations detected in permuted data. In sum, our data show
that GWAS in bacteria may be a powerful tool for characterizing genetic
architecture and identifying genes responsible for phenotypic variation,
however, careful evaluation of candidates is necessary to avoid false
signals of association.
Master READMELists location of most important data and results and
describes the analysis workflow.NCBI accession numbersncbi.zipStrain
InformationMetadata for the strains included in the
study.strain_info.zipPopgen AnalysesResults and code for population
genetic summaries.popgen.zipFigure & Table
codefigs_tables.zipPhenotype DataAll phenotype data, both raw and
processed.pheno_data.zipReference GenomeCopy of reference genome sequence
and annotation.reference_genome.zipLD Grouping AnalysisCode for and
results from LD grouping.ld_groups.zipPAVsPresence-absence variant calling
results and code.pav_calls.zipSingle-variant Association AnalysesCode for
and results from the single-variant association analyses with
GEMMA.gwas.zipModel SelectionCode to run forward model selection and
results for real and permuted data.model_selection.zipGenome-wide PVE
analysesCode that used GEMMA to get BSLMM and LMM chip-heritability
estimates and summaries of results.bslmm.zipDe novo AssembliesDe novo
assemblies and annotation used for identifying PAVs.assembly.zipRaw SNP
and Indel CallsRaw variant calls from FreeBayes based on alignment of
reads to USDA1106.snp_calls.zipVariant Calls For AnalysisSNP and PAV calls
combined and filtered on missingness. Starting point for association and
LD grouping analyses.variant_filtering_for_analysis.zip
World