10.5061/DRYAD.TH0RJ
Bank, Claudia
Instituto Gulbenkian de Ciência
Matuszewski, Sebastian
École Polytechnique Fédérale de Lausanne
Hietpas, Ryan T.
Eli Lilly and Company
Jensen, Jeffrey D.
École Polytechnique Fédérale de Lausanne
Data from: On the (un)predictability of a large intragenic fitness landscape
Dryad
dataset
2017
deep mutational scanning
fitness landscape
2017-11-04T00:00:00Z
2017-11-04T00:00:00Z
en
https://doi.org/10.1073/pnas.1612676113
173622 bytes
1
CC0 1.0 Universal (CC0 1.0) Public Domain Dedication
The study of fitness landscapes, which aims at mapping genotypes to
fitness, is receiving ever-increasing attention. Novel experimental
approaches combined with next-generation sequencing (NGS) methods enable
accurate and extensive studies of the fitness effects of mutations,
allowing us to test theoretical predictions and improve our understanding
of the shape of the true underlying fitness landscape and its implications
for the predictability and repeatability of evolution. Here, we present a
uniquely large multiallelic fitness landscape comprising 640 engineered
mutants that represent all possible combinations of 13 amino acid-changing
mutations at 6 sites in the heat-shock protein Hsp90 in Saccharomyces
cerevisiae under elevated salinity. Despite a prevalent pattern of
negative epistasis in the landscape, we find that the global fitness peak
is reached via four positively epistatic mutations. Combining traditional
and extending recently proposed theoretical and statistical approaches, we
quantify features of the global multiallelic fitness landscape. Using
subsets of the data, we demonstrate that extrapolation beyond a known part
of the landscape is difficult owing to both local ruggedness and amino
acid-specific epistatic hotspots and that inference is additionally
confounded by the nonrandom choice of mutations for experimental fitness
landscapes.
Yeast Hsp90 fitness landscape from deep mutational scanningThis data was
obtained from 1611 engineered mutations in yeast Hsp90 exposed to high
salinity, using the EMPIRIC approach. This file contains the deep
mutational scanning data from both replicates. Columns are A-E are named
unambiguously. Numbers in column F-N, line 1, indicate sampling times of
replicate 2 in generations, numbers in column O-W, line 1, indicate
sampling times of replicate 1.data.csv