10.5061/DRYAD.66T1G1JXS
Mostafavi, Hakhamanesh
0000-0002-1060-2844
Columbia University
Harpak, Arbel
0000-0002-3655-748X
Columbia University
Agarwal, Ipsita
0000-0001-8537-0008
Columbia University
Conley, Dalton
0000-0002-5174-7222
Princeton University
Pritchard, Jonathan
Stanford University
Przeworski, Molly
Columbia University
Variable prediction accuracy of polygenic scores within an ancestry group
Dryad
dataset
2019
Polygenic scores
genome-wide association studies (GWAS)
National Institute of General Medical Sciences
https://ror.org/04q48ey07
GM121372
National Human Genome Research Institute
https://ror.org/00baak391
HG008140
Robert Wood Johnson Foundation
https://ror.org/02ymmdj85
84337817
Simons Foundation
https://ror.org/01cmst727
633313
2020-02-24T00:00:00Z
2020-02-24T00:00:00Z
en
https://doi.org/10.7554/eLife.48376
18879556766 bytes
3
CC0 1.0 Universal (CC0 1.0) Public Domain Dedication
Fields as diverse as human genetics and sociology are increasingly using
polygenic scores based on genome-wide association studies (GWAS) for
phenotypic prediction. However, recent work has shown that polygenic
scores have limited portability across groups of different genetic
ancestries, restricting the contexts in which they can be used reliably
and potentially creating serious inequities in future clinical
applications. Using the UK Biobank data, we demonstrate that even within a
single ancestry group (i.e., when there are negligible differences in
linkage disequilibrium or in causal alleles frequencies), the prediction
accuracy of polygenic scores can depend on characteristics such as the
socio-economic status, age or sex of the individuals in which the GWAS and
the prediction were conducted, as well as on the GWAS design. Our findings
highlight both the complexities of interpreting polygenic scores and
underappreciated obstacles to their broad use.
This repository contains summary statistics for all association tests
(including GWAS and effect re-estimations for sets of pre-ascertained
SNPs) that were performed in this study. The directory
"gwas_by_sample_characteristics" stores data corresponding to
Figures 1, 2, and Appendix-figures 1-5,13-15, and Appendix-table 2. The
directory "standard_vs_sibling_gwas" stores data corresponding
to Figure 3, and Appendix-figures 11, 12, 16. Additional README files can
be found within each directory.