Title | Demographic history and rare allele sharing among human populations. |
Publication Type | Journal Article |
Year of Publication | 2011 |
Authors | Gravel, S, Henn, BM, Gutenkunst, RN, Indap, AR, Marth, GT, Clark, AG, Yu, F, Gibbs, RA, Bustamante, CD |
Corporate Authors | 1000 Genomes Project |
Journal | Proc Natl Acad Sci U S A |
Volume | 108 |
Issue | 29 |
Pagination | 11983-8 |
Date Published | 2011 Jul 19 |
ISSN | 1091-6490 |
Keywords | Demography, Evolution, Molecular, Gene Frequency, Genes, Genetic Drift, Genetic Variation, Genetics, Population, Genomics, Humans, Models, Genetic, Racial Groups, Sequence Analysis, DNA |
Abstract | High-throughput sequencing technology enables population-level surveys of human genomic variation. Here, we examine the joint allele frequency distributions across continental human populations and present an approach for combining complementary aspects of whole-genome, low-coverage data and targeted high-coverage data. We apply this approach to data generated by the pilot phase of the Thousand Genomes Project, including whole-genome 2-4× coverage data for 179 samples from HapMap European, Asian, and African panels as well as high-coverage target sequencing of the exons of 800 genes from 697 individuals in seven populations. We use the site frequency spectra obtained from these data to infer demographic parameters for an Out-of-Africa model for populations of African, European, and Asian descent and to predict, by a jackknife-based approach, the amount of genetic diversity that will be discovered as sample sizes are increased. We predict that the number of discovered nonsynonymous coding variants will reach 100,000 in each population after ∼1,000 sequenced chromosomes per population, whereas ∼2,500 chromosomes will be needed for the same number of synonymous variants. Beyond this point, the number of segregating sites in the European and Asian panel populations is expected to overcome that of the African panel because of faster recent population growth. Overall, we find that the majority of human genomic variable sites are rare and exhibit little sharing among diverged populations. Our results emphasize that replication of disease association for specific rare genetic variants across diverged populations must overcome both reduced statistical power because of rarity and higher population divergence. |
DOI | 10.1073/pnas.1019276108 |
Alternate Journal | Proc Natl Acad Sci U S A |
PubMed ID | 21730125 |
PubMed Central ID | PMC3142009 |
Grant List | R01 HL072904 / HL / NHLBI NIH HHS / United States R01 HL072810-06 / HL / NHLBI NIH HHS / United States R01 HL072810-05A1 / HL / NHLBI NIH HHS / United States R01 HL072810-03 / HL / NHLBI NIH HHS / United States R01 HL072810-07 / HL / NHLBI NIH HHS / United States R01 HG003229 / HG / NHGRI NIH HHS / United States U54 HG003273 / HG / NHGRI NIH HHS / United States R01 HL072904-08 / HL / NHLBI NIH HHS / United States R01 HL072810-09 / HL / NHLBI NIH HHS / United States R01 HL072810-01 / HL / NHLBI NIH HHS / United States 085532 / WT_ / Wellcome Trust / United Kingdom R01 HL072810-08 / HL / NHLBI NIH HHS / United States R01 HL072810-02 / HL / NHLBI NIH HHS / United States R01 HL072810-04 / HL / NHLBI NIH HHS / United States R01 HL072810 / HL / NHLBI NIH HHS / United States G1000758 / MRC_ / Medical Research Council / United Kingdom 090532 / WT_ / Wellcome Trust / United Kingdom |