Mapping and characterization of structural variation in 17,795 human genomes.

TitleMapping and characterization of structural variation in 17,795 human genomes.
Publication TypeJournal Article
Year of Publication2020
AuthorsAbel, HJ, Larson, DE, Regier, AA, Chiang, C, Das, I, Kanchi, KL, Layer, RM, Neale, BM, Salerno, WJ, Reeves, C, Buyske, S, Matise, TC, Muzny, DM, Zody, MC, Lander, ES, Dutcher, SK, Stitziel, NO, Hall, IM
Corporate AuthorsNHGRI Centers for Common Disease Genomics
JournalNature
Volume583
Issue7814
Pagination83-89
Date Published2020 07
ISSN1476-4687
KeywordsAlleles, Case-Control Studies, Epigenesis, Genetic, Female, Gene Dosage, Genetic Variation, Genetics, Population, Genome, Human, High-Throughput Nucleotide Sequencing, Humans, Male, Molecular Sequence Annotation, Quantitative Trait Loci, Racial Groups, Software, Whole Genome Sequencing
Abstract

A key goal of whole-genome sequencing for studies of human genetics is to interrogate all forms of variation, including single-nucleotide variants, small insertion or deletion (indel) variants and structural variants. However, tools and resources for the study of structural variants have lagged behind those for smaller variants. Here we used a scalable pipeline to map and characterize structural variants in 17,795 deeply sequenced human genomes. We publicly release site-frequency data to create the largest, to our knowledge, whole-genome-sequencing-based structural variant resource so far. On average, individuals carry 2.9 rare structural variants that alter coding regions; these variants affect the dosage or structure of 4.2 genes and account for 4.0-11.2% of rare high-impact coding alleles. Using a computational model, we estimate that structural variants account for 17.2% of rare alleles genome-wide, with predicted deleterious effects that are equivalent to loss-of-function coding alleles; approximately 90% of such structural variants are noncoding deletions (mean 19.1 per genome). We report 158,991 ultra-rare structural variants and show that 2% of individuals carry ultra-rare megabase-scale structural variants, nearly half of which are balanced or complex rearrangements. Finally, we infer the dosage sensitivity of genes and noncoding elements, and reveal trends that relate to element class and conservation. This work will help to guide the analysis and interpretation of structural variants in the era of whole-genome sequencing.

DOI10.1038/s41586-020-2371-0
Alternate JournalNature
PubMed ID32460305
PubMed Central IDPMC7547914
Grant ListHHSN268201100001I / HL / NHLBI NIH HHS / United States
U01 HG007419 / HG / NHGRI NIH HHS / United States
R35 GM118335 / GM / NIGMS NIH HHS / United States
U01 DK062413 / DK / NIDDK NIH HHS / United States
U24 AG021886 / AG / NIA NIH HHS / United States
U01 HG007416 / HG / NHGRI NIH HHS / United States
P50 AG008702 / AG / NIA NIH HHS / United States
HHSN268201100046C / HL / NHLBI NIH HHS / United States
N01HC65236 / HL / NHLBI NIH HHS / United States
N01HC65234 / HL / NHLBI NIH HHS / United States
P01 CA033619 / CA / NCI NIH HHS / United States
U54 HG003079 / HG / NHGRI NIH HHS / United States
R01 ES015794 / ES / NIEHS NIH HHS / United States
U01 HG007417 / HG / NHGRI NIH HHS / United States
UM1 HG008895 / HG / NHGRI NIH HHS / United States
HHSN268201100004I / HL / NHLBI NIH HHS / United States
U24 AG056270 / AG / NIA NIH HHS / United States
HHSN268201100003C / WH / WHI NIH HHS / United States
U01 HG007376 / HG / NHGRI NIH HHS / United States
N01HC65235 / HL / NHLBI NIH HHS / United States
UM1 HG008901 / HG / NHGRI NIH HHS / United States
R01 HL135156 / HL / NHLBI NIH HHS / United States
N01HC65233 / HL / NHLBI NIH HHS / United States
HHSN268201700002C / HL / NHLBI NIH HHS / United States
HHSN268201700001I / HL / NHLBI NIH HHS / United States
UM1 HG008853 / HG / NHGRI NIH HHS / United States
N01HC65237 / HL / NHLBI NIH HHS / United States
HHSN271201100004C / AG / NIA NIH HHS / United States
U24 AG026395 / AG / NIA NIH HHS / United States
R01 GM059290 / GM / NIGMS NIH HHS / United States
HHSN268201100002C / WH / WHI NIH HHS / United States
R01 AG041797 / AG / NIA NIH HHS / United States
R01 HL113315 / HL / NHLBI NIH HHS / United States
HHSN268201700005C / HL / NHLBI NIH HHS / United States
HHSN268201700001C / HL / NHLBI NIH HHS / United States
R01 HL128439 / HL / NHLBI NIH HHS / United States
HHSN268201700003C / HL / NHLBI NIH HHS / United States
U01 CA098758 / CA / NCI NIH HHS / United States
R01 HL117004 / HL / NHLBI NIH HHS / United States
P60 MD006902 / MD / NIMHD NIH HHS / United States
P30 DK020572 / DK / NIDDK NIH HHS / United States
HHSN268201100003I / HL / NHLBI NIH HHS / United States
HHSN268201100002I / HL / NHLBI NIH HHS / United States
R21 ES024844 / ES / NIEHS NIH HHS / United States
HHSN268201700002I / HL / NHLBI NIH HHS / United States
HHSN268201700005I / HL / NHLBI NIH HHS / United States
UM1 HG008898 / HG / NHGRI NIH HHS / United States
P01 DK046763 / DK / NIDDK NIH HHS / United States
P30 DK052574 / DK / NIDDK NIH HHS / United States
U24 HG008956 / HG / NHGRI NIH HHS / United States
U01 CA136792 / CA / NCI NIH HHS / United States
HHSN268201700003I / HL / NHLBI NIH HHS / United States
U01 HG007397 / HG / NHGRI NIH HHS / United States
R37 CA054281 / CA / NCI NIH HHS / United States
HHSN268201100001C / WH / WHI NIH HHS / United States
HHSN268201100004C / WH / WHI NIH HHS / United States
U01 DK062431 / DK / NIDDK NIH HHS / United States
RL5 GM118984 / GM / NIGMS NIH HHS / United States
R01 MD010443 / MD / NIMHD NIH HHS / United States