Assessing structural variation in a personal genome-towards a human reference diploid genome.

TitleAssessing structural variation in a personal genome-towards a human reference diploid genome.
Publication TypeJournal Article
Year of Publication2015
AuthorsEnglish, AC, Salerno, WJ, Hampton, OA, Gonzaga-Jauregui, C, Ambreth, S, Ritter, DI, Beck, CR, Davis, CF, Dahdouli, M, Ma, S, Carroll, A, Veeraraghavan, N, Bruestle, J, Drees, B, Hastie, A, Lam, ET, White, S, Mishra, P, Wang, M, Han, Y, Zhang, F, Stankiewicz, P, Wheeler, DA, Reid, JG, Muzny, DM, Rogers, J, Sabo, A, Worley, KC, Lupski, JR, Boerwinkle, E, Gibbs, RA
JournalBMC Genomics
Volume16
Pagination286
Date Published2015 Apr 11
ISSN1471-2164
KeywordsComputational Biology, Databases, Genetic, Diploidy, Genome, Human, Genomic Structural Variation, Humans, Sequence Analysis, DNA, Software
Abstract

BACKGROUND: Characterizing large genomic variants is essential to expanding the research and clinical applications of genome sequencing. While multiple data types and methods are available to detect these structural variants (SVs), they remain less characterized than smaller variants because of SV diversity, complexity, and size. These challenges are exacerbated by the experimental and computational demands of SV analysis. Here, we characterize the SV content of a personal genome with Parliament, a publicly available consensus SV-calling infrastructure that merges multiple data types and SV detection methods.

RESULTS: We demonstrate Parliament's efficacy via integrated analyses of data from whole-genome array comparative genomic hybridization, short-read next-generation sequencing, long-read (Pacific BioSciences RSII), long-insert (Illumina Nextera), and whole-genome architecture (BioNano Irys) data from the personal genome of a single subject (HS1011). From this genome, Parliament identified 31,007 genomic loci between 100 bp and 1 Mbp that are inconsistent with the hg19 reference assembly. Of these loci, 9,777 are supported as putative SVs by hybrid local assembly, long-read PacBio data, or multi-source heuristics. These SVs span 59 Mbp of the reference genome (1.8%) and include 3,801 events identified only with long-read data. The HS1011 data and complete Parliament infrastructure, including a BAM-to-SV workflow, are available on the cloud-based service DNAnexus.

CONCLUSIONS: HS1011 SV analysis reveals the limits and advantages of multiple sequencing technologies, specifically the impact of long-read SV discovery. With the full Parliament infrastructure, the HS1011 data constitute a public resource for novel SV discovery, software calibration, and personal genome structural variation analysis.

DOI10.1186/s12864-015-1479-3
Alternate JournalBMC Genomics
PubMed ID25886820
PubMed Central IDPMC4490614
Grant ListU54HG003273 / HG / NHGRI NIH HHS / United States
K12 GM084897 / GM / NIGMS NIH HHS / United States
U54 HG006542 / HG / NHGRI NIH HHS / United States
R01NS058529 / NS / NINDS NIH HHS / United States
U54 HG003273 / HG / NHGRI NIH HHS / United States
T15 LM007093 / LM / NLM NIH HHS / United States
R01 NS058529 / NS / NINDS NIH HHS / United States
U54HD006542 / HD / NICHD NIH HHS / United States

Similar Publications

Chen F, Zhang Y, Chandrashekar DS, Varambally S, Creighton CJ. Global impact of somatic structural variation on the cancer proteome. Nat Commun. 2023;14(1):5637.
Rhie A, Nurk S, Cechova M, Hoyt SJ, Taylor DJ, Altemose N, et al.. The complete sequence of a human Y chromosome. Nature. 2023;621(7978):344-354.
Saengboonmee C, Sorin S, Sangkhamanon S, Chomphoo S, Indramanee S, Seubwai W, et al.. γ-aminobutyric acid B2 receptor: A potential therapeutic target for cholangiocarcinoma in patients with diabetes mellitus. World J Gastroenterol. 2023;29(28):4416-4432.
Wojcik MH, Reuter CM, Marwaha S, Mahmoud M, Duyzend MH, Barseghyan H, et al.. Beyond the exome: What's next in diagnostic testing for Mendelian conditions. Am J Hum Genet. 2023;110(8):1229-1248.
Chin C-S, Behera S, Khalak A, Sedlazeck FJ, Sudmant PH, Wagner J, et al.. Multiscale analysis of pangenomes enables improved representation of genomic diversity for repetitive and clinically relevant genes. Nat Methods. 2023;20(8):1213-1221.
Zhao N, Teles F, Lu J, Koestler DC, Beck J, Boerwinkle E, et al.. Epigenome-wide association study using peripheral blood leukocytes identifies genomic regions associated with periodontal disease and edentulism in the Atherosclerosis Risk in Communities study. J Clin Periodontol. 2023;50(9):1140-1153.
Harris RA, McAllister JM, Strauss JF. Single-Cell RNA-Seq Identifies Pathways and Genes Contributing to the Hyperandrogenemia Associated with Polycystic Ovary Syndrome. Int J Mol Sci. 2023;24(13).
Qian X, Srinivasan T, He J, Chen R. The Role of Ceramide in Inherited Retinal Disease Pathology. Adv Exp Med Biol. 2023;1415:303-307.
Calame DG, Guo T, Wang C, Garrett L, Jolly A, Dawood M, et al.. Monoallelic variation in DHX9, the gene encoding the DExH-box helicase DHX9, underlies neurodevelopment disorders and Charcot-Marie-Tooth disease. Am J Hum Genet. 2023;110(8):1394-1413.
Walker KA, Chen J, Shi L, Yang Y, Fornage M, Zhou L, et al.. Proteomics analysis of plasma from middle-aged adults identifies protein markers of dementia risk in later life. Sci Transl Med. 2023;15(705):eadf5681.