Title | A robust benchmark for detection of germline large deletions and insertions. |
Publication Type | Journal Article |
Year of Publication | 2020 |
Authors | Zook, JM, Hansen, NF, Olson, ND, Chapman, L, Mullikin, JC, Xiao, C, Sherry, S, Koren, S, Phillippy, AM, Boutros, PC, Sahraeian, SMohammad E, Huang, V, Rouette, A, Alexander, N, Mason, CE, Hajirasouliha, I, Ricketts, C, Lee, J, Tearle, R, Fiddes, IT, Barrio, AMartinez, Wala, J, Carroll, A, Ghaffari, N, Rodriguez, OL, Bashir, A, Jackman, S, Farrell, JJ, Wenger, AM, Alkan, C, Soylev, A, Schatz, MC, Garg, S, Church, G, Marschall, T, Chen, K, Fan, X, English, AC, Rosenfeld, JA, Zhou, W, Mills, RE, Sage, JM, Davis, JR, Kaiser, MD, Oliver, JS, Catalano, AP, Chaisson, MJP, Spies, N, Sedlazeck, FJ, Salit, M |
Journal | Nat Biotechnol |
Volume | 38 |
Issue | 11 |
Pagination | 1347-1355 |
Date Published | 2020 Nov |
ISSN | 1546-1696 |
Keywords | Diploidy, Genomic Structural Variation, Germ-Line Mutation, Humans, INDEL Mutation, Molecular Sequence Annotation, Sequence Analysis, DNA |
Abstract | New technologies and analysis methods are enabling genomic structural variants (SVs) to be detected with ever-increasing accuracy, resolution and comprehensiveness. To help translate these methods to routine research and clinical practice, we developed a sequence-resolved benchmark set for identification of both false-negative and false-positive germline large insertions and deletions. To create this benchmark for a broadly consented son in a Personal Genome Project trio with broadly available cells and DNA, the Genome in a Bottle Consortium integrated 19 sequence-resolved variant calling methods from diverse technologies. The final benchmark set contains 12,745 isolated, sequence-resolved insertion (7,281) and deletion (5,464) calls ≥50 base pairs (bp). The Tier 1 benchmark regions, for which any extra calls are putative false positives, cover 2.51 Gbp and 5,262 insertions and 4,095 deletions supported by ≥1 diploid assembly. We demonstrate that the benchmark set reliably identifies false negatives and false positives in high-quality SV callsets from short-, linked- and long-read sequencing and optical mapping. |
DOI | 10.1038/s41587-020-0538-8 |
Alternate Journal | Nat Biotechnol |
PubMed ID | 32541955 |
PubMed Central ID | PMC8454654 |
Grant List | 9999-NIST / ImNIST / Intramural NIST DOC / United States R01 AI151059 / AI / NIAID NIH HHS / United States |
A robust benchmark for detection of germline large deletions and insertions.
Similar Publications
Inverted triplications formed by iterative template switches generate structural variant diversity at genomic disorder loci. Cell Genom. 2024;4(7):100590. | .
Unveiling novel genetic variants in 370 challenging medically relevant genes using the long read sequencing data of 41 samples from 19 global populations. Mol Genet Genomics. 2024;299(1):65. | .
Genetic diversity of 1,845 rhesus macaques improves genetic variation interpretation and identifies disease models. Nat Commun. 2024;15(1):5658. | .