Title | A diploid assembly-based benchmark for variants in the major histocompatibility complex. |
Publication Type | Journal Article |
Year of Publication | 2020 |
Authors | Chin, C-S, Wagner, J, Zeng, Q, Garrison, E, Garg, S, Fungtammasan, A, Rautiainen, M, Aganezov, S, Kirsche, M, Zarate, S, Schatz, MC, Xiao, C, Rowell, WJ, Markello, C, Farek, J, Sedlazeck, FJ, Bansal, V, Yoo, B, Miller, N, Zhou, X, Carroll, A, Barrio, AMartinez, Salit, M, Marschall, T, Dilthey, AT, Zook, JM |
Journal | Nat Commun |
Volume | 11 |
Issue | 1 |
Pagination | 4794 |
Date Published | 2020 Sep 22 |
ISSN | 2041-1723 |
Keywords | Benchmarking, Cell Line, Diploidy, Genetic Variation, Genome, Human, Haplotypes, Humans, Major Histocompatibility Complex |
Abstract | Most human genomes are characterized by aligning individual reads to the reference genome, but accurate long reads and linked reads now enable us to construct accurate, phased de novo assemblies. We focus on a medically important, highly variable, 5 million base-pair (bp) region where diploid assembly is particularly useful - the Major Histocompatibility Complex (MHC). Here, we develop a human genome benchmark derived from a diploid assembly for the openly-consented Genome in a Bottle sample HG002. We assemble a single contig for each haplotype, align them to the reference, call phased small and structural variants, and define a small variant benchmark for the MHC, covering 94% of the MHC and 22368 variants smaller than 50 bp, 49% more variants than a mapping-based benchmark. This benchmark reliably identifies errors in mapping-based callsets, and enables performance assessment in regions with much denser, complex variation than regions covered by previous benchmarks. |
DOI | 10.1038/s41467-020-18564-9 |
Alternate Journal | Nat Commun |
PubMed ID | 32963235 |
PubMed Central ID | PMC7508831 |
Grant List | R01 HG010149 / HG / NHGRI NIH HHS / United States U01 AI090905 / AI / NIAID NIH HHS / United States |
A diploid assembly-based benchmark for variants in the major histocompatibility complex.
Similar Publications
DNA Methylation-Derived Immune Cell Proportions and Cancer Risk in Black Participants. Cancer Res Commun. 2024;4(10):2714-2723. | .
Whole genomes of Amazonian uakari monkeys reveal complex connectivity and fast differentiation driven by high environmental dynamism. Commun Biol. 2024;7(1):1283. | .
StratoMod: predicting sequencing and variant calling errors with interpretable machine learning. Commun Biol. 2024;7(1):1316. | .