Title | Chromosome-scale, haplotype-resolved assembly of human genomes. |
Publication Type | Journal Article |
Year of Publication | 2021 |
Authors | Garg, S, Fungtammasan, A, Carroll, A, Chou, M, Schmitt, A, Zhou, X, Mac, S, Peluso, P, Hatas, E, Ghurye, J, Maguire, J, Mahmoud, M, Cheng, H, Heller, D, Zook, JM, Moemke, T, Marschall, T, Sedlazeck, FJ, Aach, J, Chin, C-S, Church, GM, Li, H |
Journal | Nat Biotechnol |
Volume | 39 |
Issue | 3 |
Pagination | 309-312 |
Date Published | 2021 Mar |
ISSN | 1546-1696 |
Keywords | Algorithms, Chromosomes, Human, Genome, Human, Haplotypes, Heterozygote, Humans, Polymorphism, Single Nucleotide |
Abstract | Haplotype-resolved or phased genome assembly provides a complete picture of genomes and their complex genetic variations. However, current algorithms for phased assembly either do not generate chromosome-scale phasing or require pedigree information, which limits their application. We present a method named diploid assembly (DipAsm) that uses long, accurate reads and long-range conformation data for single individuals to generate a chromosome-scale phased assembly within 1 day. Applied to four public human genomes, PGP1, HG002, NA12878 and HG00733, DipAsm produced haplotype-resolved assemblies with minimum contig length needed to cover 50% of the known genome (NG50) up to 25 Mb and phased ~99.5% of heterozygous sites at 98-99% accuracy, outperforming other approaches in terms of both contiguity and phasing completeness. We demonstrate the importance of chromosome-scale phased assemblies for the discovery of structural variants (SVs), including thousands of new transposon insertions, and of highly polymorphic and medically important regions such as the human leukocyte antigen (HLA) and killer cell immunoglobulin-like receptor (KIR) regions. DipAsm will facilitate high-quality precision medicine and studies of individual haplotype variation and population diversity. |
DOI | 10.1038/s41587-020-0711-0 |
Alternate Journal | Nat Biotechnol |
PubMed ID | 33288905 |
PubMed Central ID | PMC7954703 |
Grant List | R01 HG010040 / HG / NHGRI NIH HHS / United States RM1 HG008525 / HG / NHGRI NIH HHS / United States U01 HG010971 / HG / NHGRI NIH HHS / United States K99 HG010906 / HG / NHGRI NIH HHS / United States UM1 HG008898 / HG / NHGRI NIH HHS / United States |
Chromosome-scale, haplotype-resolved assembly of human genomes.
Similar Publications
Inverted triplications formed by iterative template switches generate structural variant diversity at genomic disorder loci. Cell Genom. 2024;4(7):100590. | .
Unveiling novel genetic variants in 370 challenging medically relevant genes using the long read sequencing data of 41 samples from 19 global populations. Mol Genet Genomics. 2024;299(1):65. | .
Genetic diversity of 1,845 rhesus macaques improves genetic variation interpretation and identifies disease models. Nat Commun. 2024;15(1):5658. | .