Title | Utility of long-read sequencing for All of Us. |
Publication Type | Journal Article |
Year of Publication | 2024 |
Authors | Mahmoud, M, Huang, Y, Garimella, K, Audano, PA, Wan, W, Prasad, N, Handsaker, RE, Hall, S, Pionzio, A, Schatz, MC, Talkowski, ME, Eichler, EE, Levy, SE, Sedlazeck, FJ |
Journal | Nat Commun |
Volume | 15 |
Issue | 1 |
Pagination | 837 |
Date Published | 2024 Jan 29 |
ISSN | 2041-1723 |
Keywords | Genome, Human, High-Throughput Nucleotide Sequencing, Humans, INDEL Mutation, Population Health, Sequence Analysis, DNA |
Abstract | The All of Us (AoU) initiative aims to sequence the genomes of over one million Americans from diverse ethnic backgrounds to improve personalized medical care. In a recent technical pilot, we compare the performance of traditional short-read sequencing with long-read sequencing in a small cohort of samples from the HapMap project and two AoU control samples representing eight datasets. Our analysis reveals substantial differences in the ability of these technologies to accurately sequence complex medically relevant genes, particularly in terms of gene coverage and pathogenic variant identification. We also consider the advantages and challenges of using low coverage sequencing to increase sample numbers in large cohort analysis. Our results show that HiFi reads produce the most accurate results for both small and large variants. Further, we present a cloud-based pipeline to optimize SNV, indel and SV calling at scale for long-reads analysis. These results lead to widespread improvements across AoU. |
DOI | 10.1038/s41467-024-44804-3 |
Alternate Journal | Nat Commun |
PubMed ID | 38281971 |
PubMed Central ID | PMC10822842 |
Grant List | OT2 OD026556 / OD / NIH HHS / United States U2C OD023196 / OD / NIH HHS / United States OT2 OD025315 / OD / NIH HHS / United States OT2 OD026551 / OD / NIH HHS / United States U24 OD023121 / OD / NIH HHS / United States OT2 OD026552 / OD / NIH HHS / United States OT2 OD026549 / OD / NIH HHS / United States OT2 OD025337 / OD / NIH HHS / United States OT2 OD025277 / OD / NIH HHS / United States OT2 OD026555 / OD / NIH HHS / United States OT2 OD026550 / OD / NIH HHS / United States OT2 OD026553 / OD / NIH HHS / United States OT2 OD023205 / OD / NIH HHS / United States OT2 OD025276 / OD / NIH HHS / United States OT2 OD026554 / OD / NIH HHS / United States U24 OD023163 / OD / NIH HHS / United States OT2 OD023206 / OD / NIH HHS / United States OT2 OD002748 / OD / NIH HHS / United States U24 OD023176 / OD / NIH HHS / United States OT2 OD026548 / OD / NIH HHS / United States OT2 OD026557 / OD / NIH HHS / United States OT2 OD002751 / OD / NIH HHS / United States |
Utility of long-read sequencing for All of Us.
Similar Publications
DNA Methylation-Derived Immune Cell Proportions and Cancer Risk in Black Participants. Cancer Res Commun. 2024;4(10):2714-2723. | .
StratoMod: predicting sequencing and variant calling errors with interpretable machine learning. Commun Biol. 2024;7(1):1316. | .
Identification of allele-specific KIV-2 repeats and impact on Lp(a) measurements for cardiovascular disease risk. BMC Med Genomics. 2024;17(1):255. | .