Accurate circular consensus long-read sequencing improves variant detection and assembly of a human genome.

TitleAccurate circular consensus long-read sequencing improves variant detection and assembly of a human genome.
Publication TypeJournal Article
Year of Publication2019
AuthorsWenger, AM, Peluso, P, Rowell, WJ, Chang, P-C, Hall, RJ, Concepcion, GT, Ebler, J, Fungtammasan, A, Kolesnikov, A, Olson, ND, Töpfer, A, Alonge, M, Mahmoud, M, Qian, Y, Chin, C-S, Phillippy, AM, Schatz, MC, Myers, G, DePristo, MA, Ruan, J, Marschall, T, Sedlazeck, FJ, Zook, JM, Li, H, Koren, S, Carroll, A, Rank, DR, Hunkapiller, MW
JournalNat Biotechnol
Volume37
Issue10
Pagination1155-1162
Date Published2019 10
ISSN1546-1696
KeywordsBase Sequence, DNA, Circular, Genetic Variation, Genome, Human, Haplotypes, High-Throughput Nucleotide Sequencing, Humans, Sequence Analysis, DNA
Abstract

The DNA sequencing technologies in use today produce either highly accurate short reads or less-accurate long reads. We report the optimization of circular consensus sequencing (CCS) to improve the accuracy of single-molecule real-time (SMRT) sequencing (PacBio) and generate highly accurate (99.8%) long high-fidelity (HiFi) reads with an average length of 13.5 kilobases (kb). We applied our approach to sequence the well-characterized human HG002/NA24385 genome and obtained precision and recall rates of at least 99.91% for single-nucleotide variants (SNVs), 95.98% for insertions and deletions 15 megabases (Mb) and concordance of 99.997%, substantially outperforming assembly with less-accurate long reads.

DOI10.1038/s41587-019-0217-9
Alternate JournalNat. Biotechnol.
PubMed ID31406327
PubMed Central IDPMC6776680
Grant ListR01 HG006677 / HG / NHGRI NIH HHS / United States
R01 HG010040 / HG / NHGRI NIH HHS / United States