A SNP discovery method to assess variant allele probability from next-generation resequencing data.

TitleA SNP discovery method to assess variant allele probability from next-generation resequencing data.
Publication TypeJournal Article
Year of Publication2010
AuthorsShen, Y, Wan, Z, Coarfa, C, Drabek, R, Chen, L, Ostrowski, EA, Liu, Y, Weinstock, GM, Wheeler, DA, Gibbs, RA, Yu, F
JournalGenome Res
Volume20
Issue2
Pagination273-80
Date Published2010 Feb
ISSN1549-5469
KeywordsAlgorithms, Alleles, Bayes Theorem, Computer Simulation, Genome, Bacterial, Logistic Models, Polymorphism, Single Nucleotide, Sequence Analysis, DNA, Software, Staphylococcus aureus
Abstract

Accurate identification of genetic variants from next-generation sequencing (NGS) data is essential for immediate large-scale genomic endeavors such as the 1000 Genomes Project, and is crucial for further genetic analysis based on the discoveries. The key challenge in single nucleotide polymorphism (SNP) discovery is to distinguish true individual variants (occurring at a low frequency) from sequencing errors (often occurring at frequencies orders of magnitude higher). Therefore, knowledge of the error probabilities of base calls is essential. We have developed Atlas-SNP2, a computational tool that detects and accounts for systematic sequencing errors caused by context-related variables in a logistic regression model learned from training data sets. Subsequently, it estimates the posterior error probability for each substitution through a Bayesian formula that integrates prior knowledge of the overall sequencing error probability and the estimated SNP rate with the results from the logistic regression model for the given substitutions. The estimated posterior SNP probability can be used to distinguish true SNPs from sequencing errors. Validation results show that Atlas-SNP2 achieves a false-positive rate of lower than 10%, with an approximately 5% or lower false-negative rate.

DOI10.1101/gr.096388.109
Alternate JournalGenome Res
PubMed ID20019143
PubMed Central IDPMC2813483
Grant ListU54 HG003273 / HG / NHGRI NIH HHS / United States
1U01HG005211-0109 / HG / NHGRI NIH HHS / United States
5U54HG003273 / HG / NHGRI NIH HHS / United States

Similar Publications

Wojcik MH, Reuter CM, Marwaha S, Mahmoud M, Duyzend MH, Barseghyan H, et al.. Beyond the exome: What's next in diagnostic testing for Mendelian conditions. Am J Hum Genet. 2023;110(8):1229-1248.
Weinstock JS, Gopakumar J, Burugula BBharathi, Uddin MMesbah, Jahn N, Belk JA, et al.. Aberrant activation of TCL1A promotes stem cell expansion in clonal haematopoiesis. Nature. 2023;616(7958):755-763.
Lecca M, Pehlivan D, Suñer DHeine, Weiss K, Coste T, Zweier M, et al.. Bi-allelic variants in the ESAM tight-junction gene cause a neurodevelopmental disorder associated with fetal intracranial hemorrhage. Am J Hum Genet. 2023;110(4):681-690.
Rozowsky J, Gao J, Borsari B, Yang YT, Galeev T, Gürsoy G, et al.. The EN-TEx resource of multi-tissue personal epigenomes & variant-impact models. Cell. 2023;186(7):1493-1511.e40.
Olson ND, Wagner J, Dwarshuis N, Miga KH, Sedlazeck FJ, Salit M, et al.. Variant calling and benchmarking in an era of complete human genome sequences. Nat Rev Genet. 2023;24(7):464-483.
Granot-Hershkovitz E, Spitzer B, Yang Y, Tarraf W, Yu B, Boerwinkle E, et al.. Genetic loci of beta-aminoisobutyric acid are associated with aging-related mild cognitive impairment. Transl Psychiatry. 2023;13(1):140.
F Frost G, Morimoto M, Sharma P, Ruaud L, Belnap N, Calame DG, et al.. Bi-allelic SNAPC4 variants dysregulate global alternative splicing and lead to neuroregression and progressive spastic paraparesis. Am J Hum Genet. 2023;110(4):663-680.
Wen S, Wang M, Qian X, Li Y, Wang K, Choi J, et al.. Systematic assessment of the contribution of structural variants to inherited retinal diseases. Hum Mol Genet. 2023;32(12):2005-2015.
Behera S, LeFaive J, Orchard P, Mahmoud M, Paulin LF, Farek J, et al.. FixItFelix: improving genomic analysis by fixing reference errors. Genome Biol. 2023;24(1):31.
English AC, Menon VK, Gibbs RA, Metcalf GA, Sedlazeck FJ. Truvari: refined structural variant comparison preserves allelic diversity. Genome Biol. 2022;23(1):271.