Title | The distribution and mutagenesis of short coding INDELs from 1,128 whole exomes. |
Publication Type | Journal Article |
Year of Publication | 2015 |
Authors | Challis, D, Antunes, L, Garrison, E, Banks, E, Evani, US, Muzny, DM, Poplin, R, Gibbs, RA, Marth, G, Yu, F |
Journal | BMC Genomics |
Volume | 16 |
Issue | 1 |
Pagination | 143 |
Date Published | 2015 Feb 28 |
ISSN | 1471-2164 |
Keywords | Computational Biology, Exome, Genome, Human, High-Throughput Nucleotide Sequencing, Human Genome Project, Humans, INDEL Mutation, Machine Learning, Mutagenesis |
Abstract | BACKGROUND: Identifying insertion/deletion polymorphisms (INDELs) with high confidence has been intrinsically challenging in short-read sequencing data. Here we report our approach for improving INDEL calling accuracy by using a machine learning algorithm to combine call sets generated with three independent methods, and by leveraging the strengths of each individual pipeline. Utilizing this approach, we generated a consensus exome INDEL call set from a large dataset generated by the 1000 Genomes Project (1000G), maximizing both the sensitivity and the specificity of the calls.RESULTS: This consensus exome INDEL call set features 7,210 INDELs, from 1,128 individuals across 13 populations included in the 1000 Genomes Phase 1 dataset, with a false discovery rate (FDR) of about 7.0%.CONCLUSIONS: In our study we further characterize the patterns and distributions of these exonic INDELs with respect to density, allele length, and site frequency spectrum, as well as the potential mutagenic mechanisms of coding INDELs in humans. |
DOI | 10.1186/s12864-015-1333-7 |
Alternate Journal | BMC Genomics |
PubMed ID | 25765891 |
PubMed Central ID | PMC4352271 |
Grant List | R01 HG004719 / HG / NHGRI NIH HHS / United States 5U54HG003273 / HG / NHGRI NIH HHS / United States U01 HG006513 / HG / NHGRI NIH HHS / United States U01 HG005211 / HG / NHGRI NIH HHS / United States R01HG004719 / HG / NHGRI NIH HHS / United States U54 HG003273 / HG / NHGRI NIH HHS / United States U01HG006513 / HG / NHGRI NIH HHS / United States 1U01HG005211 / HG / NHGRI NIH HHS / United States R01 HG008115 / HG / NHGRI NIH HHS / United States 1R01HG008115 / HG / NHGRI NIH HHS / United States |
The distribution and mutagenesis of short coding INDELs from 1,128 whole exomes.
Similar Publications
DNA Methylation-Derived Immune Cell Proportions and Cancer Risk in Black Participants. Cancer Res Commun. 2024;4(10):2714-2723. | .
StratoMod: predicting sequencing and variant calling errors with interpretable machine learning. Commun Biol. 2024;7(1):1316. | .
Identification of allele-specific KIV-2 repeats and impact on Lp(a) measurements for cardiovascular disease risk. BMC Med Genomics. 2024;17(1):255. | .