Title | Estimating population genetic parameters and comparing model goodness-of-fit using DNA sequences with error. |
Publication Type | Journal Article |
Year of Publication | 2010 |
Authors | Liu, X, Fu, Y-X, Maxwell, TJ, Boerwinkle, E |
Journal | Genome Res |
Volume | 20 |
Issue | 1 |
Pagination | 101-9 |
Date Published | 2010 Jan |
ISSN | 1549-5469 |
Keywords | Angiopoietin-Like Protein 4, Angiopoietins, Base Sequence, Black or African American, Black People, Computational Biology, Computer Simulation, Genetics, Population, Humans, Likelihood Functions, Models, Genetic, Mutation, Polymorphism, Single Nucleotide, White People |
Abstract | It is known that sequencing error can bias estimation of evolutionary or population genetic parameters. This problem is more prominent in deep resequencing studies because of their large sample size n, and a higher probability of error at each nucleotide site. We propose a new method based on the composite likelihood of the observed SNP configurations to infer population mutation rate theta = 4N(e)micro, population exponential growth rate R, and error rate epsilon, simultaneously. Using simulation, we show the combined effects of the parameters, theta, n, epsilon, and R on the accuracy of parameter estimation. We compared our maximum composite likelihood estimator (MCLE) of theta with other theta estimators that take into account the error. The results show the MCLE performs well when the sample size is large or the error rate is high. Using parametric bootstrap, composite likelihood can also be used as a statistic for testing the model goodness-of-fit of the observed DNA sequences. The MCLE method is applied to sequence data on the ANGPTL4 gene in 1832 African American and 1045 European American individuals. |
DOI | 10.1101/gr.097543.109 |
Alternate Journal | Genome Res |
PubMed ID | 19952140 |
PubMed Central ID | PMC2798822 |
Grant List | P50 GM065509 / GM / NIGMS NIH HHS / United States 5P50GM065509 / GM / NIGMS NIH HHS / United States |
Estimating population genetic parameters and comparing model goodness-of-fit using DNA sequences with error.
Similar Publications
Inverted triplications formed by iterative template switches generate structural variant diversity at genomic disorder loci. Cell Genom. 2024;4(7):100590. | .
Unveiling novel genetic variants in 370 challenging medically relevant genes using the long read sequencing data of 41 samples from 19 global populations. Mol Genet Genomics. 2024;299(1):65. | .
Genetic diversity of 1,845 rhesus macaques improves genetic variation interpretation and identifies disease models. Nat Commun. 2024;15(1):5658. | .