Title | Rare variants analysis using penalization methods for whole genome sequence data. |
Publication Type | Journal Article |
Year of Publication | 2015 |
Authors | Yazdani, A, Yazdani, A, Boerwinkle, E |
Journal | BMC Bioinformatics |
Volume | 16 |
Pagination | 405 |
Date Published | 2015 Dec 04 |
ISSN | 1471-2105 |
Keywords | Algorithms, Atherosclerosis, Genetic Association Studies, Genetic Variation, Genome, Human, High-Throughput Nucleotide Sequencing, Humans, Linkage Disequilibrium, Phenotype, Principal Component Analysis |
Abstract | BACKGROUND: Availability of affordable and accessible whole genome sequencing for biomedical applications poses a number of statistical challenges and opportunities, particularly related to the analysis of rare variants and sparseness of the data. Although efforts have been devoted to address these challenges, the performance of statistical methods for rare variants analysis still needs further consideration.RESULT: We introduce a new approach that applies restricted principal component analysis with convex penalization and then selects the best predictors of a phenotype by a concave penalized regression model, while estimating the impact of each genomic region on the phenotype. Using simulated data, we show that the proposed method maintains good power for association testing while keeping the false discovery rate low under a verity of genetic architectures. Illustrative data analyses reveal encouraging result of this method in comparison with other commonly applied methods for rare variants analysis.CONCLUSION: By taking into account linkage disequilibrium and sparseness of the data, the proposed method improves power and controls the false discovery rate compared to other commonly applied methods for rare variant analyses. |
DOI | 10.1186/s12859-015-0825-4 |
Alternate Journal | BMC Bioinformatics |
PubMed ID | 26637205 |
PubMed Central ID | PMC4670502 |
Grant List | RC2 HL102419 / HL / NHLBI NIH HHS / United States H H S N 268201100009C / / PHS HHS / United States 5RC2HL102419 / HL / NHLBI NIH HHS / United States H H S N 268201100007C / / PHS HHS / United States U54 HG003273 / HG / NHGRI NIH HHS / United States H H S N 2682011000 06C / / PHS HHS / United States H H S N 268201100005C / / PHS HHS / United States H H S N 268201100012C / / PHS HHS / United States H H S N 68201100010C / / PHS HHS / United States H H S N 268201100011C / / PHS HHS / United States H H S N 268201100008C / / PHS HHS / United States |
Rare variants analysis using penalization methods for whole genome sequence data.
Similar Publications
PRL1 and PRL3 promote macropinocytosis via its lipid phosphatase activity. Theranostics. 2024;14(9):3423-3438. | .
Unveiling novel genetic variants in 370 challenging medically relevant genes using the long read sequencing data of 41 samples from 19 global populations. Mol Genet Genomics. 2024;299(1):65. | .
Genetic diversity of 1,845 rhesus macaques improves genetic variation interpretation and identifies disease models. Nat Commun. 2024;15(1):5658. | .