Title | Association studies for next-generation sequencing. |
Publication Type | Journal Article |
Year of Publication | 2011 |
Authors | Luo, L, Boerwinkle, E, Xiong, M |
Journal | Genome Res |
Volume | 21 |
Issue | 7 |
Pagination | 1099-108 |
Date Published | 2011 Jul |
ISSN | 1549-5469 |
Keywords | Angiopoietin-Like Protein 4, Angiopoietins, Computational Biology, Computer Simulation, Databases, Genetic, Genetic Variation, Genetics, Population, Genome, Human, Genome-Wide Association Study, Genotype, Humans, Models, Biological, Models, Statistical, Multivariate Analysis, Phenotype, Sequence Analysis, DNA |
Abstract | Genome-wide association studies (GWAS) have become the primary approach for identifying genes with common variants influencing complex diseases. Despite considerable progress, the common variations identified by GWAS account for only a small fraction of disease heritability and are unlikely to explain the majority of phenotypic variations of common diseases. A potential source of the missing heritability is the contribution of rare variants. Next-generation sequencing technologies will detect millions of novel rare variants, but these technologies have three defining features: identification of a large number of rare variants, a high proportion of sequence errors, and a large proportion of missing data. These features raise challenges for testing the association of rare variants with phenotypes of interest. In this study, we use a genome continuum model and functional principal components as a general principle for developing novel and powerful association analysis methods designed for resequencing data. We use simulations to calculate the type I error rates and the power of nine alternative statistics: two functional principal component analysis (FPCA)-based statistics, the multivariate principal component analysis (MPCA)-based statistic, the weighted sum (WSS), the variable-threshold (VT) method, the generalized T(2), the collapsing method, the CMC method, and individual tests. We also examined the impact of sequence errors on their type I error rates. Finally, we apply the nine statistics to the published resequencing data set from ANGPTL4 in the Dallas Heart Study. We report that FPCA-based statistics have a higher power to detect association of rare variants and a stronger ability to filter sequence errors than the other seven methods. |
DOI | 10.1101/gr.115998.110 |
Alternate Journal | Genome Res |
PubMed ID | 21521787 |
PubMed Central ID | PMC3129252 |
Grant List | P01 AR052915-01A1 / AR / NIAMS NIH HHS / United States P50 AR054144 / AR / NIAMS NIH HHS / United States 1R01AR057120-01 / AR / NIAMS NIH HHS / United States R01 HL106034 / HL / NHLBI NIH HHS / United States 1R01HL106034-01 / HL / NHLBI NIH HHS / United States R01 AR057120 / AR / NIAMS NIH HHS / United States P50 AR054144-01 / AR / NIAMS NIH HHS / United States P01 AR052915 / AR / NIAMS NIH HHS / United States |
Association studies for next-generation sequencing.
Similar Publications
Single cell dual-omic atlas of the human developing retina. Nat Commun. 2024;15(1):6792. | .
Improved high quality sand fly assemblies enabled by ultra low input long read sequencing. Sci Data. 2024;11(1):918. | .
Loss of symmetric cell division of apical neural progenitors drives DENND5A-related developmental and epileptic encephalopathy. Nat Commun. 2024;15(1):7239. | .