Title | Efficient gene-environment interaction tests for large biobank-scale sequencing studies. |
Publication Type | Journal Article |
Year of Publication | 2020 |
Authors | Wang, X, Lim, E, Liu, C-T, Sung, YJu, Rao, DC, Morrison, AC, Boerwinkle, E, Manning, AK, Chen, H |
Journal | Genet Epidemiol |
Volume | 44 |
Issue | 8 |
Pagination | 908-923 |
Date Published | 2020 Nov |
ISSN | 1098-2272 |
Keywords | Biological Specimen Banks, Body Mass Index, Computer Simulation, Exome, Exome Sequencing, Female, Gene-Environment Interaction, Humans, Linear Models, Male, Models, Genetic, Obesity, Phenotype, Quantitative Trait, Heritable, Time Factors |
Abstract | Complex human diseases are affected by genetic and environmental risk factors and their interactions. Gene-environment interaction (GEI) tests for aggregate genetic variant sets have been developed in recent years. However, existing statistical methods become rate limiting for large biobank-scale sequencing studies with correlated samples. We propose efficient Mixed-model Association tests for GEne-Environment interactions (MAGEE), for testing GEI between an aggregate variant set and environmental exposures on quantitative and binary traits in large-scale sequencing studies with related individuals. Joint tests for the aggregate genetic main effects and GEI effects are also developed. A null generalized linear mixed model adjusting for covariates but without any genetic effects is fit only once in a whole genome GEI analysis, thereby vastly reducing the overall computational burden. Score tests for variant sets are performed as a combination of genetic burden and variance component tests by accounting for the genetic main effects using matrix projections. The computational complexity is dramatically reduced in a whole genome GEI analysis, which makes MAGEE scalable to hundreds of thousands of individuals. We applied MAGEE to the exome sequencing data of 41,144 related individuals from the UK Biobank, and the analysis of 18,970 protein coding genes finished within 10.4 CPU hours. |
DOI | 10.1002/gepi.22351 |
Alternate Journal | Genet Epidemiol |
PubMed ID | 32864785 |
PubMed Central ID | PMC7754763 |
Grant List | R00 HL130593 / HL / NHLBI NIH HHS / United States R01 HL145025 / HL / NHLBI NIH HHS / United States |
Efficient gene-environment interaction tests for large biobank-scale sequencing studies.
Similar Publications
DNA Methylation-Derived Immune Cell Proportions and Cancer Risk in Black Participants. Cancer Res Commun. 2024;4(10):2714-2723. | .
StratoMod: predicting sequencing and variant calling errors with interpretable machine learning. Commun Biol. 2024;7(1):1316. | .
Identification of allele-specific KIV-2 repeats and impact on Lp(a) measurements for cardiovascular disease risk. BMC Med Genomics. 2024;17(1):255. | .