StocSum: stochastic summary statistics for whole genome sequencing studies.

TitleStocSum: stochastic summary statistics for whole genome sequencing studies.
Publication TypeJournal Article
Year of Publication2023
AuthorsWang, N, Yu, B, Jun, G, Qi, Q, Durazo-Arvizu, RA, Lindström, S, Morrison, AC, Kaplan, RC, Boerwinkle, E, Chen, H
JournalbioRxiv
Date Published2023 Apr 07
Abstract

Genomic summary statistics, usually defined as single-variant test results from genome-wide association studies, have been widely used to advance the genetics field in a wide range of applications. Applications that involve multiple genetic variants also require their correlations or linkage disequilibrium (LD) information, often obtained from an external reference panel. In practice, it is usually difficult to find suitable external reference panels that represent the LD structure for underrepresented and admixed populations, or rare genetic variants from whole genome sequencing (WGS) studies, limiting the scope of applications for genomic summary statistics. Here we introduce StocSum, a novel reference-panel-free statistical framework for generating, managing, and analyzing stochastic summary statistics using random vectors. We develop various downstream applications using StocSum including single-variant tests, conditional association tests, gene-environment interaction tests, variant set tests, as well as meta-analysis and LD score regression tools. We demonstrate the accuracy and computational efficiency of StocSum using two cohorts from the Trans-Omics for Precision Medicine Program. StocSum will facilitate sharing and utilization of genomic summary statistics from WGS studies, especially for underrepresented and admixed populations.

DOI10.1101/2023.04.06.535886
Alternate JournalbioRxiv
PubMed ID37066281
PubMed Central IDPMC10104122
Grant ListHHSN268201300005C / HL / NHLBI NIH HHS / United States
HHSN268201300004C / HL / NHLBI NIH HHS / United States
R01 HL120393 / HL / NHLBI NIH HHS / United States
HHSN268201300001C / HL / NHLBI NIH HHS / United States
75N92022D00002 / HL / NHLBI NIH HHS / United States
R01 HL092577 / HL / NHLBI NIH HHS / United States
U01 HL120393 / HL / NHLBI NIH HHS / United States
N01HC65236 / HL / NHLBI NIH HHS / United States
N01HC65235 / HL / NHLBI NIH HHS / United States
U54 HG003273 / HG / NHGRI NIH HHS / United States
N01HC65234 / HL / NHLBI NIH HHS / United States
75N92022D00004 / HL / NHLBI NIH HHS / United States
N01HC65233 / HL / NHLBI NIH HHS / United States
HHSN268201800001C / HL / NHLBI NIH HHS / United States
75N92022D00005 / HL / NHLBI NIH HHS / United States
UM1 HG008898 / HG / NHGRI NIH HHS / United States
R01 HL117626 / HL / NHLBI NIH HHS / United States
U24 HG008956 / HG / NHGRI NIH HHS / United States
75N92022D00001 / HL / NHLBI NIH HHS / United States
HHSN268201500015C / HL / NHLBI NIH HHS / United States
N01HC65237 / HL / NHLBI NIH HHS / United States
HHSN268201600033C / ES / NIEHS NIH HHS / United States
HHSN268201300003C / HG / NHGRI NIH HHS / United States
R01 HL145025 / HL / NHLBI NIH HHS / United States
75N92022D00003 / HL / NHLBI NIH HHS / United States