Dynamic Scan Procedure for Detecting Rare-Variant Association Regions in Whole-Genome Sequencing Studies.

TitleDynamic Scan Procedure for Detecting Rare-Variant Association Regions in Whole-Genome Sequencing Studies.
Publication TypeJournal Article
Year of Publication2019
AuthorsLi, Z, Li, X, Liu, Y, Shen, J, Chen, H, Zhou, H, Morrison, AC, Boerwinkle, E, Lin, X
JournalAm J Hum Genet
Volume104
Issue5
Pagination802-814
Date Published2019 May 02
ISSN1537-6605
KeywordsAlgorithms, Computational Biology, Genetic Variation, Genome, Human, Genome-Wide Association Study, Humans, Linkage Disequilibrium, Models, Genetic, Whole Genome Sequencing
Abstract

Whole-genome sequencing (WGS) studies are being widely conducted in order to identify rare variants associated with human diseases and disease-related traits. Classical single-marker association analyses for rare variants have limited power, and variant-set-based analyses are commonly used by researchers for analyzing rare variants. However, existing variant-set-based approaches need to pre-specify genetic regions for analysis; hence, they are not directly applicable to WGS data because of the large number of intergenic and intron regions that consist of a massive number of non-coding variants. The commonly used sliding-window method requires the pre-specification of fixed window sizes, which are often unknown as a priori, are difficult to specify in practice, and are subject to limitations given that the sizes of genetic-association regions are likely to vary across the genome and phenotypes. We propose a computationally efficient and dynamic scan-statistic method (Scan the Genome [SCANG]) for analyzing WGS data; this method flexibly detects the sizes and the locations of rare-variant association regions without the need to specify a prior, fixed window size. The proposed method controls for the genome-wise type I error rate and accounts for the linkage disequilibrium among genetic variants. It allows the detected sizes of rare-variant association regions to vary across the genome. Through extensive simulated studies that consider a wide variety of scenarios, we show that SCANG substantially outperforms several alternative methods for detecting rare-variant-associations while controlling for the genome-wise type I error rates. We illustrate SCANG by analyzing the WGS lipids data from the Atherosclerosis Risk in Communities (ARIC) study.

DOI10.1016/j.ajhg.2019.03.002
Alternate JournalAm J Hum Genet
PubMed ID30982610
PubMed Central IDPMC6507043
Grant ListRC2 HL102419 / HL / NHLBI NIH HHS / United States
R35 CA197449 / CA / NCI NIH HHS / United States
U19 CA203654 / CA / NCI NIH HHS / United States
R01 HL113338 / HL / NHLBI NIH HHS / United States
/ RA / ARRA NIH HHS / United States
U54 HG003273 / HG / NHGRI NIH HHS / United States
HHSN268201700002C / HL / NHLBI NIH HHS / United States
HHSN268201700001I / HL / NHLBI NIH HHS / United States
HHSN268201700004I / HL / NHLBI NIH HHS / United States
HHSN268201700003I / HL / NHLBI NIH HHS / United States
HHSN268201700005C / HL / NHLBI NIH HHS / United States
HHSN268201700001C / HL / NHLBI NIH HHS / United States
HHSN268201700003C / HL / NHLBI NIH HHS / United States
HHSN268201700004C / HL / NHLBI NIH HHS / United States
P01 CA134294 / CA / NCI NIH HHS / United States
HHSN268201700002I / HL / NHLBI NIH HHS / United States
HHSN268201700005I / HL / NHLBI NIH HHS / United States
U01 HG009088 / HG / NHGRI NIH HHS / United States
UM1 HG008898 / HG / NHGRI NIH HHS / United States

Similar Publications