Generating Clinical Reports from Genomic Data on the Cloud-based Neptune Platform

TitleGenerating Clinical Reports from Genomic Data on the Cloud-based Neptune Platform
Publication TypePoster Session Abstract
Year of Publication2017
AuthorsVenner, E, Leduc, MS, Bainbridge, MN, Wu, T-J, Kovar, C, Chiang, T, Murugan, M, Salerno, WJ, Muzny, D, Gibbs, RA
Date Published03/2017
PublisherACMG Annual Clinical Genetics Meeting 2017
Place PublishedPhoenix
Other NumbersAbstract Number: (368)

High throughput clinical analysis of DNA capture panels demand automated processing to provide timely and cost-efficient reporting. We developed Neptune, an automated analytical platform to sign-out and deliver clinical reports, to address this need.

Initial data intake occurs in a HIPAA compliant environment on DNAnexus, and samples are de-identified before moving into the CLIA lab. After analysis with the Human Genome Sequencing Center’s Mercury Pipeline, Neptune’s custom annotation software identifies variants of putative clinical relevance for manual review and possible addition to a “VIP” database of clinically relevant variation. This resource draws on both public resources (ClinVar, literature review) and internal data sets accessed via Anton, the HGSC’s Hadoop-based data store. The VIP database currently houses 20,872 SNPs and 3,946 indels, and contains a curated set of copy number variants (CNVs) annotated with internal frequency data.

Using Neptune’s manual review interface, a clinical geneticist updates the VIP database accordingly. Once all variants have been categorized, Neptune extracts reportable, pathogenic variants using the VIP set, and Neptune outputs an automated clinical pre-report populated with prioritized variants (or a negative report if no relevant variants are found), descriptive text, and coverage statistics produced by the HGSC’s ExCiD software. The modular nature of Neptune allows it to be deployed locally, on the cloud or as a hybrid. Clinical reports integrate SNVs, Indels and CNVs.

Early applications include reporting for the National Institutes of Health eMERGE network where more than 12,500 samples and a panel of 109 genes will be processed in less than three years, as well as the Right10k pharmacogenomics project in collaboration with the Mayo Clinic.

Citation Key3356