SV-STAT accurately detects structural variation via alignment to reference-based assemblies.

TitleSV-STAT accurately detects structural variation via alignment to reference-based assemblies.
Publication TypeJournal Article
Year of Publication2016
AuthorsDavis, CF, Ritter, DI, Wheeler, DA, Wang, H, Ding, Y, Dugan, SP, Bainbridge, MN, Muzny, DM, Rao, PH, Man, T-K, Plon, SE, Gibbs, RA, Lau, CC
JournalSource Code Biol Med
Date Published2016

BACKGROUND: Genomic deletions, inversions, and other rearrangements known collectively as structural variations (SVs) are implicated in many human disorders. Technologies for sequencing DNA provide a potentially rich source of information in which to detect breakpoints of structural variations at base-pair resolution. However, accurate prediction of SVs remains challenging, and existing informatics tools predict rearrangements with significant rates of false positives or negatives.RESULTS: To address this challenge, we developed 'Structural Variation detection by STAck and Tail' (SV-STAT) which implements a novel scoring metric. The software uses this statistic to quantify evidence for structural variation in genomic regions suspected of harboring rearrangements. To demonstrate SV-STAT, we used targeted and genome-wide approaches. First, we applied a custom capture array followed by Roche/454 and SV-STAT to three pediatric B-lineage acute lymphoblastic leukemias, identifying five structural variations joining known and novel breakpoint regions. Next, we detected SVs genome-wide in paired-end Illumina data collected from additional tumor samples. SV-STAT showed predictive accuracy as high as or higher than leading alternatives. The software is freely available under the terms of the GNU General Public License version 3 at SV-STAT works across multiple sequencing chemistries, paired and single-end technologies, targeted or whole-genome strategies, and it complements existing SV-detection software. The method is a significant advance towards accurate detection and genotyping of genomic rearrangements from DNA sequencing data.

Alternate JournalSource Code Biol Med
PubMed ID27330550
PubMed Central IDPMC4913042
Grant ListK12 GM084897 / GM / NIGMS NIH HHS / United States
R01 CA138836 / CA / NCI NIH HHS / United States
T15 LM007093 / LM / NLM NIH HHS / United States