SVachra: a tool to identify genomic structural variation in mate pair sequencing data containing inward and outward facing reads.

TitleSVachra: a tool to identify genomic structural variation in mate pair sequencing data containing inward and outward facing reads.
Publication TypeJournal Article
Year of Publication2017
AuthorsHampton, OA, English, AC, Wang, M, Salerno, WJ, Liu, Y, Muzny, DM, Han, Y, Wheeler, DA, Worley, KC, Lupski, JR, Gibbs, RA
JournalBMC Genomics
Volume18
IssueSuppl 6
Pagination691
Date Published2017 Oct 03
ISSN1471-2164
Abstract

BACKGROUND: Characterization of genomic structural variation (SV) is essential to expanding the research and clinical applications of genome sequencing. Reliance upon short DNA fragment paired end sequencing has yielded a wealth of single nucleotide variants and internal sequencing read insertions-deletions, at the cost of limited SV detection. Multi-kilobase DNA fragment mate pair sequencing has supplemented the void in SV detection, but introduced new analytic challenges requiring SV detection tools specifically designed for mate pair sequencing data. Here, we introduce SVachra - Structural Variation Assessment of CHRomosomal Aberrations, a breakpoint calling program that identifies large insertions-deletions, inversions, inter- and intra-chromosomal translocations utilizing both inward and outward facing read types generated by mate pair sequencing.

RESULTS: We demonstrate SVachra's utility by executing the program on large-insert (Illumina Nextera) mate pair sequencing data from the personal genome of a single subject (HS1011). An additional data set of long-read (Pacific BioSciences RSII) was also generated to validate SV calls from SVachra and other comparison SV calling programs. SVachra exhibited the highest validation rate and reported the widest distribution of SV types and size ranges when compared to other SV callers.

CONCLUSIONS: SVachra is a highly specific breakpoint calling program that exhibits a more unbiased SV detection methodology than other callers.

DOI10.1186/s12864-017-4021-y
Alternate JournalBMC Genomics
PubMed ID28984202
PubMed Central IDPMC5629590
Grant ListU54 HG003273 / HG / NHGRI NIH HHS / United States