Everything is bigger in Texas: Pan-Structural Variation hackathon in the Cloud!

The “Structural Variant Crying Club” is pleased to announce the 4th round of Structural Variants in the Cloud Hackathon!

From October 9-12, 2022, DNAnexus will help run a virtual bioinformatics hackathon in Houston, Texas hosted by the Baylor College of Medicine. Folks from NCBI will also be around to help out!  This hybrid event will include opportunities for participation in person or online over Zoom. Potential topics include:

  • Mapping structural variants to public databases
  • Calculating the heritability of different types of structural variants
  • CNV effect on isoform expression
  • Assembly accuracy for metagenomics
  • Quality assessment in large cohorts
  • Structural variants affecting agricultural production
  • Sars-Cov-2 variations

We're specifically looking for folks who have experience in working with structural variants, complex disease, precision medicine, and similar genomic analysis. If this describes you, please apply! This event is for researchers, including students and postdocs, who are already engaged in the use of bioinformatics data or in the development of pipelines for large scale genomic analyses from high-throughput experiments. The event is open to anyone selected for the hackathon (see below).


Working groups of five to six individuals will be formed into five to eight teams. These teams will build pipelines to analyze large datasets within a cloud infrastructure. The projects will be unveiled before the hackathon starts, and will build off of previous NCBI-style hackathons and community projects.


After a brief organizational session, teams will spend four days addressing a challenging set of scientific problems related to a group of datasets. Participants will analyze and combine datasets in order to work on these problems. Throughout the four days, we will come together to discuss progress on each of the topics, bioinformatics best practices, coding styles, etc.

The hackathon ends in a final presentation of all groups on the last day. The members of the best two groups will receive prices from Oxford Nanopore and Pacific Biosciences.


Datasets will come from public repositories, with a focus on a number of trios produced by long read sequencing as a base graph and short read datasets in the sequence read archive that have been ported to cloud infrastructure, as well as derivative contigs of the above.


All pipelines and other scripts, software, and programs generated in this hackathon will be added to a public GitHub repository designed for that purpose (github.com/collaborativebioinformatics).

Manuscripts describing the design and usage of the software tools constructed by each team may be submitted to an appropriate journal such as the F1000Research hackathons channel, BMC Bioinformatics, GigaScience, Genome Research or PLoS Computational Biology.

The outcomes of the past Hackathons have been published here:


Initial applications are due Sept. 23, 2022 by 3 p.m. CDT. Participants will be selected based on the experience and motivation they provide on the form.

The first round of accepted applicants will be notified on Sept. 30, and have until Oct. 4 noon CDT to confirm their participation. If you confirm, please make sure it is highly likely you can attend, as confirming and not attending prevents other data scientists from attending this event. Please include a monitored email address, in case there are follow-up questions.

Note: Participants will need to bring their own laptop to this program. A working knowledge of scripting (e.g., Shell, Python, R) is useful but not necessary to be successful in this event. Employment of higher level scripting or programming languages may also be useful. Participants will also have access to cloud computing infrastructure.

Applicants must be willing to commit to all four days of the event.

There will be no registration fee or cost associated with attending this event.

For more information, or with any questions, please contact Ben Busby / Fritz Sedlazeck.

Application form

Workshop participants will be selected based on registration responses.