Everything is bigger in Texas: Pan-Structural Variation hackathon in the Cloud!

The “Structural Variant Crying Club” is pleased to announce the next round of Structural Variants in the Cloud Hackathon teaming up with Pangenomes! The hackathon is now scheduled for Oct. 11-13, 2020, with a symposium on Oct. 10. 

  • Precision medicine
  • Incorporation of population data to a reference graph
  • Mapping structural variants to public databases
  • Building genome graphs representing population SVs
  • Calculating the heritability of different types of structural variants
  • CNV effect on isoform expression
  • Incorporation of Public Annotation Databases with Graphs
  • Assembly accuracy for metagenomics
  • Develop a pipeline for submission of non-graph associated read assemblies to a public sequence database.
  • Quality assessment in large cohorts
  • Develop efficient query mechanisms from graph genomes
  • Assessing the benefits of graph genomes in clinical analysis

We're specifically looking for folks who have experience in working with structural variants, complex disease, precision medicine, and similar genomic analysis.  If this describes you, please apply! This event is for researchers, including students and postdocs, who are already engaged in the use of bioinformatics data or in the development of pipelines for large scale genomic analyses from high-throughput experiments. The event is open to anyone selected for the hackathon and willing to travel to BCM (see below). 

STAY TUNED for a special symposium the night before (10th of April) to celebrate and bring together world leading experts on SV and graph genomes! 


Working groups of five to six individuals will be formed into five to eight teams. These teams will build pipelines to analyze large datasets within a cloud infrastructure. The projects will be unveiled before the hackathon starts, and will build off of previous NCBI hackathons and community projects.


After a brief organizational session, teams will spend three days addressing a challenging set of scientific problems related to a group of datasets.  Participants will analyze and combine datasets in order to work on these problems. Throughout the three days, we will come together to discuss progress on each of the topics, bioinformatics best practices, coding styles, etc.


Datasets will come from public repositories, with a focus on a number of trios produced by long read sequencing as a base graph and short read datasets in the sequence read archive that have been ported to cloud infrastructure, as well as derivative contigs of the above.


All pipelines and other scripts, software, and programs generated in this hackathon will be added to a public GitHub repository designed for that purpose (github.com/NCBI-Hackathons).

Manuscripts describing the design and usage of the software tools constructed by each team may be submitted to an appropriate journal such as the F1000Research hackathons channel, BMC Bioinformatics, GigaScience, Genome Research or PLoS Computational Biology. 


Participants will be selected based on the experience and motivation they provide on the form.

If you confirm your acceptance, please make sure it is highly likely you can attend, as confirming and not attending prevents other data scientists from attending this event.  Please include a monitored email address, in case there are follow-up questions.

Note: Participants will need to bring their own laptop to this program. A working knowledge of scripting (e.g., Shell, Python, R) is useful but not necessary to be successful in this event. Employment of higher level scripting or programming languages may also be useful.

Applicants must be willing to commit to all three days of the event.

It’s unlikely that financial support for travel, lodging or meals is available for this event. We will have reduced prices at nearby hotels that will be sent out for registrants.  Also, note that the hackathon may extend into the evening hours each day. Please make any necessary arrangements to accommodate this possibility. 

There will be no registration fee or cost associated with attending this event.

For more information, or with any questions, please contact Ben Busby / Fritz Sedlazeck

Application for Structural Variation in the Cloud Hackathon, Oct 11-13, 2020

Participants will be selected based on the experience and motivation they provide on the form.

This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.