Everything is bigger in Texas: Pan-Structural Variation hackathon in the Cloud!

The “Structural Variant Crying Club” is pleased to announce the 6th round of Structural Variants in the Cloud Hackathon!

August 28-31, DNAnexus will help run a virtual bioinformatics hackathon in Houston, Texas hosted by the Baylor College of Medicine. This hybrid event will include opportunities for participation in person or online over Zoom. Potential topics include:

Mapping structural variants to public databases
Mendelian disease discovery
Identification of somatic and mosaic variants (tumor vs normal, within tissue)
Assembly accuracy for metagenomics
Analysis of long read RNA and comparison on variant calling
Respiratory Virus Variations
Structural variants affecting agricultural production
Integrating genome graphs with phenotype networks

We're specifically looking for folks who have experience in working with structural variants, complex disease, precision medicine, and similar genomic analysis. If this describes you, please apply! This event is for researchers, including students and postdocs, who are already engaged in the use of bioinformatics data or in the development of pipelines for large scale genomic analyses from high-throughput experiments. The event is open to anyone selected for the hackathon (see below).

Topics

Working groups of five to six individuals will be formed into five to eight teams. These teams will build pipelines to analyze large datasets within a cloud infrastructure. The projects will be unveiled before the hackathon starts, and will build off of previous NCBI-style hackathons and community projects.

Organization

After a brief organizational session, teams will spend four days addressing a challenging set of scientific problems related to a group of datasets. Participants will analyze and combine datasets in order to work on these problems. Throughout the four days, we will come together to discuss progress on each of the topics, bioinformatics best practices, coding styles, etc.

The hackathon ends in a final presentation of all groups on the last day.

Datasets

Datasets will come from public repositories, with a focus on a number of trios produced by long read sequencing as a base graph and short read datasets in the sequence read archive that have been ported to cloud infrastructure, as well as derivative contigs of the above.

Products

All pipelines and other scripts, software, and programs generated in this hackathon will be added to a public GitHub repository designed for that purpose (github.com/collaborativebioinformatics).

Manuscripts describing the design and usage of the software tools constructed by each team may be submitted to an appropriate journal such as the F1000Research hackathons channel, BMC Bioinformatics, GigaScience, Genome Research or PLoS Computational Biology.

The outcomes of the past Hackathons have been published here:

Application

Initial applications are due Aug. 21, 2024 by 3 p.m. CDT. Participants will be selected based on the experience and motivation they provide on the form.

If you confirm, please make sure it is highly likely you can attend, as confirming and not attending prevents other data scientists from attending this event. Please include a monitored email address, in case there are follow-up questions.

Note: Participants will need to bring their own laptop to this program. A working knowledge of scripting (e.g., Shell, Python, R) is useful but not necessary to be successful in this event. Employment of higher level scripting or programming languages may also be useful. Participants will also have access to cloud computing infrastructure.

Applicants must be willing to commit to all three days of the event.

There will be no registration fee or cost associated with attending this event.

For more information, or with any questions, please contact Ben Busby / Fritz Sedlazeck.

Application Form

First Name *

Last Name *

Email address *

Your institution *

How do you plan to participate? *

In person at Texas Medical Center in Houston

Zoom

No preference

Your Time Zone

Do you have a working knowledge over the terminal/command line interface? *

Unfamiliar/no experience

Novice/basic experience

Analyst/moderate experience

Developer/expert.

Your DNAnexus account ID

Your GitHub ID

Please tell us in 2-3 sentences what you want to learn

Please tell us why you want to participate:

Leave this field blank