The HGSC was founded in 1996 under the leadership of Dr. Richard Gibbs and is a world leader in genomics. The fundamental interests of the HGSC are in advancing biology and genetics by improved genome technologies. As one of the three large-scale sequencing centers funded by the National Institutes of Health, the HGSC provides a unique opportunity to work on the cutting-edge of genomic science in a state of the art institution. Today, the HGSC employs ~ 200 staff, and it occupies more than 36,000 square feet on the 14th, 15th, and 16th floors of the Margaret M. and Albert B. Alkek Building. The HGSC is located on the southwest edge of downtown Houston, the fourth largest city in the U.S., in the Texas Medical Center, the world's largest medical complex. The major activity of the HGSC is high-throughput DNA sequence generation and the accompanying analysis. The HGSC is also involved in developing the next generation of DNA sequencing and bioinformatics technologies that will allow greater scientific advances in the future.
This position with the Next-Generation Sequencing Informatics (NGSI) group requires a Bioinformatics Programmer with Linux/Unix command line and coding experience. As the HGSC’s Bioinformatics Core, NGSI manages the production, maintenance, and primary analysis of all genome sequencing data at the HGSC, including Illumina HiSeq X and NovaSeq informatics. NGSI also contributes to multiple clinical, Mendelian, and large cohort sequencing studies, specifically in the areas of structural variation and at-scale genomic data science. Under the direction of a senior manager, a qualified candidate will assist with running research informatics pipelines, managing data storage and delivery, and troubleshooting routine production issues.
- Support routine NGSI production tasks: storage management, pipeline troubleshooting, monitoring cluster health
- Work with NGSI production team to innovate improvements to NGS pipelines
- Support development and testing of software developed by NGSI used in running NGS pipelines
- Provide excellent customer service to other HGSC groups and outside collaborators through ticketing systems
- Bachelor's degree in Computer Science or Biological Sciences.
- Two years experience in scientific programming with applications to biological research or a Master's degree in Computer Science or Biological Sciences.
- Experience with Perl, HTML, CGI, relation databases and SQL.
- Programming in a UNIX/LINUX environment.
- Experience with C++ and programming using UNIX/LINUX clusters.
- At least 1 year of hands-on experience working on Linux or Unix-based systems from the command line
- At least 1 year of programming experience with Python or Java
- Familiar with running analyses on a HPC clusters (Moab, PBS, and Torque preferred
- NGS pipeline development
- NGS sequence analysis tools (e.g., BWA, Samtools, bedtools, bamUtils, Picard, GATK,vcftools,bcftools)
- Common genomics data formats (e.g., FASTQ, BAM, VCF, BED)
- Database and big data software (e.g. NoSQL, Hadoop, HBase)
- Statistical and visualization software (e.g. R, SAS)
- Demonstrated experience in software development or testing
- Structural variation detection methods