Drosophila pseudoobscura Genome Project

Drosophila pseudoobscura  Photo, Dr. Stephen Schaeffer, Penn State University

Drosophila pseudoobscura

Photo, Dr. Stephen Schaeffer, Penn State University


December 2014

Additional PacBio P5 chemistry pre-publication data from our efforts towards a finished Drosophila pseudoobscura is available on our public FTP site.

We hope to get as near to a finished assembly as possible and publish a manuscript describing those efforts soon after.

Please contact Stephen (Fringy) Richards if you wish to use this pre-publication data.

December 2012

We have upgraded the D. pseudoobscura assembly using Pacific Biosciences long read data.

With 24× mapped coverage of PacBio long-reads, we addressed 99% of gaps and were able to close 69% and improve 12% of all gaps in D. pseudoobscura.

For details, please see, "Mind the Gap: Upgrading Genomes with Pacific Biosciences RS Long-Read Sequencing Technology"

December 2004

A publication reporting this work, "Comparative genome sequencing of Drosophila pseudoobscura: Chromosomal, gene and cis-element evolution", has been published in Genome Research (January 2005). The annotated sequence is now available in Flybase (to query and to browse) and in GenBank. Supplemental information, including the filtered BlastZ alignments and the set of protein alignments used for dN/dS analysis, is also available from BCM-HGSC using the BCM-HGSC data link in the sidebar.

We would like to thank everyone involved in this great project.

About the Project

The BCM-HGSC has sequenced the second Drosophila species: Drosophila pseudoobscura, Genome Research. The euchromatic portion of the genome (125 Mb) was sequenced to approximately seven-fold coverage using a whole genome shotgun approach and is being assembled using the BCM-HGSC's assembly engine, Atlas. Comparison of this species assembly with the recently completed sequence of Drosophila melanogaster is expected to offer important further insights into the biology of this historic model for experimental genetics.

Project Goals

  • Generation of high quality paired-end sequencing reads from the following library sources:

    • 2kb plasmid - 1.8 million reads completed

    • 6kb plasmid - 770,000 reads

    • 40kb fosmid - 45,000 reads

    • 175kb BAC - 12,000 reads

  • Generation of assembled draft and finished high-quality sequence for 16 fosmid projects

  • Generation of 50,000 ESTs and 1,000 full-length cDNA sequences from Drosophila pseudoobscura cDNA libraries

  • Generation of a whole genome assembly using the Atlas genome assembler

  • Annotation of the Drosophila pseudoobscura genome, primarily by comparison to the finished Drosophila melanogaster genome

  • Over the years many people have asked about the correspondence between fosmid read, end sequences, and the clones from the fosmid library. The clone id for the fosmid clones that were end sequenced can be found in the trace archive records for those sequences.

Genomic Resources

Access to the Data

Related Publications