Additional PacBio P5 chemistry pre-publication data from our efforts towards a finished Drosophila pseudoobscura is available on our public FTP site.
We hope to get as near to a finished assembly as possible and publish a manuscript describing those efforts soon after.
Please contact Stephen (Fringy) Richards if you wish to use this pre-publication data.
We have upgraded the D. pseudoobscura assembly using Pacific Biosciences long read data.
With 24× mapped coverage of PacBio long-reads, we addressed 99% of gaps and were able to close 69% and improve 12% of all gaps in D. pseudoobscura.
For details, please see, "Mind the Gap: Upgrading Genomes with Pacific Biosciences RS Long-Read Sequencing Technology"
A publication reporting this work, "Comparative genome sequencing of Drosophila pseudoobscura: Chromosomal, gene and cis-element evolution", has been published in Genome Research (January 2005). The annotated sequence is now available in Flybase (to query and to browse) and in GenBank. Supplemental information, including the filtered BlastZ alignments and the set of protein alignments used for dN/dS analysis, is also available from BCM-HGSC using the BCM-HGSC data link in the sidebar.
We would like to thank everyone involved in this great project.
About the Project
The BCM-HGSC has sequenced the second Drosophila species: Drosophila pseudoobscura, Genome Research. The euchromatic portion of the genome (125 Mb) was sequenced to approximately seven-fold coverage using a whole genome shotgun approach and is being assembled using the BCM-HGSC's assembly engine, Atlas. Comparison of this species assembly with the recently completed sequence of Drosophila melanogaster is expected to offer important further insights into the biology of this historic model for experimental genetics.
Generation of high quality paired-end sequencing reads from the following library sources:
2kb plasmid - 1.8 million reads completed
6kb plasmid - 770,000 reads
40kb fosmid - 45,000 reads
175kb BAC - 12,000 reads
Generation of assembled draft and finished high-quality sequence for 16 fosmid projects
Generation of 50,000 ESTs and 1,000 full-length cDNA sequences from Drosophila pseudoobscura cDNA libraries
Generation of a whole genome assembly using the Atlas genome assembler
Annotation of the Drosophila pseudoobscura genome, primarily by comparison to the finished Drosophila melanogaster genome
Over the years many people have asked about the correspondence between fosmid read, end sequences, and the clones from the fosmid library. The clone id for the fosmid clones that were end sequenced can be found in the trace archive records for those sequences.
Access to the Data