Sheep Genome Project

These particular sheep belong to a research flock at the US Sheep Experiment Station near Dubois, Idaho. Image taken from (Image Number K4166-5).

About the Project

The BCM-HGSC contributed to the reference genome assemblies of the sheep (Ovis aries).  The latest genome assembly is for a Rambouillet ewe a de novo assembly using Pacific Biosciences long-read sequence. The 70-fold data was generated in 2015, followed by Hi-C data in 2016. This project is designed to support planned functional annotation efforts (FAANG) that are similar to earlier ENCODE annotation projects for human and model organisms. The ewe donor for this reference genome and also provided samples for the annotation efforts.  The HGSC is producing transcript sequences using Illumina and Pacific Biosciences data.

The initial sheep assembly (Oar_v3.1) of the Texel breed reference genome was published by the International Sheep Genomics Consortium. BCM-HGSC contributed sequence data from a male Texel using the 454 sequencing technology that was combined with data from a female Texel for the reference genome sequence. A number of sheep breeds and wild sheep were sequenced with the Illumina technology to identify variants and develop a genotyping resource.

We produced an improved Texel sheep assembly (Oar_v4.0) using the Pacific Biosciences long-read sequencing technology to generate 20-fold sequence coverage of the male Texel genome and the PBJelly software to produce a more contiguous genome reference. This project was funded by the USDA.


This work is funded by the International Sheep Genomics Consortium and United States Department of Agriculture National Institute of Food and Agriculture (2013-67015-21228, 2013-67015-21372, 2017-67016-26301).

Genome Assemblies

Access to genome assemblies is provided by NCBI and CSIRO.

  • 2017: Rambouillet genome assembly available from BCM ftp site until available from NCBI

  • 2015: Oar_v4.0 (available from NCBI) incorporated Pacific Biosciences long read data to fill gaps in the Oar_v3.1 assembly.

  • 2012: Oar_v3.1 (available from NCBI and CSIRO) incorporated additional Illumina data and MeDIP-seq, BAC data and 454 data to fill gaps in the Oar_v2.0 assembly.

  • 2011: Oar_v2.0 (available from CSIRO) of Illumina data from the ewe, followed by gap-filling using Illumina data from the ewe and the ram by BGI.

  • 2010: Oar_v1.0 (available from NCBI and CSIRO) of the 454 data guided by the bovine genome.

Reference Genome Sequences

The Rambouillet data is available pre-publication for use by the research community. We intend to publish an analysis of the genome assembly and the FAANG annotation assays including those available here. 

Breed and Species Sequences

Additional Resources