Honey Bee Genome Project

Image source: Louise Docker from sydney, Australia (Lift Off- Best Viewed Large) [CC-BY-2.0], via Wikimedia Commons

About the Project

The BCM-HGSC sequenced the honey bee, Apis mellifera. The latest version, the Amel_4.5 assembly, is available in GenBank. The version 4.0 assembly was published in October 2006.

The honey bee is important in the agricultural community as a producer of honey and as a facilitator of pollination. It is a model organism for studying the following human health issues: immunity, allergic reaction, antibiotic resistance, development, mental health, longevity and diseases of the X chromosome. In addition, biologists are interested in the honey bee's social organization and behavioral traits.

Sequencing of the honey bee is jointly funded by National Human Genome Research Institute (NHGRI) and the Department of Agriculture (USDA). Multiple drones from the same queen (strain DH4) were obtained from Danny Weaver of B. Weaver Apiaries.

All libraries were made from DNA isolated from these drones. The honey bee BAC library (CHORI-224) was prepared by Pieter de Jong and Katzutoyo Osoegawa at the Children's Hospital Oakland Research Institute.

Publications include a main paper in Nature and up to forty companion papers in Genome Research and Insect Molecular Biology. The BCM-HGSC intends to publish the analysis of the improved genome assembly.

Genomic Resources

The BCM-HGSC intends to publish an analysis of this genome. Use is covered by the Toronto, Ft. Lauderdale and Bermuda agreements (see conditions for use).

The published genome Amel_4.0 is also available.

Comparisons of cDNA sequences to the genome assemblies are available to evaluate assembly completeness and correctness by using the FTP Data link in the sidebar. See the alignments for the different assemblies (1.0, 1.1, 1.2, 2.0, 3.0).

Traces are available from the NCBI Trace Archive by using the link, which can be searched using NCBI MegaBLAST with a same species or cross species query.

Additional sequence data generated using the Roche 454 platform can be downloaded from the 454 Sequence Data site or searched using BLAST.

Honey Bee SNP Data

Africanized honey bee sequences were aligned to the genome assembly to identify single nucleotide polymorphisms. The sequence reads are available from the NCBI Trace Archive link in the sidebar. The SNP analysis results are also available for download using the FTP Data link.

Single nucleotide polymorphisms were also identified in the haplotypes present in the genome assembly.


Date Released Release Name Coverage Comments
2011 Feb Amel_4.5 8x Scaffolding and gap filling using ABI SOIiD and Roche 454 sequence data.
2006 Mar 10 Amel_4.0 7.5x Mapped to chromosomes using improved genetic map.
2005 May 1 Amel_3.0 7.5x Created by adding repetitive reads generated by shotgun sequencing to the previous whole genome shotgun (WGS) reads. In addition, small contigs from contaminated sequences and unmerged contigs from the second potential haplotype were excluded from this assembly.
2005 Jan 20 Amel_2.0 7.5x A new assembly created by adding reads generated by shotgun sequencing of purified AT-rich genomic DNA, Fosmid clone ends, and BAC reads to the previous whole genome shotgun (WGS) reads.
2004 Jul 20 Amel_1.2 6x A new assembly created by adding reads generated by shotgun sequencing of purified AT-rich genomic DNA to the previous whole genome shotgun (WGS) reads. In earlier assemblies, some AT-rich regions of the genome had lower coverage, and this new set of reads addressed this issue.


This project was proposed to the BCM-HGSC by a group of dedicated insect biologists, headed by Gene Robinson. Following a workshop at the BCM-HGSC and a honey bee white paper, the HGSC began the project in 2002. A 6-fold coverage WGS, BAC sequence from pooled arrays, and an initial genome assembly (Amel_v1.0) were released beginning in 2003. This has been a challenging project with difficulty in recovering AT-rich regions. The WGS data had lower coverage in AT-rich regions and BAC data from clones showed evidence of internal deletions. Additional reads from AT enriched DNA addressed these underrepresented regions.

The assembly Amel_4.0 was produced with the Atlas assembly system and published in a paper in Nature and companion papers in Genome Research and Insect Molecular Biology.

It includes 2.7 million reads (1.8 Gb) or 7.5x coverage of the (clonable) genome. About 97% of STSs, 98% of ESTs, and 96% of cDNAs are represented in the 231 Mb assembly.

Initial SNP data was prepared using about 2,500 reads produced from a strain of Africanized bees. The data is available in dbSNP and the NCBI Trace Archive.

Analysis of the genome by a consortium of 20 labs has been completed. This produced a gene list derived from five different methods melded through the GLEAN software.

Additional Resources

Learn more about the honey bee

Related Publications