top of page

Announcing the release of updated (version 2) genome annotations, plus the initial release of 39 newly annotated DNA Zoo genomes!


tl;dr, the entire set of >2.9 million protein-coding genes spanning 109 mammalian genomes, can be found here (see also Wasabi mirror).  This set is much improved over the version 1 annotations, with the fraction of missing mammalian BUSCOs down to 5% (from 10%). We’ve called on average 28141 genes per species (Min 22,417 Eidolon helvum, Max 45,707 Saimiri boliviensis). 95.3% of genes are assigned to 74,713 orthogroups. 9165 species specific genes have been assigned to 2609 orthogroups.


All protein files, transcripts and the gff3 can be found in data release folders associated with each individual assembly. Orthofinder summary files are found here, while the file that contains the orthogroups are found here


What did we do differently?


Remember that in the 1st attempt, we used genes contained in the Swiss-prot reference.  To update, in brief, we added more reference material. Non-coding RNAs, transcript evidence from other species focusing on adding coverage to carnivores, rodents, and primates. This additional transcript information has dramatically improved our ability to detect genes in genomes.


See this blog post for information on the original version 1 genome annotations. 


The updated maker control file is here:, and the reference fastas used are located here. Given these files, the runs should be fully reproducible. 


Each annotation took between 48 and 72 hours to run across 80 cores, for a total of about 450,000 core-hours!


The Phylogeny


The phylogeny of these 109 mammals (plus the Ostrich used as an outgroup) was computed using OrthoFinder 2.4.0. The image is below, and the Newick text file is here


What’s next?

  1. We can still do better, but for this we need RNAseq data! If you have transcriptome data for any of the DNA Zoo genomes, and would like to share it, I’d be happy to update the annotation! This would really help us improve both the completeness of the genomes, but also the accuracy.

  2. Is your favorite gene missing? Let us know and we can see where it went.




539 views0 comments

Updated: Jul 3, 2020

The brush rabbit, Sylvilagus bachmani, is one of several species of cottontail rabbits. They have a short, fluffy tails that may be white or gray in color. Inhabiting the western costal region of North America, brush rabbits may be found foraging through shrub-lands, woodlands, and coniferous forests. Though they rarely leave the brush for long, they may be seen basking in the sun in nice weather. If they’re feeling particularly excited or playful, brush rabbits can binky, jumping up in the air while twisting their bodies and kicking their feet [1]!

Brush rabbit by Allan Hack, [CC BY-ND 2.0], via flickr.com

The brush rabbit are prolific breeders, producing around 3 litters with an average of four offspring a year [2]. The population is kept in check by their many predators, including snakes, foxes, coyotes, and bobcats. When startled, brush rabbits may thump their back feet on the ground in surprise! Brush rabbits avoid predators by running at speeds of 40 km/hr in zig zag patterns [3].  

Originally sampled in 1976, this assembly was created from primary fibroblasts obtained from T.C. Hsu CryoZoo at the University of Texas MD Anderson Cancer Center. 44 years later, we share the chromosome length assembly of the brush rabbit. This is a $1K genome assembly with contig N50 = 58 Kb and scaffold N50 = 116 Mb. See Dudchenko et al., 2018 for details on the procedure.


This is only a second chromosome-length genome assembly for a rabbit in our collection: previously we shared a few tweaks to the European rabbit genome assembly from the Broad institute (Lindblad-Toh et al., 2011), here. The second genome gives us the first opportunity to compare karyotypes within the rabbit family. Included below is the whole-genome alignment plot between the two rabbit genomes: the genome appear to be highly collinear, with two fusion events (circled in blue) apparent responsible for the difference in karyotypes: 2n=44 in the European rabbit vs 2n=48 in the brush rabbit!

Whole-genome alignment plot between the only other rabbit with a chromosome-length genome assembly before the brush rabbit, the European rabbit (assembly OryCu2.0_HiC, from the Broad institute with DNA Zoo tweaks) and the new brush rabbit genome assembly (Sylvilagus_bachmani_HiC).


86 views0 comments

Mountain zebras (Equus zebra), endemic to South Africa and Namibia, are one of three extant species of extant zebras and comprise two recognized subspecies, the Cape mountain zebra (E. z. zebra) and Hartmann’s mountain zebra (E. z. hartmannae). Like their sister species, plains and Grevy’s zebras, they are recognizable by their iconic black and white stripes. Mountain zebras fall between the other two species in size and the thickness of their stripes. Mountain zebras can also easily be distinguished by the fact that they possess a dewlap, a fold of skin hanging from the throat.


As their name implies, they prefer mountainous terrain up to about 3000 feet. Once listed as endangered (IUCN Red List - 1996) with a global population of between 2-3000 only 80 of which were Cape mountain zebra, the species has rebounded to a global population of 35,000, 1700 being the Cape subspecies. They are still vulnerable, however, due to habitat fragmentation and the potential threat of increased drought due to climate change. It was drought that caused the catastrophic decline that led to the population nadir in the 1980’s.

Mountain Zebra stallion by Bernard Dupont, [CC BY-SA 2.0], via flickr.com

Today we share the $1K assembly of the mountain zebra. The sample for the assembly was provided by a mountain zebra named Zakota and obtained by Greg Barsh (Hudson Alpha/ Stanford University) and Ren Larison (UCLA) during a visit to the Hearts and Hands Animal Rescue in Ramona, CA, owned by animal lover and zebra whisperer Nancy Nunke. During our visit Nancy had us stroke the fur along the back of a mountain zebra, allowing us to learn an unusual fact about them; the fur between the saddle and rump grows backward, with the nap back to front instead of front to back.

Like the plains zebra, the mountain zebra shows quite a bit of rearrangement in their chromosomes relative to the domestic horse (see whole genome alignment plots below). This rearrangement is also reflected in the large differences in number chromosomes among the three species, with the horse having 32 pairs of chromosomes, the plains zebra 22, and the mountain zebra only 16 – half that of the horse. In spite of these re-arrangements equids are notorious for their ability to hybridize, leading to the fascinating pelage patterns seen in hebras and zorses, as well as potential conservation threats due to hybridization between the rarer zebra species - mountain zebras and Grevy’s zebras - and the vastly more common plains zebra.

Whole-genome alignment between the new chromosome-length genome assembly for the mountain zebra (Equus_zebra_HiC) and that of the domestic horse (EquCab2.0, from Wade et al., Science 2009).

129 views0 comments

bottom of page