DNA Zoo Blog

This blog aims to shout out the release of new assemblies and sharing of data on this website.

The bearded seal (Erignathus barbatus) gets its name from the long white whiskers on its face. These whiskers are very sensitive and are used to find food on the ocean bottom. Bearded seals inhabit circumpolar Arctic and sub-Arctic waters that are relatively shallow (primarily less than about 1,600 feet deep) and seasonally ice-covered. In U.S. waters, they are found off the coast of Alaska [1], which is where the specimen behind this blog post came from.


Bearded seals are closely associated with sea ice, particularly pack ice. As such, they are sensitive to changes in the environment that affect the annual timing and extent of sea ice formation and breakup. Fun fact, bearded seals can sleep vertically in open water with their heads on the water surface! Like all marine mammals, bearded seals are protected under the Marine Mammal Protection Act. [1]


Today, we share the chromosome-length genome assembly for the bearded seal. The draft assembly was generated by the DNA Zoo team from short insert-size PCR-free DNA-Seq data using w2rap-contigger (Clavijo et al. 2017), see (Dudchenko et al., 2018) for details. Work was performed under Marine Mammal Health and Stranding Response Program (MMHSRP) Permit No. 18786-03 issued by the National Marine Fisheries Service (NMFS) under the authority of the Marine Mammal Protection Act (MMPA) and Endangered Species Act (ESA). The specimen used in this study was collected by Louis Binderman and provided by the National Marine Mammal Tissue Bank, which is maintained by the National Institute of Standards and Technology (NIST) in the NIST Biorepository, and which is operated under the direction of NMFS with the collaboration of USGS, USFWS, MMS, and NIST through the Marine Mammal Health and Stranding Response Program and the Alaska Marine Mammal Tissue Archival Project. We thank Ben Neely for his help with this sample!


This is the third earless seal genome (Phocidae) in the DNA Zoo collection after the harbor seal (Phoca vitulina) and the Northern elephant seal (Mirounga angustirostris). We have already made an observation, that the earless seal genomes appear to be very similar to each other, see here. In support of this analysis, see below how the new earless seal genome assembly compares to the harbor seal genome. Once again, the genomes appear to be identical up to chromosome #2 in the harbor seal which appears as two separate chromosomes in the bearded seal (#4 & #1).

Whole genome alignment plot between the genome assemblies of the harbor seal (GSC_HSeal_1.0_HiC) and the bearded seal (Erignathus_barbatus_HiC).

  • Klaus-Peter Koepfli

Klaus-Peter Koepfli, Marlys Houck, Erez Aiden, and Olga Dudchenko


Pangolins, also known as scaly anteaters, belong to an entirely distinct order of mammals known as the Pholidota (from Greek meaning “horny scale”). DNA evidence has established that the closest living group to the pangolins is the Carnivora, the order containing the cats, civets, mongooses and hyenas as well as the dogs, bears, raccoons, weasels, seals and sea lions. The earliest pangolin fossils date from the Eocene (~34 to 56 million years ago), but molecular dating suggests the ancestors of Pholidota may originated around 80 million years ago in the Cretaceous.


The 8 living species of pangolins were originally classified in the genus Manis, but analyses based on fossils, morphology, and molecular data indicate that the species are divided into three well defined genera: Manis (Asian pangolins, 4 species), Phataginus (African tree pangolins, 2 species) and Smutsia (African ground pangolins, 2 species).


Pangolins are among the most unusual of mammals when it comes to their biology. Along with their bodies being covered in sharp, keratinized scales which provides armor against predators when they roll up into a ball, pangolins lack teeth and instead use their extremely long tongue and sticky saliva to feed on ants and termites. Interestingly, molecular genetic and genome studies have shown that several genes involved in the development of teeth have become pseudogenized in pangolins. They possess long and thick claws at the ends of their powerful limbs which they use to dig burrows, break open ant and termite nests and to climb trees.


Pangolins are considered the most illegally trafficked mammals in the world. They are heavily poached due to the high demand for their scales, which are used in Asian traditional medicines, and for their meat. The eight species are listed as either Threatened or Critically Endangered on the IUCN Red List of Threatened Species. Vigorous international efforts are trying to curtail the illegal trafficking of pangolins in order to prevent their extinction. You can learn more about pangolins and their conservation on the Save Pangolins website.


Pangolins have also been in the news recently because of their possible link to the novel coronavirus that has been infecting people within and outside of China. Several studies have reported evidence that the Malayan pangolin (Manis javanica) to be a possible intermediate host and reservoir of coronaviruses that are related to the novel human coronavirus causing COVID-19 [1, 2, 3]. However, these studies have not been formally peer-reviewed and therefore, the conclusions should be interpreted with caution.


Today, we are proud to share the very first chromosome-length assembly of one of the 8 species of the Pholidota, the white-bellied or tree pangolin (Phataginus tricupis) from Africa. The assembly was generated from a female white-bellied pangolin named “Jaziri” who is housed at the Pittsburgh Zoo & PPG Aquarium, in Pittsburgh, Pennsylvania. The chromosome-length assembly is based on a draft assembly generated using 10x Genomics linked-read sequencing and Supernova version 2.0.1.


Jaziri’s assembly revealed an amazing finding: the presence of as many as 114 diploid chromosomes! This discovery was first observed independently in 2009 in standard giemsa stained and C- and G- banded karyotypes from several individuals of white-bellied pangolin from San Diego Zoo’s Frozen Zoo®. This was unexpected because previous karyotype studies of three Asian pangolin species (Chinese, Indian and Malayan) showed a chromosome complement between 2n=36-42. This would also make white-bellied pangolins the mammal species outside of the Rodentia (mice, rats, squirrels and their allies) with the highest number of chromosomes, among those whose karyotype has been examined. The current record holder for highest chromosome number in mammals is the Bolivian bamboo rat, Dactylomys boliviensis, with a 2n = 118, which in 2001 broke the record of 2n=102, previously held by another rodent, the red vizcacha rat, Tympanoctomys barrerae, which was reported in 1990 [4]. The karyotypes of the San Diego animals show the presence of many small chromosomes and Jaziri’s Hi-C contact map show the presence of many small c-scaffolds, which likely correspond to these very small physical chromosomes. A manuscript describing these results is in preparation.


Jaziri’s assembly was made possible through a collaboration of the following individuals: Tom Smith, Department of Ecology and Evolutionary Biology and Director of the Center for Tropical Research at UCLA; Klaus-Peter Koepfli, Center for Species Survival, Smithsonian Conservation Biology Institute; Kenneth Kaemmerer, Curator of Mammals, and Ginger Sturgeon, Director of Animal Health, Pittsburgh Zoo & PPG Aquarium; Jan Janecka, Department of Biological Sciences, Duquesne University; and Olga Dudchenko, Arina Omer and Erez Aiden, The Center for Genome Architecture, Baylor College of Medicine and Rice University. Karyotypes of the San Diego Zoo pangolins were made possible by Marlys Houck, Julie Fronczek and Ann Misuraca, San Diego Zoo Institute for Conservation Research.

We are pleased to announce that we reached our next milestone with 125 shared genome assemblies on the website. What better way to celebrate than by doing upload of raw data to NCBI Sequence Read Archive?


As usual, the data is shared under BioProject accession PRJNA512907. The new submission covers 32 biosamples, with raw Hi-C data for 26 species and raw WGS data for 17 species. In total, the DNA Zoo BioProject data now spans 274 experiments and 16,039,729,964,358 bases!


We thank Illumina, Macrogen, Novogen, the Broad Institute and Baylor College of Medicine GARP core for their help with the data production!


As always, we share the data without restrictions: see our data usage policy here.


Subscribe to the website updates below or follow us on twitter for more updates!

ARC-Logo-Final-2018-01.png

© 2018-2019 by the Aiden Lab.