To celebrate the DNA Zoo's 200th mammalian species release, we proudly present not one, but three de novo primate genome assemblies for the gelada baboon (Theropithecus gelada), the siamang (Symphalangus syndactylus), and the talapoin monkey (Miopithecus talapoin)!

All three are exceptionally interesting monkeys! For example, the gelada baboon is the only grass grazing primates living today, thought to be the last of an ancient line of grass-eating primates [1]. The talapoin monkeys are the smallest of the Old World monkeys. They are also among the strongest swimmers of all primates [2]! Finally, the siamang, are capable of exceptionally loud vocalizations thanks to their inflatable throat sacs which can be heard from 2 miles away [3].

And all three are exceptionally interesting assemblies!

The siamang genome assembly was created thanks to a blood sample donation from Jambi, the siamang living at the Houston Zoo. (Check out this slide show of Jambi and her little family by the Houston Zoo!) On finalizing Jambi's assembly we discovered something unexpected. We have put together 25 long sequences, corresponding to 25 pairs chromosomes, one coming from Jambi's mom, and one from her dad. And just as with humans, we expected that the pairs of parental chromosomes would be nearly identical. Not so for Jambi. We saw major rearrangements between the maternal and paternal copies not for one but two chromosomes (see figure below)!

Hi-C contact maps of the two 'atypical' chromosomes in the siamang Jambi's genome assembly (2&4). The contact data from both the maternal and paternal copies of the chromosomes are aligned to the same arbitrarily chosen reference sequence (we do not know at present whether this version represents the mom or the dad). The resulting maps contain off-diagonal enrichments (circled) not characteristic of the well-assembled references. These enrichments are a consequence of large rearrangements between the maternal and paternal copies of the chromosomes, that make it impossible to represent both with a single sequence.

Interestingly, two subspecies of siamangs have been nominated, the the sumatran and the malaysian siamang, but we have not tracked that in the US population. Very likely there have been crosses between the populations, and Jambi may be a result of one (or several) of those crosses. Jambi's data may be a solid piece of data in support of the subspecies (if not species) nomination, and may actually suggest that the differences between the subpopulations may extend to large-scale differences in chromosome structure and explain some of the breeding difficulties! We look forward to exploring this question with more of our siamang samples and with phased analyses. (Send us siamang samples!)

The two other samples from today's release come from the T.C. Hsu Cryo-Zoo collection, and have been both sampled back in the late 1970s: 1978 for the gelada, and 1979 for the talapoin! The talapoin sample, an ear biopsy from a Louisiana Purchase Zoo at Monroe, LA, has been identified as southern (Angolan) talapoin, presumably based on morphological characteristics. The de novo mitochondrion assembly for the sample, however, places the sample more closely with the northern (Gabon) talapoin samples in the NCBI collection (see figure below). The same is true of at least one more sample among those available via NCBI (Tala). It is not yet clear if this suggests another hybridization even in the ex situ population. We look forward to figuring out this mystery out as we continue our primate research.

BLAST alignment results for the de novo mitochondrion assembly of the DNA Zoo southern talapoin sample.

Finally, even our gelada sample was not without surprise, showing a 17Mb deletion in one of the haplotypes of chromosome 16. We suspect this to be a culturing artifact, but look forward to confirming this suspicion with new data.

Please send us more primate samples (including more individuals for the tree species from today's release) to help resolve these mysteries. Don't forget to browse the chromosome-length contact maps for the genomes below to see the contact data evidence for yourself. All three of today's chromosome-length assemblies follow the $1k method outlined in Dudchenko et al. 2018. (For more assembly procedure details, please see our Methods page.) We release also the mitochondrion assemblies for all three samples, available via respective assembly pages: here (T. gelada), here (S. syndactylus) and here (M. talapoin).

With these three samples we cross the mark of 200 assembled mammalian species! Thank you to our ever growing list of collaborators and the organizations that have provided invaluable support to create this encompassing dataset. Stay tuned for more chromosome-length mammal (and other) genomes by joining our mailing list below and following us on Twitter @thednazoo!


