Improved methods to estimate macroevolutionary rates in fossils

Paper by Silvestro et al. on “Improved estimation of macroevolutionary rates from fossil data using a Bayesian framework” published in Paleobiology.

Abstract: The estimation of origination and extinction rates and their temporal variation is central to understanding diversity patterns and the evolutionary history of clades. The fossil record provides the only direct evidence of extinction and biodiversity changes through time and has long been used to infer the dynamics of diversity changes in deep time. The software PyRate implements a Bayesian framework to analyze fossil occurrence data to estimate the rates of preservation, origination, and extinction while incorporating several sources of uncertainty. Building upon this framework, we present a suite of methodological advances including more complex and realistic models of preservation and the first likelihood-based test to compare the fit across different models. Further, we develop a new reversible jump Markov chain Monte Carlo algorithm to estimate origination and extinction rates and their temporal variation, which provides more reliable results and includes an explicit estimation of the number and temporal placement of statistically significant rate changes. Finally, we implement a new C++ library that speeds up the analyses by orders of magnitude, therefore facilitating the application of the PyRate methods to large data sets. We demonstrate the new functionalities through extensive simulations and with the analysis of a large data set of Cenozoic marine mammals. We compare our analytical framework against two widely used alternative methods to infer origination and extinction rates, revealing that PyRate decisively outperforms them across a range of simulated data sets. Our analyses indicate that explicit statistical model testing, which is often neglected in fossil-based macroevolutionary analyses, is crucial to obtain accurate and robust results.

Phylogenomics of Geonomateae palms and the PhyloPalm baits

Paper by Loiseau (and the Geonoma consortium) et al. on “Targeted capture of hundreds of nuclear genes unravels phylogenetic relationships of the diverse neotropical palm tribe Geonomateae” published in Frontiers in Plant Science.

Abstract: The tribe Geonomateae is a widely distributed group of 103 species of Neotropical palms which contains six ecologically important understory or subcanopy genera. Although it has been the focus of many studies, our understanding of the evolutionary history of this group, and in particular of the taxonomically complex genus Geonoma, is far from complete due to a lack of molecular data. Specifically, the previous Sanger sequencing-based studies used a few informative characters and partial sampling. To overcome these limitations, we used a recently developed Arecaceae-specific target capture bait set to undertake a phylogenomic analysis of the tribe Geonomateae. We sequenced 3,988 genomic regions for 85% of the species of the tribe, including 84% of the species of the largest genus, Geonoma. Phylogenetic relationships were inferred using both concatenation and coalescent methods. Overall, our phylogenetic tree is highly supported and congruent with taxonomic delimitations although several morphological taxa were revealed to be non-monophyletic. It is the first time that such a large genomic dataset is provided for an entire tribe within the Arecaceae. Our study lays the groundwork not only for detailed macro- and micro-evolutionary studies within the group, but also sets a workflow for understanding other species complexes across the tree of life.

Codon evolution while accounting for protein and nucleotide selection

Paper published by Davydov et al. on “Large-scale comparative analysis of codon models accounting for protein and nucleotide selection” published in Molecular Biology and Evolution.

Abstract: There are numerous sources of variation in the rate of synonymous substitutions inside genes, such as direct selection on the nucleotide sequence, or mutation rate variation. Yet scans for positive selection rely on codon models which incorporate an assumption of effectively neutral synonymous substitution rate, constant between sites of each gene. Here we perform a large-scale comparison of approaches which incorporate codon substitution rate variation and propose our own simple yet effective modification of existing models. We find strong effects of substitution rate variation on positive selection inference. More than 70% of the genes detected by the classical branch-site model are presumably false positives caused by the incorrect assumption of uniform synonymous substitution rate. We propose a new model which is strongly favored by the data while remaining computationally tractable. With the new model we can capture signatures of nucleotide level selection acting on translation initiation and on splicing sites within the coding region. Finally, we show that rate variation is highest in the highly recombining regions, and we propose that recombination and mutation rate variation, such as high CpG mutation rate, are the two main sources of nucleotide rate variation. Although we detect fewer genes under positive selection in Drosophila than without rate variation, the genes which we detect contain a stronger signal of adaptation of dynein, which could be associated with Wolbachia infection. We provide software to perform positive selection analysis using the new model.

Genomics of clownfish mutualism

Paper by Marcionetti et al. on “​Insights into the genomics of clownfish adaptive radiation: genetic basis of the mutualism with sea anemones” just accepted in Genome Biology and Evolution.

Abstract: Clownfishes are an iconic group of coral reef fishes, especially known for their mutualism with sea anemones. This mutualism is particularly interesting as it likely acted as the key innovation that triggered clownfish adaptive radiation. Indeed, after the acquisition of the mutualism, clownfishes diversified into multiple ecological niches linked with host and habitat use. However, despite the importance of this mutualism, the genetic mechanisms allowing clownfishes to interact with sea anemones are still unclear.

Here, we used a comparative genomics and molecular evolutionary analyses to investigate the genetic basis of clownfish mutualism with sea anemones. We assembled and annotated the genome of nine clownfish species and one closely-related outgroup. Orthologous genes inferred between these species and additional publicly-available teleost genomes resulted in almost 16’000 genes that were tested for positively-selected substitutions potentially involved in the adaptation of clownfishes to live in sea anemones. We identified 17 genes with a signal of positive selection at the origin of clownfish radiation. Two of them (Versican core protein and Protein O-GlcNAse) show particularly interesting functions associated with N-acetylated sugars, which are known to be involved in sea anemone discharge of toxins.

This study provides the first insights into the genetic mechanisms of clownfish mutualism with sea anemones. Indeed, we identified the first candidate genes likely to be associated with clownfish protection form sea anemones, and thus the evolution of their mutualism. Additionally, the genomic resources acquired represent a valuable resource for further investigation of the genomic basis of clownfish adaptive radiation.

Phylogeny inference and coevolution

New paper by Meyer et al. on “Simultaneous Bayesian inference of phylogeny and molecular coevolution” in PNAS.

Abstract: Patterns of molecular coevolution can reveal structural and functional constraints within or among organic molecules. These patterns are better understood when considering the underlying evolutionary process, which enables us to disentangle the signal of the dependent evolution of sites (coevolution) from the effects of shared ancestry of genes. Conversely, disregarding the dependent evolution of sites when studying the history of genes negatively impacts the accuracy of the inferred phylogenetic trees. Although molecular coevolution and phylogenetic history are interdependent, analyses of the two processes are conducted separately, a choice dictated by computational convenience, but at the expense of accuracy. We present a Bayesian method and associated software to infer how many and which sites of an alignment evolve according to an independent or a pairwise dependent evolutionary process, and to simultaneously estimate the phylogenetic relationships among sequences. We validate our method on synthetic datasets and challenge our predictions of coevolution on the 16S rRNA molecule by comparing them with its known molecular structure. Finally, we assess the accuracy of phylogenetic trees inferred under the assumption of independence among sites using synthetic datasets, the 16S rRNA molecule and 10 additional alignments of protein-coding genes of eukaryotes. Our results demonstrate that inferring phylogenetic trees while accounting for dependent site evolution significantly impacts the estimates of the phylogeny and the evolutionary process.

Color patterns in clownfishes

New paper by Salis et al. on “Developmental and comparative transcriptomic identification of iridophore contribution to white barring in clownfish” in Pigment Cell Melanoma Res.

Abstract: Actinopterygian fishes harbor at least eight distinct pigment cell types, leading to a fascinating diversity of colors. Among this diversity, the cellular origin of the white color appears to be linked to several pigment cell types such as iridophores or leucophores. We used the clownfish Amphiprion ocellaris, which has a color pattern consisting of white bars over a darker body, to characterize the pigment cells that underlie the white hue. We observe by electron microscopy that cells in white bars are similar to iridophores. In addition, the transcriptomic signature of clownfish white bars exhibits similarities with that of zebrafish iridophores. We further show by pharmacological treatments that these cells are necessary for the white color. Among the top differentially expressed genes in white skin, we identified several genes (fhl2a, fhl2b, saiyan, gpnmb, and apoD1a) and show that three of them are expressed in iridophores. Finally, we show by CRISPR/Cas9 mutagenesis that these genes are critical for iridophore development in zebrafish. Our analyses provide clues to the genomic underpinning of color diversity and allow identification of new iridophore genes in fish.

Phylogenetic turnover in tetrapods

New paper by Saladin et al. on “Environment and evolutionary history shape phylogenetic turnover in European tetrapods” published in Nature Comm.

Abstract: Phylogenetic turnover quantifies the evolutionary distance among species assemblages and iscentral to understanding the main drivers shaping biodiversity. It is affected both by geo-graphic and environmental distance between sites. Therefore, analyzing phylogenetic turn-over in environmental space requires removing the effect of geographic distance. Here, weapply a novel approach by deciphering phylogenetic turnover of European tetrapods inenvironmental space after removing geographic land distance effects. We demonstrate thatphylogenetic turnover is strongly structured in environmental space, particularly in ecto-thermic tetrapods, and is well explained by macroecological characteristics such as nichesize, species richness and relative phylogenetic diversity. In ectotherms, rather recent evo-lutionary processes were important in structuring phylogenetic turnover along environmentalgradients. In contrast, early evolutionary processes had already shaped the current structureof phylogenetic turnover in endotherms. Our approach enables the disentangling of theidiosyncrasies of evolutionary processes such as the degree of niche conservatism anddiversification rates in structuring biodiversity.

Paper on coevolution database

Paper by Meyer, Dib et al. on “CoevDB: a database of intramolecular coevolution among protein-coding genes of the bony vertebrates” published in Nucleic Acid Research.

Abstract: The study of molecular coevolution, due to its potential to identify gene regions under functional or structural constraints, has recently been subject to numerous scientific inquiries. Particular efforts have been conducted to develop methods predicting the presence of coevolution in molecular sequences. Among these methods, a few aim to model the underlying evolutionary process of coevolution, which enable to differentiate the shared history of genes to coevolution and thus improve their accuracy. However, the usage of such methods remains sparse due to their expensive computational cost and the lack of resources alleviating this issue. Here we present CoevDB (, a database containing the result of a large-scale analysis of intramolecular coevolution of 8201 protein-coding genes of bony vertebrates. The web interface of CoevDB gives access to the results to 800 millions of statistical tests corresponding to all the pairs of sites analyzed. Several type of queries enable users to explore the database by either targeting specific genes or by discovering genes having promising estimations of coevolution.