Tuesday, August 25, 2020

Rhizobium leguminosarum 9

 

How do we define the Rhizobium leguminosarum species complex?

Before continuing our tour of the Rhizobium leguminosarum species complex (Rlc), I want to pause to consider what we are trying to achieve. I was prompted to address this important question by a comment that Stéphane Boivin and Marc Lepetit have made on my recent post Rhizobium leguminosarum 5. Here is their comment:

We are enthusiastic about the idea of clarifying/defining the boundaries of the leguminosarum complex species. This will certainly help the community. There is no doubt that you are the best expert to solve this question. We will be pleased to help, if necessary, although we are specialists in systematic.
We read you posts with great interest. Thank you to give us the historical perspective and for initiating the debate for a new rationale organization of R leguminosarum , based on our current knowledge.
In your first tree you suggest to restrict Rlc to the large green clade. What do you think about extending this complex to the Anhuiense Gs and more generally to all related bacteria that nodulate clover, pea, fababean and bean that share closely related symbiotic clusters on symbiotic plasmid? The general objective might be to define a large leguminosarum complex gathering all “leguminosarum” symbiovars? We know, for example, that bacteria of the symbiovar viciae may belong to R anuihense R pisi or R binae… It is apparently the same story for symbiovars trifoli or phaseoli… To our understanding, Anhuiense Gs were defined as separate species only because their ANI with R leguminosarum Gs are lowers than the arbitrary limit. But, up to now there is little evidence suggesting that their symbiotic characteristics may strongly differ from the other leguminosarum (ie nodulating clover, pea, faba etc and possibly associated as PGP with non legume plants). To our knowledge, there is no clear rule and/or ANI threshold to define a complex species boundaries. Is this aim completely heterodox or unrealistic?

Thank you for your comment, and for pointing out, quite rightly, that the nodulation genes that define symbiovars viciae, trifolii and phaseoli have a wider host range than just the “green clade”. Before discussing this, I want to make a clear distinction between phenotypic classifications and taxonomic classifications. Taxonomy is, or should be, based on phylogenetic clades – each consisting of an ancestral organism and all its descendants. Each species should be a clade within its genus, each genus a clade within its family, and so on. Fortunately, we know that bacteria do have a single true phylogeny – every bacterial cell arises by division of exactly one parent cell. If we could see all the cell divisions since the last common ancestor, we would know the true phylogeny. Unfortunately, we can’t, so we try to reconstruct it from the sequences of genes. We choose those genes that are most likely to have been handed down vertically from parent to offspring, namely core genes with essential functions. Even these may occasionally be replaced by versions that come into the cell from other bacteria, but such transfers usually only affect one or a few genes at a time, in one lineage at a time, so we hope that if we use a large number of core genes, such disturbances will be diluted out and we will have an approximation to the true phylogeny.

In bacteria, many important adaptations to the environment are not gained by mutation of the core genes, but by acquisition of functional modules encoded by ready-made sets of genes that are transferred from other bacteria. The evolutionary history of these genes is clearly not going to match that of the bacterial cells, so they cannot be used for phylogeny-based taxonomy. In rhizobia, the symbiosis genes are the best-known example of an accessory module of this kind, and their history of horizontal transfer has been extensively documented (see Andrews et al. 2018 https://doi.org/10.3390/genes9070321).  That is why the taxonomy subcommittee pointed out, in its Minimal Standards document (de Lajudie et al. 2019 https://doi.org/10.1099/ijsem.0.003426), that symbiotic properties, though certainly interesting and important phenotypes, cannot be used as taxonomic characters when defining new rhizobial taxa.

If I understand correctly, you are proposing that we should define the R. leguminosarum species complex as including all species in which one or more of the symbiovars viciae, trifolii or phaseoli have been found. This could potentially be the whole leguminosarum-etli clade, which includes R. leguminosarum, laguerreae, sophorae, indicum, etli, anhuiense, ecuadorense, acidisoli, hidalgonense, vallis, pisi (syn. fabae), bangladeshense, sophoriradices, phaseoli, esperanzae, aethiopicum (syn. aegypticum), binae, lentis, and probably others. This clade is one of the major subdivisions of the genus Rhizobium and could perhaps be given a formal name as a subgenus. It is the clade with blue branches in the phylogeny I showed in the Rhizobium leguminosarum 3 post, and is a very distinct group on a long, well-supported branch. There is no guarantee, of course, that all species in this clade will match your phenotypic criterion: there may be species not yet described that do not, or even cannot, form a root nodule symbiosis. From what we know so far, though, it seems that the ability to host and express these particular sets of symbiosis genes may be a shared property of the species within this clade.

This leguminosarum-etli clade is much wider than the grouping that I want to focus on at the moment, which mostly consists of strains that are still called R. leguminosarum. You are right that there is no clear rule to define a species complex, because this is not a formal taxonomic category, although the term has been used in other bacterial groups, most notably for the Burkholderia cepacia complex, known as the Bcc (e.g. Mahenthiralingam et al. 2015 https://www.nature.com/articles/nrmicro1085). The Rlc is a unit of a comparable size to the Bcc (I should probably drop the italics and just call it the Rlc). It consists of a number of genospecies that are closely related but diverged enough that they meet the customary criteria for defining separate species. Indeed, several of them have already been given their own species names, but so far this has been done one by one in a haphazard way with no overall view of their place within the Rlc.

To summarise, the group that I am currently concerned with and am defining as the Rlc is the clade on a green background in the phylogeny I showed in the Rhizobium leguminosarum 3 post. It is an appropriate size to be called a species complex. The larger group that you are interested in is the whole leguminosarum-etli clade, shown as blue branches in the phylogeny. This is of an appropriate size to be defined as a subgenus, though I am not proposing to do that right now (it would require that the rest of the genus was also split into named subgenera). This larger grouping definitely needs a lot of work to clarify its structure, but that is work for the future. My immediate aim is to tackle the smaller grouping first.

No comments:

Post a Comment