Some more information
Many thanks to everyone who responded so quickly to my request for information on the country and host of origin, and especially to Marta Maluk who not only dealt with her own JHI strains but with many others as well. We now have a fairly complete list, and the few remaining gaps are not too important. The Google Sheet is still here, but if you have some changes to suggest, please let me know directly, because I have already downloaded the current state of the spreadsheet and may not notice any further changes on the Google Sheet. My main aim was to get a sense of whether some genospecies were confined to certain regions or hosts. For those genospecies with many strains, this is not generally true, apart from gsA, which only includes clover symbionts so far, though from various locations.
I have searched the genomes for matches to NodD, NodA and NodC sequences representing the three symbiovars viciae, trifolii and phaseoli. This is a useful complement to documentation of host of origin. There are a few isolates that appear to have lost their symbiosis genes in cultivation between isolation and genome sequencing. This is something that has been observed before – it seems that not all symbiosis plasmids are fully stable in culture.
Here is the phylogeny with the addition of the symbiovar data from this nod-gene search, and the strain names have been added, too.
I have checked the species assigned to all those strains that are included in the GTDB (http://gtdb.ecogenomic.org/). Some of the more recent accessions are not there yet, but there is good agreement for those that are. GTDB divides the Rlc into ten species plus two single-strain ‘species’, lumping together some of the closely related species and unique strains that are borderline but I have argued for keeping separate. For example, they place the whole F-clade in s__Rhizobium laguerreae. There are no direct conflicts between the two schemes, though. Here is the equivalence table.
Genospecies |
GTDB_species |
anhuiense |
s__Rhizobium anhuiense |
L |
s__Rhizobium leguminosarum_D |
M |
s__Rhizobium leguminosarum_I |
C |
s__Rhizobium leguminosarum_C |
D + CC278f + Norway |
s__Rhizobium leguminosarum_K |
E |
s__Rhizobium leguminosarum |
H |
s__Rhizobium leguminosarum_J |
A |
s__Rhizobium leguminosarum_E |
WYCCWR10014 |
s__Rhizobium sp001657485 |
Tri-43 |
s__Rhizobium leguminosarum_M |
G |
not represented |
S |
not represented |
I |
not represented |
Q, WSM1689, CCBAU10279, R, P, O, N |
s__Rhizobium laguerreae |
Vaf12 |
s__Rhizobium sp005860925 |
K, J, B |
s__Rhizobium leguminosarum_L |
Their taxonomy includes three further species that sound as though they ought to be in the Rlc but are actually more distant. Their s__Rhizobium leguminosarum_G covers WSM2297, which is somewhere close to R, hidalgonense. Their s__Rhizobium leguminosarum_A is for OV483, which is so far away that it is not even in the leguminosarum-etli clade. Their s__Rhizobium sophorae is actually R. sophoriradices – an unfortunate mistake that arose because the first version of the R. sophorae genome was not from the right strain.
I can also bring you, hot off the press, my summary figure of the ANI evidence for the 10 genospecies. I have included some of these individual plots in earlier posts, but now we have all 18 plots, in glorious colour. Each plot shows, in rank order, the ANI values for all 440 strains against the reference strain for that genospecies. Larger symbols indicate strains that belong to the genospecies in question, and the colours match the genospecies throughout. It took a few hours of battling with the intricacies of Seaborn FacetGrid to get to this point, but I think the result is pretty.
By the way, the figures in this blog are PNG files that you can download and save (using the right-click menu) so that you can take a closer look at them.
That’s all for now.
No comments:
Post a Comment