Thursday, October 8, 2020

Rhizobium leguminosarum 20

A 16S flashback

 

In November 1991, Helen Downer and I were sequencing 16S genes of rhizobia. We used a recently-invented process called PCR (Saiki et al. 1988 http://dx.doi.org/10.1126/science.239.4839.487) and primers Y1 and Y2 that I had designed to amplify the first part of the gene (Young et al. 1991 http://dx.doi.org/10.1128/jb.173.7.2271-2277.1991). Then we sequenced the products by hand using big gels, X-ray film and 32P radioisotope. The PCR product was normally 308-312 bp, but we were intrigued by one pea-nodulating strain, SP18, that gave a much longer product. When we sequenced it, we found that the extra DNA was in a region that was normally conserved. The first stem-loop in the secondary structure of Rhizobium 16S rRNA usually looks like this (taken from my 1991 lab book):




 

The CCCC….GGGG stem is found in most Rhizobium and in Sinorhizobium. The GCAA loop is even more conserved in most Alphaproteobacteria, but instead of GCAA, strain SP18 had:

TCCTTCAAGCAAGCTTGAAG-ATTTTTATCCTTGGAAAGGAAGATCAAGAAGAGCTTCTAAGAAGCTTTCTTGATGGA

 

A few months later, I left the John Innes Centre for the University of York and got involved in new projects, so I never published this strange sequence. Last week, I started to look at conservation of the 16S sequence in the 429 Rlc genomes, but was motivated to dig out my old lab records because I saw a similar ‘extra’ sequence in a few genomes. In fact, not just similar, but identical, apart from an additional ‘G’ where I have shown ‘-‘ in the SP18 sequence (almost certainly, this was an error in our manual sequence, which was based on a single read). There are 11 genomes with the extra sequence; they are all in genospecies O, P and Q, but not all genomes in these genospecies have it.

 

The first 16 bases of this long ‘loop’ sequence are complementary to the last 16 (except a couple of ‘bulges’), so would be expected to extend the stem structure, but what kind of secondary structure would be adopted by the rest of the sequence is unclear. This is what I got when I sent the sequence to an RNA structure prediction site (http://rna.urmc.rochester.edu/RNAstructureWeb/):

 

 



The red part at the bottom is the conserved stem shown in the previous figure; the rest of the structure is speculative.

 

I am hoping that you, my readers, can help me here. I think I have seen publications fairly recently that have described similar ‘long’ sequences in this location of 16S, but I cannot remember where. Can someone point us to relevant papers? I also have a suspicion that the 16S rRNA may be cleaved within this sequence and exist as two disconnected strands within the ribosome, but I can’t remember whether someone else showed that or it was our own unpublished observation of an unexpected pattern of rRNA bands in nucleic acid preps.

 

All this is something of a digression. I just wanted to record the 16S sequences of all the strains because this is something that taxonomists like to look at, and I thought the result was going to be boring and uninformative. It turns out that there is more 16S sequence variation than I expected. There are also a few genome assemblies with broken 16S sequences or no 16S at all (!), and it is taking me a while to sort those out, so the ‘boring’ consideration of 16S variation will have to wait until the next post.

No comments:

Post a Comment