Molecular bumper cars (RNA polymerase-ribosomal interactions): their (unexpected) functional effects and how to control them

Cells are extremely complex.1 Much of their “core” complexity appears to have been present in their last (universal) common ancestor, known as LUCA. We find it in the “conserved” molecular mechanisms and machines active in modern cells. LUCA and its offspring are membrane-bounded, non-equilibrium systems that import free energy and export entropy to maintain and repair themselves, to grow, behave, and reproduce (and all the other things living things do). One problem, however, with LUCA is that it makes speculation on the steps that preceded it impossible to know with certainty. Not withstanding claims of breakthroughs (e.g. ‘Monumental’ experiment suggests how life on Earth may have started“), it is likely that we will never know the actual steps involved; after all, the origin of life occurred billions of years ago and under rather different conditions than exist today.

Living systems “work” based on inherited, pre-existing molecular machines and mechanisms (1). The actions of these machines are fueled through coupling to thermodynamically favorable reactions taking place under non-equilibrium conditions, i.e. the living state. Looking at the details of these interactions reveals interesting and unexpected behaviors. Unfortunately, the “simple” physical-chemical underpinnings of these processes, key to understanding them, are not always presented to students effectively (2).  At the same time, the complexity of cellular systems means that in practice, the link between “simple” molecular mechanisms and the behavior of a biological systems can be obscure (see 3).  That said, key insights are illuminated when molecular mechanisms are examined, as illustrated by Wee et al., (2023)(4).  

Emerging from LUCA, biological populations have diverged into distinct “prokaryotic” lineages: the bacteria and archaea.2 Both are defined by a protein-lipid boundary layer, the plasma membrane. Within this membrane is a single internal compartment, the cytoplasm. Information is stored in cells in two forms, first in the on-going LUCA-derived living system and the second, information in the sequence of double-stranded DNA molecules. These two types of information are interdependent: the information in DNA makes sense only within a cell and the on-going cellular processes depend upon and utilize the information in DNA. In bacteria and archaea, these are circular double-stranded DNA molecules. Here we restrict our discussion to the common unicellular bacterium Escherichia coli (E. coli), one of the workhorse systems that led to an understanding of core molecular mechanisms.  

E. coli hasa single circular genomic DNA molecule of ~5 million nucleotide base pairs in length; it contains about 5000 distinct genes that encode polypeptides and functional “non-coding” RNA molecules (if you are interested in numbers, check out bionumbers).  An E. coli cell is rod-shaped and ~1 micrometer (10-6 meters) long. Its genomic DNA molecule is ~1000 times longer than the cell that contains it, and a rapidly dividing cell can contain multiple copies of the genome. Genes typically contain two distinct functional regions. Regulatory regions interact with various proteins that determine whether a gene is “expressed” or not. Coding regions specify what is expressed. The first step is the synthesis of an RNA molecule; such a molecule can encode a polypeptide or a non-coding RNA.3 Non-coding RNAs can have structural, catalytic, or regulatory functions.  

The first step in gene expression in all cell types is the binding of proteins to a gene’s regulatory sequences. Typically a complex of proteins leads to the binding and activation of a DNA-dependent, RNA polymerase. The RNA polymerase complex unwinds a specific region of the DNA and uses the complementary nature of nucleotide base pairing: A binding to T in DNA and U in RNA, and C binding to G in both, to synthesize an RNA molecule based on DNA sequence. Synthesis of RNAs that encode polypeptides, known as messenger RNAs (mRNAs) starts with the 5′ end of the molecule and moves toward the 3′ end (replaced ↓ soon).

In prokaryotic cells, both DNA and mRNA synthesis reactions occur in the cytoplasm. A ribosome, a molecular machine composed of multiple proteins and RNAs, can engage the 5′ end region of an mRNA as soon as it appears – before the synthesis of the mRNA is complete. The cytoplasm of a cell contains lots of ribosomes; in E. coli there are ~70,000 ribosomes per cell (more or less). This leads to some interesting and functionally significant interactions.  One thing to consider, not always stressed, is that these synthetic processes are not error proof.  DNA replication (DNA-directed, DNA synthesis), transcription (DNA-directed, RNA synthesis), and polypeptide synthesis (RNA-directed, polypeptide synthesis) all have an error rate, typically 1 error per ~106 addition events for DNA replication and transcription. Errors can lead to mutations in the DNA, RNAs that encode abnormal proteins, or abnormal and potentially toxic polypeptides.

To deal with physical realities, these synthetic processes employ various “error correction” strategies.  In the case of DNA and RNA synthesis, the polymerases involved have what is known as “proof-reading” activities. If the incorrect nucleotide is inserted into a growing DNA or RNA chain, it can be recognized; the polymerase can then “reverse” (move backward along the DNA), remove the mistakenly inserted nucleotide, and then move forward again, adding the correct nucleotide. Key here is that the polymerase is moving back and forth along the DNA strand. The result of proof-reading is to reduce the error rate of DNA-dependent DNA and RNA synthesis substantially, down to 10-8 to 10-10 per base pair in the case of DNA synthesis.  

In the case of the RNA polymerase, the newly synthesized RNA can fold back on itself, forming what is known as a “hairpin”. This hairpin “can stabilize an elemental pause (in RNA synthesis) an allosteric interaction with the β-flap tip helix of RNAP”. What Wee et al (4) report is as the mRNA-associated ribosome moves along the RNA it unfold the hairpin and “bumps” into the polymerase, inhibiting this “pause” which increases the rate of mRNA synthesis and inhibits the polymerase’s error correction function. The resulting mRNA population has more frequent base pair changes, errors that can influence the polypeptides synthesized. While cells of all types have various  “chaperone” systems that can deal with misfolded proteins that arise in response to various stresses or errors, these can be overwhelmed. The resulting misfolded (damaged) proteins can lead to cellular defects and long term effects on viability (discussed in 5).  

About 1.5 billion years later (give or take), a new type of cell appeared, the result (apparently) of a symbiotic interaction between an archaeal-like “host” and a O2-utilizing bacterium.  This synthetic organism, the progenitor of the eukaryotes, differed from either type of prokaryote in that it sequestered its genome, now composed of linear DNA molecules, within a double membrane bounded “nuclear” compartment. In this hybrid cell type, DNA and RNA synthesis was confined to the nucleus, while ribosomes and polypeptide synthesis were confined to the cytoplasm. Eukaryotic cells are typically much larger that prokaryotic cells, reproduce more slowly, and are more complex in terms of the numbers of genes, and the amount of genomic DNA they contain. It is tempting to speculated that while rapidly dividing, relatively simple prokaryotic cells may be able to tolerate more mistakes in terms of the synthesis of their polypeptides, larger, more complex eukaryotic cells would be vulnerable. A plausible result would be a selection pressure to separating RNA from polypeptide synthesis.

literature cited

  1. Alberts, B. (1998). The cell as a collection of protein machines: preparing the next generation of molecular biologists. Cell, 92, 291-294.
  2. de Lorenzo, V., 2024. The principle of uncertainty in biology: Will machine learning/artificial intelligence lead to the end of mechanistic studies?. Plos Biology, 22, p.e3002495.
  3. Klymkowsky, M.W., 2021. Making mechanistic sense: are we teaching students what they need to know? Developmental Biology, 476, pp.308-313.
  4. Wee et al., 2023. A trailing ribosome speeds up RNA polymerase at the expense of transcript fidelity via force and allostery. Cell, 186, pp.1244-1262.
  5. Klymkowsky, M.W., 2019. Filaments and phenotypes: cellular roles and orphan effects associated with mutations in cytoplasmic intermediate filament proteinsF1000Research8.

Footnotes

  1. if you want brush up on you molecular biology, check out chapter 7 of biofundamentals  ↩︎
  2. Image from Govindjee – doi:10.3389/fpls.2011.00028, CC BY 3.0.  Given the diversity of biological systems, these are general descriptions – often there a exceptions, but recognizing them all makes generating a coherent narrative difficult (and beyond me).  Mea culpa.    ↩︎
  3. bioliteracy link: When is a gene product a protein when is it a polypeptide? ↩︎