By Mike Klymkowsky
Stochastic processes are often presented in terms of random, that is unpredictable, events. This framing obscures the reality that stochastic processes, while more or less unpredictable at the level of individual events, are well behaved at the population level. It also obscures the role of stochastic processes in a wide range of predictable phenomena; in atomic systems, for example, unknown factors determine the timing of the radioactive decay of a particular unstable atom, at the same time the rate of radioactive decay is highly predictable in a large enough population. Similarly, in the classical double-slit experiment the passage of a single photon, electron, or C60 molecule is unpredictable while the behavior of a larger population is perfectly predictable. The macroscopic predictability of the Brownian motion (a stochastic process) enabled Einstein to argue for the reality of atoms. Similarly, the dissociation of a molecular complex or the occurrence of a chemical reaction, driven as they are by thermal collisions, are stochastic processes, whereas dissociation constants and reaction rates are predictable. In fact this type of unpredictability at the individual level and predictability at the population level is the hallmark of stochastic, as compared to truly random, that is, unpredictable behaviors.
Single cell and single molecule studies increasingly provide mechanistic insights into a range of biological processes, from evolutionary to cognitive and pathogenic mechanisms. The effects of stochastic events are complicated by the developing and adaptive
nature of biological systems and appears to be influenced by the genetic background. In some cases, homeostatic (feedback) mechanisms return the system to its original state. In others, the stochastic expression (or mutation) of a particular gene (or set of genes) leads to a cascade of downstream effects that change the system, such that subsequent events become more or less probable, a process nicely illustrated in recent real time studies of the evolution of antibiotic resistance in bacteria (←FIG & a seriously cool video). The stochastic (molecular clock) nature of an organism’s intrinsic mutation rate has recently been used with the EXAC system to visualize the impact of selective and non-adaptive effects on human genes.
Pedagogical studies: The “Framework for K12 science education” ignores stochastic processes altogether, while the Vision and Change in Undergraduate Biology Education” document contains a single point that calls for “incorporating stochasticity into biological models” (p. 17), but omits details of what this means in practice. People (even scholars) often have a difficult time developing an accurate understanding of stochastic processes (see “Understanding Randomness and its Impact on Student Learning” and “Fooled by Randomness“. The failure to appreciate the ubiquity of stochastic processes in biological system has been an obstacle to the acceptance of Darwinian evolution. In this light, it seems well past time to rethink the foundational roles of stochastic processes in biological (as well as chemical and physical) systems and how best to introduce such processes to students through coherent course narratives and supporting materials.
A number of studies indicate that students call upon deterministic models to explain a range of stochastic processes. The fact that all too often students are introduced to the behaviors of cellular and molecular level biological systems through depictions that are overtly deterministic does not help the situation. In the majority of instructional videos, for example, molecules appear to know were they are heading and move there with a purpose. Similarly the folding of polypeptides is often depicted as a deterministic process although the proliferation of model-based simulations offers a more realistic depiction (see below). That said the widespread involvement of chaperones is rarely acknowledged. Macromolecules are commonly depicted as rigid rather than as dynamic. The thermally driven opening and closing of the DNA double-helix (a consequence of the weakness of intermolecular interactions) is rarely illustrated. Molecules recognize one another and (apparently) stay locked in their mutual embrace forever; the role of thermal collisions in driving molecular dissociation (and binding specificity) is rarely considered in most textbooks, and presumably, in the classes that use these books. Moreover, the factors involved in inter-molecular interactions are often poorly understood, even after the completion of conventional university level chemistry courses. The energetic factors that determine enzyme specificity and reaction rates and the binding of transcription factors to their target DNA sequences, as well as the effects of mutations on these and other processes, often go uncommented on. It is not at all clear whether students appreciate that thermal collisions are responsible for the reversal of molecular interactions or that they supply a reaction’s activation energy. Cells with the same genotype are implicitly expected to behave in identical ways (display the same phenotypes), a situation at odds with direct observation (see “Stochastic Gene Expression in a Single Cell” and “What’s Luck Got to Do with It: Single Cells, Multiple Fates, and Biological Non-determinism” and the general processes involved in cellular differentiation and social behaviors. Phenotypic penetrance and expressivity also involve stochastic behaviors, together with genetic background effects. It certainly does not help when instructors introduce a stochastic process, such as genetic drift, in the context of the Hardy-Weinberg model, a situation in which genetic drift does not occur. Such presentations are likely to increase student confusion.
It is our impression that the typical instructional approach is to present molecular level processes in terms of large populations of molecules that behave in a deterministic manner. Consider the bacteria Escherichia coli’s lac operon, a group of genes that has been a workhorse in modern molecular biology and a common context through which to present the regulation of gene expression. Expression of the lac operon results in the synthesis of two proteins (lactose permease and β-galactosidase) that enable lactose to enter the cell and convert lactose into the monosaccharides glucose and galactose (which can be metabolized futher) and allolactone, which binds to, and inhibits the binding of the lac repressor protein to DNA, allowing the expression of the lac operon. When the bulk behavior of a bacterial culture is analyzed, the expression of the lac operon increases as a smooth function over time (in the absence of other energy sources)(FIG. ↑). The result is that the expression of the proteins required for lactose metabolism is restricted to situations in which lactose is present.
The mechanistic quandary, rarely if ever considered explicitly as far as we can tell, is how the lac operon can “turn on” when the entry of lactose into the cell and the inactivation of the lac repressor both depend upon the operon’s expression? The situation becomes clear only when we consider the behavior of individual cells; LacZ expression goes from off to fully on in a stochastic manner (FIG. 2↑). Given that there are ~5 to 10 lac repressor molecules and one to two copies of the lac operon per cell, the lac operon can be expressed when the operon is free of bound repressor. If such a “noisy” event happens to occur when lactose is present in the media, expression of the lac operon allows lactose to enter the cell, the conversion of lactose into allolactone, the inactivation of the lac repressor, and stable expression of the lac operon. The stochastic behavior of the system enables individual cells to sample their environment and respond when useful metabolites are present while minimizing unnecessary metabolic expense (the synthesis of irrelevant polypeptides) when they are not. A similar logic is involved in the quorum sensing, the emission of light (via the luciferase system), the regulation of the DNA uptake system, the generation of persister phenotypes, and programmed cell death (to benefit genetically related neighbors).
What is a biology educator to do? The question that faces the reflective educational designer and enlightened instructor is how should their course address the multiple roles of stochastic processes within biological systems? I have a short set of recommendations that I think both designers and instructors might want to consider; many have been incorporated into ongoing efforts at course design, which I have only recently (2019) begun to think of as educational engineering. First, it should be explicitly recognized, and conveyed to students, that stochastic processes are difficult to understand, as witness the common belief in the Gambler’s fallacy and the “hot hand”. Students need to be given adequate time to work with, and appropriate feedback on the behavior of stochastic systems. Secondly, and rather obviously, instructors should illustrate and articulate the role of stochastic processes in range of biological systems, from phenotypic variation and evolutionary events, including the effects of mutations and various non-adaptive processes (such as genetic drift) to de novo gene formation, gene expression, drug-target interactions, and reaction kinetics. Finally, the stochastic behaviors of molecular (and cellular) level processes should be accurately and explicitly illustrated . Among currently available examples there are those that illustrate the movement of a water molecule through a membrane either through an aquaporin molecule or on its own, as well as a PhET applet that illustrates the Elowitz et al study on stochastic GFP-expression in E. coli (and allows for student manipulation of key regulatory parameters). A simulation of the nature of intermolecular interactions and the role of molecular collisions in their formation has been developed for use with the CLUE Chemistry curriculum.
Students’ understanding of stochastic processes can be revealed to instructors (and their students) through the use of various targeted assessment tools (such as the Biology Concepts Instrument or BCI, the Genetic Drift Instrument, and diagnostic assessments of student thinking about stochastic processes). For example, students can be asked to draw a graph that reflects the movement of a macroscopic projectile FIG. 3A→) versus a molecular (microscopic) object (FIG.3B→); such a task can reveal whether students can make the transition from the well behaved (deterministic) to the stochastic. Drawing (and explanation) has been used extensively in the analysis of student understanding with the context of the CLUE project (“lost in lewis structures“, “noncovalent interactions“, and “relationships between molecular structure and properties“). In a similar vein, network dynamics, including the cascade effects driving cell level divergence and the feedback and regulatory interactions involved in limiting the effects of noise and generating various outcomes (cellular differentiation) can be presented to students (see “Network motifs: simple building blocks of complex networks“, “Using graph-based assessments within socratic tutorials” and “Noise facilitates transcriptional control under dynamic inputs“). One can consider the role of stochastic events within social systems, including responses to various aberrant behaviors (social cheating, cancer) and in terms of social feedback mechanisms (apoptosis, positive and negative feedback, lateral inhibition of cell differentiation) and in the context of the decisions involved in stem cells division, differentiation, and cancer formation. By introducing students to the roles and implications of stochastic processes in biological systems, we can help them develop a coherent understanding of the predictable, but not completely deterministic, nature of such systems.
Some minor edits: 5 May 2019 and 2 October 2020 – figures re-inserted 20 June 2020
 Scientific animation: protein production and folding
 See this recording of stem cells
 Aquaporin and a lipid membrane
 PhET gene expression basics – see panel 3