Embryogenesis is based on a framework of social (cell-cell) interactions, initial and early asymmetries, and cascading cell-cell signaling and gene regulatory networks (DEVO posts one, two, & three). The result is the generation of embryonic axes, germ layers (ectoderm, mesoderm, endoderm), various organs and tissues (brains, limbs, kidneys, hearts, and such) and their characteristic cell types, their patterning, and their coordination into a functioning organism. It is well established that all animals share a common ancestor (hundreds of millions of years ago) and that a number of molecular modules were already present in that common ancestor.
At the same time evolutionary processes are, and need to be, flexible enough to generate the great diversity of organisms, with their various adaptations to particular life-styles. The extent of both conservation and flexibility (new genes, new mechanisms) in developmental systems is, however, surprising. Perhaps the most striking evidence for the depth of this conservation was supplied by the discovery of the organization of the Hox gene cluster in the fruit fly Drosophila and in the mouse (and other vertebrates). In both, the Hox genes are arranged and expressed in a common genomic and expression patterns. But as noted by Denis Duboule (2007) Hox gene organization is often presented in textbooks in a distorted manner (↓).
The Hox gene clusters of vertebrates are compact, but are split, disorganized, and even “atomized” in other types of organisms. Similarly, processes that might appear foundational, such as the role of the Bicoid gradient in the early fruit fly embryo (a standard topic in developmental biology textbooks), is in fact restricted to a small subset of flies (Stauber et al., 1999). New genes can be generated through well defined processes, such as gene duplication and divergence, or they can arise de novo out of sequence noise (Carvunis et al., 2012; Zhao et al., 2014 – see Van Oss & Carvunis 2019. De novo gene birth). Comparative genomic analyses can reveal the origins of specific adaptations (see Stauber et al., 1999). The result is that organisms as closely related to each other as the great apes (including humans) have significant species-specific genetic differences (see Florio et al., 2018; McLean et al., 2011; Sassa, 2013 and references therein) as well as common molecular and cellular mechanisms.
A universal (?) feature of developing systems – gradients and non-linear responses: There is a predilection to find (and even more to teach) simple mechanisms that attempt to explain everything (witness the distortion of the Hox cluster, above) – a form of physics “theory of everything” envy. But the historic nature, evolutionary plasticity, and need for regulatory robustness generally lead to complex and idiosyncratic responses in biological systems. Biological systems are not “intelligently designed” but rather cobbled together over time through noise (mutation) and selection (Jacob, 1977)(see blog post).
That said, a common (universal?) developmental process appears to be the transformation of asymmetries into unambiguous cell fate decisions. Such responses are based on threshold events controlled by a range of molecular behaviors, leading to discrete gene expression states. We can approach the question of how such decisions are made from both an abstract and a concrete perspective. Here I outline my initial approach – I plan to introduce organism specific details as needed. I start with the response to a signaling gradient, such as that found in many developmental systems, including the vertebrate spinal cord (top image Briscoe and Small, 2015) and the early Drosophila embryo (Lipshitz, 2009)(↓).
We begin with a gradient in the concentration of a “regulatory molecule” (the regulator). The shape of the gradient depends upon the sites and rates of synthesis, transport away from these sites, and turnover (degradation and/or inactivation). We assume, for simplicity’s sake, that the regulator directly controls the expression of target gene(s). Such a molecule binds in a sequence specific manner to regulatory sites, there could be a few or hundreds, and lead to the activation (or inhibition) of the DNA-dependent, RNA polymerase (polymerase), which generates RNA molecules complementary to one strand of the DNA. Both the binding of the regulator and the polymerase are stochastic processes, driven by diffusion, molecular collisions, and binding interactions.(1)
Let us now consider the response of target gene(s) as a function of cell position within the gradient. We might (naively) expect that the rate of target gene expression would be a simple function of regulator concentration. For an activator, where the gradient is high, target gene expression would be high, where the gradient concentration is low, target gene expression would be low – in between, target gene expression would be proportional to regulator concentration. But generally we find something different, we find that the expression of target genes is non-uniform, that is there are thresholds in the gradient: on one side of the threshold concentration the target gene is completely off (not expressed), while on the other side of the threshold concentration, the target gene is fully on (maximally expressed). The target gene responds as if it is controlled by an on-off switch. How do we understand the molecular basis for this behavior?
Distinct mechanisms are used in different systems, but we will consider a system from the gastrointestinal bacteria E. coli that students may already be familiar with; these are the genes that enable E. coli to digest the mammalian milk sugar lactose. They encode a protein needed to import lactose into a bacterial cell and an enzyme needed to break lactose down so that it can be metabolized. Given the energetic cost to synthesize these proteins, it is in the bacterium’s adaptive self interest to synthesize them only when lactose is present at sufficient concentrations in their environment. The response is functionally similar to that associated with quorum sensing, which is also governed by threshold effects. Similarly cells respond to the concentration of regulator molecules (in a gradient) by turning on specific genes in specific domains, rather than uniformly.
Now let us look in a little more detail at the behavior of the lactose utilization system in E. coli following an analysis by Vilar et al (2003)(2). At an extracellular lactose concentration below the threshold, the system is off. If we increase the extracellular lactose concentration above threshold the system turns on, the lactose permease and β-galactosidase proteins are made and lactose can enter the cell and be broken down to produce metabolizable sugars. By looking at individual cells, we find that they transition, apparently stochastically from off to on (→), but whether they stay on depends upon the extracellular lactose concentration. We can define a concentration, the maintenance concentration, below the threshold, at which “on” cells will remain on, while “off” cells will remain off.
The circuitry of the lactose system is well defined (Jacob and Monod, 1961; Lewis, 2013; Monod et al., 1963)(↓). The lacI gene encodes the lactose operon repressor protein and it is expressed constituately at a low level; it binds to sequences in the lac operon and inhibits transcription. The lac operon itself contains three genes whose expression is regulated by a constituatively active promoter. LacY encodes the permease while the lacZ encodes β-galactosidase. β-galactosidase has two functions: it catalyzes the reaction that transforms lactose into allolactone and it cleaves lactose into the metabolically useful sugars glucose and galactose. Allolactone is an allosteric modulator of the Lac repressor protein; if allolactone is present, it binds to lac epressor proteins and inactivates them, allowing lac operon expression.
The cell normally contains only ~10 lactose repressor proteins. Periodically (stochastically), even in the absence of lactose, and so its derivative allolactone, the lac operon promoter region is free of repressor proteins, and a lactose operon is briefly expressed – a few LacY and LacZ polypeptides are synthesized (↓). This noisy leakiness in the regulation of the lac operon allows the cell to respond if lactose happens to be present – some lactose molecules enter the cell through the permease, are converted to allolactone by β-galactosidase. Allolactone is an allosteric effector of the lac repressor; when present it binds to and inactivates the lac repressor protein so that it no longer binds to its target sequences (the operator or “O” sites). In the absence of repressor binding, the lac operon is expressed. If lactose is not present, the lac operon is inhibited and lacY and LacZ disappear from the cell by turnover or growth associated dilution.
The question of how the threshold concentration for various signal-regulated decisions is set often involves homeostatic processes that oppose the signaling response. The binding and activation of regulators can involve cooperative interactions between molecular components and both positive and negative feedback effects.
In the case of patterning a tissue, in terms of regional responses to a signaling gradient, there can be multiple regulatory thresholds for different genes, as well as indirect effects, where the initiation of gene expression of one set of target genes impacts the sensitive expression of subsequent sets of genes. One widely noted mechanism, known as reaction-diffusion, was suggested by the English mathematician Alan Turing (see Kondo and Miura, 2010) – it postulates a two component system. One component is an activator of gene expression, which in addition to its own various targets, positively regulates its own expression. The second component is a repressor of the first. Both of these two regulator molecules are released by the signaling cell or cells; the repressor diffuses away from the source faster than the activator does. The result can be a domain of target gene expression (where the concentration of activator is sufficient to escape repression), surrounded by a zone in which expression is inhibited (where repressor concentration is sufficient to inhibit the activator). Depending upon the geometry of the system, this can result in discrete regions (dots or stripes) of primary target gene expression (see Sheth et al., 2012). In real systems there are often multiple gradients present; their relative orientations can produce a range of patterns.
The point of all of this, is that when we approach a particular system – we need to consider the mechanisms involved. Typically they are selected to produce desired phenotypes, but also to be robust in the sense that they need to produce the same patterns even if the system in which they occur is subject to perturbations, such as embryo/tissue size (due to differences in cell division / growth rates) and temperature and other environmental variables.
note: figures returned – updated 13 November 2020.
- While stochastic (random) these processes can still be predictable. A classic example involves the decay of an unstable isotope (atom), which is predictable at the population level, but unpredictable at the level of an individual atom. Similarly, in biological systems, the binding and unbinding of molecules to one another, such as a protein transcription regulator to its target DNA sequence is stochastic but can be predictable in a large enough population.
- and presented in biofundamentals ( pages 216-218).
Briscoe & Small (2015). Morphogen rules: design principles of gradient-mediated embryo patterning. Development 142, 3996-4009.
Carvunis et al (2012). Proto-genes and de novo gene birth. Nature 487, 370.
Duboule (2007). The rise and fall of Hox gene clusters. Development 134, 2549-2560.
Florio et al (2018). Evolution and cell-type specificity of human-specific genes preferentially expressed in progenitors of fetal neocortex. eLife 7.
Jacob (1977). Evolution and tinkering. Science 196, 1161-1166.
Jacob & Monod (1961). Genetic regulatory mechanisms in the synthesis of proteins. Journal of Molecular Biology 3, 318-356.
Kondo & Miura (2010). Reaction-diffusion model as a framework for understanding biological pattern formation. Science 329, 1616-1620.
Lewis (2013). Allostery and the lac Operon. Journal of Molecular Biology 425, 2309-2316.
Lipshitz (2009). Follow the mRNA: a new model for Bicoid gradient formation. Nature Reviews Molecular Cell Biology 10, 509.
McLean et al (2011). Human-specific loss of regulatory DNA and the evolution of human-specific traits. Nature 471, 216-219.
Monod Changeux & Jacob (1963). Allosteric proteins and cellular control systems. Journal of Molecular Biology 6, 306-329.
Sassa (2013). The role of human-specific gene duplications during brain development and evolution. Journal of Neurogenetics 27, 86-96.
Sheth et al (2012). Hox genes regulate digit patterning by controlling the wavelength of a Turing-type mechanism. Science 338, 1476-1480.
Stauber et al (1999). The anterior determinant bicoid of Drosophila is a derived Hox class 3 gene. Proceedings of the National Academy of Sciences 96, 3786-3789.
Vilar et al (2003). Modeling network dynamics: the lac operon, a case study. J Cell Biol 161, 471-476.
Zhao et al (2014). Origin and Spread of de Novo Genes in Drosophila melanogaster Populations. Science. 343, 769-772