On teaching genetics, social evolution and understanding the origins of racism

Links between genetics and race crop up periodically in the popular press (link; link), but the real, substantive question, and the topic of a number of recent essays (see Saletan. 2018a. Stop Talking About Race and IQ) is whether the idea of “race” as commonly understood, and used by governments to categorize people (link), makes scientific sense.  More to the point, do biology educators have an unmet responsibility to modify and extend their materials and pedagogical approaches to address the non-scientific, often racist, implications of racial characterizations.  Such questions are complicated by a social geneticssecond factor, independent of whether the term race has any useful scientific purpose, namely to help students understand the biological (evolutionary) origins of racism itself, together with the stressors that lead to its periodic re-emergence as a socio-political factor. In times of social stress, reactions to strangers (others) identified by variations in skin color or overt religious or cultural signs (dress), can provoke hostility against those perceived to be members of a different social group.  As far as I can tell, few in the biology education community, which includes those involved in generating textbooks, organizing courses and curricula, or the design, delivery, and funding of various public science programs, including PBS’s NOVA, the science education efforts of HHMI and other private foundations, and programs such as Science Friday on public radio, directly address the roots of racism, roots associated with biological processes such as the origins and maintenance of multicellularity and other forms of social organization among organisms, involved in coordinating their activities and establishing defenses against social cheaters and processes such as cancer, in an organismic context (1).  These established defense mechanisms can, if not recognized and understood, morph into reflexive and unjustified intolerance, hostility toward, and persecution of various “distinguishable others.”  I will consider both questions, albeit briefly, here. 


Two factors have influenced my thinking about these questions.  The first involves the design of the biofundamentals text/course and its extension to include topics in genetics (2).  This involved thinking about what is commonly taught in genetics, what is critical for students to know going forward (and by implication what is not), and where materials on genetic processes best fit into a molecular biology curriculum (3).  While engaged in such navel gazing there came an email from Malcolm Campbell describing student responses to the introduction of a chapter section on race and racism in his textbook Integrating Concepts in Biology.  The various ideas of race, the origins of racism, and the periodic appearance of anti-immigrant, anti-religious and racist groups raise important questions – how best to clarify what is an undeniable observation, that different, isolated, sub-populations of a species can be distinguished from one another (see quote from Ernst Mayr’s 1994 “Typological versus Population thinking” ), from the deeper biological reality, that at the level of the individual these differences are meaningless. In what I think is an interesting way, the idea that people can be meaningfully categorized as different types of various platonic ideals (for example, as members of one race or the other) based on anatomical / linguistic differences between once distinct sub-populations of humans is similar to the dichotomy between common wisdom (e.g. that has influenced people’s working understanding of the motion of objects) and the counter-intuitive nature of empirically established scientific ideas (e.g. Newton’s laws and the implications of Einstein’s theory of general relativity).  What appears on the surface to be true but in fact is not.  In this specific case, there is a pressure toward what Mayr terms “typological” thinking, in which we class people into idealized (platonic) types or races ().   

As pointed out most dramatically, and repeatedly, by Mayr (1985; 1994; 2000), and supported by the underlying commonality of molecular biological mechanisms and the continuity of life, stretching back to the last universal common ancestor, there are only individuals who are members of various populations that have experienced various degrees of separation from one another.  In many cases, these populations have diverged and, through geographic, behavioral, and structure adaptations driven by natural, social, and sexual selection together with the effects of various events, some non-adaptive, such as bottlenecks, founder effects, and genetic drift, may eventually become reproductively isolated from one another, forming new species.  An understanding of evolutionary principles and molecular mechanisms transforms biology from a study of non-existent types to a study of populations with their origins in common, sharing a single root – the last universal common ancestor (LUCA).   Over the last ~200,000 years the movement of humans first within Africa and then across the planet  has been impressive ().  These movements have been accompanied by the fragmentation of human populations. Campbell and Tishkoff (2008) identified 13 distinct ancestral African populations while Busby et al (2016) recognized 48 sub-saharan population groups.  The fragmentation of the human population is being reversed (or rather rendered increasingly less informative) by the effects of migration and extensive intermingling ().   

    Ideas, such as race (and in a sense species), try to make sense of the diversity of the many different types of organisms we observe. They are based on a form of essentialist or typological thinking – thinking that different species and populations are completely different “kinds” of objects, rather than individuals in a population connected historically to all other living things. Race is a more pernicious version of this illusion, a pseudo-scientific, political and ideological idea that postulates that humans come  in distinct, non-overlapping types (quote  again, from Mayr).  Such a weird idea underlies various illogical and often contradictory legal “rules” by which a person’s “race” is determined.  

Given the reality of the individual and the unreality of race, racial profiling (see Satel,
2002) can lead to serious medical mistakes, as made clear in the essays by Acquaviva & Mintz (2010) “Are We Teaching Racial Profiling?”,  Yudell et al  (2016) “Taking Race out of Human Genetics”, and Donovan (2014) “The impact of the hidden curriculum”. 

The idea of race as a type fails to recognize the dynamics of the genome over time.  If possible (sadly not) a comparative analysis of the genome of a “living fossil”, such as modern day coelacanths and their ancestors (living more than 80 million years ago) would likely reveal dramatic changes in genomic DNA sequence.  In this light the fact that between 100 to 200 new mutations are introduced into the human genome per generation (see Dolgin 2009 Human mutation rate revealed) seems like a useful number to be widely appreciated by students, not to mention the general public. Similarly, the genomic/genetic differences between humans, our primate relatives, and other mammals and the mechanisms behind them (Levchenko et al., 2017)(blog link) would seem worth considering and explicitly incorporating into curricula on genetics and human evolution.  

While race may be meaningless, racism is not.  How to understand racism?  Is it some kind of political artifact, or does it arise from biological factors.  Here, I believe, we find a important omission in many biology courses, textbooks, and curricula – namely an introduction and meaningful discussion of social evolutionary mechanisms. Many is the molecular/cell biology curriculum that completely ignores such evolutionary processes. Yet, the organisms that are the primary focus of biological research (and who pay for such research, e.g. humans) are social organisms at two levels.  In multicellular organisms somatic cells, which specialize to form muscular, neural, circulatory and immune systems, bone and connective tissues, sacrifice their own inter-generational reproductive future to assist their germ line (sperm and/or eggs) relatives, the cells that give rise to the next generation of organisms, a form of inclusive fitness (Dugatkin, 2007).  Moreover, humans are social organisms, often sacrificing themselves, sharing their resources, and showing kindness to other members of their group. This social cooperation is threatened by cheaters of various types (POST LINK).  Unless these social cheaters are suppressed, by a range of mechanisms, and through processes of kin/group selection, multicellular organisms die and socially dysfunctional social populations are likely to die out.  Without the willingness to cooperate, and when necessary, self-sacrifice, social organization is impossible – no bee hives, no civilizations.  Imagine a human population composed solely of people who behave in a completely selfish manner, not honoring their promises or social obligations.  

A key to social interactions involves recognizing those who are, and who are not part of your social group.  A range of traits can serve as markers for social inclusion.  A plausible hypothesis is that the explicit importance of group membership and defined social interactions becomes more critical when a society, or a part of society, is under stress.  Within the context of social stratification, those in the less privileged groups may feel that the social contract has been broken or made a mockery of.  The feeling (apparent reality) that members of “elite” or excessively privileged sub-groups are not willing to make sacrifices for others serves as evidence that social bonds are being broken (4). Times of economic and social disruption (migrations and conquests) can lead to increased explicit recognition of both group and non-group identification.  The idea that outsiders (non-group members) threaten the group can feed racism, a justification for why non-group members should be treated differently from group members.  From this position it is a small (conceptual) jump to the conclusion that non-group members are somehow less worthy, less smart, less trustworthy, less human – different in type from members of the group – many of these same points are made in an op-ed piece by Judis. 2018. What the Left Misses About Nationalism.

That economic or climatic stresses can foster the growth of racist ideas is no new idea; consider the unequal effects of various disruptions likely to be associated with the spread of automation (quote from George Will ) and the impact of climate change on migrations of groups within and between countries (see Saletan 2018b: Why Immigration Opponents Should Worry About Climate Change) are likely to spur various forms of social unrest, whether revolution or racism, or both – responses that could be difficult to avoid or control.   

So back to the question of biology education – in this context understanding the ingrained responses of social creatures associated with social cohesion and integrity need to be explicitly presented. Similarly, variants of such mechanisms occur within multicellular organisms and how they work is critical to understanding how diseases such as cancer, one of the clearest forms of a cheater phenotype, are suppressed.  Social evolutionary mechanisms provide the basis for understanding a range of phenomena, and the ingrained effects of social selection may be seen as one of the roots of racism, or at the very least a contributing factor worth acknowledging explicitly.  

Thanks to Melanie Cooper and Paul Strode for comments. Minor edits 4 May 2019.

Footnotes:

  1. It is an interesting possibility whether the 1%, or rather the super 0.1% represent their own unique form of social parasite, leading periodically to various revolutions – although sadly, new social parasites appear to re-emerge quite quickly.
  2. A part of the CoreBIO-biofundamentals project 
  3. At this point it is worth noting that biofundamentals itself includes sections on social evolution, kin/group and sexual selection (see Klymkowsky et al., 2016; LibreText link). 
  4. One might be forgiven for thinking that rich and privileged folk who escape paying what is seen as their fair share of taxes, might be cast as social cheaters (parasites) who, rather than encouraging racism might lead to revolutionary thoughts and actions. 

Literature cited: 

Acquaviva & Mintz. (2010). Perspective: Are we teaching racial profiling? The dangers of subjective determinations of race and ethnicity in case presentations. Academic Medicine 85, 702-705.

Busby et  al. (2016). Admixture into and within sub-Saharan Africa. Elife 5, e15266.

Campbell & Tishkoff. (2008). African genetic diversity: implications for human demographic history, modern human origins, and complex disease mapping. Annu. Rev. Genomics Hum. Genet. 9, 403-433.

Donovan, B.M. (2014). Playing with fire? The impact of the hidden curriculum in school genetics on essentialist conceptions of race. Journal of Research in Science Teaching 51: 462-496.

Dugatkin, L. A. (2007). Inclusive fitness theory from Darwin to Hamilton. Genetics 176, 1375-1380.

Klymkowsky et al., (2016). The design and transformation of Biofundamentals: a non-survey introductory evolutionary and molecular biology course..” LSE Cell Biol Edu pii: ar70.

Levchenko et al., (2017). Human accelerated regions and other human-specific sequence variations in the context of evolution and their relevance for brain development. Genome biology and evolution 10, 166-188.

Mayr, E. (1985). The Growth of Biological Thought: Diversity, Evolution, and Inheritance. Cambridge, MA: Belknap Press of Harvard University Press.

Mayr, E. (1994). Typological versus population thinking. Conceptual issues in evolutionary biology, 157-160.

—- (2000). Darwin’s influence on modern thought. Scientific American 283, 78-83.

Satel, S. (2002). I am a racially profiling doctor. New York Times 5, 56-58.

Yudell et al., (2016). Taking race out of human genetics. Science 351, 564-565.

Genes – way weirder than you thought

Pretty much everyone, at least in societies with access to public education or exposure to media in its various forms, has been introduced to the idea of the gene, but “exposure does not equate to understanding” (see Lanie et al., 2004).  Here I will argue that part of the problem is that instruction in genetics (or in more modern terms, the molecular biology of the gene and its role in biological processes) has not kept up with the advances in our understanding of the molecular mechanisms underlying biological processes (Gayon, 2016). spacer bar

Let us reflect (for a moment) on the development of the concept of a gene: Over the course of human history, those who have been paying attention to such things have noticed that organisms appear to come in “types”, what biologists refer to as species. At the same time, individual organisms of the same type are not identical to one  another, they vary in various ways. Moreover, these differences can be passed from generation to generation, and by controlling  which organisms were bred together; some of the resulting offspring often displayed more extreme versions of the “selected” traits.  By strictly controlling which individuals were breddogs
together, over a number of generations, people were able to select for the specific traits they desired (→).  As an interesting aside, as people domesticated animals, such as cows and goats, the availability of associated resources (e.g. milk) led to reciprocal effects – resulting in traits such as adult lactose tolerance (see Evolution of (adult) lactose tolerance & Gerbault et al., 2011).  Overall, the process of plant and animal breeding is generally rather harsh (something that the fanciers of strange breeds who object to GMOs might reflect upon), in that individuals that did not display the desired trait(s) were generally destroyed (or at best, not allowed to breed). spacer bar

Charles Darwin took inspiration from this process, substituting “natural” for artificial (human-determined) selection to shape populations, eventually generating new species (Darwin, 1859).  Underlying such evolutionary processes was the presumption that traits, and their variation, was “encoded” in some type of “factors”, eventually known as genes and their variants, alleles.  Genes influenced the organism’s molecular, cellular, and developmental systems, but the nature of these inheritable factors and the molecular trait building machines active in living systems was more or less completely obscure. 

Through his studies on peas, Gregor Mendel was the first to clearly identify some of the rules for the behavior of these inheritable factors using highly stereotyped, and essentially discontinuous traits – a pea was either yellow or green, wrinkled or smooth.  Such traits, while they exist in other organisms, are in fact rare – an example of how the scientific exploration of exceptional situations can help understand general processes, but the downside is the promulgation of the idea that genes and traits are somehow discontinuous – that a trait is yes/no, displayed by an organism or not – in contrast to the realities that the link between the two is complex, a reality rarely directly addressed (apparently) in most introductory genetics courses.  Understanding such processes is critical to appreciating the fact that genetics is often not destiny, but rather alterations in probabilities (see Cooper et al., 2013).  Without such an more nuanced and realistic understanding, it can be difficult to make sense of genetic information.     spacer bar

A gene is part of a molecular machine:  A number of observations transformed the abstraction of Darwin’s and Mendel’s hereditary factors into physical entities and molecular mechanisms (1).  In 1928 Fred Griffith demonstrated that a genetic trait could be transferred from dead to living organisms – implying a degree of physical / chemical stability; subsequent observations implied that the genetic information transferred involved DNA molecules. The determination of the structure of double-stranded DNA immediately suggested how information could be stored in DNA (in variations of bases along the length of the molecule) and how this information could be duplicated (based on the specificity of base pairing).  Mutations could be understood as changes in the sequence of bases along a DNA molecule (introduced by chemicals, radiation, mistakes during replication, or molecular reorganizations associated with DNA repair mechanisms and selfish genetic elements.  

But on their own, DNA molecules are inert – they have functions only within the context of a living organism (or highly artificial, that is man made, experimental systems).  The next critical step was to understand how a gene works within a biological system, that is, within an organism.  This involve appreciating the molecular mechanisms (primarily proteins) involved in identifying which stretches of a particular DNA molecule were used as templates for the synthesis of RNA molecules, which in turn could be used to direct the synthesis of polypeptides (see previous post on polypeptides and proteins).  In the context of the introductory biology courses I am familiar with (please let me know if I am wrong), these processes are based on a rather deterministic context; a gene is either on or off in a particular cell type, leading to the presence or absence of a trait. Such a deterministic presentation ignores the stochastic nature of molecular level processes (see past post: Biology education in the light of single cell/molecule studies) and the dynamic interaction networks that underlie cellular behaviors.  spacer bar

But our level of resolution is changing rapidly (2).  For a number of practical reasons, when the human genome was first sequence, the identification of polypeptide-encoding genes was based on recognizing “open-reading frames” (ORFs) encoding polypeptides of > 100 amino acids in length (> 300 base long coding sequence).  The increasing sensitivity of mass spectrometry-based proteomic studies reveals that smaller ORFs (smORFs) are present and can lead to the synthesis of short (< 50 amino acid long) polypeptides (Chugunova et al., 2017; Couso, 2015).  Typically an ORF was considered a single entity – basically one gene one ORF one polypeptide (3).  A recent, rather surprising discovery is what are known as “alternative ORFs” or altORFs; these RNA molecules that use alternative reading frames to encode small polypeptides.  Such altORFs can be located upstream, downstream, or within the previously identified conventional ORFalternative orfs
(figure →)(see Samandi et al., 2017).  The implication, particularly for the analysis of how variations in genes link to traits, is that a change, a mutation or even the  experimental  deletion of a gene, a common approach in a range of experimental studies, can do much more than previously presumed – not only is the targeted ORF effected, but various altORFs can also be modified.  

The situation is further complicated when the established rules of using RNAs to direct polypeptide synthesis via the process of translation, are violated, as occurs in what is known as “repeat-associated non-ATG (RAN)” polypeptide synthesis (see Cleary and Ranum, 2017).  In this situation, the normal signal for the start of RNA-directed polypeptide synthesis, an AUG codon, is subverted – other RNA synthesis start sites are used leading to underlying or imbedded gene expression.  This process has been found associated with a class of human genetic diseases, such as amyotrophic lateral sclerosis (ALS) and frontotemporal dementia (FTD) characterized by the expansion of simple (repeated) DNA sequences  (see Pattamatta et al., 2018).  Once they exceed a certain length, such“repeat” regions have been found to be associated with the (apparently) inappropriarepeat region RAN process
te transcription of RNA in both directions, that is using both DNA strands as templates (← A: normal situation, B: upon expansion of the repeat domain).  These abnormal repeat region RNAs are translated via the RAN process to generate six different types of toxic polypeptides. spacer bar

So what are the molecular factors that control the various types of altORF transcription and translation?  In the case of ALS and FTD, it appears that other genes, and the polypeptides and proteins they encode, are involved in regulating the expression of repeat associated RNAs (Kramer et al., 2016)(Cheng et al., 2018).  Similar or distinct mechanisms may be involved in other  neurodegenerative diseases  (Cavallieri et al., 2017).  

So how should all of these molecular details (and it is likely that there are more to be discovered) influence how genes are presented to students?  I would argue that DNA should be presented as a substrate upon which various molecular mechanisms occur; these include transcription in its various forms (directed and noisy), as well as DNA synthesis, modification, and repair mechanisms occur.   Genes are not static objects, but key parts of dynamic systems.  This may be one reason that classical genetics, that is genes presented within a simple Mendelian (gene to trait) framework, should be moved deeper into the curriculum, where students have the background in molecular mechanisms needed to appreciate its complexities, complexities that arise from the multiple molecular machines acting to access, modify, and use the information captured in DNA (through evolutionary processes), thereby placing the gene in a more realistic cellular perspective (4). 

Footnotes:

1. Described greater detail in biofundamentals™

2. For this discussion, I am completely ignoring the roles of genes that encode RNAs that, as far as is currently know, do not encode polypeptides.  That said, as we go on, you will see that it is possible that some such non-coding RNA may encode small polypeptides.  

3. I am ignoring the complexities associated with alternative promoter elements, introns, and the alternative and often cell-type specific regulated splicing of RNAs, to create multiple ORFs from a single gene.  

4. With respects to Norm Pace – assuming that I have the handedness of the DNA molecules wrong or have exchanged Z for A or B. 

literature cited: 

  • Cavallieri et al, 2017. C9ORF72 and parkinsonism: Weak link, innocent bystander, or central player in neurodegeneration? Journal of the neurological sciences 378, 49.
  • Cheng et al, 2018. C9ORF72 GGGGCC repeat-associated non-AUG translation is upregulated by stress through eIF2α phosphorylation. Nature communications 9, 51.
  • Chugunova et al, 2017. Mining for small translated ORFs. Journal of proteome research 17, 1-11.
  • Cleary & Ranum, 2017. New developments in RAN translation: insights from multiple diseases. Current opinion in genetics & development 44, 125-134.
  • Cooper et al, 2013. Where genotype is not predictive of phenotype: towards an understanding of the molecular basis of reduced penetrance in human inherited disease. Human genetics 132, 1077-1130.
  • Couso, 2015. Finding smORFs: getting closer. Genome biology 16, 189.
  • Darwin, 1859. On the origin of species. London: John Murray.
  • Gayon, 2016. From Mendel to epigenetics: History of genetics. Comptes rendus biologies 339, 225-230.
  • Gerbault et al, 2011. Evolution of lactase persistence: an example of human niche construction. Philosophical Transactions of the Royal Society of London B: Biological Sciences 366, 863-877.
  • Kramer et al, 2016. Spt4 selectively regulates the expression of C9orf72 sense and antisense mutant transcripts. Science 353, 708-712.
  • Lanie et al, 2004. Exploring the public understanding of basic genetic concepts. Journal of genetic counseling 13, 305-320.
  • Pattamatta et al, 2018. All in the Family: Repeats and ALS/FTD. Trends in neurosciences 41, 247-250.
  • Samandi et al, 2017. Deep transcriptome annotation enables the discovery and functional characterization of cryptic small proteins. Elife 6.

Ideas are cheap, theories are hard

In the context of public discourse, there are times when one is driven to simple, reflexive and often disproportionate (exasperated) responses.  That happens to me whenever people talk about the various theories that they apply to a process or event.  I respond by saying (increasingly silently to myself), that what they mean is really that they have an idea, a model, a guess, a speculation, or a comforting “just-so” story. All too often such competing “theories” are flexible enough to explain (or explain away) anything, depending upon one’s predilections. So why a post on theories?  Certainly the  point as been made before (see Ghose. 2013. “Just a Theory”: 7 Misused Science Words“). Basically because the misuse of the term theory, whether by non-scientists, scientists, or science popularizers, undermines understanding of, and respect for the products of the scientific enterprise.  It confuses hard won knowledge with what are often superficial (or self-serving) opinions. When professors, politicians, pundits, PR flacks, or regular people use the word theory, they are all too often, whether consciously or not, seeking to elevate their ideas through the authority of science.    

So what is the big deal anyway, why be an annoying pain in the ass (see Christopher DiCarlo’s video), challenging people, making them uncomfortable, and making a big deal about something so trivial.  But is it really trivial?  I think not, although it may well be futile or quixotic.  The inappropriate use of the word theory, particularly by academics, is an implicit attempt to gain credibility.  It is also an attack on the integrity of science.  Why?  Because like it or not, science is the most powerful method we have to understand how the world works, as opposed to what the world or our existence within the world means.  The scientific enterprise, abiding as it does by explicit rules of integrity, objective evidence, logical and quantifiable implications, and their testing has been a progressive social activity, leading to useful knowledge – knowledge that has eradicated small pox and polio (almost) and produced iPhones, genetically modified organisms, and nuclear weapons.  That is not to say that the authority of science has not been repeatedly been used to justify horrific sociopolitical ideas, but those ideas have not been based on critically evaluated and tested scientific theories, but on variously baked ideas that claim the support of science (both the eugenics and anti-vaccination movements are examples).   

Modern science is based on theories, ideas about the universe that explain and predict what we will find when we look (smell, hear, touch) carefully at the world around us.  And these theories are rigorously and continually tested, quantitatively – in fact one might say that the ability to translate a theory into a quantitative prediction is one critical hallmark of a real versus an ersatz (non-scientific) theory [here is a really clever approach to teaching students about facts and theories, from David Westmoreland 

So where do (scientific) theories come from?  Initially they are guesses about how the world works, as stated by Richard Feynman and the non-scientific nature of vague “theories”.  Guesses that have evolved based on testing, confirmation, and where wrong – replacement with more and more accurate, logically well constructed and more widely applicable constructs – an example of the evolution of scientific knowledge.  That is why ideas are cheap, they never had, or do not develop the disciplinary rigor necessary to become a theory.  In fact, it often does not even matter, not really, to the people propounding these ideas whether they correspond to reality at all, as witness the stream of tweets from various politicians or the ease with which many apocalyptic predictions are replaced when they turn out to be incorrect.  But how is the average person to identify the difference between a (more or less half-baked) idea and a scientific theory?  Probably the easiest way is to ask, is the idea constantly being challenged, validated, and where necessary refined by both its proponents and its detractors.  One of the most impressive aspects of Einstein’s theory of general relativity is the accuracy of its predictions (the orbit of Mercury, time-dilation, and gravitational waves (link)), predictions that if not confirmed would have forced its abandonment – or at the very least serious revision.  It is this constant application of a theory, and the rigorous testing of its predictions (if this, then that) that proves its worth.  

Another aspect of a scientific theory is whether it is fecund or sterile.  Does its application lead to new observations that it can explain?  In contrast, most ideas are dead ends.  Consider the recent paper on the possibility that life arose outside of the Earth, a proposal known as pan-spermia (1) – “a very plausible conclusion – life may have been seeded here on Earth by life-bearing comets” – and recently tunneling into  the web’s consciousness in stories implying the extra-terrestrial origins of cephalopods (see “no, octopuses don’t come from outer space.”)  Unfortunately, no actual biological insights emerge from this idea (wild speculation), since it simply displaces the problem, if life did not arise here, how did it arise elsewhere?  If such ideas are embraced, as is the case with many religious ideas, their alteration often leads to violent schism rather than peaceful refinement. Consider, as an example, an idea had by an archaic Greek or two that the world was made of atoms. These speculations were not theories, since their implications were not rigorously tested.  The modern atomic theory has been evolving since its introduction by Dalton, and displays the diagnostic traits of a scientific theory.  Once introduced to explain the physical properties of matter, it led to new discoveries and explanations for the composition and structure of atoms themselves (electrons, neutrons, and protons), and then to the composition and properties of these objects, quarks and such (link to a great example.)   

Scientific theories are, by necessity, tentative (again, as noted by Feynman) – they are constrained and propelled by new and more accurate observations.  A new observation can break a theory, leading it to be fixed or discarded.  When that happens, the new theory explains (predicts) all that the old theory did and more.  This is where discipline comes in; theories must meet strict standards – the result is that generally there cannot be two equivalent theories that explain the same phenomena – one (or both) must be wrong in some important ways.  There is no alternative, non-atomic theory that explains the properties of matter.  

The assumption is that two “competing” theories will make distinctly different predictions, if we look (and measure) carefully enough. There are rare cases where two “theories” make the same predictions; the classic example is the Ptolemaic Sun-centered and the Copernican Earth-centered models of the solar system.  Both explained the appearances  of planetary motion more or less equally well, and so on that basis there was really no objective reason to choose between them.  In part, this situation arose from an unnecessary assumption underlying both models, namely that celestial objects moved in perfect circular orbits – this assumption necessitated the presence of multiple “epicycles” in both models.  The real advance came with Kepler’s recognition that celestial objects need not travel in perfect circular orbits, but rather in elliptical orbits; this liberated models of the solar system from the need for epicycles.  The result was the replacement of “theories of solar system movement” with a theory of planetary/solar/galactic motions”.  

Whether, at the end of the day scientific theories are comforting or upsetting, beautiful or ugly remains to be seen, but what is critical is that we defend the integrity of science and call out the non-scientific use of the word theory, or blame ourselve for the further decay of civilization (perhaps I am being somewhat hyperbolic – sorry).

notes: 

1. Although really, pan-oogenia would be better.  Sperm can do nothing without an egg, but an unfertilized egg can develop into an organism, as occurs with bees.  

When is a gene product a protein when is it a polypeptide?

On the left is a negatively-stained electron micrograph of a membrane vesicle isolated from the electric ray Torpedo california, with a muscle-type nicotinic single acetylcholine receptor (AcChR) pointed out . To the right the structure of the AcChR determined to NN resolution using cryoelectron microscopy by Rahman, Teng, Worrell, Noviello, Lee, Karlin, Stowell & Hibbs (2020). “Structure of the native muscle-type nicotinic receptor and inhibition by snake venom toxins.”

As a new assistant professor (1), I was called upon to teach my department’s “Cell Biology” course. I found, and still find, the prospect challenging in part because I am not exactly sure which aspects of cell biology are important for students to know, both in the context of the major, as well as their lives and subsequent careers.  While it seems possible (at least to me) to lay out a coherent conceptual foundation for biology as a whole [see 1], cell biology can often appear to students as an un-unified hodge-podge of terms and disconnected cellular systems, topics too often experienced as a vocabulary lesson, rather than as a compelling narrative. As such, I am afraid that the typical cell biology course often re-enforces an all too common view of biology as a discipline, a view, while wrong in most possible ways, was summarized by the 19th/early 20th century physicist Ernest Rutherford as “All science is either physics or stamp collecting.”  A key motivator for the biofundamentals project [2] has been to explore how to best dispel this prejudice, and how to more effectively present to students a coherent narrative and the key foundational observations and ideas by which to scientifically consider living systems, by any measure the most complex systems in the Universe, systems shaped, but not determined by, physical chemical properties and constraints, together with the historical vagaries of evolutionary processes on an ever-changing Earth. 

Two types of information:  There is an underlying dichotomy within biological systems: there is the hereditary information encoded in the sequence of nucleotides along double-stranded DNA molecules (genes and chromosomes).  There is also the information inherent in the living system.  The information in DNA is meaningful only in the context of the living cell, a reaction system that has been running without interruption since the origin of life.  While these two systems are inextricably interconnected, there is a basic difference between them. Cellular systems are fragile, once dead there is no coming back.  In contrast the information in DNA can survive death – it can move from cell to cell in the process of horizontal gene transfer.  The Venter group has replaced the DNA of bacterial cells with synthetic genomes in an effort to define the minimal number of genes needed to support life, at least in a laboratory setting [see 3, 4].  In eukaryotes, cloning is carried out by replacing a cell’s DNA, with that of another cell (reference).  

Conflating protein synthesis and folding with assembly and function: Much of the information stored in a cell’s DNA is used to encode the sequence of various amino acid polymers (polypeptides).  While over-simplified [see 5], students are generally presented with the view that each gene encodes a particular protein through DNA-directed RNA synthesis (transcription) and RNA-directed polypeptide synthesis (translation).  As the newly synthesized polypeptide emerges from the ribosomal tunnel, it begins to fold, and is released into the cytoplasm or inserted into or through a cellular membrane, where it often interacts with one or more other polypeptides to form a protein  [see 6].  The assembled protein is either functional or becomes functional after association with various non-polypeptide co-factors or post-translational modifications.  It is the functional aspect of proteins that is critical, but too often their assembly dynamics are overlooked in the presentation of gene expression/protein synthesis, which is really a combination of distinct processes. 

Students are generally introduced to protein synthesis through the terms primary, secondary, tertiary, and quaternary structure, an approach that can be confusing since many (most) polypeptides are not proteins and many proteins are parts of complex molecular machines [here is the original biofundamentals web page on proteins + a short video][see Teaching without a Textbook]. Consider the nuclear pore complex, a molecular machine that mediates the movement of molecules into and out of the nucleus.  A nuclear pore is “composed of ∼500, mainly evolutionarily conserved, individual protein molecules that are collectively known as nucleoporins (Nups)” [7]. But what is the function of a particular NUP, particularly if it does not exist in significant numbers outside of a nuclear pore?  Is a nuclear pore one protein?  In contrast, the membrane bound, mitochondrial ATP synthase found in aerobic bacteria and eukaryotic mitochondria, is described as composed “of two functional domains, F1 and Fo. F1 comprises 5 different subunits (three α, three β, and one γ, δ and ε)” while “Fo contains subunits c, a, b, d, F6, OSCP and the accessory subunits e, f, g and A6L” [8].  Are these proteins or subunits? is the ATP synthase a protein or a protein complex?  

Such confusions arise, at least in part, from the primary-quaternary view of protein structure, since the same terms are applied, generally without clarifying distinction, to both polypeptides and proteins. These terms emerged historically. The purification of a protein was based on its activity, which can only be measured for an intact protein. The primary structure of  a polypeptide was based on the recognition that DNA-encoded amino acid polymers are unbranched, with a defined sequence of amino acid residues (see Sanger. The chemistry of insulin).  The idea of a polypeptide’s secondary structure was based on the “important constraint that all six atoms of the amide (or peptide) group, which joins each amino acid residue to the next in the protein chain, lie in a single plane” [9], which led Pauling, Corey and Branson [10] to recognized the α-helix and β-sheet, as common structural motifs.  When a protein is composed of a single polypeptide, the final folding pattern of the polypeptide, is referred to as its tertiary structure and is apparent in the first protein structure solved, that of myoglobin (↓), by Max Perutz and John Kendell. 

Myoglobin’s role in O2 transport depends upon a non-polypeptide (prosthetic) heme group. So far so good, a gene encodes a polypeptide and as it folds a polypeptide becomes a protein – nice and simple (2).  Complications arise from the observations that 1) many proteins are composed of multiple polypeptides, encoded for by one or more genes, and 2) some polypeptides are a part of different proteins.  Hemoglobin, the second protein whose structure was

determined, illustrates the point (←).  Hemoglobin is composed of four polypeptides encoded by distinct genes encoding α- and β-globin polypeptides.  These polypeptides are related in structure, function, and evolutionary origins to myoglobin, as well  as the cytoglobin and neuroglobin proteins (↓).  In

humans, there are a number of distinct α-like globin and β-like globin genes that are expressed in different hematopoetic tissues during development, so functional hemoglobin proteins can have a number of distinct (albeit similar) subunit compositions and distinct properties, such as their affinities for O2 [see 11].  

But the situation often gets more complicated.  Consider centrin-2, a eukaryotic Ca2+ binding polypeptide that plays roles in organizing microtubules, building cilia, DNA repair, and gene expression [see 12 and references therein].  So, is the centrin-2 polypeptide just a polypeptide, a protein, or a part of a number of other proteins?  As another example, consider the basic-helix-loop-helix family of transcription factors; these transcription factor proteins are typically homo- or hetero-dimeric; are these polypeptides proteins in their own right?  The activity of these transcription factors is regulated in part by which binding partners they contain. bHLH polypeptides also interact with the Id polypeptide (or is it a protein); Id lacks a DNA binding domain so when it forms a dimer with a bHLH polypeptide it inhibits DNA binding (↓).  So is a single bHLH polypeptide a protein or is the protein necessarily a dimer?  More to the point, does the current primary→quaternary view of protein structure help or hinder student understanding of the realities of biological systems?  A potentially interesting bio-education research question.

A recommendation or two:  While under no illusion that the complexities of polypeptide synthesis and protein assembly can be easily resolved – it is surely possible to present them in a more coherent, consistent, and accessible manner.  Here are a few suggestions that might provoke discussion.  Let us first recognize that, for those genes that encode polypeptides: i) they encode polypeptides rather than functional proteins (a reality confused by the term “quaternary structure”).  We might well distinguish a polypeptide from a protein based on the concentration of free monomeric polypeptide (gene product) within the cell.  Then we need to convey the reality to students that the assembly of a protein is no simple process, particularly within the crowded cytoplasm [13], a misconception supported by the simple secondary-tertiary structure perspective. While some proteins assemble on their own, many (most?) cannot.


As an example, consider the protein tubulin (↑). As noted by Nithianantham et al [14], “ Five conserved tubulin cofactors/chaperones and the Arl2 GTPase regulate α- and β-tubulin assembly into heterodimers” and the “tubulin cofactors TBCD, TBCE, and Arl2, which together assemble a GTP-hydrolyzing tubulin chaperone critical for the biogenesis, maintenance, and degradation of soluble αβ-tubulin.”  Without these various chaperones the tubulin protein cannot be formed.  Here the distinction between protein and multiprotein complex is clear, since tubulin protein exists in readily detectable levels within the cell, in contrast to the α- and β-tubulin polypeptides, which are found complexed to the TBCB and TBCA chaperone polypeptides. Of course the balance between tubulin and tubulin polymers (microtubules) is itself regulated by a number of factors. 

 The situation is even more complex when we come to the ribosome and other structures, such as the nuclear pore.  Woolford [15] estimates that “more than 350 protein and RNA molecules participate in yeast ribosome assembly, and many more in metazoa”; in addition to four ribsomal RNAs and ~80 polypeptides (often referred to as ribosomal proteins) that are synthesized in the cytoplasm and transported into the nucleus in association with various transport factors, these “assembly factors, including diverse RNA-binding proteins, endo- and exonucleases, RNA helicases, GTPases and ATPases. These assembly factors promote pre-rRNA folding and processing, remodeling of protein–protein and RNA–protein networks, nuclear export and quality control” [16].  While I suspect that some structural components of the ribosome and the nuclear pore may have functions as monomeric polypeptides, and so could be considered as proteins, at this point it is best (most accurate) to assume that they are polypeptides, components of proteins and larger, molecular machines (past post). 

We can, of course, continue to consider the roles of common folding motifs,  arising from the chemistry of the peptide bond and the environment within and around the assembling protein, in the context of protein structure [17, 18], The knottier problem is how to help students recognize how functional entities, proteins and molecular machines, together with the coupled reaction systems that drive them and the molecular interactions that regulate them, function. How mutations, alleleic variations, and various environmentally induced perturbations influence the behaviors of cells and organisms, and how they generate normal and pathogenic phenotypes. Such a view emphasizes the dynamics of the living state, and the complex flow of information out of DNA into networks of molecular machines and reaction systems. 


Acknowledgements
: Thanks to Michael Stowell for feedback and suggestions and Jon Van Blerkom for encouragement.  All remaining errors are mine. Post updated to include imagines in the right places (and to include the cryoEM structure of the AcChR + minor edits – 16 December 2020.

Footnotes:

  1. Recently emerged from the labs of Martin Raff and Lee Rubin – Martin is one of the founding authors of the transformative “molecular biology of the cell” textbook. 
  2. Or rather quite over-simplistic, as it ignore complexities arising from differential splicing, alternative promoters, and genes encoding non-polypeptide encoding RNAs. 

Literature cited (please excuse excessive self-citation – trying to avoid self-plagarism)

1. Klymkowsky, M.W., Thinking about the conceptual foundations of the biological sciences. CBE Life Science Education, 2010. 9: p. 405-7.

2. Klymkowsky, M.W., J.D. Rentsch, E. Begovic, and M.M. Cooper, The design and transformation of Biofundamentals: a non-survey introductory evolutionary and molecular biology course. LSE Cell Biol Edu, in press., 2016. pii: ar70.

3. Gibson, D.G., J.I. Glass, C. Lartigue, V.N. Noskov, R.-Y. Chuang, M.A. Algire, G.A. Benders, M.G. Montague, L. Ma, and M.M. Moodie, Creation of a bacterial cell controlled by a chemically synthesized genome. science, 2010. 329(5987): p. 52-56.

4. Hutchison, C.A., R.-Y. Chuang, V.N. Noskov, N. Assad-Garcia, T.J. Deerinck, M.H. Ellisman, J. Gill, K. Kannan, B.J. Karas, and L. Ma, Design and synthesis of a minimal bacterial genome. Science, 2016. 351(6280): p. aad6253.

5. Samandi, S., A.V. Roy, V. Delcourt, J.-F. Lucier, J. Gagnon, M.C. Beaudoin, B. Vanderperre, M.-A. Breton, J. Motard, and J.-F. Jacques, Deep transcriptome annotation enables the discovery and functional characterization of cryptic small proteins. Elife, 2017. 6.

6. Hartl, F.U., A. Bracher, and M. Hayer-Hartl, Molecular chaperones in protein folding and proteostasis. Nature, 2011. 475(7356): p. 324.

7. Kabachinski, G. and T.U. Schwartz, The nuclear pore complex–structure and function at a glance. J Cell Sci, 2015. 128(3): p. 423-429.

8. Jonckheere, A.I., J.A. Smeitink, and R.J. Rodenburg, Mitochondrial ATP synthase: architecture, function and pathology. Journal of inherited metabolic disease, 2012. 35(2): p. 211-225.

9. Eisenberg, D., The discovery of the α-helix and β-sheet, the principal structural features of proteins. Proceedings of the National Academy of Sciences, 2003. 100(20): p. 11207-11210.

10. Pauling, L., R.B. Corey, and H.R. Branson, The structure of proteins: two hydrogen-bonded helical configurations of the polypeptide chain. Proceedings of the National Academy of Sciences, 1951. 37(4): p. 205-211.

11. Hardison, R.C., Evolution of hemoglobin and its genes. Cold Spring Harbor perspectives in medicine, 2012. 2(12): p. a011627.

12. Shi, J., Y. Zhou, T. Vonderfecht, M. Winey, and M.W. Klymkowsky, Centrin-2 (Cetn2) mediated regulation of FGF/FGFR gene expression in Xenopus. Scientific Reports, 2015. 5:10283.

13. Luby-Phelps, K., The physical chemistry of cytoplasm and its influence on cell function: an update. Molecular biology of the cell, 2013. 24(17): p. 2593-2596.

14. Nithianantham, S., S. Le, E. Seto, W. Jia, J. Leary, K.D. Corbett, J.K. Moore, and J. Al-Bassam, Tubulin cofactors and Arl2 are cage-like chaperones that regulate the soluble αβ-tubulin pool for microtubule dynamics. Elife, 2015. 4.

15. Woolford, J., Assembly of ribosomes in eukaryotes. RNA, 2015. 21(4): p. 766-768.

16. Peña, C., E. Hurt, and V.G. Panse, Eukaryotic ribosome assembly, transport and quality control. Nature Structural and Molecular Biology, 2017. 24(9): p. 689.

17. Dobson, C.M., Protein folding and misfolding. Nature, 2003. 426(6968): p. 884.

18. Schaeffer, R.D. and V. Daggett, Protein folds and protein folding. Protein Engineering, Design & Selection, 2010. 24(1-2): p. 11-19.

Molecular machines and the place of physics in the biology curriculum

The other day, through no fault of my own, I found myself looking at the courses required by our molecular biology undergraduate degree program. I discovered a requirement for a 5 credit hour physics course, and a recommendation that this course be taken in the students’ senior year – a point in their studies when most have already completed their required biology courses.  Befuddlement struck me, what was the point of requiring an introductory physics course in the context of a molecular biology major?  Was this an example of time-travel (via wormholes or some other esoteric imagining) in which a physics course in the future impacts a students’ understanding of molecular biology in the past?  I was also struck by the possibility that requiring such a course in the students’ senior year would measurably impact their time to degree.  

In a search for clarity and possible enlightenment, I reflected back on my own experiences in an undergraduate biology degree program – as a practicing cell and molecular  biologist, I was somewhat confused. I could not put my finger on the purpose of our physics requirement, except perhaps the admirable goal of supporting physics graduate students. But then, after feverish reflections on the responsibilities of faculty in the design of the courses and curricula they prescribe for their students and the more general concepts of instructional (best) practice and malpractice, my mind calmed, perhaps because I was distracted by an article on Oxford Nanopore’s MinION (→) “portable real-time device for DNA and RNA sequencing”, a device that plugs into the USB port on one’s laptop! Distracted from the potentially quixotic problem of how to achieve effective educational reform at the undergraduate level, I found myself driven on by an insatiable curiosity (or a deep-seated insecurity) to ensure that I actually understood how this latest generation of DNA sequencers worked. This led me to a paper by Meni Wanunu (2012. Nanopores: A journey towards DNA sequencing)[1].  On reading the paper, I found myself returning to my original belief, yes, understanding physics is critical to developing a molecular-level understanding of how biological systems work, BUT it was just not the physics normally inflicted upon (required of) students [2]. Certainly this was no new idea.  Bruce Alberts had written on this topic a number of times, most dramatically in his 1989 paper “The cell as a collection of molecular machines” [3].  Rather sadly, and not withstanding much handwringing about the importance of expanding student interest in, and understanding of, STEM disciplines, not much of substance in this area has occurred. While (some minority of) physics courses may have adopted active engagement pedagogies, in the meaning of Hake [4], most insist on teaching macroscopic physics, rather than to focus on, or even to consider, the molecular level physics relevant to biological systems, explicitly the physics of protein machines in a cellular (biological) context. Why sadly? Because conventional, that is non-biologically relevant introductory physics and chemistry courses, all too often serve the role of a hazing ritual, driving many students out of the biological sciences [5], in part I suspect because they often seem irrelevant to students’ interests in the workings of biological systems.[6].  

Nanopore’s sequencer and Wanunu’s article got me thinking again about biological machines, of which there are a great number, ranging from pumps, propellers, and oars to  various types of transporters, molecular truckers that move chromosomes, membrane vesicles, and parts of cells with respect to one another, to DNA detanglers, protein unfolders, and molecular recyclers (→).  The Nanopore sequencer works because as a single strand of DNA (or RNA) moves through a narrow pore, the different bases (A,C,T,G) occlude the pore to different extents, allowing different numbers of ions, different amounts of current, to pass through the pore. These current differences can be detected, and allows for nucleotide sequence to be read as the nucleic acid strand moves through the pore. Understanding the process involves understanding how molecules move, that is the physics of molecular collisions and energy transfer, how proteins and membranes allow and restrict ion movement, and the impact of chemical gradients and electrical fields across a membrane on molecular movements  – all physical concepts of widespread significance in biological systems.  Such ideas can be extended to the more general questions of how molecules move within the cell, and the effects of molecular size and inter-molecular interactions within a concentrated solution of proteins, protein polymers, lipid membranes, and nucleic acids, such as described in Oliverira et al., Increased cytoplasm viscosity hampers aggregate polar segregation in Escherichia coli [7].  At the molecular level the processes, while biased by electric fields (potentials) and concentration gradients, are stochastic (noisy). Understanding of stochastic processes is difficult for students [8], but critical to developing an appreciation of how such processes can lead to phenotypic differences between cells with the same genotypes (previous post) and how such noisy processes are managed by the cell and within a multicellular organism.    

As path leads on to path, I found myself considering the (← adapted from Joshi et al., 2017) spear-chucking protein machine present in the pathogenic bacteria Vibrio cholerae; this molecular machine is used to inject toxins into neighbors that the bacterium happens to bump into (see Joshi et al., 2017. Rules of Engagement: The Type VI Secretion System in Vibrio cholerae)[9].  The system is complex and acts much like a spring-loaded and rather “inhumane” mouse trap.  This is one of a  number of bacterial type VI systems, and “has structural and functional homology to the T4 bacteriophage tail spike and tube” – the molecular machine that injects bacterial cells with the virus’s genetic material, its DNA.

Building the bacterium’s spear-based injection system is controlled by a social (quorum sensing) system (previous post). One of the ways that such organisms determine whether they are alone or living in an environment crowded with other organisms. During the process of assembly, potential energy, derived from various chemically coupled, thermodynamically favorable reactions, is stored in both type VI “spears” and the contractile (nucleic acid injecting) tails of the bacterial viruses (phage). Understanding the energetics of this process, for example, how coupling thermodynamically favorable chemical reactions, such as ATP hydrolysis, or physico-chemical reactions such as the diffusion of ions down an electrochemical gradient, can be used to set these “mouse traps”, and understandingwhere the energy goes when the traps are sprung is central to students’ understanding of these and a wide range of other molecular machines. 

Energy stored in such molecular machines during their assembly can be used to move the cell. As an example, another bacterial system generates contractile (type IV pili) filaments; the contraction of such a filament can allow “the bacterium to move 10 000 times its own body weight, which results in rapid movement” (see Berry & Belicic 2015. Exceptionally widespread nanomachines composed of type IV pilins: the prokaryotic Swiss Army knives)[10].  The contraction of such a filament has also been found to be used to import DNA into the cell, the first step in the process of horizontal gene transfer.  In other situations (other molecular machines) such protein filaments access thermodynamically favorable processes to rotate, acting like a propeller, driving cellular movement. 

 

During my biased random walk through the literature, I came across another, but molecularly distinct, machine used to import DNA into Vibrio (see Matthey & Blokesch 2016. The DNA-Uptake Process of Naturally Competent Vibrio cholerae)[11]. This molecular machine enables the bacterium to import DNA from the environment, released, perhaps, from a neighbor killed by its spear.  In this system (adapted from Matthey & Bioesch  et al., 2017 →), the double stranded DNA molecule is first transported through the bacterium’s outer membrane (“OM”); the DNA’s two strands are then separated, and one strand passes through a channel protein through the inner (plasma) membrane, and into the cytoplasm, where it can interact with the bacterium’s  genomic DNA.    

The value of introducing students to the idea of molecular machines is that it can be used to demystify how biological systems work, how such machines carry out specific functions, whether moving the cell or recognizing and repairing damaged DNA.  If physics matters in biological curriculum, it matters for this reason – it establishes th e core premise of biology that organisms are not driven by “vital” forces, but by prosaic physiochemical ones.  At the same time, the molecular mechanisms behind evolution, such as mutation, gene duplication, and genomic reorganization, provide the means by which new structures emerge from pre-existing ones, yet many is the molecular biology degree program that does not include an introduction to evolutionary mechanisms in its required course sequence – imagine that, requiring physics but not evolution?  [see 12].One final point regarding requiring students to take a biologically relevant physics course early in their degree program is that it can be used to reinforce what I think is a critical and often misunderstood point. While biological systems rely on molecular machines, we (and by we I mean all organisms) are NOT machines, no matter what physicists might postulate – see We Are All Machines That Think.  We are something different and distinct. Our behaviors and our feelings, whether ultimately understandable or not, emerge from the interaction of genetically encoded, stochastically driven non-equilibrium systems, modified through evolutionary, environmental, social, and a range of unpredictable events occurring in an uninterrupted, and basically undirected fashion for ~3.5 billion years.  While we are constrained, we are more, in some weird and probably ultimately incomprehensible way.

footnotes and literature cited

1. Wanunu, M., Nanopores: A journey towards DNA sequencing. Physics of life reviews, 2012. 9: p. 125-158.

2. Klymkowsky, M.W. Physics for (molecular) biology students. 2014  

3. Alberts, B., The cell as a collection of protein machines: preparing the next generation of molecular biologists. Cell, 1998. 92: p. 291-294.

4. Hake, R.R., Interactive-engagement versus traditional methods: a six-thousand-student survey of mechanics test data for introductory physics courses. Am. J. Physics, 1998. 66: p. 64-74.

5. Mervis, J., Weed-out courses hamper diversity. Science, 2011. 334: p. 1333-1333.

6. A discussion with Melanie Cooper on what chemistry is relevant to a life science major was a critical driver in our collaboration to develop the chemistry, life, the universe, and everything (CLUE) general chemistry course sequence.  

7. Oliveira, S., R. Neeli‐Venkata, N.S. Goncalves, J.A. Santinha, L. Martins, H. Tran, J. Mäkelä, A. Gupta, M. Barandas, and A. Häkkinen, Increased cytoplasm viscosity hampers aggregate polar segregation in Escherichia coli. Molecular microbiology, 2016. 99: p. 686-699.

8. Garvin-Doxas, K. and M.W. Klymkowsky, Understanding Randomness and its impact on Student Learning: Lessons from the Biology Concept Inventory (BCI). Life Science Education, 2008. 7: p. 227-233.

9. Joshi, A., B. Kostiuk, A. Rogers, J. Teschler, S. Pukatzki, and F.H. Yildiz, Rules of engagement: the type VI secretion system in Vibrio cholerae. Trends in microbiology, 2017. 25: p. 267-279.

10. Berry, J.-L. and V. Pelicic, Exceptionally widespread nanomachines composed of type IV pilins: the prokaryotic Swiss Army knives. FEMS microbiology reviews, 2014. 39: p. 134-154.

11. Matthey, N. and M. Blokesch, The DNA-uptake process of naturally competent Vibrio cholerae. Trends in microbiology, 2016. 24: p. 98-110.

12. Pallen, M.J. and N.J. Matzke, From The Origin of Species to the origin of bacterial flagella. Nat Rev Microbiol, 2006. 4: p. 784-90.