After the March for Science, What Now?

Recently, I contributed to a project that turned healthy human tissues into an earlier stage of pancreatic cancer—a disease that carries a dismal 5-year survival rate of 5 percent.


When I described our project to a friend, she asked, “why in the world would you want to grow cancer in a lab?” I explained that by the time a patient learns that he has pancreatic cancer, the tumor has spread throughout the body. At that point, the patient typically has less than a year to live and his tumor cells have racked up a number of mutations, making clinical trials and molecular studies of pancreatic cancer evolution downright difficult. For this reason, our laboratory model of pancreatic cancer was available to scientists who wanted to use it to find the biological buttons that turn healthy cells into deadly cancer. By sharing our discovery, we wanted to enable others in developing drugs to treat cancer and screening tests to diagnose patients early. The complexity of this process demonstrates that science is a team effort that involves lots of time, money, and the brainpower of highly-trained individuals working together toward a single goal.


Many of the challenges we face today—from lifestyle diseases, to the growing strains of antibiotic-resistant superbugs in hospitals, to the looming energy crisis—require scientific facts and solutions. And although there’s never a guarantee of success, scientists persist in hopes that our collective discoveries will reverberate into the future. However, as a corollary, hindering scientific progress means a loss of possibilities.


Unfortunately, the deceleration of scientific progress seems likely possibility. In March, the White House released a document called “America First: A Budget Blueprint to Make America Great Again,” which describes deep cuts to some of the country’s most important funding agencies for science.


As it stands, the National Institutes of Health is set to lose nearly a fifth of its budget; the Department of Energy’s Office of Science, $900 million; and the Environmental Protection Agency, a 31.5 percent budget cut worth $2.6 billion. Imagine the discoveries that could have saved our lives or created jobs, which will instead languish solely as unsupported hypotheses in the minds of underfunded scientists.


Scientists cannot remain idle on the sidelines; we must be active in making the importance of scientific research known. Last weekend’s March on Science drew tens of thousands of people around more than 600 rallies across the world, but the challenge now lies in harnessing the present momentum and energy to make sustained efforts to maintain government funding for a wide range of scientific projects.


The next step is to get involved in shaping public opinion and policy. As it stands, Americans on both sides of the political spectrum have expressed ambivalence about the validity of science on matters ranging from climate change to childhood vaccinations. Academics can start tempering the public’s unease toward scientific authority and increase public support for the sciences by stepping off the ivory tower. Many researchers are already engaging with the masses by posting on social media, penning opinion articles, and appearing on platforms aimed at public consumption (Youtube channels, TED, etc). A researcher is her own best spokesperson in explaining the importance of her work and the scientific process; unfortunately, a scientist’s role as an educator in the classroom and community is often shoved out by the all-encompassing imperative to publish or perish. As a profession, we must become more willing to step out of our laboratories to engage with the public and educate the next generation of science-savvy citizens.


In addition, many scientists have expressed interest in running for office, including UC Berkeley’s Michael Eisen (who also a co-founder of PLOS). When asked by Science why he was considering a run for senate, Eisen responded:


“My motivation was simple. I’m worried that the basic and critical role of science in policymaking is under a bigger threat than at any point in my lifetime. We have a new administration and portions of Congress that don’t just reject science in a narrow sense, but they reject the fundamental idea that undergirds science: That we need to make observations about the world and make our decisions based on reality, not on what we want it to be. For years science has been under political threat, but this is the first time that the whole notion that science is important for our politics and our country has been under such an obvious threat.”


If scientists can enter into the house and senate in greater numbers, they will be able to inject scientific sense into the discussions held by members of legislature whose primary backgrounds are in business and law.


Science is a bipartisan issue that should not be bogged down by the whims of political machinations. We depend on research to address some of the most pressing problems of our time, and America’s greatness lies in part on its leadership utilizing science as an exploration of physical truths and a means of overcoming our present limitations and challenges.



Check out Yoo Jung’s book aimed at helping college students excel in science, What Every Science Student Should Know (University of Chicago Press)

Judging science fairs: 10/10 Privilege, 0/10 Ability

Every year, I make a point of rounding up students in my department and encouraging them to volunteer one evening judging our local science fair. This year, the fair was held at the start of April, and featured over 200 judges and hundreds of projects from young scientists in grades 5 through to 12, with the winners going on to the National Championships.

President Obama welcomes some young scientists to the White House | Photo via USDAGov
President Obama welcomes some young scientists to the White House | Photo via USDAGov

Perhaps the most rewarding part of volunteering your time, and the reason why I encourage colleagues to participate is when you see just how excited the youth are for their projects. It doesn’t matter what the project is, most of the students are thrilled to be there. Add to that how A Real Life Scientist (TM) wants to talk to them about their project? It’s a highlight for many of the students. As a graduate student, the desire to do science for science’s sake is something that gets drilled out of you quickly as you follow the Williams Sonoma/Jamie Oliver Chemistry 101 Cookbook, where you add 50 g Chemical A to 50 of Chemical B and record what colour the mixture turns. Being around  excitement based purely on the pursuit of science is refreshing.

However, the aspect of judging science fairs that I struggle most with is how to deal with the wide range of projects. How do you judge two projects on the same criteria where one used university resources (labs, mass spectrometers, centrifuges etc) and the other looked at how high balls bounce when you drop them. It becomes incredibly difficult as a judge to remain objective when one project is closer in scope to an undergraduate research project and the other is more your typical kitchen cabinet/garage equipment project. Even within two students who do the same project, there is variability depending on whether or not they have someone who can help them at home, or access to facilities through their school or parents social network.

As the title suggests, this is an issue of privilege. Having people at home who can help, either directly by providing guidance and helping do the project, or indirectly by providing access to resources, gives these kids a huge leg up over their peers. As Erin pointed out in her piece last year:

A 2009 study of the Canada-Wide Science Fair found that found that fair participants were elite not just in their understanding of science, but in their finances and social network. The study looked at participants and winners from the 2002-2008 Fairs, and found that the students were more likely to come from advantaged middle to upper class families and had access to equipment in universities or laboratories through their social connections (emphasis mine).

So the youth who are getting to these fairs are definitely qualified to be there – they know the project, and they understand the scientific method. They’re explaining advanced concepts clearly and understand the material. The problem becomes how does one objectively deal with this? You can’t punish the student because they used the resources available to them, especially if they show mastery of the concepts. But can you really evaluate them on the same stage and using the same criteria as their peers without access to those resources, especially when part of the criteria includes the scientific merit of the project?

The fair, to their credit, took a very proactive approach to this concern, which was especially prudent given the makeup of this area where some kids have opportunities and others simply don’t. Their advice was to judge the projects independently, and judge the kids on the strength of their presentation and understanding. But again, there’s an element of privilege behind this. The kids who have parents and mentors who can coach them and prepare them for how to answer questions, or even just give them an opportunity/push them to practice their talk, will obviously do better.

The science fair acts as a microcosm for our entire academic system, from undergrad into graduate and professional school and into later careers. The students who can afford to volunteer in labs over the summer during undergrad are more likely to make it into highly competitive graduate programs as they have “relevant experience,” while their peers who have to work minimum wage positions to pay tuition or student loans are going to be left behind. The system is structured to reward privilege – when was the last time an undergrad or graduate scholarship considered “work history” as opposed to “relevant work experience?” Most ask for a resume or curriculum vitae, where one could theoretically include that experience, but if the ranking criteria look for “relevant” work experience, which working at Starbucks doesn’t include, how do those students compete for the same scholarships? This is despite how working any job does help you develop various transferable skills including time management and conflict resolution. And that doesn’t even begin to consider the negative stigma many professors hold for this type of employment.

The question thus is: Are we okay with this? Are we okay with a system where, based purely on luck, some kids are given opportunities, while others aren’t? And if not, how do we start tackling it?




Disclaimer: I’ve focused on economic privilege here, but privilege comes in many different forms. I’m not going to wade into the other forms, but for some excellent reads, take a read of this, this and this.

Using Math to make Guinness

William Sealy Gosset, statistician and rebel | Picture from Wikimedia Commons

Let me tell you a story about William Sealy Gosset. William was a Chemistry and Math grad from Oxford University in the class of 1899 (they were partying like it was 1899 back then). After graduating, he took a job with the brewery of Arthur Guinness and Son, where he worked as a mathematician, trying to find the best yields of barley.

But this is where he ran into problems.

One of the most important assumptions in (most) statistical tests is that you have a large enough sample size to create inferences about your data. You can’t make many comments if you only have 1 data point. 3? Maybe. 5? Possibly. Ideally, we want at least 20-30 observations, if not more. It’s why when a goalie in hockey, or a batter in baseball, has a great game, you chalk it up to being a fluke, rather than indicative of their skill. Small sample sizes are much more likely to be affected by chance and thus may not be accurate of the underlying phenomena you’re trying to measure. Gosset, on the other hand, couldn’t create 30+ batches of Guinness in order to do the statistics on them. He had a much smaller sample size, and thus “normal” statistical methods wouldn’t work.

Gosset wouldn’t take this for an answer. He started writing up his thoughts, and examining the error associated with his estimates. However, he ran into problems. His mentor, Karl Pearson, of Pearson Product Moment Correlation Coefficient fame, while supportive, didn’t really appreciate how important the findings were. In addition, Guiness had very strict policies on what their employees could publish, as they were worried about their competitors discovering their trade secrets. So Gosset did what any normal mathematician would.

He published under a pseudonym. In a startlingly rebellious gesture, Gosset published his work in Biometrika titled “The Probable Error of a Mean.” (See, statisticians can be badasses too). The name he used? Student. His paper for the Guinness company became one of the most important statistical discoveries of the day, and the Student’s T-distribution is now an essential part of any introductory statistics course.


So why am I telling you this? Well, I’ve talked before about the importance of storytelling as a way to frame scientific discovery, and I’ve also talked about the importance of mathematical literacy in a modern society. This piece forms the next part of that spiritual trilogy. Math is typically taught in a very dry, very didactic format – I recite Latin to you, you remember it, I eventually give you a series of questions to answer, and that dictates your grade in the class. Often, you’re only actually in the class because it’s a mandatory credit you need for high school or your degree program. There’s very little “discovery” occurring in the math classroom.

Capturing interest thus becomes of paramount importance to instructors, especially in math which faces a societal stigma of being “dull,” “boring” and “just for nerds.” A quick search for “I hate math” on Twitter yields a new tweet almost every minute from someone expressing those sentiments, sometimes using more “colourful” language (at least they’re expanding their vocabulary?).

There are lots of examples of these sorts of interesting anecdotes about math. The “Scottish book” was a book named after the Scottish Café in Lviv, Ukraine, where mathematicians would leave a potentially unsolvable problem for their colleagues to tackle. Successfully completing these problems would result in you receiving a prize ranging from a bottle of brandy to, I kid you not, a live goose (thanks Mariana for that story!) The Chudnovsky Brothers built a machine in their apartment that calculated Pi to two billion decimal places. I asked for stories on Twitter and @physicsjackson responded with:

Amalie (Emmy) Noether is probably the most famous mathematician you’ve never heard of | Photo courtesy Wikimedia Commons

There’s also the story of Amalie Noether, the architect behind Noether’s theorem, which basically underpins all modern physics. Dr Noether came to prominence at a time when women were largely excluded from academic positions, yet rose through the ranks to become one of the most influential figures of that time, often considered at the same level of brilliance as Marie Curie. Her mathematical/physics contemporaries included David Hilbert, Felix Klein and Albert Einstein, who took up her cause to help her get a permanent position, and often sought out her opinion and thoughts. Indeed, after Einstein stated his theory of general relativity, it was Noether who then took this to the next level and linked time and energy. But don’t take my word for it – Einstein himself said:

In the judgment of the most competent living mathematicians, Fräulein Noether was the most significant creative mathematical genius thus far produced since the higher education of women began.

While stories highlight the importance of these discoveries, they also highlight the diversity that exists within the scientific community. Knowing that the pantheon of science and math heroes includes people who aren’t all “math geniuses” can make math much more engaging and interesting. Finally, telling stories of the people behind math can demystify the science, and engage youth who may not consider math as a career path.

Heading to #SciWri13!

ScienceWriters 2013! A quick update for all our readers – Cristina and I (Atif) will be in beautiful Gainesville, Florida this week for the National Association of Science Writers/Council for the Advancement of Science Writers annual conference! I will be speaking on a panel on Saturday November 2nd titled “Take a lesson from the universe: Expand” in the Dogwood room at 11am. I’m excited to be speaking on this panel, along with some of my favourite science communicators in Alan Boyle, Joe Hanson, Matt Shipman and Kirsten “Dr Kiki” Sanford. Thanks also to Clinton Colmenares for organizing this wonderful opportunity and what promises to be an excellent discussion. A description of the session from the program:

Scientists know science. And they’re good at getting science news. Know who’s not? Non-scientists. Yet non-scientists outnumber scientists, and their attitudes, believes, intellects (or not) and their votes help determine science policies, from funding for stem cells to what’s taught in school. The near-extinction of science reporters at local news outlets has created a gap in a steady stream of legitimate, dependable science news. Yet today there are more ways than ever to reach the general public. This session is about expanding your audience beyond the science in crowd. We’ll talk with two young scientists who are passionate about finding new ways to reach new audiences and we’ll explore ideas for how PIOs, freelancers, staff reporters and even scientists themselves can take a lesson from the universe and expand.

If you see either of us around, be sure to say hi! We’ll be at most of the events, and would love to meet you!

This was published simultaneously on Mr Epidemiology

Sabremetrics and Math: How sports can teach statistics



Mental arithmatic.

Do those words scare you? If they do, you’re in good company. Mathematical anxiety is a well studied phenomenon that manifests for a number of different reasons. It’s an issue I’ve talked about before at length, and something that frustrates me no end. In my opinion though, one of the biggest culprits behind this is how math alienates people. Lets try an example:

If the average of three distinct positive integers is 22, what is the largest possible value of these three integers?
A: 64
B: 63
C: 33
D: 42
E: 48

Too easy? How about this one:

The average of the integers 24, 6, 12, x and y is 11. What is the value of the sum x + y?

A: 11
B: 17
C: 13
D: 15

I do statistics regularly, and I find these tricky. Not because the underlying math is hard, or that they’re fundamentally “difficult,” but because you have to read the question 3 or 4 times just to figure out what they’re asking. This is exacerbated at higher levels, where you need to first understand the problem, and then understand the math.*

Last week, my colleague Cristina Russo discussed how sports can be used to teach biology. Today I’m going to discuss a personal example, and how I use sports to explain statistics.

One of my main objectives as a statistics instructor is to take “fear” out of the equation (math joke!), and make my students comfortable with the underlying mathematical concepts. I’m not looking for everyone to become a statistician, but I do want them to be able to understand statistics in everyday life. Once they have mastered the underlying concepts, we can then apply them to new and novel situations. Given most of my students are athletically minded or have a basic understanding of sports, this is a logical and reasonable place to start.

Hi, I'm Chris Neil and I'll be your instructor today
The mean number of teeth in adults is 32. The mean number of teeth among hockey players is considerably less | Chris Neil picture source: NHLPA

First, a little backstory. The world of sports has undergone a major shift in the past 20 years. While in the 50s and 60s it was a much smaller enterprise, now it is a multi-million dollar business, where player performance is vitally important. When every dollar counts, you use every tool at your disposal to maximise your assets – including recording everything you can (documented in the book and film Moneyball). Shots, goals, assists, batting averages, yards gained, completions, you name it, there are stats available. But it’s not just owners, management and staff who use this information – armchair fans are now using this information to help them draft the best fantasy team possible – as there is a large amount of money to be won by competing in these leagues. As a result, a lot of data is freely available online.

Let me illustrate this with an example. One of the first concepts people learn about is the difference between mean vs median vs mode.

To reiterate: the mean is the average value, the median is the middle value (which is useful if your data are very skewed), and the mode is the most common value. Typically, this is accompanied by an example of birth weight, or something somewhat relateable. However, it’s hard to understand why there is a difference between these numbers as they are typically the same, as much of the “example” data we use is almost all normally distributed, or is skewed because of some other, usually more convoluted, reason. But not so in the case of sports.

Note: All examples use data on all players from the 2010-2011 NHL season. They were taken from Hockey-Reference, which has a great list of stats on the NHL going all the way back to 1917 (!).

Lets start with age and look at the mean, median and modal values. The mean is 26.6, the median is 26.0 and the mode is 26. Which basically tells us that the mean age of players in the NHL is 26.6, the “middle value” for age is 26, and the most common age is 26. Graphically, it looks like this:

The ages of players in the 2010-11 NHL Season | Data from Hockey-Reference
The ages of players in the 2010-11 NHL Season | Source: Hockey-Reference

Those are all very similar, which makes it difficult to see the difference between the values. However, all students have an intuitive understanding of age – they see most players are 20 to 30 years old, and there are very few who continue to play into their late 30s (except Teemu Selanne, who is actually Benjamin Button).

This changes when we look at another important statistic in hockey – goals. In this case, the mean is 7.5, the median is 4.0 and the mode is 0. This is interesting, as it tells us the “average” number of goals scored in the NHL is 7.5, the median, or “middle value” is 4.0, but the most common value is 0, i.e. a large number of people in the NHL didn’t score any goals. The data are highly skewed, and, more importantly, students can understand why, so they can dedicate their energy in understanding what that skew “means” in statistical terms.

The distribution of goals scored in the 2010-11 NHL season | Source: Hockey-reference
The distribution of goals scored in the 2010-11 NHL season | Source: Hockey-reference

Here, the concept of “skew” is very clear, and you can see that the most common number of goals scored in the NHL is 0, i.e. many players didn’t score any goals at all! This is considerably easier to understand than an example on blood pressure, birth weights, or mileage on cars, and takes the intimidation factor out of statistics.

This is one example of how sports can be used to highlight a statistical concept that I find students struggle with. However, here’s where the real power of sports stats comes in handy: You can scale this up to cover advanced concepts. You want to compare means between groups, (i.e. t-tests)? You can calculate the mean number of goals scored by forwards and defencemen and compare them (forwards score more goals). Need to do a chi-square test? Look at the number of forwards and defencemen on each team and if different teams have different numbers (they don’t). Need to talk about regression? Why not model goals scored and how much time on ice you get to see if more time results in more goals. The possibilities keep going from there.**

The thing I like the most about this is how accessible this makes things. Take away the intimidating part of math, and all of a sudden it’s not nearly as scary. You can change sports to pretty much anything else – baseball, football (association or gridiron),  or even other widely available databases – movie revenue by genre, number of albums sold by pop artists, voter turnout in recent elections, whatever connects with your students. Once you’ve made the example relatable and have removed the “fear” part of the statistics equation, math can suddenly become much more interesting and engaging to students. And once they’re engaged, learning will become that much easier.


*I should point out: I’m not against difficult problems, as comprehension is an important skill to develop in order to apply statistics to new and novel situations. But lets leave that for another day, and not start there. The way we teach statistics and math now is like asking a toddler to do cartwheels on a balance beam above a lake of hungry alligators before they can walk.

**If you would like me to provide webinars/slideshares on statistical concepts in future posts, let me know in the comments.

Guest Post: “Talkin’ ‘Bout a Revolution”…or are we?

PLOS Sci-Ed is pleased to welcome Eve Purdy to the blog today to discuss Massive Open Online Courses (MOOCs), and her experiences with them. For more on Eve, please see her bio at the end of this post.

Revolutions are characterized by radical change. Education has always been about knowledge distribution and the creation of learning communities. To me, these do not seem to be radical ideas. However, some are saying that they will revolutionize education. Some feel that they are just a fad. They are generating conversation and they are changing the way students learn, or are they?

The “they” are Massive Open Online Courses (MOOCs). MOOCs have been on the education scene since 2008 when the course “Connectivism and Connective Knowledge” created by George Siemens at the University of Manitoba registered 2200 students online. They are now available by the hundreds through websites such as as Coursera and Udacity that boast > 4 million participants. Despite attrition rates of >90%, MOOCs have the ability to reach more students in one course offering than in 40 years of teaching through an institution, as described in this article.

How do MOOCs work?
Anybody can register for courses on topics ranging from “Artificial Intelligence for Robotics” to “Microeconomic Principles” to “The Anatomy of the Upper Limb“. These courses, most often taught by a professor at a reputable post-secondary institution (Harvard, UCSF, Stanford etc. have joined the ranks), are offered for free and run for 4-12 weeks. Though courses vary, in most MOOCs, participants watch lectures on their own time, complete assignments, join discussion and submit/peer grade assignments.
I previously outlined my experience with the MOOC “Clinical Problem Solving” here. While MOOCs can supplement my medical school experience they cannot replace it. The same might be said for other practical laboratory and work environments.
xMOOCs vs cMOOCs

When thinking about the role of MOOCs in education, and for the rest of this discussion, it is key to make the distinction between xMOOCs and cMOOCs.


xMOOCs are an eXtension of existing educational pedagogies. These are the most common types of MOOCs featured on Coursera, EdX, Udacity etc. They allow professors to deliver information in the same way that they do in a university lecture-based course but to a much larger audience using technology. The “sage on the stage” is still central to the learning with some secondary discussion on class discussion boards and peer graded assignments. Technology does not change the learning model but it does extend it to reach a larger audience.

xMOOCs provide an opportunity to deliver information in a relatively cheap and efficient way. Universities might consider them as a method to reduce costs and provide the highest quality teaching for courses when the main goal is to deliver information to students. Whether or not this is a valid educational goal is the topic of another debate but for now, let’s look at an example:

Medical students must learn some amount of anatomy. Historically, each institution has had a unique curriculum organized and delivered by professors at each school. This results in excess administrative costs and manpower for information that is essentially the same. From experience, I know that when I learned about the arm at McMaster University then again at Queen’s University the biceps brachii was still the biceps brachii. We could encourage the most engaging and effective anatomy professors across the country to collaborate to create an xMOOC “Anatomy for Medical Students” then share this resource with schools who may or may not choose to use it in their curriculum. Programs could support these MOOCs with other learning opportunities such as labs and tutorials. Such a future is explored in a great article “Just Imagine- New Paradigms for Medical Education“. There are certainly problems with this approach but if the goal is to streamline the delivery of factual information, xMOOCs might just be the way to go.


cMOOCs (connectivist MOOCs) are different. They are a form of decentralized learning. The content is not central to the learning; instead, the process of learning is the learning. A single professor is no longer transferring knowledge in a top-down (vertical) approach as participants act as both students and educators by sharing information and engaging with each other, using technology as means to facilitate such interaction. Sounds a bit abstract right? To read more see this article. Though new to formalized education, this type of learning model is not new. It reflects the type of informal learning that colleagues engage in on a daily basis, but now the constantly evolving balance of learning with and from each other around a shared topic can be explicit and documented.

cMOOCs offer an opportunity to go beyond the material. Students become educators and educators become students. By creating a network where we learn to aggregate, remix, repurpose and share information we become aware that knowledge itself doesn’t make a doctor or an epidemiologist or a biologist. We become aware that how one interacts within a community is equally and likely more important than the knowledge. Universities might consider cMOOCs as a place to explore the already existing informal or “hidden” curriculum. Again, let’s turn to an example in medical education:

Cognitive biases often result in errors in clinical reasoning. For example a physician may be more likely to order unnecessary tests in an otherwise healthy young adult with chest pain if they missed a rare but deadly diagnosis related to that presentation early in their career. This is an example of the availability heuristic that sees recent or easily remembered, often emotionally charged events affect current decision making. If not recognized, it may result in increased costs to the system and to patients. There are many types of biases in decision making, each with different implications for physicians and patients. Simply delivering information about these cognitive biases to learners will not result in understanding or improved practice. Instead, a group of participants (medical students, residents, doctors, nurses, patients etc.) from an infinite number of institutions could commit to exploring cognitive biases through a cMOOC. This would look like the delivery of some content that would serve as a jumping off point for discussion, curation and creation of content from a variety of perspectives. Through such a course the medical student might learn not only what the attending physician knows but also the language she uses and the attitudes she holds. The medical student might challenge the attending and the attending might challenge the student. Being a horizontal course every participant would be in a position to contribute. The attending physician would learn from the nurse and the resident from the medical student. Knowledge acquisition is not the endpoint for the cMOOC the community is. For topics in medicine (and other sciences) that are less well defined cMOOCs provide a unique technological platform, not defined by boundaries of space and time, for exploration.

Are MOOCs revolutionary?

Will xMOOCs mean that more people have access to information? Yes. Will cMOOCs provide a platform for wider learning communities to create knowledge together? Yes. Will this require historical institutions to adapt? Yes. Will this create new opportunities for learning? Yes.

Will that series of “yes’s” result in radical change? You tell me!

I am interested in your thoughts on and experiences with MOOCs. Please feel free to comment below or contact me on twitter @purdy_eve. A thanks to Javier Benitez whose thoughts and perspectives in our discussions about MOOCs in the context of medical education have shaped my own ideas.

And since we may or may not be “Talkin’ ‘Bout a Revolution”

About the Author

Eve PurdyEve Purdy BHSc is a third year medical student at Queen’s University with interests in emergency medicine, medical education and social media in health care. She blogs at and you can always contact her on twitter @purdy_eve

Twitter for Sci-Ed Part 1: Teaching in 140 characters or less

This week, I’ll be talking about Twitter

Twitter is a well known microblogging platform. People can post updates in the form of 140 character “tweets” that can be read by followers, who can “retweet,” i.e. repost that tweet to their own followers, or reply to the original post. I started using it about a year ago, and have found it to be equal parts whimsical and hilarious, along with useful and informative.

Several other authors have discussed reasons why scientists should be using Twitter, including this excellent post on Deep Sea News and this post through the American Geophysical Union. For a more personal opinion, Dr Jeremy Segrott gave his thoughts after he used Twitter for a three months. Scientists are realizing that social media is an important way to translate knowledge to the public when done well, and Twitter provides another avenue by which this can be accomplished.

What I will do is post 5 reasons why I think, as a scientist, you should be using Twitter, or, at the very least, be signed up for a Twitter account. Over the next three posts, I’m going to cover five reasons why I think you should use Twitter, and how it can be incredibly useful as a networking tool. Reasons 2 and 3 will be up on Wednesday, and reasons 4 and 5 will go up on next Monday. I’m not going to go into details about how to set up a Twitter account, and will instead link to some Twitter 101 guides to help you get started at the end of this post. However, before we start, I’m going to cover some terms I’ll be using so everyone is on the same page.

  • A “tweet” refers to a message up to 140 characters in length. This is the message that you write.
  • A “hashtag” is a Twitter based “filing system.” Within Twitter, people can use hashtags to categorize tweets. So, for example, tweets about graduate school use the #gradschool hashtag; Tweets about Kingston (the town I live in) use the #ygk hashtag. It means that anyone who wants to find Tweets about Kingston can search #ygk, and as long as the original person used that tag, they’ll be able to find it. Other popular tags include #scied, for science education, #phdchat for discussions around PhD-related issues, #madwriting for people who are trying to bust through a writing slump, and #TMLtalk for those with terrible taste in hockey teams. Events such as the #GoldenGlobes also have their own hashtags, and you’ve probably seen hashtags at the bottom of your screen while watching the news, or even TV shows.
  • Retweeting” refers to when you repeat what someone else has written, giving them credit. This can occur either through a direct retweet (using the retweet button), or by adding a comment and the letters RT before the original tweet. Sometimes you have to cut down the number of words used if you want to add a comment in front of their tweet, and so the acronym MT (modified tweet) is used to indicate that you have changed their original tweet.

Reason #1: It has very direct, and very relevant implications for those in Public Health

Everyone has a cellphone; some people have two. With the advent of social media (i.e. Facebook, Twitter etc), we are sharing more than we ever have in the past, and anyone can know about that awesome new app that I found, or the delicious Christmas dinner I made. However, while personal tweets can be frivolous, using them to track when people report symptoms of being sick is something that Epidemiologists can use. You imagine the number of “I hate being sick!” and “My nose is stuffed up!” tweets people write in the winter and you know what I mean. This is a very rich, but very poorly understood data source. There has been some exploratory work in this area. A researcher at LSU looked at the accuracy of using Twitter as a predictor of influenza outbreaks, finding that it was quite accurate, as the graph below shows.

Figure 2: Fitted and predicted ILI rates. The red line is predicted rates, the black line is the actual rate. The vertical line separates training from predicted data. (Culotta, 2010)

This is a new area however, and of course, this still needs work:

These results show extremely strong correlations for all queries except for fever, which appears frequently in figurative phrases such as “I’ve got Bieber fever” (in reference to pop star Justin Bieber).

Thanks to Jessica S. for that paper 🙂

Another example of researchers using Twitter is investigating how misinformation spreads through social media. Misinformation bothers us, and it is something Beth has discussed over at Public Health Perspectives, as well as our own Cristina Russo here at Sci-Ed.

Researchers at Columbia sampled 1000 tweets that mentioned antibiotics to investigate how they were reported on Twitter. While the vast majority of tweets were innocuous, there were some that were clearly incorrect or misinformed. What is important however is the reach of these tweets: while 302 tweets by 277 individuals incorrectly used the words “cold” and “antibiotics” together, those tweets reached 850,375 followers (although this number is heavily skewed; the median number of followers was 66).

Rest assured readers, Bieber fever is not a contagious disease | Photo via

Come back on Wednesday for Reasons 2 and 3!

For a guide about how to set up a twitter account, I’d recommend the following links for a handy “how-to”: Wikihow, CNet, Brent Ozar’s FAQ, as well as Travis Saunders’ post about Twitter etiquette. If you’re wondering who to follow, I’d recommend checking out these lists: Colby Vorland’s list of Nutritional and Health Science people, Health Scientists, Shelley Wallingford’s list of Epidemiologists,  Sara Caldwell’s Science-y Folk, RenuShenu’s Public Health Tweeple, Melonie Fullick’s PhDChat and Liz Ditz’s MedSocialMedia.


Scanfeld, D., Scanfeld, V., & Larson, E. (2010). Dissemination of health information through social networks: Twitter and antibiotics American Journal of Infection Control, 38 (3), 182-188 DOI: 10.1016/j.ajic.2009.11.004

Culotta, A. (2010). Detecting influenza outbreaks by analyzing Twitter messages Unpublished. Available online at:

Ed note: A version of this series originally appeared on Mr Epidemiology (Part 1, Part 2, Part 3)

Open data for science education

Open data is the idea that scientific data should be freely available to all, without restrictions, in searchable online repositories. The open data movement is gaining momentum in the scientific community because of its promise to enable more frequent replication of studies and to accelerate the pace of research. But the advantages for science education are just as compelling.

Science students can benefit greatly from educational materials that expose them to real-world phenomena and data. Unlike learning from broad generalizations and pre-fabricated “cookbook” labs, examining and working with real data can increase interest and better prepare students for careers in science. As states begin to adopt the Next Generation Science Standards, which emphasize practices such as analyzing and interpreting data, and mathematical and computational thinking, developers of K–12 science curriculum materials are increasingly looking for ways to incorporate scientific data into their lessons and assessments.

However, barriers exist that prevent educators from effectively using much of the data that scientists produce. As a reader of PLOS blogs, you are likely familiar with the open access movement in scholarly publishing. But even access to journal articles, though valuable, is often not sufficient for educators’ purposes. Data in journal articles are usually in the form of a few graphs. These graphs are typically frozen in PDFs as part of a paper that conveys the authors’ interpretation of the results in the context of their particular study. And the data presentation choices were made with one audience in mind: experts in the field.

Open Data by Colleen Simon for (CC-BY-SA)
Open Data by Colleen Simon for (CC-BY-SA)

Using data as it is presented in papers is almost never pedagogically sound at the middle or high school level; much must be changed about the presentation. Jargon and acronyms might have to be removed from axis titles, individual data sets might need to be separated if they are layered into a single figure, or perhaps a section of the graph that describes phenomena outside the scope of the lesson and would have to be removed. Making these kinds of educationally necessary modifications—while maintaining scientific accuracy—often requires access to full original datasets.

Unfortunately, most scientific data is not archived and readily available online. Educators have to contact the study authors and see if they are willing and able to pass it along. Just as with journal articles, this “write to the author” stop-gap is wildly inefficient. Study authors often can’t or won’t respond to requests for original data for a variety of reasons. Sometimes they are simply out of town and not checking email. Sometimes they want to publish more papers and are afraid of getting scooped. And sometimes, especially with older studies, they actually can’t find their data.

In a 2002 survey of geneticists, of those who admitted to denying at least one request from a colleague for published data, the most commonly given reason was the “effort required to actually produce the information” (80 percent of respondents). As Todd Vision, a biologist at UNC and contributor to the Data Dryad open data repository, explained in BioScience:

Unarchived data files are often misplaced, corrupted, or the software in which they were produced becomes obsolete. Memories fade.

Science education materials developers need full access to the data in order to determine its pedagogical strengths and weaknesses. This process often involves investigating many different data sets until settling on the ones that will best address the learning goals for their particular project. Following up on hundreds of individual papers—with a dismal rate of return—isn’t feasible for a small education nonprofit or a lone teacher trying to innovate at a struggling school. This leaves vast amounts of potentially more educationally useful data untapped.

I talked to Sandra Porter, who I met at the last Science Online conference, about her experience with obtaining data for curriculum materials development. Sandra is the president of Digital World Biology, and one of her collaborative projects, Bio-ITEST, involved the development of bioinformatics curriculum materials for secondary students. In genetics and bioinformatics, which are inherently data-focused, data archiving requirements are more common and Sandra and her colleagues were able to take advantage of open data resources such as the National Center for Biotechnology Information (NCBI) and the Barcode of Life Data (BOLD) Systems. Yet even in these fields, access to raw data—the kind that practicing scientists would encounter in their careers—can be tricky to obtain. Sandra commented:

The raw data was useful for us because we needed to know what raw data looks like so we could work out analysis problems in advance. These types of data files are not likely to be available from many places since these raw data are usually processed and analyzed through many pipeline steps before they get submitted to a database.

There are many worthy reasons to support the open science movement, but the argument for science education holds its own among them. It has never been easier to bring real scientific data into classrooms, and the benefits to young scientists-in-training are clear. It would be a shame for all of that educational potential to languish on old hard drives.