Spurious Correlations, or, Why Nicolas Cage Must Be Stopped

There is a misguided assumption in a lot of media reporting on research that correlation equals causation. Correlation is a statistical relationship between two variables – for example, amounts of social service funding and crime rates – that assumes that one variable has some degree of dependence on the other.  In other words, if one variable changes, there should be a change in the other variable if the two are correlated.

There is a problem with this assumption, however – or at least it’s a problem for reporters who can’t be bothered to learn basic statistical concepts. A variable that is statistically correlated with another variable may change not because of a change in the other variable,  but because of factors that have absolutely nothing to do with that other variable. If you think of the large number of variables that could be related to amounts of social service funding (e.g. what activities the funding is being spent on, how or where the funding allocations are made) and to crime rates (e.g. what kinds of crimes, how much criminal activity is actually reported), you can see how a correlation cannot definitively prove that changes in funding for social services will result in changes in crime rates. And that is why statistics instructors always tell their students: correlation does not imply causation.

I’ve just come across a website, Spurious Correlations, that demonstrates this principle with some great examples. It seems that Nicolas Cage should be banned from making any more movies, because the more he appears in films, the more people drown in swimming pools.

FireShot Screen Capture #010 - 'Spurious Correlations' - www_tylervigen_com

(The correlation number at the bottom of the table indicates the strength of the relationship between the two variables. A positive number means that an increase in one variable relates to an increase in the other variable; a negative number means that an increase in one variable relates to a decrease in the other variable. The closer the correlation number is to +1 or -1, the stronger the relationship between the variables.)

It also appears that increased mozzarella consumption leads to more doctorates in civil engineering in the United States. Maybe hungry American PhD students eat more cheese?

FireShot Screen Capture #012 - 'Spurious Correlations' - www_tylervigen_com

And the website also allows you to generate your own spurious correlations. I know that it rains a lot in Washington, the US state closest to me. But I didn’t know that a decrease in precipitation in Washington leads to fewer lawyers in the Northern Mariana Islands.

FireShot Screen Capture #009 - 'Precipitation in Washington correlates with Number of lawyers in the Northern Mariana Islands' - www_tylervigen_com_view_correlation_php_id=3130

A really great feature of this website is that all its spurious correlations are statistically significant. That is, based on the numbers of pieces of data that were used in the calculation, the correlations are unlikely to have occurred by chance. The fact that these correlations are meaningful by statistical standards – but utterly meaningless in terms of any real effect of the variables on each other – emphasizes even more strongly why it’s important to understand statistical concepts.

And it’s especially important to be able to think critically and analytically about statistics if you’re writing about research based on statistical analyses. If you don’t, you may end up misreporting the research and misleading your readers – which is a problem not only for you and for them, but also for society at large. Because that misleads us about the real reasons why things work as they do.

5 comments

  1. What an absolutely *superb* post, Fiona! One of my biggest pet peeves is when journalists take a correlation, extrapolate it to its most ridiculous extreme (in this case, “Nicholas Cage Murders Hundreds by Drowning”), and then cheerfully report it as a fact. Though, come to think of it, I rather have been tempted to drown myself after watching a couple of Nick’s films … 😀

    May I have your permission to reblog your piece, please? With all due — and well-deserved — credit, of course.

Leave a reply to hmunro Cancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.