Correlation vs. Causation

News media only makes money if your eyes are on their media.  As such, it is in the best interest of the news outlet to sensationalize their story in order to get the most eyes.  This can be tricky though, especially when it comes to blurring the line between causation and correlation.

Correlation and causation are both important terms in statistics, and it is easy to get them confused.  The difference is important though.  Correlation means that there is a statistically significant relationship between to observable phenomenon.  For instance, you may find that if you go to bed with your socks on, you have nightmares.  There is certainly a correlation between wearing socks and having nightmares, but are the socks actually causing the nightmares?  Possibly, but not definitely.  Once it can be found that wearing socks definitely causes nightmares, then you have causation.  Correlation doesn't automatically imply causation, it doesn't rule it out either.

This topic came to mind because I see articles in the news quite often that take statistical correlation as an opportunity to imply causation.  For instance, the recent Craiglist killer was said to have a problem with gambling and to be in debt.  There might be a correlation more killers have problems with gambling and debt?  Potentially.  But do gambling and carrying debt cause you to become a killer?  Probably not.  

Wikipedia has more extensive article on the topic that you can read here: It's a little dry, but I found it pretty interesting.


Alcuin said...

There's something fascinating related to correlation and causation called a Hidden Markov Model.

The basic idea is that there are hidden "states" that indicate the probability that you'll exhibit a certain behavior or that you'll move to some other "state".

The Hidden Markov Model assumes that there's some other, hidden cause. For example: perhaps socks and nightmares are caused by the "hidden state" - Very Cold Room...

Jade Mason