Monday, April 05, 2010

I got up in them middle of the night to write this and it was all muddled. Then I got up in the middle of another night thinking 'This makes no sense!' And it didn't, because I had written P(causation|corrrelation) = P(causation) - P(correlation|no causation). Stupid me. I'll correct it later. In short: They're not independent, as correlation is a prerequisite for causation. P(correlation) is not 0 or we wouldn't be discussing P(correlation) with such fascination. I'll explain it with more rigor when I'm not really tired and needing to get up in seven hours.

----

I occasionally see misuse of the mantra 'Correlation does not imply causation'. So I don't have to write all this every time I see it, here's a concise explanation:

(Note on notation for non-mathy people: P(x) is the probability that x is true. P(y|x) means the probability that y is true given that x is true.)

The source of the confusion with this statement is that 'imply' can mean either 'entail' or 'suggest'. 'x entails y' means P(y|x)=1). 'x suggests y' means P(y|x)>P(y).

In 'correlation does not imply causation', the word 'imply' is being used to mean entail. Correlation does suggest causation.

Also, if you've ever been tempted to claim that 'lack of evidence is not evidence of lack', I encourage you to apply the same [deleted because I muddled it] reasoning, keeping in mind that the terms 'evidence' and 'proof' are not typically considered interchangeable.