what’s the difference between correlation and causation?

1.03K views

what’s the difference between correlation and causation?

In: Other

9 Answers

Anonymous 0 Comments

Correlation means that two things appear to be linked somehow. Causation means that one thing leads to another.

So for instance a pub selling more beer might lead to a rise in the number of drunk people. That’s causation. One thing led to the other but not the other way round (i.e. A coach load of already drunk people arriving wouldn’t increase the amount of beer that already been sold).

With correlation, two things can change and *appear* to change in the same way, like [US import of oil from Norway and being hit by trains](https://tylervigen.com/view_correlation?id=136). But these two things may or may not be linked, may or may not cause each other or may be linked by an entirely different third thing which is actually the causation.

Hence the saying that correlation does not imply causation – you get a lot of people dishonestly using spurious correlation to try to convince you that there is causation.

Anonymous 0 Comments

Correlation = you analyze data and find that when variable X changes, variable Y does too, and the relationship between their change is mathematically sound (linear, exponential, whatever) and not random. But you have no idea whether variable Z is changing in the background and causing both X and Y to change with it. In that case, X and Y are not causally linked, they just have a common variable that controls them.

Example: the more you smoke, the longer you live. That’s correlation (I made that up). You can’t just say oh smoking is good for you. Because if you dig deeper, you realize the people that smoke are rich enough to buy cigarettes. And the more money you have, the longer you live.

Causation = you do an experiment, you keep everything the exact same, but in one case you change variable A and in the other variable B. If Variable C changes when variable A does, then changing A causes a change in C. It’s not so simple and you need many controls to make sure this is actually causal. But that’s the gist of it.

Example: you get two sets of mice. One you give them folic acid dissolved in sodium bicarbonate. The other, you give them just sodium bicarbonate. You do everything identically. You find those who got folic acid got kidney failure. Folic acid causes kidney failure. Usually causal experiments need to be very tightly controlled, and you manipulate the experiment yourself. With correlation, you usually just observe things that already happened, so you can’t control everything, because it’s in the past.

Anonymous 0 Comments

The big trick is that correlation can give people incredibly misleading information at times. Such as “People who own horses are more likely to live longer.” [https://www.theplaidhorse.com/2019/08/12/research-finds-women-who-own-horses-live-longer-than-those-who-dont/](https://www.theplaidhorse.com/2019/08/12/research-finds-women-who-own-horses-live-longer-than-those-who-dont/) So people start to think, “Owning a horse makes you healthy!” rather than going “People who own horses are likely to be rich, since owning the land to keep a horse is expensive, and richer people are more likely to be able to afford better medical treatment.”

Correlation can lead to people making bad associations, and thus making bad decisions.

Anonymous 0 Comments

Correlation means there is something similar between two things. Causation means one affects the other.

As my coffee consumption increases (number of cups I’ve drank today), the hour of the day increases (how late it is). The time does not affect how many cups of coffee I’ve drank. Nor does the number of cups I’ve drank affect the time. There is a correlation without causation.

As i consume more coffee in a given day, the number of times I visit the restroom in that day increases. There is a direct causal relationship; there is correlation but also causation. The more coffee I drink, the more I need to visit the restroom.

Anonymous 0 Comments

Causation is when one thing directly causes the next thing. “I drank heavily last night, so I got drunk.” Drinking alcohol directly causes you to become drunk.

Correlation is when two things are related in some way. “I drank heavily last night, and I have a hangover this morning.” Drinking alcohol does not directly cause hangovers, it just depletes you of various necessary nutrients. The lack of nutrients is the direct cause of my hangover. Drinking is heavily correlated with hangovers.

Anonymous 0 Comments

Let’s the that there are two group of people, 100 in each group.

Group A has 75 happy people (who each gave their happiness a score of more than 5 out of 10). There are also 75 people in group A who are healthy and don’t take any medications.

Group B has only 30 happy people. There are also 30 people in group B who are healthy and don’t take any medications.

You can see that there appears to be a correlation between health and happiness. However, we don’t know from this information whether being happy causes good health, or if being in good health causes happiness. We only know there is a correlation, and we don’t know what the causation is.

Anonymous 0 Comments

Causation is a much more narrow, stronger, special case of correlation.

**Correlation** is simply whenever you notice thing B changing in some direction when thing A is changing in some direction. For example, overcast skies, grass being wet, and umbrella sales are all correlated with each other – in this case their odds increase when the other two increase.

Now, if I take a garden hose and start spraying my lawn with water, I wouldn’t expect skies to suddenly turning grey, nor is it a particularly great marketing campaign for my umbrella business. I also wouldn’t expect my special discount on umbrellas to do that, or to cause my lawn to become wet. On the other hand, if the clouds start gathering above, I **do** expect that I likely will be raking in the profits, but won’t need to turn the sprinklers on.

**Causation** is the reason why I expect that – in the model of the world that I currently find plausible, the way rain clouds work causes the other two things to happen, but none of the other two cause rain clouds to happen.

So, one way to think about causation is as a kind of an *asymmetric*, or *directed* correlation. Any causal relation will show correlation, but not all correlations are proof there’s causation.

Note that two things may cause each other – for example, in a ping-pong game, one player hitting the ball makes it more likely the other player will (because he couldn’t if the first guy butterfingered it), and the other player hitting back makes it more likely the first player will be able to hit it again – by the same logic.

Correlations are weak evidence, because they may pop up for all sorts of random reasons. The umbrella sales and the wet grass are correlated because they have a common cause, for example. Other times, it’s just plain stroke of luck that two unrelated things happened to play out at roughly the same time.

Anonymous 0 Comments

If I were to dump a bucket of water over your head, you would become soaking wet. My dumping the bucket of water over your head CAUSED you to become soaking wet.

If a teenage boy starts masturbating and developing acne, the masturbation is not causing his acne, and his acne is not causing his masturbation. Instead, the new hormones in his system are responsible for both. The masturbation and the acne are CORRELATED (literally co-related).

Anonymous 0 Comments

A correlation between to things A and B is the observation that they change in a similar pattern. If last Tuesday effect A doubled, and effect B doubled at the same time, and that was part of a consistent pattern, you’d say that they were correlated.

Unfortunately our primitive brains have a built-in cognitive bias that disposes us to assume that two things that change together are linked by a causal relationship, one *making* the other change.

The possibilities that we give too much weight to are that A caused B or that B caused A. The one we forget is that maybe C caused the change in both A and B; and that there’s not just one C effect, there’s the infinite series of phenomena in the environment C1, C2, C3…. Cn that could be doing it.

It helps to look at [weird correlations](http://tylervigen.com/spurious-correlations), where it would be ludicrous to believe that one caused the other. For example: there’s a 95.8% correlation between the per capita consumption of mozzarella and the number of civil engineering doctorates awarded. Nobody actually thinks that eating a particular kind of cheese determines how many people chose to study a given discipline 3 years before they ate it, but the correlation is strong. But correlation does **not** imply causation.