Академический Документы
Профессиональный Документы
Культура Документы
Notice the numerator will tend to be larger in magnitude when larger values of x happen along
with larger values of y, smaller values of x with smaller values of y. The denominator is the
standard deviation of both variables; in other words, we are scaling for how much these variables
vary without reference to each other.
The correlation coefficient always has a value between -1 and 1. When its 1, we have a perfect
positive correlation; when its -1, we have a perfect negative correlation. A perfect correlation
means that if you know one variables value, you automatically know the others as well. For
instance, the ages of any two people are perfectly correlated; if you know one persons age, then
as long as you know the difference in their birthdates, you know the other persons age, too.
When the correlation coefficient is zero, there is no correlation; knowing one variables tells you
absolutely nothing about the other.
If you square the correlation coefficient, you get something called the coefficient of
determination, or r2. It always lies between 0 and 1, and it has a nice interpretation: it tells you
the fraction of variation in one variable that can be explained, or predicted, by variation in the
other. For example, suppose r2 = 0.4 for income and age. That would mean differences in age
explain 40% of differences in income; the remaining 60% would have to be explained by other
factors.
However, were using explain and predict in a very specific way here. It is only a property
of the variables numerical values and how they tend to go together. That does necessarily mean
that changes in one variable cause changes in the other. For example, age and income might be
correlated, but age may not cause higher income; its just that older people tend to have more
experience. Correlation is only a statement of numerical facts; it says nothing about cause and
effect. As we will see, causation is a much more complex matter.
II. Correlation versus Causation
Causation = cause and effect; talking about one thing will tend, other things equal, to result in
another thing. It is often a very difficult matter to distinguish true causal relationship.
Example: RadioLab podcast on Secrets of Success. What causes success? Is it innate ability?
Timing/opportunity? Love of the activity (motivation)? Practice?
Things to consider: (1) Maybe there is not just one answer. Multiple factors contribute
to success. Some may have one cause, others a different cause. (2) Some factors may
only indirectly cause the outcome. E.g., motivation might matter only because it affects
practice. (3) Some factors may interact with each other. E.g., the effectiveness of
practice may depend on innate talent. E.g., the use of talent may be dependent on some
degree of luck. (4) Some things may have both a direct and indirect effect. E.g., talent
has both a direct effect by making you just better, and an indirect effect by increasing
your motivation (people like to do what theyre good at).
Example: Republicans report having more satisfying sex lives than Democrats. According to an
ABC News poll (http://abcnews.go.com/Primetime/News/story?id=180291), Republicans are
more likely to report being very satisfied with their sex life than Democrats (by a margin of 56%
to 47%). This is true even if you control for being in a committed relationship (87% versus
76%), so its not just that Republicans are more likely to be in such relationships. Whats going
on? Does this mean being a Republican causes better sex lives? [Good for demonstrating
CA&B. Turns out men are both more likely to be Republican and more likely to be happy
with their sex lives.]
Example: People who have had more sex partners are more likely to get divorced. Does this
mean having more sex partners causes divorce?
(http://agoraphilia.blogspot.com/2007/02/sexual-correlation-and-causation.html)
[Good for demonstrating CA&B and also BA. One possibility is that possession of
conservative values results in both fewer divorces and fewer sex partners. Another possibility is
that getting married tends to cause fewer sex partners (because you stop adding more), while
getting divorced tends to cause more sex partners (because you start adding them again).]
Example: Is the President responsible for the economys performance on his watch? Obviously,
the President has some influence on economic policy. But there are lots of confounding factors.
(a) Effects can be lagged in time, so that some economic effects are the responsibility of the
previous president. (b) Business cycles can be driven by non-political factors, such as changes in
underlying factors in the economy. (c) A president might get voted out of office because people
think hes responsible for the recession, and as a result the new president comes in just as the
economy is recovering.
Simplest form of causation: AB. That is, when A happens, that means B will also happen.
We say A causes B to happens. When we observe a correlation between A and B, people will
often reach the conclusion that AB. But there are many other possibilities:
1. BA; call this reverse causation.
2. CA & B; cause this external causation.
3.
4.
5.
6.
7.
AB & CB; A and C each independently cause B; call this multiple causation.
(A&C)B; A and B together cause B; call this joint causation.
ACB; call this indirect causation.
CAB; this is also indirect causation, but with a different order of events.
A unrelated to B; we call this a coincidence. B happened for unrelated reasons.