Академический Документы
Профессиональный Документы
Культура Документы
Many statistical tests calculate correlation between variables. A few go further and calculate the
likelihood of a true causal relationship; examples are theGranger causality test and convergent cross mapping. The counter assumption, that correlation proves causation, is considered a questionable cause logical fallacy in that two events occurring together are taken to have a cause-and-effect relationship. This fallacy is also known as cum hoc ergo propter hoc, Latin for "with this, therefore because of this", and "false cause". A similar fallacy, that an event that follows another was necessarily a consequence of the first event, is sometimes described as post hoc ergo propter hoc (Latin for "after this, therefore because of this"). In a widely studied example, numerous epidemiological studies showed that women who were taking combined hormone replacement therapy (HRT) also had a lower-than-average incidence of coronary heart disease (CHD), leading doctors to propose that HRT was protective against CHD. But randomized controlled trials showed that HRT caused a small but statistically significant increase in risk of CHD. Re-analysis of the data from the epidemiological studies showed that women undertaking HRT were more likely to be from highersocio-economic groups (ABC1), with better-than-average diet and exercise regimens. The use of HRT and decreased incidence of coronary heart disease were coincident effects of a common cause (i.e. the benefits associated with a higher socioeconomic status), rather than cause and effect, as had been supposed.
[3]
As with any logical fallacy, identifying that the reasoning behind an argument is flawed does not imply that the resulting conclusion is false. In the instance above, if the trials had found that hormone replacement therapy caused a decrease in coronary heart disease, but not to the degree suggested by the epidemiological studies, the assumption of causality would have been correct, although the logic behind the assumption would still have been flawed.
Contents
[hide]
o o o o
3.1 B causes A (reverse causation) 3.2 A and B cause C which causes D (string of causation) 3.3 A causes B and B causes A (bidirectional causation) 3.4 Third factor C (the common-causal variable) causes both A and B
o o o
5.1 In academia 5.2 Causality construed from counterfactual states 5.3 Causality predicted by an extrapolation of trends
Usage[edit]
In logic, the technical use of the word "implies" means "to be a sufficient circumstance." This is the meaning intended by statisticians when they say causation is not certain. Indeed, p implies q has the technical meaning of logical implication: if p then q symbolized as p q. That is "if circumstance p is true, then q necessarily follows." In this sense, it is always correct to say "Correlation does not imply causation."
However, in casual use, the word "imply" loosely means suggests rather than requires. The idea that correlation and causation are connected is certainly true; where there is causation, there is likely to be correlation. Indeed, correlation is used when inferring causation; the important point is that such inferences are made after correlations are confirmed to be real and all causational relationship are systematically explored using large enough data sets. Edward Tufte, in a criticism of the brevity of "correlation does not imply causation," deprecates the use of "is" to relate correlation and causation (as in "Correlation is not causation"), citing its inaccuracy as incomplete.
[4] [1]
While it is not the case that correlation is causation, simply stating their
nonequivalence omits information about their relationship. Tufte suggests that the shortest true statement that can be made about causality and correlation is one of the following:
"Empirically observed covariation is a necessary but not sufficient condition for causality." "Correlation is not causation but it sure is a hint."
General pattern[edit]
For any two correlated events A and B, the following relationships are possible: A causes B; B causes A; A and B are consequences of a common cause, but do not cause each other; There is no connection between A and B; the correlation is coincidental.
Less clear-cut correlations are also possible. For example, causality is not necessarily one-way; in a predator-prey relationship, predator numbers affect prey, but prey numbers, i.e. food supply, also affect predators. The cum hoc ergo propter hoc logical fallacy can be expressed as follows: 1. A occurs in correlation with B. 2. Therefore, A causes B.
In this type of logical fallacy, one makes a premature conclusion about causality after observing only a correlation between two or more factors. Generally, if one factor (A) is observed to only be correlated with another factor ( B), it is sometimes taken for granted that A is causing B, even when no evidence supports it. This is a logical fallacy because there are at least five possibilities: 1. A may be the cause of B. 2. B may be the cause of A. 3. some unknown third factor C may actually be the cause of both A and B. 4. there may be a combination of the above three relationships. For example, B may be the cause of A at the same time as A is the cause of B (contradicting that the only relationship between A and B is that A causes B). This describes a self-reinforcing system. 5. the "relationship" is a coincidence or so complex or indirect that it is more effectively called a coincidence (i.e. two events occurring at the same time that have no direct relationship to each other besides the fact that they are occurring at the same time). A larger sample size helps to reduce the chance of a coincidence, unless there is asystematic error in the experiment. In other words, there can be no conclusion made regarding the existence or the direction of a cause and effect relationship only from the fact that A and B are correlated. Determining whether there is an actual cause and effect relationship requires further investigation, even when the relationship between A and B is statistically significant, a largeeffect size is observed, or a large part of the variance is explained.
In this example, the correlation between the number of firemen at a scene and the size of the fire does not imply that the firemen cause the fire. Firemen are sent according to the severity of the fire and if there is a large fire, a greater number of firemen are sent; therefore, it is rather that fire causes firemen to arrive at the scene. So the above conclusion is false. Example 2 The faster windmills are observed to rotate, the more wind is observed to be. Therefore wind is caused by the rotation of windmills. (Or, simply put: windmills, as their name indicates, are machines used to produce wind.) In this example, the correlation (simultaneity) between windmill activity and wind velocity does not imply that wind is caused by windmills. It is rather the other way around, as suggested by the fact that wind doesnt need windmills to exist, while windm ills need wind to rotate. Wind can be observed in places where there are no windmills or non-rotating windmills. And there are good reasons to believe that wind existed before the invention of windmills.
with other factors) to show that there is a direct correlation between the two properties. For a fixed volume and mass of gas, an increase in temperature will cause an increase in pressure; likewise, increased pressure will
cause an increase in temperature. This demonstrates bidirectional causation. The conclusion that pressure causes temperature is true but is not logically guaranteed by the premise.
State University did not find that infants sleeping with the light on caused the development of myopia. It did find a strong link between parental myopia and the development of child myopia, also noting that myopic parents were more likely to leave a light on in their children's
bedroom.
[7][8][9][10]
In this case, the cause of both conditions is parental myopia, and the above-
stated conclusion is false. Example 3 As ice cream sales increase, the rate of drowning deaths increases sharply. Therefore, ice cream consumption causes drowning. The aforementioned example fails to recognize the importance of time and temperature in relationship to ice cream sales. Ice cream is sold during the hot summer months at a much greater rate than during colder times, and it is during these hot summer months that people are more likely to engage in activities involving water, such as swimming. The increased drowning deaths are simply caused by more exposure to water-based activities, not ice cream. The stated conclusion is false. Example 4 A hypothetical study shows a relationship between test anxiety scores and shyness scores, with a statistical r value (strength of correlation) of +.59.
[11]
Therefore, it may be simply concluded that shyness, in some part, causally influences test anxiety. However, as encountered in many psychological studies, another variable, a "self-consciousness score," is discovered which has a sharper correlation (+.73) with shyness. This suggests a possible "third variable" problem, however, when three such closely related measures are found, it further suggests that each may have bidirectional tendencies (see "bidirectional variable," above), being a cluster of correlated values each influencing one another to some extent. Therefore, the simple conclusion above may be false. Example 5
Since the 1950s, both the atmospheric CO2 level and obesity levels have increased sharply. Hence, atmospheric CO2 causes obesity. Richer populations tend to eat more food and consume more energy Example 6 HDL ("good") cholesterol is negatively correlated with incidence of heart attack. Therefore, taking medication to raise HDL will decrease the chance of having a heart attack. Further research
[12]
Instead, it may be that other underlying factors, like genes, diet and exercise, affect both HDL levels and the likelihood of having a heart attack; it is possible that medicines may affect the directly measurable factor, HDL levels, without affecting the chance of heart attack.
correlations, where the statistical object is a group of persons (i.e. an ethnic group), does not show the same behaviour as individual correlations, where the objects of inquiry are individuals: "The relation between ecological and individual correlations which is discussed in this paper provides a definite answer as to whether ecological correlations can validly be used as substitutes for individual
correlations. They cannot." (...) "(a)n ecological correlation is almost certainly not equal to its corresponding individual correlation."
Determining causation[edit]
In academia[edit]
Main articles: Causality and Causality (physics) The point of view that correlation implies causation may be regarded as a theory of causality which is somewhat inherent to the field of statistics. Within academia as a whole, the nature of causality is systematically investigated from several academic disciplines, including philosophy and physics. In academia, there is a significant number of theories on causality; The Oxford Handbook of Causation (Beebee et al. 2009) encompasses 770 pages. Among the more influential theories within philosophy are Aristotle's Four causes and Al-Ghazali's occasionalism.
[14]
David
Hume argued that causality is based on experience, and experience similarly based on the assumption that the future models the past, which in turn can only be based on experience leading to circular logic. In conclusion, he asserted that causality is not based on actual reasoning: only correlation can actually be perceived.
[15]
Immanuel
principle according to which every event has a cause, or follows according to a causal law, cannot be established through induction as a purely empirical claim, since in would then lack strict universality, or necessity".
[16]
Outside the field of philosophy, theories of causation can be identified in classical mechanics, statistical mechanics, quantum mechanics, spacetime theories, biology, social sciences, and law.
[17]
causal within physics, it is normally understood that the cause and the effect must be connected through a localmechanism (cf. for instance the concept of impact) or a nonlocal mechanism (cf. the concept of field), in accordance with known laws of nature. From the point of view of thermodynamics, universal properties of causes as compared to effects have been identified through the Second law of thermodynamics, confirming the ancient, medieval and Descartian
[18]
view
that "the cause is greater than the effect" for the particular case of thermodynamic free energy. This, in turn, would appear to be challenged by popular interpretations of the concepts of nonlinear system and Butterfly effect, in which small causes are regarded to be able to cause large effects due to, respectively, unpredictability and an unlikely triggering of large amounts of potential energy.
A major goal of scientific experiments and statistical methods is to approximate as best as possible the counterfactual state of the world.
[20]
run anexperiment on identical twins who were known to consistently get the same grades on their tests. One twin is sent to study for six hours while the other is sent to the amusement park. If their test scores suddenly diverged by a large degree, this would be strong evidence that studying
(or going to the amusement park) had a causal effect on test scores. In this case, correlation between studying and test scores would almost certainly imply causation. Well-designed experimental studies replace equality of individuals as in the previous example by equality of groups. This is achieved by randomization of the subjects to two or more groups. Although not a perfect system, the likeliness of being equal in all aspects rises with the number of subjects placed randomly in the treatment/placebo groups. From the significance of the difference of the effect of the treatment vs. the placebo, one can conclude the likeliness of the treatment having a causal effect on the disease. This likeliness can be quantified in statistical terms by the P-value
[dubious discuss]
causation due the presence of bidirectional causation) can be avoided by using explanators (regressors) that are necessarily exogenous, such as physical explanators like rainfall amount (as a determinant of, say, futures prices), lagged variables whose values were determined before the dependent variable's value was determined, instrumental variables for the explanators (chosen based on their known exogeneity), etc. SeeCausality#Statistics and Economics. Spurious correlation due to mutual influence from a third, common, causative variable, is harder to avoid: the model must be specified such that there is a theoretical reason to believe that no such underlying causative variable has been omitted from the model. In particular, underlying time trends of both the dependent variable and the independent (potentially causative) variable must be controlled for by including time as another independent variable.
[citation needed]
Scientists are careful to point out that correlation does not necessarily mean causation. The assumption that A causes B simply because A correlates with B is often not accepted as a legitimate form of argument. However, sometimes
people commit the opposite fallacy dismissing correlation entirely, as if it does not suggest causation. This would dismiss a large swath of important scientific evidence.
[21]
In conclusion, correlation is a valuable type of scientific evidence in fields such as medicine, psychology, and sociology. But first correlations must be confirmed as real, and then every possible causative relationship must be systematically explored. In the end correlation can be used as powerful evidence for a cause and effect relationship between a treatment and benefit, a risk factor and a disease, or a social or economic factor and various outcomes. But it is also one of the most abused types of evidence, because it is easy and even tempting to come to premature conclusions based upon the preliminary appearance of a correlation.
See also[edit]
Affirming the consequent Chain reaction Confirmation bias Confounding Design of experiments Domino effect Ecological fallacy Four causes
Mierscheid Law Normally distributed and uncorrelated does not imply independent
References[edit]
1. ^
a b
PowerPoint: Pitching Out Corrupts Within. Cheshire, Connecticut: Graphics Press. p. 5. ISBN 0-9613921-5-0. 2. ^ Aldrich, John (1995). "Correlations Genuine and Spurious in Pearson and Yule". Statistical Science 10 (4): 364 376. doi:10.1214/ss/1177009870. JSTOR 2246135. 3. ^ Lawlor DA, Davey Smith G, Ebrahim S (June 2004). "Commentary: the hormone replacement-coronary heart disease conundrum: is this the death of observational epidemiology?". Int J Epidemiol 33 (3): 464 7. doi:10.1093/ije/dyh124. PMID 15166201. 4. ^ Tufte, Edward R. (2003). The Cognitive Style of PowerPoint. Cheshire, Connecticut: Graphics Press. p. 4. ISBN 0-9613921-5-0.
5.
^ Quinn GE, Shin CH, Maguire MG, Stone RA (May 1999). "Myopia and ambient lighting at night". Nature 399 (6732): 113 4. doi:10.1038/20094. PMID 10335839.
6.
7.
^ Ohio State University Research News, March 9, 2000. Night lights don't lead to nearsightedness, study suggests
8.
^ Zadnik K, Jones LA, Irvin BC, et al. (March 2000). "Myopia and ambient night-time lighting". Nature 404 (6774): 143 4. doi:10.1038/35004661. PMID 10724157.
9.
^ Gwiazda J, Ong E, Held R, Thorn F (March 2000). "Myopia and ambient night-time lighting". Nature 404 (6774): 144. doi:10.1038/35004663. PMID 10724158.
10. ^ Stone, J; et al., E; Held, R; Thorn, F (March 2000). "Myopia and ambient night-time lighting". Nature 404 (6774): 144. doi:10.1038/35004665 11. ^ The Psychology of Personality: Viewpoints, Research, and Applications. Carducci, Bernard J. 2nd Edition. Wiley-Blackwell: UK, 2009. 12. ^ Ornish, Dean. "Cholesterol: The good, the bad, and the truth" [1] (retrieved 3 June 2011)
13. ^ Robinson, W.S. (1950). "Ecological Correlations and the Behavior of Individuals". American Sociological Review (American Sociological Review) 15 (3): 351 357.doi:10.2307/2087176. JSTOR 2087176. 14. ^ Beebee et al., 2009 15. ^ David Hume (Stanford Encyclopedia of Philosophy) 16. ^ Beebee et al. 2009 17. ^ Beebee et al. 2009 18. ^ Lloyd, A.C., The principle that the cause is greater than its effect, Pronesis 21(2), 1976 19. ^ Paul W. Holland. 1986. "Statistics and Causal Inference" Journal of the American Statistical Association, Vol. 81, No. 396. (Dec., 1986), pp. 945 960. 20. ^ Judea Pearl. 2000. Causality: Models, Reasoning, and Inference, Cambridge University Press. 21. ^
a b
External links[edit]
"The Art and Science of cause and effect": a slide show and tutorial lecture by Judea Pearl Causal inference in statistics: An overview, by Judea Pearl (September 2009)