Вы находитесь на странице: 1из 19

Causal Inference: Overview

Jennifer Hill, Steinhardt School of Culture, Education, and Human Development, New York University, New York, NY, USA
Elizabeth A Stuart, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, MD, USA
Ó 2015 Elsevier Ltd. All rights reserved.

Abstract

This article discusses causal inference in statistics. It describes the theoretical framework and notation needed to formally
define causal effects and the assumptions required to identify them nonparametrically. This involves definition of potential
outcomes that represent the potential value of the outcome across different treatment exposures. Designs that allow
researchers to satisfy or weaken these assumptions are briefly described. Then common parametric assumptions used to
model effects and more current approaches that require weaker assumptions are discussed.

Introduction: Causal Inference as a Comparison approaches that focus on estimating specific causal effects
of Potential Outcomes (Holland, 1986; VanderWeele, 2012).
To formalize assumptions and models in this framework,
Causal inference refers to an intellectual discipline that traditional statistical notation has been extended to explicitly
considers the assumptions, study designs, and estimation define the ‘potential outcomes’ described above (Rubin, 1978).
strategies that allow researchers to draw causal conclusions For instance, for an individual, i, and a binary treatment, Zi, two
based on data. As detailed below, the term ‘causal conclu- outcomes can be defined that have the potential to be observed
sion’ used here refers to a conclusion regarding the effect of at a given time point, Yi(Zi ¼ 0) ¼ Yi(0) and Yi(Zi ¼ 1) ¼ Yi(1). In
a causal variable (often referred to as the ‘treatment’ under common language, these can be described as the outcome that
a broad conception of the word) on some outcome(s) of would be observed if individual i had not received the treatment
interest. The dominant perspective on causal inference in (Zi ¼ 0) and the outcome that would be observed if individual i
statistics has philosophical underpinnings that rely on had received the treatment (Zi ¼ 1), respectively. In the context
consideration of counterfactual states. In particular, it of the example of the effect of a drug on blood pressure, Yi(0)
considers the outcomes that could manifest given exposure would reflect the systolic blood pressure at a particular point in
to each of a set of treatment conditions. Causal effects are time for individual i observed in the absence of the drug and
defined as comparisons between these ‘potential outcomes.’ Yi(1) would reflect the systolic blood pressure observed at that
For instance, the causal effect of a drug on systolic blood same point in time for individual i after treatment with the drug.
pressure 1 month after the drug regime has begun (vs no One benefit of the potential outcome notation is that it allows
exposure to the drug) would be defined as a comparison of for a simple and clear definition of the individual-level causal
systolic blood pressure that would be measured at this time effect for observation i as a comparison between these two
given exposure to the drug with the systolic blood pressure potential outcomes, for instance, Yi(1)  Yi(0). (Nonlinear
that would be measured at the same point in time in the comparisons of potential outcomes, such as those used to define
absence of exposure to the drug. The challenge for causal an odds ratio can also be considered; for simplicity this article
inference is that we are not generally able to observe both of focuses on the simple difference, in part because of complica-
these states: at the point in time when we are measuring the tions that arise with noncollapsible functions of the potential
outcomes, each individual either has had drug exposure or outcomes; see, e.g., Greenland et al., 1999; Austin, 2007.) This
has not. formalization makes explicit the difficulty in inferring causality at
the individual level given that we can never observe both Yi(1)
and Yi(0) for the same individual, at any given time point;
Notation for Potential Outcomes and Causal Effects Holland (1986) refers to this as the ‘fundamental problem of
causal inference.’ Using this potential outcomes framework to
There is a long history of philosophers, statisticians, and define causal effects also allows us to define those effects non-
economists thinking about causality and how to establish parametrically, and creates a useful distinction between structural
causal relationships between variables. This philosophical assumptions and modeling assumptions (discussed below).
discussion continues to this day, with many different perspec- While it is nearly impossible to empirically identify
tives still being expressed (see, e.g., Pearl, 2009; Dowd, 2011). individual-level causal effects without very strong assumptions
Statisticians have generally focused on the investigation of the or very specialized settings (for instance, in a laboratory where
extent to which a given ‘cause’ leads to a change in some conditions can be precisely controlled), it is possible to make
outcome(s) of interest. This is the framework detailed in this progress with respect to estimating average causal effects. The
article. More general discussions of causal relationships, such as most common such estimand is the average of individual-level
those investigated through structural equation models or path causal effects over a population or sample of interest, expressed
analysis, or those that attempt to determine a (or the) cause of generally as E[Y(1)  Y(0)] ¼ (1/N)Si(Yi(1)  Yi(0)). When
an effect, typically require much stronger assumptions than do this average is taken over the full population of interest, it is

International Encyclopedia of the Social & Behavioral Sciences, 2nd edition, Volume 3 http://dx.doi.org/10.1016/B978-0-08-097086-8.42095-7 255
256 Causal Inference: Overview

referred to as the population average treatment effect. When program on hourly wages. The table presents an omniscient
this average is taken over the analysis sample it is referred to as view of the data that the researcher could never see in entirety
the sample average treatments effect (SATE). because the Y(0) cells for those in the treatment group and the
There is also sometimes interest in causal effects for Y(1) cells for those in the control group are complete. The
particular subgroups. As a simple example, researchers may column for Y denotes the outcome that is observed for
be interested in the SATE for males as distinct from the each person: Y ¼ Treat*Y(1) þ (1  Treat)*Y(0). Seeing this
SATE for females. A slightly more complex example is complete data set with all of the potential outcomes observed is
sometimes referred to as the ‘average treatment effect on of course not possible in the real world, but, for illustration, it
the treated’ (ATT). This is the average treatment effect allows us to see the causal effects for each individual and, by
for those in the treatment group, and is defined as extension, to calculate several different sample estimands. The
E[Y(1)  Y(0)jZ ¼ 1] ¼ (1/Nt)Si(Yi(1)  Yi(0)jZi ¼ 1). (An treatment effects vary in this example depending on whether
analogous quantity called the sample average treatment the participant had received a high school education (HS). The
effect on the control (SATC) can also be defined.) subgroups without high school education (HS ¼ 0) are all
Researchers are sometimes interested in the ATT in settings defined to have treatment effects of 4. On the other hand, those
where there is no thought of giving the treatment to the with a high school education would experience no gain from
full population, and rather interest is in the effect of the the job training (for each, Y(0) ¼ Y(1)). Since the distribution
treatment on the people who actually received it. This of those with high school education varies across the treatment
might be the relevant quantity, for instance, in studies of groups, the average treatment effects vary as well. The SATT is
the effects of adolescent drug use, where there is no equal to 2.67 and the SATC is equal to 1.
thought of ‘imposing’ drug use on all adolescents. The
sample average treatment effect on the treated (SATT) in
such a study would compare the average outcomes for drug Structural Assumptions Required for Nonparametric
users with the potential outcomes for the same group had Identification
they not used drugs. This could inform the potential effect
of programs that help to prevent or terminate drug use in If we could observe all of the potential outcomes for our
teenagers. The SATE, on the other hand, would represent sample, then causal inference would be easy. However, as
the difference between the average potential outcomes if all discussed above, the fundamental problem of causal inference
adolescents in the study were heavy drug users versus the reflects the fact that at best we can observe only half of the
potential outcomes that would manifest if none of the potential outcomes, and thus need to rely on assumptions
adolescents in the study were heavy drug users. This would about those missing potential outcomes. In some sense this is
not be the relevant scientific quantity given that some what distinguishes causal inference from standard statistical
portion of that sample would likely never choose to use inference, in that it (generally) requires untestable assumptions
drugs so the effect for these people is rather meaningless. regarding the unobserved potential outcomes. This section
See Imai et al. (2008) and Harder et al. (2010) for more discusses the structural assumptions needed to estimate causal
discussion of possible estimands. effects, followed by a brief discussion of possible study designs
To illustrate some of these concepts, consider the hypo- and additional parametric assumptions.
thetical data in Table 1. These data reflect 14 participants in
a study that aimed to estimate the effect of a job training Ignorability
Unconditional Ignorability
Table 1 Displays data from a hypothetical job training study. Data on What assumption would allow us to identify average causal
high school graduation status (HS), age, and hourly wage (Y) could be effects? To make the issues concrete, consider each piece of the
observed by the researcher. The full set of potential outcomes could not be SATE. Estimating E[Y(1)] for the sample is challenging because
observed. They are displayed here to aid in conceptualization of individual- we only observe Y(1) for those observations that were assigned
level causal effects as well as several sample average estimands to receive Z ¼ 1. If those observations assigned to Z ¼ 0 have
a different mean of Y(1) than those who were assigned to
Person Treat (Z) HS Age Y(0) Y(1) Y
Z ¼ 1, then an estimate of the mean of the treated outcomes,
1 1 0 25 12 16 16 E[Y(1)jZ ¼ 1], will be a biased estimate of E[Y(1)]. Similarly,
2 1 0 22 11 15 15 estimating E[Y(0)] is challenging because we only observe Y(0)
3 1 0 21 10 14 14 for those observations that were assigned to receive Z ¼ 0. If
4 1 0 23 11 15 15 those observations assigned to Z ¼ 1 have a different mean of
5 1 1 22 17 17 17 Y(0) than those who were assigned to Z ¼ 0, then an estimate
6 1 1 24 19 19 19
of the mean of the control outcomes, E[Y(0)jZ ¼ 0], will be
7 0 0 20 10 14 10
a biased estimate of E[Y(0)]. The most straightforward set of
8 0 0 24 12 16 12
9 0 1 22 17 17 17 assumptions that ensures that a simple difference in means can
10 0 1 22 16 16 16 provide an unbiased estimate of the average treatment effect
11 0 1 26 19 19 19 relies on mean independence:
12 0 1 20 18 18 18 E½Yð0Þ ¼ E½Yð0ÞjZ ¼ 0
13 0 1 28 19 19 19
14 0 1 21 17 17 17
E½Yð1Þ ¼ E½Yð1ÞjZ ¼ 1
Causal Inference: Overview 257

Designs and estimation procedures discussed in a later section average these conditional means over the distribution of X and
highlight the distinction between assumptions and estimation. we can recover the marginal means of the potential outcomes.
However, it is worth mentioning at this point that a completely If X is not low-dimensional, however, this conditioning can
randomized experiment would automatically satisfy this mean prove difficult in practice, which has led to use of some of
independence assumption. In fact, it will satisfy an even stronger parametric, semiparametric, and sophisticated nonparametric
assumption, that the distributions of the potential outcomes are approaches discussed below or in other articles.
independent of the binary outcome Z: Y(0), Y(1) t Z. Because of
the connection to this classic design in causal inference, this
Common Support
assumption, sometimes referred to as the ignorability assump-
tion, is also referred to as the randomization assumption. Satisfying ignorability ensures comparability (with regard to
The other estimands discussed above require slightly weaker potential outcomes) across treatment and control groups for
assumptions. For instance, the effect of the treatment on the observations with the same values of the covariates. However,
treated can be decomposed into two pieces, E[Y(1)jZ ¼ 1] and this is only helpful if there is common support across treatment
E[Y(0)jZ ¼ 1]. Since we observe Y(1) for the treated (E[Y(1)j groups with respect to the confounders. That is, there must be
Z ¼ 1] ¼ E[YjZ ¼ 1]), it is straightforward to estimate E[Y(1)j observations from both treatment groups for each neighbor-
Z ¼ 1]. However, it is difficult to estimate E[Y(0)jZ ¼ 1], since hood of the covariate space where we want to make inferences.
we do not observe Y(0) for any treated individuals. It only If, for example, no control units exist for a given neighborhood
becomes straightforward if E[Y(0)jZ ¼ 1] ¼ E[Y(0)jZ ¼ 0]. This of the covariate space where treated units exist, then, in essence,
would be a consequence of randomization of the treatment Z no ‘empirical counterfactuals’ exist for those treatment units
as well since that design ensures that Y(0) t Z. Similarly, and we cannot make causal inferences about them without
Y(1) t Z would allow us to estimate the effect of the treatment further assumptions. For this reason, causal researchers often
on the controls. While these are weaker assumptions than are require an assumption of overlap or ‘common support’ when
necessary for identification of the average treatment effect, it is attempting to identify causal effects, typically formalized as
difficult to find situations in which one assumption is clearly 0 < Pr(ZjX) < 1. In a randomized experiment, this is oper-
more plausible than the other. ationalized by the fact that all individuals in the experiment
have a positive probability of receiving either treatment
Ignorability condition; individuals who could not receive one of the treat-
Unless a randomized experiment has been performed, the ments would be deemed ineligible for the study.
above version of the ignorability assumption is typically not In practice, if researchers determine that this assumption is
plausible because it implies that there are no systematic not satisfied, they might decide to perform inference only for
differences between the treatment and control groups with observations in portions of the covariate space where both
respect to the potential outcomes. Therefore, many causal treated and control units exist. This may lead to a different
inference methods rely on the more general version of the causal estimand than originally intended and results must be
assumption above that conditions on observed covariates, X, interpreted accordingly (see, e.g., Crump et al., 2006). This
formalized as Y(0), Y(1) t ZjX. In words, this assumption trade-off in what is being estimated and how well it can be
requires that the treatment and control group members that estimated is often of particular relevance in studies that use
have the same values of X are not systematically different with propensity score methods, discussed further in other articles on
regard to their potential outcomes. Some authors refer to this observational studies.
formulation as ‘conditional ignorability,’ however, this The determination of common support depends on the
nomenclature is redundant since the standard formulation of inferential goal. For instance, if the goal is to estimate the effect
the assumption allows for conditioning on covariates. of the treatment on the treated, this would require that control
A simple example in which we would know that the igno- observations exist in every neighborhood of the covariate space
rability assumption holds would be a randomized block where treatment observations exist, but not vice versa. If there
experiment in which the blocks are defined by covariates, X. are parts of the covariate space where there are controls but no
Observational studies differ in a crucial way from randomized treated subjects, it does not matter for nonparametric identifi-
block experiments, however, because no explicit randomiza- cation of the ATT since the goal is not to make inferences about
tion occurs; thus, we must take it on faith that the ignorability the controls anyway. This lack of overlap could influence
assumption holds. parametric estimation, however.
How does ignorability help? When the ignorability Common support is the only standard causal inference
assumption holds it implies the following properties: assumption that is empirically verifiable. However, the
usefulness of this property is hampered by two issues. The first
E½E½YjZ ¼ 0; X ¼ E½E½Yð0ÞjZ ¼ 1; X ¼ E½E½Yð0ÞjX is that researchers may condition on a large number of cova-
¼ E½Yð0Þ riates in an attempt to satisfy ignorability and it becomes
increasingly difficult to verify overlap as the dimensionality of
and
the covariate space grows. A common approach to this problem
E½E½YjZ ¼ 1; X ¼ E½E½Yð1ÞjZ ¼ 1; X ¼ E½E½Yð1ÞjX is simply to examine overlap with regard to the propensity
¼ E½Yð1Þ score (see Observational Studies: Overview). However, this
requires correct specification of the propensity score model.
This means that if ignorability holds (conditional on X) and The second issue is that even if the included covariates satisfy
we can appropriately condition on X, then all we have to do is ignorability, they may represent a superset of the covariates
258 Causal Inference: Overview

actually needed to satisfy ignorability. Researchers forcing other authors (such as Gelman and Hill, 2007). Conditioning
common support with respect to all included covariates may on variables that may be affected by the treatment (e.g.,
make overly conservative judgments about which observations mediators) brings in additional complexities because they
need to be discarded from an analysis. Alternatives exist that require consideration of the potential values of those post-
aim to satisfy common causal support, 0 < Pr(ZjW) < 1, which treatment variables. A framework for thinking about post-
requires overlap only with respect to the subset of covariates treatment variables, called principal stratification, is discussed
needed to satisfy ignorability, W (Hill and Yu-Sung, 2013). further in the article with that name. Other approaches to
exploring causal pathways posttreatment are discussed in the
article on Mediation.
Stable Unit Treatment Value Assumption
The potential outcome notation introduced above has an
assumption embedded in it. The potential outcomes for Design Considerations
individual i under treatment and control conditions,
respectively, were defined as a function of the treatment Causal researchers have carefully considered the types of
exposure for individual i alone, Yi(Zi ¼ 1) ¼ Yi(1) and research designs that facilitate satisfying the required assump-
Yi(Zi ¼ 0) ¼ Yi(0). This ignores the possibility that this tions. Generally, randomized experiments are seen as the
outcome could be influenced by the treatments received by strongest design for estimating causal effects since ignorability
others. A more general formulation, Yi(Z), could define is assured through the randomization. Randomization also
potential outcomes based on the full vector of treatments obviates the need to fit models that condition on pretreatment
received by everyone in the study, Z. Even with a modest covariates and makes inferences from such models (when used,
sample size, such a formulation would increase the for instance, to increase efficiency) more robust to departures
complexity of the research questions and subsequent analysis from the parametric assumptions. However, randomized
beyond a reasonable limit, not to mention the impossibility experiments are not always possible, and can have their own
of ever achieving common support. For instance, in a study complications that make inferences challenging, such as
with a binary treatment and 10 people this formulation noncompliance and missing data (Barnard et al., 2003). A final
would lead to 210 ¼ 1024 different potential outcomes for concern with randomized experiments is that they often enroll
each person. The assumption that each person’s potential subjects that are not representative of the population of
outcomes are only defined based on his or her own treat- interest, potentially making them less useful for estimating
ment assignment can be formalized by the following population treatment effects (Imai et al., 2008).
expression: Yi(Z) ¼ Yi(Z0 ), if and only if Zi ¼ Zi0 . This A number of strong nonexperimental study designs exist for
assumption is often discussed colloquially as the ‘no inter- settings when randomized experiments are infeasible or do not
ference assumption’ to reflect the requirement that one answer the question of interest. A common theme across
individual’s potential outcomes cannot be changed by nonexperimental studies is the need to design them with as
another person’s treatment. However, implicit in the stable much (or more) care as is used for randomized experiments
unit treatment value assumption (SUTVA) is also the (Rosenbaum, 1999; Rubin, 2008). Some nonexperimental
assumption that the treatment and control conditions mean designs are motivated by the desire to avoid or make more
the same thing for all units. In other words, there is only one plausible the assumption of ignorability of the treatment
‘version’ of the treatment condition and one ‘version’ of the assignment. These designs include regression discontinuity,
control condition (see Hernan and VanderWeele, 2011 for interrupted time series, fixed effects, and instrumental variables
more thorough discussion of this assumption and its approaches. They are discussed in more detail in other articles
implications). on Observational Studies.
The most common strategy for avoiding concerns about Other nonexperimental study designs are motivated by the
SUTVA violations is through study designs that assign treat- desire to avoid or weaken parametric assumptions by creating
ments at the group level. For example, many studies of as much balance and overlap across treatment and control
educational interventions randomize whole schools to treat- groups as possible. Matching and stratification (with or
ment conditions rather than randomizing individuals within without propensity scores) are the most prominent designs in
schools, due to concerns that the treatment received by one this genre. This type of ‘preprocessing’ (see Ho et al., 2007) can
individual may affect the outcomes of other individuals in the allow either for nonparametric inference on the newly con-
same school. This interference is less of a concern when structed treatment and control groups (for instance, by
considering schools as the units. comparing mean outcomes across matched treatment and
control groups) or for more robust inference using standard
parametric models (for instance, by fitting a regression on
Temporality
matched treatment and control groups). Treatment effect esti-
Implicit in our understanding of causality is the notion that our mation is more robust to deviations from the parametric
causal variable (or treatment) occurs temporally prior to the models when the treatment and control groups have common
outcome. Moreover, when conditioning on covariates it is support and are well balanced (that is, the covariate distribu-
crucial that they represent events that occur or characteristics tions are similar across groups).
measured prior to the causal (treatment) variable, otherwise Nonexperimental studies of all types are ideally designed in
they are implicitly outcomes and need to be conceived as such. ways that limit their sensitivity to unobserved confounding
Rosenbaum (1984) presents a discussion of this issue as do (violations of ignorability). Rosenbaum (2004) formalizes this
Causal Inference: Overview 259

in a concept called ‘design sensitivity’ and Zubizarreta et al. particular determining the effects of treatments on
(2013) provide one example of implementing these ideas in outcomes. It has highlighted the key structural and para-
practice, such as comparing extreme values of a continuous metric assumptions commonly made, and the implications
treatment variable, matching on a large set of observed char- of those assumptions in experimental and nonexperimental
acteristics, and tying the statistical test used to the hypothesized settings. For more information on the details of estimating
pattern of effects. Gelman and Hill (2007) and Shadish et al. causal effects please see the other associated articles in this
(2002) provide more information on nonexperimental study volume.
designs, and cite relevant articles on observational studies.
See also: Experimental Design: Bayesian Designs; Experimental
Design: Compliance; Experimental Design: Large-Scale Social
Parametric Assumptions
Experimentation; Experimental Design: Overview; Hierarchical
Models: Random and Fixed Effects; Instrumental Variables in
The focus of this article thus far has been on identification of
Statistics and Econometrics; Latent Structure and Causal
causal effects through nonparametric approaches (compari-
Variables; Longitudinal Causal Inference; Mediation, Statistical;
sons of means or subgroup-specific means). It can be difficult
Observational Studies: Overview; Semiparametric Models;
to maintain a completely nonparametric approach in situa-
Time Series: General.
tions where it is necessary or beneficial to condition on cova-
riates. For instance, if ignorability requires conditioning on
covariates, then we must find a way to model the requisite
conditional expectations. A straightforward approach to such
Bibliography
conditioning is by fitting a linear regression model to the
observed data. In particular, if one assumes that Y(0) ¼ Xb þ ε0 Austin, Peter C., 2007. The performance of different propensity score methods for
and Y(1) ¼ Xb þ s þ ε1, then regressing observed Y on X and Z estimating marginal odds ratios. Statistical Medicine 26, 3078–3094.
would be a reasonable approach for estimating s. However, we Barnard, J., Frangakis, C.E., Hill, J.L., Rubin, D.B., 2003. Principal stratification
rarely believe that such simple additive and linear models hold approach to broken randomized experiments: a case study of school choice
vouchers in New York city. Journal of the American Statistical Association 98
in practice, and it can be difficult to diagnose deviations from
(462), 299–323.
these models when X is high-dimensional or when there is lack Crump, R.K., Hotz, J.V., Imbens, G.W., Mitnik, O.A., 2006. Moving the Goal-
of common support across treatment and control groups. posts: Addressing Limited Overlap in the Estimation of Average
As discussed above, some designs (notably matching and Treatment Effects by Changing the Estimand. NBER Technical Working Paper
stratification) are motivated by the desire to avoid these types of No. 330.
Dowd, Bryan E., 2011. Separated at birth: statisticians, social scientists, and
stringent parametric assumptions. However, an alternative (or causality in health services research. Health Services Research 46 (2),
complementary) approach is to simply use a semiparametric or 397–420.
nonparametric strategy when estimating causal effects. Examples Gelman, A., Hill, J., 2007. Data Analysis Using Regression and Multilevel/Hierarchical
of semiparametric estimation including regression models that Models. Cambridge University Press, Cambridge.
Greenland, S., Robins, J.M., Pearl, J., 1999. Confounding and collapsibility in causal
incorporate weights based on the propensity score could also be
inference. Statistical Science 14 (1), 29–46.
considered to be a part of this paradigm (Hahn, 1998; Hahn, J., 1998. On the role of the propensity score in efficient semiparametric esti-
Rosenbaum 1987; Hernan et al., 2001; Kurth et al., 2006). A more mation of average treatment effects. Econometrica 66, 315–322.
liberal interpretation might also frame weighting strategies as Harder, V.S., Stuart, E.A., Anthony, J., 2010. Propensity score techniques and the
observational ‘designs’ because the goal is to create a ‘pseudo- assessment of measured covariate balance to test causal associations in
psychological research. Psychological Methods 15 (3), 234–249.
population’ within which treatment and control groups are Hernan, M.A., VanderWeele, T.J., 2011. Compound treatments and transportability of
balanced. Nonparametric regression methods that can accom- causal inference. Epidemiology 22 (3), 368–377.
modate large numbers of covariates are typically Bayesian or Hernan, M., Brumback, B., Robins, J., 2001. Marginal structural models to estimate
computationally driven, or both (examples can be found in Hill, the joint causal effect of nonrandomized treatments. Journal of the American
Statistical Association 96 (454), 440–448.
2011 and Karabatsos and Walker, 2012).
Hill, Jennifer L., 2011. Bayesian nonparametric modeling for causal inference. Journal
Some causal inference approaches target the structural of Computational and Graphical Statistics 20, 217–240.
assumptions without considering the parametric assumptions. Holland, P.W., 1986. Statistics and causal inference. Journal of the American
For instance, classic fixed effects models are often used in causal Statistical Association 81 (396), 945–960.
inference settings as a way of capturing group-level confounders Imai, K., King, G., Stuart, E.A., 2008. Misunderstandings between experimentalists
and observationalists about causal inference. Journal of the Royal Statistical
that are not directly measured (see Hierarchical Models: Random Society, Series A 171, 481–502.
and Fixed Effects). However, these models are traditionally Karabatsos, G., Walker, S.G., 2012. A Bayesian nonparametric causal model. Journal
embedded in a classical linear regression framework with all the of Statistical Planning and Inference 142, 925–993.
parametric assumptions inherent in that framework. Moreover Kurth, T., Walker, A.M., Glynn, R.J., Chan, K.A., Gaziano, J.M., Berger, K.,
Robins, J.M., 2006. Results of multivariable logistic regression, propensity
there is no guarantee that such strategies will reduce rather than
matching, propensity adjustment, and propensity-based weighting under
amplify bias. conditions of non-uniform effect. American Journal of Epidemiology 163 (3),
262–270.
Pearl, J., 2009. Causality: Models, Reasoning, and Inference, second ed. Cambridge
Conclusions University Press, New York.
Rosenbaum, P.R., 1999. Choice as an alternative to control in observational studies.
Statistical Science 14 (3), 259–304.
This article has discussed the potential outcomes frame- Rosenbaum, Paul R., 1987. Model-based direct adjustment. Journal of the American
work for defining and estimating causal effects, in Statistical Association 82, 387–394.
260 Causal Inference: Overview

Rosenbaum, Paul R., 2004. Design sensitivity in observational studies. Biometrika 91 VanderWeele, Tyler J., 2012. Invited commentary: structural equation models
(1), 153–164. and epidemiologic analysis. American Journal of Epidemiology 176 (7),
Rubin, Donald, 1978. Bayesian inference for causal effects: the role of randomization. 608–612.
The Annals of Statistics 6, 34–58. Zubizarreta, J.R., Cerda, M., Rosenbaum, P.R., 2013. Effect of the 2010 Chilean
Rubin, Donald B., 2008. For objective causal inference, design trumps analysis. The earthquake on posttraumatic stress: reducing sensitivity to unmeasured bias
Annals of Applied Statistics 2 (3), 808–840. through study design. Epidemiology 24 (1), 79–87.
Shadish, William R., 2002. Revisiting field experimentation: field notes for the future.
Psychological Methods 7, 3–18.
Causation: Physical, Mental, and Social
Daniel Stoljar, Australian National University, Canberra, ACT, Australia
Ó 2015 Elsevier Ltd. All rights reserved.

Abstract

Many philosophers hold (what might be called) the causal hierarchy picture of the world. The picture is hierarchical because
it views social entities (institutions, groups, or corporations) as being composed of psychological entities (persons or
individuals), which are themselves composed of biological entities (cells), which are themselves composed of fundamental
physical entities (atoms and subatomic particles). The picture is causal because it assumes that causal statements are
potentially true at every level in the hierarchy. While attractive, the picture raises an important question about how any upper
level causal statements can be true. On the one hand, it seems part of the picture that upper level causal statements are true
only if corresponding lower level statements are, but on the other, if the corresponding lower level statements are true, there
seems no reason for supposing that the upper level statements are true in the first place. This article discusses the formulation
of the problem in contemporary philosophy and its relation to a different but superficially similar problem that was
historically raised against Cartesian dualism. The article also suggests that the problem emerges in the causal hierarchy picture
only from a mistaken interpretation of a common causal principle, usually called the exclusion principle.

We invoke the notion of causation to describe many different psychological events: ‘How can there be effort directed against
aspects of our experience and of the world. For example, in anything, or motion set up in it, unless there is mutual contact
describing the physical domain, we might say the earthquake between what moves and is moved? And how can there be
caused the building to collapse; in describing the psychological contact without a body?’ (Descartes et al., 1984).
or mental domain, we might say paranoia was the cause of Gassendi’s objection is an example of a problem of causa-
Oswald shooting Kennedy; and in describing the social domain, tion in multiple domains. It turns on the fact that there is causal
we might say excessive military spending caused instability in the contact between the mental and physical domains, something
Soviet Union. In addition, we often use causal notions to link that is difficult to understand if Cartesian dualism is true.
these various domains together. For example, we might say her However, problems of this sort are by no means unique to the
arm rose because she wanted to signal, linking the mental relationship between the mental and the physical, nor to
domain (wanting to signal) and physical domain (her arm Descartes’ specific account of that relationship, and nor do they
rising); or social policy causes the degradation of the environ- arise only from older ideas in science and philosophy. On the
ment, linking the social domain (social policy) and the physical contrary, they arise from a general picture of the world, which is
domain (the environment). In short, causation is a very general very common in contemporary intellectual culture: the causal
notion: it applies in and across the physical, mental, and social hierarchy picture.
domains, and perhaps others too. According to many philoso-
phers, however, there is an interesting and difficult class of
problems that emerges when we focus on how causal statements The Causal Hierarchy Picture
in one domain – for example, the physical domain – interact with
causal statements in others – for example, the social or mental What sort of picture of the world does contemporary science
domains. Problems in this class – which we will call problems of presents us with? According to one common view, science
causation in multiple domains – is what this article is about (for presents us with a twofold picture. One part of the picture is
causation in general, see Causes and Laws: Philosophical Aspects). that the world is organized into a series of levels (cf.
Oppenheim and Putnam, 1958; Schaffer, 2003). The most
basic level (assuming there is a most basic level) is the
Cartesian Dualism and Gassendi’s Objection physical level, which is described by the most basic science,
physics. Arranged in a hierarchy on top of this level are other
The seventeenth century philosopher and mathematician Rene levels such as the chemical level, the biological level, the
Descartes famously held that, while the nature and behavior of psychological level, and the social level, and each of these
nonhuman animals could be explained completely by the levels are described by nonbasic sciences (sometimes called
physical science of his day, various features of human beings – special sciences), such as chemistry, biology, psychology, and
including in particular the fact that human thought and action sociology. To put things the other way around, the picture
are not under direct environmental control – seemed to place views social entities (institutions, groups, or corporations) as
them beyond the reach of physical science. Descartes therefore being composed of psychological entities (persons or
endorsed dualism, the idea that every human being is individuals), which are themselves composed of biological
a complex of a physical body located in space and subject to entities (cells), which are themselves composed of chemical
physical laws and an immaterial mind not located in space and entities (molecules), and which are themselves composed of
not subject to physical laws. Descartes’ contemporary Gassendi fundamental physical entities (atoms and subatomic particles).
famously objected that dualism made a mystery of how bodily Another part of the picture adds causation to the hierarchy,
actions and movements could be caused by immaterial in two distinct ways. First, it is part of the picture that, since

International Encyclopedia of the Social & Behavioral Sciences, 2nd edition, Volume 3 http://dx.doi.org/10.1016/B978-0-08-097086-8.63006-4 261
262 Causation: Physical, Mental, and Social

causation is a very general notion, causal statements are true at best and most sophisticated picture of the world. On the other
every level. Thus, causal claims about the behavior of corpo- hand, the problem of causation in multiple domains appar-
rations or persons are, from the point of view of the picture, ently tells us that picture cannot be right, because the picture
perfectly genuine and potentially true. Second, it is also part of itself makes it hard to see how any nonbasic causal statements
the picture that every time a causal statement is true at some can be true.
nonbasic level, there is a corresponding causal statement that is
true at the basic level.
The intuitive idea behind this second point is that causal The Exclusion Argument
statements at higher levels require mechanisms at lower levels,
and, ultimately, at the physical level. Thus, consider the state- We have so far been concerned with presenting the problem
ment, she raised her arm because she wanted to signal. generated by the causal hierarchy picture in an informal way.
According to the picture, this statement (if true) is made true However, it is desirable to be a bit more explicit about the
by, and requires for its truth, a further statement drawn from reasoning that leads to the problem. The crucial fact about this
a lower level, presumably a statement about neural activity and reasoning is that it exploits a principle about causation often
how that activity causes movement of the limbs. Or again, called the exclusion principle (Kim, 1993, 1998; Yablo, 1992;
consider the statement, urbanization in Australia in the twen- Bennett, 2003). As we shall shortly see, the precise
tieth century resulted in a decline of religious practice. This formulation of this principle is a matter of controversy, but
statement (if true) is made true by, and requires for its truth, a reasonable initial formulation is (E1):
a further statement drawn from a lower level, presumably
a statement about the thoughts and goals of particular people ðE1Þ If e1 causes e2 ; then there is no event e3 such
that made up the relevant populations. In both cases, causal that: ðaÞ e3 is not identical to e1 and ðbÞ e3 causes e2 :
statements at upper levels require mechanisms at lower levels.
Intuitively, the idea here is that the discovery of a cause for
Many aspects of this causal hierarchy picture have been
some phenomenon usually means that other causes have been
discussed and examined in recent philosophy of (natural and
ruled out or excluded. For example, if a doctor tells you that
social) science (see Social Properties (Facts and Entities): Phil-
the pain in your foot is caused by uric acid crystals in the joint
osophical Aspects). But for our purposes, the important point is
of your big toe, you do not regard it as an open question
that it raises a problem of causation in multiple domains, and
whether the pain has some other cause – calcium crystals, for
in fact a whole class of such problems. Here is a simple way of
example.
seeing the issue. Consider some causal statement S, which is
With the exclusion principle in place, we can now be more
putatively true at some nonbasic level N. Given the causal
explicit about the problem generated by the causal hierarchy
hierarchy picture, there will be another causal statement S*,
picture. It is easiest to state the argument if we focus on a partic-
which is true at the basic level B. However, if S* is true, and we
ular kind of causal statement, one in which a certain nonbasic
know it is true given our picture of science, why should we
event n causes a physical event p. (Later we will consider whether
suppose in addition that S is true? Intuitively, after all, all the
this assumption is misleading.) Thus, suppose that (1) is true
causal work – all the pushing and shoving – is done at the basic
level. But then it is mysterious why S – or any nonbasic causal ð1Þ n causes p:
claim – should be true.
While the issues are sometimes assimilated, the problem Given the causal hierarchy picture, it would seem that there
generated by the causal hierarchy picture is different from the must be a physical event p*, which also causes p; hence (2) is
problem that Gassendi raised for Descartes, in two respects. true:
First, Gassendi’s objection flows from the fact that, for Des- ð2Þ p causes p:
cartes, the physical and the mental domains are so different in
nature: one is in space, the other not. But the problem raised by Now, given that n and p* are from different domains, i.e.,
the causal hierarchy picture does not have its source in the fact are drawn from different levels, we may further assume that
that the mental level is so different in nature from the physical ð3Þ n is distinct from ði:e:; not identical toÞ p :
level. Its source is rather the simple fact that upper level causal
claims are distinct from (i.e., not identical with) lower level However, the principle of exclusion – in the form of (E1) –
claims, together with the idea that every time a causal claim is says that, in general, no two distinct events can cause a single
true at some nonbasic level, a corresponding causal claim will event. Hence,
be true at the basic level. Second, the problem generated by the ð4Þ abc
causal hierarchy picture is not limited to the mental and
physical domains, as was Gassendi’s objection. As we have ð4Þ If n is distinct from p ; then it is not
seen, it arises for any nonbasic causal claim, whether in
the case that both n and p causes p:
biology, psychology, or social theory. So long as one supposes
that there are causal claims at these levels, a problem of But (1)–(4) are jointly contradictory: while any three of
causation of multiple domains can be raised. them might be true, all four cannot be true together. It follows
We can now see the significance of the problem of causa- (barring some subtle ambiguity) that at least one of (1)–(4)
tion in multiple domains. On the one hand, we have a picture must be false. In other words, what the reasoning tells us is that
of the world, the causal hierarchy picture, which is presented the causal hierarchy picture together with the exclusion prin-
to us by science and therefore has a reasonable claim to be our ciple presents a contradiction. To solve the problem, one must
Causation: Physical, Mental, and Social 263

give up or modify either the exclusion principle or the causal and social respects. But this idea is something that Descartes
hierarchy picture or both. would have denied. According to Descartes, the psychological
(and presumably the social also) is only contingently related to
the physical, and thus a possible world that completely matches
Possible Responses the actual world in physical respects is not necessarily a world
that completely matches it in these other respects.
The most common response to the exclusion argument is to
modify the principle of exclusion. Before turning to that
response, however, it is important to briefly consider some Rejecting the Causal Part of the Picture
alternative possibilities. If one cannot reject the hierarchy part of the causal hierarchy
picture, can one reject the causal part? One way to develop this
Rejecting the Hierarchy Part of the Picture objection starts with the observation that it is not at all obvious
that particular sciences make explicit use of causal notions.
One obvious response is to suggest that the causal hierarchy Bertrand Russell (1917/1963) famously claimed that physics
picture is a distorted picture of the world because it takes the had no use for causation. More recently, many have argued
metaphor of levels too literally. This objection might go as convincingly that causation has no place, or at least no
follows. While it is true that there are apparently lots of central place, in the explanations offered by particular special
different sciences, this is only a methodological or pragmatic sciences such as psychology, linguistics (Chomsky, 1959;
fact, which represents our limited epistemological access to the Cummins, 1983), and anthropology (Sperber, 1996). So
world. In fact, there is only a single science – physics – which perhaps the proper account of the picture of the world that
genuinely explains the world. The other sciences are simply science presents is that it is hierarchical but not causal.
convenient descriptions, which, apart from their pragmatic However, while it might very well be true that particular
usefulness, will or could be dispensed within the future. sciences do not provide explicitly causal explanations, this does
The main problem with this position is a feature of the not mean that various causal claims will not be true at the
causal hierarchy picture that so far has not been discussed: various levels. We can see this if we examine briefly the dispute
multiple realizability. Earlier we saw that according to the between Chomsky and Skinner, surely one of the central
causal hierarchy picture, entities from upper levels are moments of twentieth-century science (Chomsky, 1959).
composed of entities from lower levels. But it seems an obvious Skinner had proposed, among other things, that the basic
empirical fact that the same entities from an upper level might aim of scientific psychology and linguistics should be to
be composed of many different entities at lower levels. Thus, provide causal explanations of human behavior – functional
the same corporation over time might be composed of different analysis, Skinner called it. Part of Chomsky’s response to
individuals. The same mental processes in different people Skinner was that it is premature to suppose that scientific
might be made up of different neural processes – indeed, given psychology could provide such explanations. He suggested
science fiction possibilities, they might not even be made of instead that a more reasonable aim for scientific psychology
neural processes at all. The lesson usually drawn from multiple would be the decomposition of certain psychological capacities
realizability (as these sort of facts are called) is that entities and abilities, and in particular the ability on the part of
from upper levels are genuinely distinct from lower level enti- a speaker to speak a language. I think it is fair to say that, in this
ties and, as a corollary, that the sciences dealing with upper regard, Chomsky’s suggestion has been enormously influential
level entities are genuinely distinct from lower level sciences and that most cognitive psychologists follow him in not aiming
(Fodor, 1974/1981); see Kim (1993) for an opposite view. at causal explanations of behavior. However, even if this is true,
However, if this is the right lesson, one should take seriously it also remains true both that there are truths about how
the hierarchy part of the causal hierarchy picture. behavior is caused, and that these truths are in all probability
It is important to note that drawing this lesson from multiply realized. But then we have the problem of causation
multiple realizability does not represent a return to Cartesian of multiple domains back again.
dualism. One can see this if one introduces another feature of One can imagine a more radical development of this point,
the causal hierarchy picture: supervenience. Philosophers often where the claim at issue is not that causation plays no role in
say that, in the causal hierarchy picture, the upper levels the explanations offered by special sciences, but rather that no
supervene on the basic level, where this means that, while the nonbasic causal claims are true. In fact, this idea has been
upper levels in the hierarchy are distinct from the basic level, historically popular in philosophy of mind and psychology. In
they are nevertheless wholly determined by, or are entailed by, part because of the influence of Wittgenstein, it was at one time
the basic level (cf. Kim, 1993). A simple way to think of very common to hear that causal talk in the psychological
supervenience is by switching metaphors from levels to domain either should not be taken seriously or should be
patterns. To say that the psychological or the social supervene interpreted very differently from similar talk deployed in the
on the physical is to say that these are patterns in the physical physical domain (Ryle, 1949; for criticism, see Davidson,
– hence are wholly physical – it is simply that they are not 1963/1981). But suggestions of this sort consistently
patterns that physicists are interested in. To put the point a little underestimate the sense in which causation is an extremely
more formally, supervenience tells us that if you imagine abstract and general notion. W.V.O. Quine famously argued
a possible world that completely matches the actual world in all that the difference between the statements ‘cows exist’ and
physical respects, then you have ipso facto imagined a possible ‘numbers exist’ is not a difference in existence, but rather
world that also matches the actual world in all psychological a difference between cows and numbers. Similarly, one might
264 Causation: Physical, Mental, and Social

say that the difference between causal claims in the physical the exception rather than the rule. On the other hand, it seems
domain and, e.g., the psychological domain is not clear that we could formulate a probabilistic version of the
a difference in causation; it is simply a difference between exclusion argument based on (E2), which would be almost as
physics and psychology. bad as the nonprobabilistic version based on (E1).
A second objection to the exclusion principle cuts deeper,
however. Consider the possibility of composition (or multiple
Rejecting the Metaphysical Presuppositions of the Picture
realizability) all the way down: whenever we arrive at a poten-
Finally, a number of philosophers have attempted to avoid the tial basic level, it turns out that this level is itself composed at
exclusion argument by suggesting that it only arises from a lower level (for discussion of this idea, see Schaffer, 2003).
mistaken metaphysical views lying behind the causal hierarchy Combining this idea with the exclusion principle yields some
picture. For example, in stating the exclusion argument above, absurd results. For if there is composition all the way down,
we assumed that an upper level event might cause a lower level it is easy to see that the exclusion argument will tell us that
event – (1) summarizes just that possibility. According to some any candidate causal statement is false. But this suggests
philosophers, however, this sort of ‘downward causation’ is that there really is something wrong with the exclusion
deeply mysterious, and causation takes place only within principle. Surely a principle of causation should not have the
levels, not across levels (cf. Kim, 1993). However, if one rejects consequence that no candidate causal claim is true.
downward causation, one might reject the exclusion argument In response to this objection, a proponent of the exclusion
right from the start. principle might take one of two options. First, one might deny
However, the problem with this suggestion is twofold. the possibility that there is composition all the way down.
First, it is not clear that downward causation is so mysterious. However, this is a fairly desperate maneuver. It is surely an
As we have noted a number of times, causation is an empirical question, not a question that can be decided a pri-
extremely general notion. There seems no reason in principle ori, whether the levels of the world simply go on forever.
why we should not suppose that upper level events might not Second, and much more plausibly, one might revise the
cause lower level ones. Second, it seems possible to formulate exclusion principle to accommodate the objection. The
the exclusion argument even if one does not start by obvious way to do this is to replace ‘identical to’ in (E2) with
assuming that downward causation is possible. For consider: ‘supervenient on,’ invoking the relation we discussed in the
given the causal hierarchy picture, every nonbasic event will context of multiple realizability. That would result in the
supervene on some basic event. But then, one could causally following principle:
explain the presence of any nonbasic event by causally
Eð3Þ If e1 causes e2 ; then ðprobably; in generalÞ there is no
explaining the basic event on which it supervenes. This brings
the problem back again: given the causal hierarchy picture, event e3 such that: ðaÞ e3 not supervenient to e1
there seems no rationale for supposing that any nonbasic and ðbÞ e3 causes e2
causal statement is true.
This version of the exclusion principle does not have the
result that, if the world is composed all the way down, none of
Problems with Exclusion our paradigm causal claims are true. For it is consistent with
(E3) that causal claims can be true at levels that supervene on
If we cannot answer the exclusion argument by rejecting the the basic level. Thus, even if we assume that there is compo-
causal hierarchy picture or some metaphysical presupposition sition at every level, it is still consistent with (E3) that some
of the argument, it would seem that the only option left is to causal claims at nonbasic levels are true.
give up or modify the exclusion principle. Indeed, objections to On the other hand, if it is (E3), and not (E1) or (E2), which
this principle are easy enough to find. Consider the spy shot by provides the proper articulation of the exclusion principle, then
the firing squad: the first soldier on the firing squad caused his we have a response to the exclusion argument. The exclusion
death, but so did the second. Hence, the spy’s death is over- argument mistakes (E1) for (E3). Combining the causal hier-
determined. But a little reflection shows that the exclusion archy picture and (E1) yields a contradiction, but there is no
principle as stated so far rules out cases of overdetermination problem with combining the causal hierarchy picture and (E3),
a priori. for (E3) does not exclude the possibility of nonbasic causal
This objection to the exclusion principle is suggestive but claims.
limited. It is true that there are possible cases of over-
determination. But it still seems unlikely that every case of
nonbasic causation should be analogous to the case of the firing Further Questions
squad. To accommodate the possibility of overdetermination,
we can reformulate the exclusion principle to read Problems of causation of multiple domains threaten to
undermine our picture of the world. But, as we have seen, these
Eð2Þ If e1 causes e2 ; then ðprobably; in generalÞ there is no
problems arise only from a mistaken conception of the exclu-
event e3 such that: ðaÞ e3 not identical to e1 sion principle, i.e., only if we formulate that principle as (E1) or
and ðbÞ e3 causes e2 (E2) and not (E3). However, it would be a mistake to conclude
that this is the end of the matter. For (E3) generates some
(E2) allows, where (E1) does not, the possibility of firing further puzzling questions of its own. I will close by briefly
squad cases, but it also says that we can expect such cases to be mentioning two of these.
Causation: Physical, Mental, and Social 265

The first question arises when we consider the difference causal? As the case of epiphenomenalism makes clear, the mere
between (5) and (6): fact that a correlation is a product of a causal relation and some
other relation does not make the correlation causal. Perhaps
ð5Þ For some events e1 ; e2 ; and e3 ; if e1 causes e2 ; and if e3 then, the answer we have given to the exclusion argument is no
supervenes on e1 then e3 causes e2 better than the epiphenomenalist answer to Gassendi?
Part of the answer to this question is that there are corre-
ð6Þ For all events e1 ; e2 ; and e3 ; if e1 causes e2 ; and if e3 lations and there are correlations. True, the fact that a correla-
tion is the product of a causal relation and some other relation
supervenes on e1 then e3 causes e2
does not make it causal. But it had better not make it noncausal
(E3) allows the possibility that (5) is true. But it had better either – after all, every causal relation is a product of a causal
not allow the possibility that (6) is. For (6) is subject to coun- relation and identity. So the structural similarity between the
terexamples such as the following (Jackson and Pettit, 1990). account we have offered and epiphenomenalism does not
Suppose living in a particular neighborhood on the north entail that account is no better than epiphenomenalism.
side of town (e1) makes you happy (e2); living in a particular But there is also a somewhat deeper issue here. The world
neighborhood on the north side of town entails living on the presents us with a myriad of correlations. Our concept of
north side of town – as we might put it, living on the north causation is in part a tool to divide these correlations into the
side of town (e3) supervenes on living on a particular neigh- causal and the noncausal. Obviously, there are some correla-
borhood on the north side (e1). Nevertheless, it might not be tions that are clearly causal and some that are not. But it needs
true that living on the north side (e3) makes you happy (e2) – to be admitted that there are hard cases, cases in which it is
after all, with the exception of your own particular neighbor- difficult to say whether we have a causal correlation here or not.
hood, you may dislike the north side, thinking of yourself as The causal hierarchy picture that we have been discussing
a ‘south side person.’ But now we are faced with a problem: if presents us with such hard cases. Adopting a concept of
(6) is not true and (5) is, how are we to draw the line? causation that employs (E1), these cases will turn out not to be
In order to answer this problem, Jackson and Pettit appeal causal; adopting a concept of causation that employs (E3), they
to a principle of causation they call invariance of effect under will. As we have seen, the latter concept does seem a reasonable
variance of realization. They go on to develop this principle, concept of causation. But defending that choice in detail
embedding it in a framework for thinking about the relation requires more investigation of the concept of causation that we
between causation, causal explanation, and the causal hier- can enter into here.
archy picture (especially as that applies to the social and
psychological realms), which they call program explanation (cf. See also: Causes and Laws: Philosophical Aspects;
Jackson and Pettit, 1990, 1992; Pettit, 1993; for some similar Individualism versus Collectivism: Philosophical Aspects;
work, see List and Menzies, 2009). Unfortunately, reviewing Intentionality and Rationality.
these ideas is beyond the scope of the present discussion. But
the important point is that the idea that (E3) articulates the
exclusion principle incurs an obligation: to explain what Bibliography
distinguishes the event triples he1, e2, e3i that satisfy (5) from
the event triples that do not. Bennett, K., 2003. Why the exclusion problems seem intractable and how, just maybe,
The second question returns us to Gassendi’s objection to to tract it. Nous 37 (3), 471–497.
Descartes. In response to Gassendi’s objection, many Cartesian Chomsky, N., 1959. Verbal behavior Skinner B F. Language 35 (1), 26–58.
Cummins, R., 1983. The Nature of Psychological Explanation. MIT Press,
dualists are tempted to endorse epiphenomenalism. Epi- Cambridge, MA.
phenomenalists accept that there is a correlation between Davidson, D., 1963/1981. Actions, reasons and causes. In: Davidson, D. (Ed.), Essays
mental events and behavioral events, but insist that the corre- on Actions and Events. Oxford University Press, Oxford, UK.
lation in question is not causal. Rather, the correlation is the Descartes, R., Cottingham, J., et al. (Eds.), 1984. The Philosophical Writings of
Descartes, vol. 22. Cambridge University Press, Cambridge, UK.
product of two other relations: a nomological (or lawful)
Fodor, J.A., 1974/1981. Special sciences: or, the disunity of science as a working
relation between brain events and psychological events, and hypothesis. In: Fodor, J.A. (Ed.), Representations, first ed. MIT Press, Cam-
a causal relation between brain events and behavioral events. bridge, MA.
Since these latter relations can obtain without a causal relation Jackson, F., Pettit, P., 1990. Causation and the philosophy of mind. Philosophy and
obtaining between mental events and behavioral events, Phenomenological Research 50, 195–214.
Jackson, F., Pettit, P., 1992. In defense of explanatory ecumenism. Economics and
epiphenomenalism is a possible position, no matter how Philosophy 8, 1–21.
counterintuitive it seems to suppose that mental events do not Kim, J., 1993. Mind and Supervenience. Cambridge University Press, Cambridge, UK.
cause behavioral events. Kim, J., 1998. Mind in a Physical World. Cambridge University Press, Cambridge, UK.
The problem that epiphenomenalism presents for (E3) and List, C., Menzies, P., 2009. Nonreductive physicalism and the limits of the exclusion
principle. Journal of Philosophy 106 (9), 475–502.
our discussion of it is the following. If we restrict attention to the
Oppenheim, P., Putnam, H., 1958. The unity of science as a working hypothesis.
relation between the psychological level and the neurological Minnesota Studies in Philosophy of Science 2, 3–36.
level, we too have articulated a correlation between mental and Pettit, P., 1993. The Common Mind. Oxford, University Press, New York.
behavioral events, and for us too the correlation is a product of Russell, B., 1917/1963. On the notion of cause. In: Russell, B. (Ed.), Mysticism and
two further relations: first, there is a supervenience relation Logic. Penguin, New York.
Ryle, G., 1949. The Concept of Mind. Hutchinson’s University Library, london.
between the mental event and the brain event; second, there is Schaffer, J., 2003. Is there a fundamental level? Nous 37 (3), 498–517.
a causal relation between the brain event and the behavioral Sperber, D., 1996. Explaining Culture. Oxford, UK: Blackwell.
event. However, why should we suppose that this correlation is Yablo, S., 1992. Mental causation. Philosophical Review 101, 245–280.
Causes and Laws: Philosophical Aspects
Helen Beebee, University of Manchester, Manchester, UK
Ó 2015 Elsevier Ltd. All rights reserved.

Abstract

The questions ‘what makes it the case that one event causes another?’ and ‘what makes it the case that something is a law of
nature?’ are highly controversial ones for which, among contemporary philosophers, there is no answer that can claim
orthodoxy. This article sketches some of the most influential theories of causation and lawhood.

Preliminaries however, most philosophers only began to take the possibility


of indeterminism seriously in the 1960s and 1970s. One
‘Cause’ and ‘law’ are perhaps two of the most important and consequence of this reluctance to take indeterminism on
fundamental concepts that human beings deploy in their board is that some current theories of causes and laws were
attempts to understand and intervene in their environment, originally formulated under the assumption of determinism,
both in everyday life and in their scientific endeavors. When we and were then modified to take into account the possibility
want to explain why an event occurred, we seek out its causes. that at least some laws and causal processes are fundamentally
When we act, we do so because we believe the action will have indeterministic. The theories of laws and causation discussed
certain effects. When we want to know why our environment below are mostly presented in their simpler, deterministic
and our fellow human beings behave in regular, predictable form. While the indeterministic versions of the theories
ways, we look for the laws that govern that behavior. present additional complications and problems – for example,
It is part of the scientist’s job to discover what causes because they employ the notion of probability – the funda-
what, and to investigate what the laws of nature are. The mental philosophical issues addressed, and problems faced,
philosopher’s job, however, is a more abstract one; the by the indeterministic and deterministic versions, are mostly
philosopher wants to know what makes it the case that the same or similar.
something – anything, be it the decay of a subatomic particle
or the assassination of Archduke Ferdinand – causes some-
The Particular and the General: Events, Facts, and Properties
thing else, or that something is a law of nature – whether it is
a physical, biological, psychological, economic, or any other Many of our causal claims are claims about individual occur-
kind of law. We know that when one strikes a match and it rences: the assassination of Archduke Ferdinand, the Great
lights, the first event – the striking – caused the second. But Recession, the birth of a particular child, a specific meteor
this causal fact depends for its truth on more than the mere shower, or whatever. Such occurrences are thus the ‘relata’ of
fact that the two events occurred one after the other; there particular or ‘token’ causal relations. There is an ongoing
must be some extra feature of the world that binds these two debate about whether such individual occurrences should be
events together as cause to effect. Similarly, it seems that laws conceived as events (the assassination of Archduke Ferdinand)
of nature must be more than mere regularities; it is doubtless or facts (the fact that Archduke Ferdinand was assassinated) –
true that all lumps of gold are smaller than a mile in diam- see, for example, Mellor, 1995 and Lewis, 1986 – but for
eter, but it is not a law that this is so. So genuine laws must simplicity we shall assume that they are events.
have some extra feature, apart from regularity, which marks Sometimes, however, we make general causal claims about
them off from mere ‘accidental’ regularities. The philoso- whole populations rather than their individual members.
pher’s job, or at least part of it, is to say what, if anything, this Thus, it is doubtless true that smoking causes cancer, but it
‘extra feature’ might be. does not follow that every individual who smokes will be
caused to get cancer as a result; and it can be true that
a particular person dies as the result of a bee sting without its
Determinism vs Indeterminism
being true that bee stings in general cause death. General or
A question that is relevant to any discussion of the nature of ‘population-level’ causation is generally taken to apply to
causes and laws is that of whether the universe is fundamen- properties or repeatable features that different individuals can
tally deterministic or indeterministic. One way of putting the share, such as smoking, getting cancer, being red, or weighing
thesis of determinism is this: the complete state of the universe 3 kg.
at any given time together with the laws of nature determines Laws of nature are likewise concerned with general rela-
precisely what the complete state of the universe will be at tionships between properties – whether it is the relationship
all future times. Indeterminism is simply the denial of between force, mass, and acceleration enshrined in f ¼ ma
determinism. or the economic ‘law’ of diminishing marginal utility. In
The view that the universe is fundamentally indeterministic what follows, general features or properties will be denoted
has been widely (although by no means universally) held ‘F,’ ‘G,’ and so on, while individual events will be denoted
within science since the advent of quantum mechanics; ‘c,’ ‘e,’ etc.

266 International Encyclopedia of the Social & Behavioral Sciences, 2nd edition, Volume 3 http://dx.doi.org/10.1016/B978-0-08-097086-8.63007-6
Causes and Laws: Philosophical Aspects 267

Laws of Nature untenable, for it simply collapses the distinction between laws
and accidental regularities. If statements of law are merely true
Laws vs Accidents
universal generalizations, then accidental regularities are
What is the difference between laws of nature and merely themselves laws; it turns out to be a law of nature that no
‘accidental’ regularities? (Suppose that it is a law that all metals Queen of England is taller than 2 m after all.
expand when heated and that it is merely accidentally true that A more sophisticated version of the regularity theory is
all (past, present, and future) Queens of England are less than provided by the ‘Ramsey–Lewis view,’ after F.P. Ramsey and
2 m tall.) A crucial difference is that laws ‘support counterfac- David Lewis (see Lewis, 1973b: pp. 73–75; J.S. Mill also held
tuals,’ whereas accidental regularities do not. It is true of any a similar view). The basic idea of the Ramsey–Lewis view is as
unheated piece of metal that if it were to be heated it would follows. When we seek, as scientists do, to find out how the
expand. But it need not be true of any non-Queen of England universe behaves, we are not satisfied with isolated facts
that if she were to be Queen she would be less than 2 m tall. about what happened in the laboratory on Tuesday after-
If someone is currently, say, 2.05 m tall, becoming Queen of noon, what happened when a particular patient took
England would not result in her getting any shorter. a particular drug, and so on. Rather, what we seek are wide-
Another difference is that violations of law are said to be ranging generalizations that abstract as far as possible from
physically impossible, whereas violations of accidental regu- the particularities of the laboratory or the patient. General-
larities are physically possible. It is physically impossible for izations about the heights of Queens of England or the
a heated metal to fail to expand when heated; but it is perfectly diameters of lumps of gold apply to an extremely restricted
possible for a Queen of England to be taller than 2 m. Finally, it number and range of phenomena. On the other hand,
is sometimes claimed as a third difference that laws, unlike generalizations about subatomic particles, chemical reactions,
accidental regularities, ‘govern’ what goes on in the universe. the relationship between force, mass, and acceleration, and so
We tend to suppose that statements of laws of nature are rather on, apply to an enormous number of diverse phenomena,
like pieces of divine legislation: decrees that the universe must and are hence much more useful and explanatorily powerful.
obey rather than mere general descriptions of what in fact According to the Ramsey–Lewis view, the difference between
happens. laws of nature and merely ‘accidental’ regularities amounts to
roughly this difference between generalizations that are wide
ranging and powerful and those that are not. It is not that
Realist vs Regularity Accounts of Laws
there is any ontological distinction between laws and acci-
Can an analysis of lawhood be given that does justice to these dents – laws do not have an extra feature (N, say) that acci-
intuitions about the nature of laws? A popular view dental regularities lack; rather, it just so happens that some
(Armstrong, 1983) is that laws are relations of necessitation generalizations (the laws) are interesting and useful ways of
between universals. To say that it is a law that all Fs are Gs is codifying what happens in the universe, and others (the
to say that there is a necessary relation, N, that holds between accidents) are not.
the universals (properties) F and G. (Following Armstrong, The Ramsey–Lewis view (and the Humean view of lawhood
we can write this ‘N(F,G)’.) When an object instantiates F, the in general) diverges from a conception of laws of nature
instantiation of F guarantees, via N, that G will also be according to which laws ‘govern’ what happens in the Universe,
instantiated. Thus, N(F,G) – its being a law that all Fs are Gs since on the Humean view the laws just are regularities, rather
– guarantees that all Fs will in fact be Gs. than features of the world that explain why the world is regular
The basic point of this ‘realist’ theory of laws is that it (Beebee, 2000). Some philosophers argue that Humeanism
provides an ‘ontological ground’ for the difference between about laws makes it impossible for us to know any laws of
laws and accidents; accidental regularities just happen, whereas nature (Armstrong, 1983: pp. 52–59; Bird, 2008; Beebee,
lawful regularities are grounded in and explained by a neces- 2011). On the other hand, some philosophers claim that
sary relation holding between the co-occurring properties. And, Humeanism about laws takes the sting out of the alleged
on this view, sense can be made of the idea that laws govern conflict between determinism and free will (see Swartz, 1985:
what happens; if N(F,G) holds, then G must be instantiated Chapters 10 and 11; Beebee and Mele, 2002; Berofsky, 2012)
whenever F is instantiated. (see Free Will and Action).
The main rival to this conception of laws is the ‘Humean’ or
‘regularity’ conception, according to which laws are really no
Ceteris Paribus Laws
more than regularities. The classic (although not the only)
motivation for the regularity view is an ‘empiricist’ episte- The accounts of laws canvassed above are perhaps best suited to
mology according to which we ought not to believe in entities the laws we find in, say, physics and chemistry – the places
that we cannot perceive. Since our evidence that there are any where we are most likely to find truly exceptionless general-
laws of nature comes solely from the observation of regulari- izations that hold across the whole of space and time (although
ties – when we see a law (as opposed to an accidental regu- it is often argued that even in the ‘hard’ sciences, truly excep-
larity) being instantiated we do not see any extra ontological tionless generalizations are extremely rare). There are plenty of
feature – we ought not to believe that there is any extra generalizations in, say, economics, sociology, and psychology
ontological feature that laws have but accidental regularities that, while not exceptionless, are clearly lawlike in some sense.
lack. The law of diminishing marginal utility, for example, does not
A ‘naïve’ regularity theory, however – according to which hold in all cases, so if lawhood requires an exceptionless
statements of law are merely true universal generalizations – is regularity, there will turn out to be no laws in economics.
268 Causes and Laws: Philosophical Aspects

On the other hand, it is surely no accident that, by and large, The notion of stability (or invariance) may be a good place
the utility of something eventually decreases as one obtains to start if we want to make sense of cp-laws, because it seems to
a greater quantity of it. Similarly, there are plenty of general- capture what is ‘lawlike’ about, say, the law of diminishing
izations in biology and psychology that are not exceptionless marginal utility. While the generalization in question only
but nonetheless provide a good foundation for prediction and holds ceteris paribus, it arguably still exhibits a good deal more
control. stability than, say, the generalization concerning the pens on
Such cases are generally thought of as ‘ceteris paribus’ laws Obama’s desk; it would remain true (ceteris paribus) given
(‘cp-laws’ for short): generalizations that hold not in all cases a wide range of changes in prices, consumer preferences, and so
simpliciter, but only ‘other things being equal,’ that is, in cases on. Of course, we still need to find a way to spell out the
where there is no additional, interfering factor. We might try restrictions imposed by the ceteris paribus clause; it has been
to subsume cp-laws under our existing accounts of laws suggested that we think of cp-laws as generalizations that are
simply by building in the absence of such interfering factors, stable across circumstances that are of particular interest to
so that our candidate law has the form ‘all Fs are Gs, except in a given science, such as psychology or economics (see Lange,
circumstances X, Y, and Z.’ This approach is problematic, 2002; Woodward and Hitchcock, 2003a,b).
however, since in many, if not most, cases, we either have no
idea what the relevant X, Y, and Z are, and so – even if there
Laws and Dispositions
is a law in the offing – we will not be in a position to
formulate it, or else there will be indefinitely many ways in A final kind of account of the nature of laws – again, one that
which the generalization in question could in principle fail, is perhaps well suited to dealing with cp-laws – is a
so there is no way in principle of specifying what the relevant ‘dispositionalist’ account. According to a naïve version of the
law is. so-called counterfactual account of dispositions (such as
We might try to avoid these problems by claiming that fragility, which, when applied to panes of glass and china
cp-laws are only supposed to hold in ‘ideal’ or ‘normal’ cups at least, we can think of as the disposition to break when
conditions, thus removing the need to specify exactly what dropped), an object o (a china cup, say) has disposition to
those conditions are within the content of the law itself. manifest feature M (being broken) in ‘stimulus conditions’ S
However, unless we can independently specify which condi- (being dropped) just if, if o were to be in situation S, it would
tions count as normal or ideal, the threat of circularity looms; manifest M. As it stands, however, this account cannot be
we cannot allow ‘normal’ or ‘ideal’ conditions to simply be right. For example, if I were to drop a china cup that is
those in which the regularity obtains, or cp-laws will turn out to wrapped in a lot of cotton wool, it would not break – but the
be trivially true (all Fs are Gs – except when they are not). cup is nonetheless fragile. The presence of the cotton wool
(See Hausman, 1992: Chapter 8; Pietroski and Rey, 1995; ‘masks’ the disposition – it prevents the disposition from
Earman and Roberts, 1999; Schrenk, 2007a.) being manifested. Antidotes provide another counterexample
to the naïve counterfactual account; the redback spider is
deadly (its bite has the disposition to kill people) despite
Stability and Invariance
the fact that, thanks to the existence of an antivenom, nobody
Some contemporary accounts of lawhood take a different has died from a redback bite for many years.
starting point to the idea that a law is an exceptionless regu- Whether the counterfactual account can be amended to deal
larity of a certain kind (the Ramsey–Lewis view) or else is the with problem cases such as these is a controversial question
obtaining of some metaphysical relation that entails the exis- (see e.g., Lewis, 1997; Bird, 1998). What is relevant to the
tence of an exceptionless regularity (Armstrong’s view). Instead, discussion of laws, however, is that we can think of laws of
they start from the idea that the distinctive feature of laws is nature as describing not how things actually behave, but how
that they are stable or invariant under a range of different they are disposed to behave. So a candidate law of nature
background conditions. Suppose that the generalization ‘all the might have the form ‘all Fs are Ds,’ where property D is the
pens currently on Obama’s desk are blue’ is true. Despite being disposition or capacity to be G, rather than ‘all Fs are Gs’
true, it is highly unstable; there are very many possible situa- (Cartwright, 1989; Ellis, 2002; Bird, 2005).
tions in which it would be false. Obama might, for example, A dispositionalist account of laws may be well suited to deal
have left the red pen he was using earlier today on his desk with cp-laws because – as we have just seen – something’s
rather than putting it in his pocket, or one of his aides might having a disposition is compatible with its failing to manifest
have left their own black pen on the desk. By contrast, ‘nothing that disposition even though the stimulus condition is
travels at the speed of light’ is highly stable; there are (so we present. If we think of laws as generalizations about
currently believe) no physically possible situations at all that dispositions, then we can think of cp-laws as those laws that
would render it false. The notion of stability is thus a ‘modal’ describe dispositions that are susceptible to masks, antidotes,
notion; the stability or otherwise of a generalization is a matter and the like. Thus, for example, ‘all china cups break when
of whether it would still be true in various different, nonactual dropped’ is false; I can easily falsify it by dropping a china
situations. And, since stability is not all-or-nothing but rather cup wrapped in cotton wool. But this does not falsify ‘all
comes in degrees – one generalization can be more or less china cups are disposed to break when dropped,’ since, as we
stable than another – we can think of lawhood itself as coming have seen, a china cup is still disposed to break when
in degrees, so that some generalizations are more lawlike dropped – it is still fragile – even if, thanks to the cotton
than others, rather than thinking that a generalization either wool, it is not in a position to manifest that disposition
expresses a law or it does not. when I drop it. (See Schrenk, 2007b.)
Causes and Laws: Philosophical Aspects 269

Causation discernable in many contemporary theories. It is untenable for


a variety of reasons. For example, the contiguity requirement
Theories of causation have proliferated in the last 50 years or rules action at a distance (one event’s causing another, spatially
so, with different theories speaking to different underlying or temporally distant event without there being a chain of
metaphysical commitments and to different views about how events ‘hooking up’ the first to the last) to be impossible, while
the ontology of causation (i.e., how the world has to be many philosophers think that is at least conceptually possible.
arranged in order for causal claims to be true) ought to relate to And the constant conjunction requirement is obviously flawed:
its epistemology (i.e., how we can come to know, or justifiably there can be constant conjunction without causation – for
believe, that a causal claim is true). This section surveys some of example, joint effects of a common cause are constantly
the leading theoretical approaches. conjoined but neither causes the other – and, if determinism
is false, it seems there can be causation without constant
conjunction. My betting on number 13 on the roulette wheel
The Naïve Regularity Theory
is a cause of my winning some money, even if in exactly the
The agenda for much of the contemporary discussion about same circumstances the following day I bet on number 13
the nature of causation was set by Hume (1748/1975: Sections and lose.
IV–VII). Hume was an empiricist, believing that there could be Nevertheless, the general issues that Hume raised are still
no ‘ideas’ in the mind that do not somehow come from our among the main foci of dispute in contemporary philosophy of
senses. How, then, do we come to have an idea of causation? causation. In particular, the question of whether causal facts
What features of the world could furnish us with that idea? reduce, somehow or other, to facts about regularities is the
Famously, Hume maintained that we do not perceive any driving force behind many of the theories of causation
intrinsic connection or relation between two events we judge currently on offer.
to be causally related; we see the match being struck, and we
see it light, but we do not see any causation between those
Counterfactual Analyses
two events. Causation, then, cannot be a mysterious intrinsic
relation between events – for if it were, since we cannot Arguably, the most influential analysis of causation in recent
perceive any such relation, we could have no idea of years has been Lewis’s counterfactual analysis (Lewis, 1973a).
causation. (See Menzies, 1998; Beebee, 2009 for discussion According to Lewis’s analysis, an event e causally depends
of the question of the observability of causation.) on event c if e counterfactually depends on c – which is to
The exact nature of Hume’s positive theory of causation is say, if, had c not occurred, e would not have occurred either
a matter for interpretative dispute (see Beebee, 2006a). (see Counterfactual Reasoning, Quantitative: Philosophical
However, the standard interpretation – and therefore the one Aspects and Counterfactual Reasoning, Qualitative:
that has had the most impact on contemporary discussion – Philosophical Aspects). A ‘chain of causal dependence’ is
characterizes Hume as a ‘naïve regularity theorist.’ On this a series of events ha; b; c; .; ni such that b causally depends
view, two events are related causally in virtue of contiguity on a, c causally depends on b, and so on. Finally, c causes e if
(causes and their immediate effects are right next to each other there is a chain of causal dependence (perhaps involving only
in space and time), temporal priority (causes precede their c and e themselves, but perhaps involving many hundreds of
effects), and constant conjunction (events similar to the cause intermediate events) from c to e.
are always followed by events similar to the effect). So what The need for chains of dependence arises when there is no
makes it true that the striking caused the lighting is that the counterfactual dependence between cause and effect because of
two events are contiguous (or, if there is some spatiotemporal a backup mechanism or ‘preempted alternative.’ Suppose that
gap between them, the events are mediated by further events the bus stop is next to the taxi rank. Susan takes the bus to work
so that there is a chain of contiguous events starting with the (c); but if the bus had been late she would have taken a taxi. She
striking and ending with the lighting); the striking occurs attends a 9.30 meeting (e). e does not counterfactually depend
before the lighting; and similar strikings are always followed on c, since if Susan had missed the bus she would have taken
by similar lightings. The constant conjunction requirement a taxi and attended the meeting just the same. But there is some
ensures that for Hume (and for most modern day Humeans) intermediate event d – Susan’s getting off the bus at the stop
causation is an extrinsic relation: a causal relation obtains outside her office, say – which counterfactually depends on c
between two events not purely in virtue of how those events (if she had not got on the bus in the first place, she would
themselves are, but in virtue of features of other events; the not have got off it either) and upon which e counterfactually
striking only gets to count as a cause of the lighting because depends (if she had not got off at that stop – given that she
other strikings are also followed by lightings. was already on the bus, and it was already nearly 9.30 – she
The other notable feature of the naïve regularity theory is would have had to walk back to her office from the next
that it is a reductionist view: causal claims, like ‘the striking of stop, and she would have missed the meeting).
the match caused the lighting,’ are made true, at bottom, not by Lewis’s counterfactual analysis of causation has been found
any primitive causal feature of the world, but by noncausal to be subject to a number of counterexamples (e.g., Schaffer,
features (since the fact that two events are contiguous, the fact 2000a). Much recent work on causation has been devoted to
that one occurs before the other, and the fact that they are devising alternative analyses which, while remaining true to
similar to other events are not themselves causal facts). Lewis’s basic claim that causation is to be analyzed in terms
The naïve regularity theory is now universally regarded as of counterfactuals, add further complexities in an attempt
untenable, although, as we shall see, elements of it are clearly both to circumvent the problem cases and to deal with
270 Causes and Laws: Philosophical Aspects

indeterministic causation (see, e.g., Menzies, 1989; Noordhof, conservation laws of physics, for example, energy-
1999; Lewis, 2000). momentum. A problem with this account is that it only tells
Both Lewis’s original analysis and its more complex us what causation actually consists in. It seems that we can
successors fall squarely within the Humean tradition, since imagine possible situations where there is causation in the
the counterfactual analysis is a reductionist project: it seeks to absence of any transfer of a conserved quantity (e.g., by the
show how causal facts are made true by noncausal features of casting of spells). Other analyses of causation – such as
the world. Indeed, since Lewis’s account of counterfactuals the counterfactual analysis – tell us why both the movement
make the truth conditions of counterfactuals depend primarily of snooker balls does, and the effective casting of spells
on the laws of nature, and the laws of nature themselves, on would, count as cases of causation, since they both involve
Lewis’s view, are merely a subset of the regularities, one can counterfactual dependence. Dowe’s account fails to do this.
think of counterfactual analyses of causation generally as more One solution to the problem is to attempt to spell out the
sophisticated descendants of the naïve regularity theory of ‘folk theory’ of causation: those features that, according to our
causation. ordinary concept of causation, are essential for a relationship to
Not everyone, however, is a Humean. Many philosophers be causal, and hence those features that will apply to snooker
deny Hume’s central motivating claim that causation cannot be balls and spells alike. And we can then claim that the transfer of
perceived (see Anscombe, 1971; Armstrong, 1997: Chapter 14; conserved quantities counts as causation precisely because, as
Menzies, 1998; Beebee, 2009). Some argue that the features of things actually are, it is such transfer that satisfies our folk
the world to which reductionist analyses seek to reduce causal theory or definition of causation – while at other possible
facts are themselves really causal features; hence reductionist worlds, such as one in which magic spells are effective, other
analyses are circular and therefore unsuccessful (see Carroll, kinds of relation count as causal (see Dowe, 2000: Chapter 1;
1994). Others argue that if a broadly Humean analysis of Menzies, 1996; Lewis, 2004).
causation were true it would render the pervasive regularities Like process theories, ‘mechanistic’ accounts of causation
that the universe exhibits a sort of cosmic fluke (see also seek to identify causation with the process by which the
Strawson, 1989: Chapter 5; Beebee, 2006b). And many cause produces the effect (Glennan, 1996; Machamer et al.,
simply think that the persistent failure of reductionist 2000; see Williamson, 2011 for a survey). However,
accounts to deal with counterexamples, despite ingenious mechanistic accounts are more pluralist in spirit, and do not
amendments, is good reason to think that the whole try to locate a single feature of the world (such as the transfer
reductionist project is doomed to failure. of a conserved quantity) that can be identified with the
causal relation. Machamer et al. note that “terms like ‘cause’
and ‘interact’ are abstract terms that need to be specified with
Process and Mechanistic Theories
a type of activity and are often so specified in typical
We generally take it for granted that causal influence is prop- scientific discourse. Anscombe (1971) . noted that the word
agated through mechanisms or processes of various kinds. ‘cause’ itself is highly general and only becomes meaningful
When the impact of the white snooker ball causes the black ball when filled out by other, more specific, causal verbs, for
to roll into the pocket, we can grant that the latter event example, scrape, push, dry, eat, burn, knock over. An entity
counterfactually depends on the former, but may yet be acts as a cause when it engages in a productive activity”
inclined to say that this dependence is symptomatic rather than (Machamer et al., 2000: p. 6; see also Cartwright, 2004).
constitutive of the causal relation between the two. This is Dowe’s account is explicitly reductionist in spirit, in two
because there is an identifiable process linking the two, senses. First, it attempts to identify causation with a feature of the
involving the transfer of energy-momentum from the white to world that can itself be understood in noncausal terms; the
the black ball and subsequent movement of the black ball ‘transfer’ of conserved quantities is not supposed to be conceived
toward the pocket. Some philosophers have attempted to as, for example, the presence of an energy-momentum in the
analyze causation by focusing on the distinction between white ball causing its subsequence presence in the black ball. It
a genuine causal process and a mere ‘pseudoprocess.’ Salmon is also reductionist in the sense that all (actual) causation is, at
(1984), for example, takes the distinguishing feature of causal bottom, causation at the level of fundamental physics; the
processes to be their ability to ‘transmit a mark.’ A mouse presence of a causal process is determined by features of the
walking across the floor is a process that can transmit a mark; world that are visible, as it were, only from the perspective of
for example, if I paint a blue patch on the mouse’s fur, the physics. Machamer, Darden, and Craver’s account is
blue patch will ‘transmit’ from earlier to later stages of the antireductionist in both senses. ‘Productive’ activities are not
process (since it will go wherever the mouse goes). By themselves to be analyzed in noncausal terms (e.g., the
contrast, if I make a spot of light move across the floor by instantiation of regularities or the transfer of conserved
means of a torch, the movement of the light across the floor quantities); nor are the mechanisms that underpin causal
is a pseudoprocess; if I paint the first illuminated patch blue, relations to be found only at the level of fundamental physics.
that mark will not transmit to later stages of the ‘process’; it Indeed, part of the purpose of their view is to make sense of
will stay just where I put it. causal explanation in biology.
Salmon’s ‘mark transmission’ theory is problematic for
various reasons (see Dowe, 2008: Section 6). Dowe’s suggested
Probabilistic Theories
replacement, the ‘Conserved Quantity Theory’ (2000),
identifies the causal relation with the transfer of whatever Frequently, it is claimed on cigarette packets that smoking
physical quantities are conserved according to the causes heart disease; this is a claim about ‘general’ or ‘type-level’
Causes and Laws: Philosophical Aspects 271

or ‘population-level’ causation. What makes (or would make) probability of E in a fair sample. An alleged benefit of Dupré’s
such a claim true? We might try to regard it as some kind of analysis is that it makes the truth conditions for general causal
generalization over individual instances of particular smokers claims match up in an obvious way with a standard way in
whose heart disease has been caused by their smoking. But it is which general causal claims are in fact established, namely,
clearly not a universal generalization; the fact that Jane – randomized control trials. The point of randomization is
a lifelong smoker – had no heart disease when she died from precisely to make it likely that other relevant factors are present
other causes (falling off a cliff, say) does not serve to refute with equal frequency in both the trial group and the control
the claim that smoking causes heart disease. Nor is our group, and an increase in frequency of the effect under inves-
general causal claim merely an existential claim to the effect tigation in the trial group relative to the control group is
that at least one smoker has been caused by their smoking to supposed to be good evidence that the drug, or whatever, is
develop heart disease. If that were so, then it would take just a cause of that effect.
one unlucky person to die because of choking on a Mars Bar The accounts just discussed are reductionist – indeed,
to make it true that eating Mars Bars causes imminent death. Humean – in spirit, in that they seek to analyze (general)
This being so, we should perhaps regard general and ‘token’ causation in terms of regularities (although the regularities in
or ‘individual’ causation as two different relations, the former question are statistical rather than exceptionless). A more
relating types or properties, and the latter relating individual recent and similar approach – the ‘manipulability’ or
events or instantiations of properties – something that is ‘interventionist’ approach – shuns reductionism, but
suggested in any case by the grammatical form of general nonetheless aims to shed light on how causal structure can
causal claims (‘smoking causes heart disease,’ and not be inferred from statistical evidence (Woodward, 2003, 2008).
‘instances of smoking cause instances of heart disease’). The need for the appeal to either homogenous reference
Indeed, some authors have argued that we have not only two classes or to fair samples stems in part from the need to rule out
different relations, but two relations that are independent of ‘spurious causation;’ cases of probability increase in the
one another (e.g., Eells, 1991: Chapter 1). Others argue that absence of causation. One factor F can easily increase the
the two ‘levels’ of causation are in fact closely related probability of another, G, in the population as a whole without
(Hitchcock, 1995), so that, once we have the right account of causing G, as when, for example, F and G are effects of
causation on the table, we will see that we do not really have a common cause H. For example, the barometer needle
two distinct species of causation at all. pointing to ‘rain’ (F) is positively correlated with rain (G), but
What is generally accepted, however, is that general causa- both are effects of a common cause, namely, low atmospheric
tion should be understood in terms of probabilistic or statis- pressure (H). (We can think of the population in this case as
tical relationships, so that, in essence, a claim of the form ‘C comprising ‘weather episodes.’) There will be no such spurious
causes E’ is made true by C’s raising the probability of correlation in a homogeneous reference class, however, since
E. (Similarly, ‘C prevents E’ is made true by C’s lowering by hypothesis all members of any such reference class will be
the probability of E.) According to one account (Eells, alike with respect to whether or not they possess H. ‘Holding
1991; see also Suppes, 1970), we should understand C’s fixed’ factor H screens off the statistical relationship between F
raising the probability of E as the obtaining of and G. Similarly, if we compare a fair sample of Fs with a fair
PrðEjC & Bi Þ > PrðEjwC & Bi Þ, where B1, B2, . are all the sample of wFs, by definition H will have the same relative
different combinations of the presence and absence of other frequency within each of the samples, and so, again, there will
relevant factors within a given population. So, for example, be no correlation between F and G.
suppose we divide our population (for example, UK citizens) Interventionist accounts provide a different way of
into ‘homogeneous reference classes,’ so that within each ‘breaking’ spurious probabilistic relationships. The basic idea is
reference class everyone is exactly alike with respect to all very simple. Take the barometer case again. F and G are posi-
other risk factors for heart disease (alcohol consumption, tively correlated within the population as a whole. But if I
cholesterol levels, relevant genetic factors, etc.). Then intervene on F, say by manually moving the pointer so it either
‘smoking causes heart disease’ is true if and only if in points to ‘rain’ (f1) or to ‘sunny’ (f2), the correlation disappears:
each such reference class, Pr(heart diseasejsmoker) > Pr(heart there is no positive correlation between F and G if we consider
diseasejnonsmoker). just those cases where I have directly brought about either f1 or
This condition is extremely hard to satisfy, however. For f2 (so long as my manually moving the pointer is not itself
example, imagine that there is a very rare genetic trait that correlated with the atmospheric pressure). In effect, my inter-
interacts with smoking in such a way as to decrease the risk of vening breaks the causal (and hence probabilistic) connection
heart disease. Then on the above account it will be false that between the putative cause we are interested in – here, the
smoking causes heart disease for any population that includes position of the barometer needle – and its usual causes
possessors of this genetic trait as a subpopulation. A rival (atmospheric pressure).
proposal (Dupré, 1984, 1990) is that C causes E if and only if C A great advantage of the interventionist account is that, with
raises the probability of E in a fair sample of the population the notion of an intervention in place, we can define causation
(i.e., in a sample of the population in which the relative in terms of increase in probability without the need to make
frequency of other relevant factors is the same as it is for the any stipulations about other (relevant) background factors –
population as a whole). This analysis yields the desired result unlike the probabilistic accounts considered previously.
that smoking causes heart disease in the above example, since Interventionist accounts are not, however, reductionist
in any fair sample those with the genetic trait will be vastly accounts, since the notion of an intervention is itself defined
outnumbered by those who lack it; so C will still raise the in causal terms: an intervention is, by definition, the ‘setting’
272 Causes and Laws: Philosophical Aspects

of a variable X (e.g., needle position) at a particular value x either the production category or the difference-making
(e.g., ‘rain’) in a way that breaks the connection between (or category; so analyses of the former kind will inevitably fail to
‘screens X off ’ from) its usual causes. deal with cases of difference making without production, and
analyses of the latter kind will inevitably fail to deal with
cases of production without difference making. And, he
One Concept of Causation or Two?
claims, it is easy to fail to distinguish the two kinds of
We can think of the various accounts of causation surveyed causation because cases of one without the other are pretty
above as falling into two distinct categories. On the one hand, rare; in our ordinary dealings with the world, causes are both
we have process and mechanistic accounts (Section Process and difference makers and producers of their effects. (When I sink
Mechanistic Theories), which take causation to be a matter of the black ball in snooker, my taking the shot causes the black
what we might call ‘production’; on the other, we have coun- ball to drop into the pocket in both senses of ‘cause.’)
terfactual and probabilistic accounts, which take causation to Many philosophers, however, have been reluctant to accept
be a matter of ‘difference making.’ Other analyses of causation that there is no univocal concept of causation (Lewis, 2004).
not considered here are generally pretty easy to categorize as For example, Dowe (2001), whose Conserved Quantity
either ‘production’ or ‘difference-making’ accounts. Theory is clearly a ‘production’ account, argues that cases of
To see how these two notions can in principle come apart, difference making without production are not really cases of
consider a ‘possible world’ where much of what goes on is causation at all. Meanwhile, defenders of difference-making
a result of spells cast by witches and wizards. Let us stipulate accounts argue that alleged cases of production without
that the spells can operate at large spatial and temporal difference making can in fact be dealt with by their accounts.
distances, and that they do so directly: there is no process or For example, we might respond to the shop case just
mechanism that connects the casting of a spell to its imple- described by appealing to the idea that causation is context-
mentation. It seems right to say that when Merlin, at midday, sensitive, in the sense that causal claims are true or false
casts a spell to turn the prince into a frog at midnight – which relative to contrast class (which is normally only implicit).
he duly does – the former caused the latter. Merlin’s spell made Thus, my taking route A rather than route B really is not
a difference to what happened, but there was no process a cause of my subsequent arrival, since my taking route A
connecting the two; the former did not ‘produce’ the latter, in rather than route B made no difference to the time and
just this sense of ‘produce’ (Schaffer, 2000a). Or consider manner of my arrival. My taking route A to the shops rather
a less far-fetched scenario: Linda the lifeguard sees a swimmer than taking no route at all, however, is a cause of my arrival.
in trouble and rushes to help, intent on preventing him from So we can save the intuitive judgment that my taking route A
drowning. Tanya deliberately trips Linda up as she runs is a cause of my arrival by finding a context within which the
toward the sea, thus preventing her from preventing the former does make a difference to the latter; hence we do not
swimmer from drowning. The swimmer duly drowns. (Such have a clear counterexample to a broadly difference-making
cases are known as cases of ‘double prevention.’) It seems account after all (Menzies, 2007).
plausible to say that Tanya’s action is a cause of the
swimmer’s death; after all, had Tanya not tripped Linda up,
the swimmer would have been saved. But again, there is no Causes and Laws
process connecting the two; we have the situation on the
beach involving Linda and Tanya, and the swimmer way out The above discussion of causation and laws of nature has
at sea, with nothing linking them together. So again we seem treated them as separate phenomena, to be given distinct
to have difference making without production (Schaffer, analyses; yet intuitively there is some relationship between, for
2000b). instance, the law that all metals expand when heated and the
It looks as though we can also have production without fact that a particular piece of metal’s expansion is caused by its
difference making. There are two routes of the same length to being heated. How, then, are causation and laws related?
the shop – route A and route B. Which route I take makes no Different theories of causation connect with laws of nature
difference to the time and manner in which I arrive at the in different ways. Hume’s constant conjunction requirement
shop, but I am intent on going to the shop and will on causation, for example, can be seen as a requirement that
definitely take one or other of these routes; in fact, today I causes and effects are lawfully correlated (or ‘fall under’
take route A. It looks as though my taking route A is not a covering law). Laws of nature enter into Lewis’s counterfac-
a difference maker: if I had not taken route A, I would have tual analysis of causation indirectly, via his analysis of coun-
taken the route B, and I would have got to the shops just the terfactuals (see Lewis, 1973b). For Lewis, the truth of ‘if A had
same. However, there is clearly a causal process linking my been the case, B would have been the case’ requires that B is true
setting out on route A and my subsequent arrival; I put one at the ‘closest possible world(s)’ in which A is true, and
foot in front of the other, repeated a lot of times, and as closeness of possible worlds depends in part on the extent to
a result reached the shop. So we have production without which they share the same laws of nature. Dowe’s Conserved
difference making. Quantity Theory of causation (Section Process and
Ned Hall (2004) argues that we really have two distinct Mechanistic Theories above) identifies the causal relation
concepts of causation: the production concepts and the with the transfer of any quantity that is subject to
difference-making concept. He diagnoses the repeated failure conservation laws, so that, again, facts about causation
of various analyses of causation to cover the full range of depend on facts about the laws of nature – although in
standard problem cases by noting that analyses fall into a rather different way. Heathcote and Armstrong (1991)
Causes and Laws: Philosophical Aspects 273

argue that the causal relation is identical with N, the relation of Dowe, P., 2008. Causal processes. In: Zalta, E.N. (Ed.), The Stanford Encyclopedia of
nomic necessity in virtue of which laws of nature obtain Philosophy (Fall 2008 Edition). http://plato.stanford.edu/archives/fall2008/entries/
causation-process (accessed 31.07.13.).
(see Section Realist vs Regularity Accounts of Laws above).
Dupré, J., 1984. Probabilistic causality emancipated. Midwest Studies in Philosophy 9,
On the other hand, it has been argued that lawful correla- 169–175.
tion is not necessary for causation. Anscombe (1971), for Dupré, J., 1990. Probabilistic causality: a rejoinder to Ellery Eells. Philosophy of
example, holds that c may cause e even though c and e do Science 57, 690–698.
not fall under any regularity. (Causation of this kind is Earman, J., Roberts, J., 1999. Ceteris paribus, there is no problem of provisos.
Synthese 118, 439–478.
sometimes referred to as ‘singular causation’; but note that Eells, E., 1991. Probabilistic Causality. Cambridge University Press, Cambridge.
‘singular causation’ can also be used to refer to particular Ellis, B., 2002. The Philosophy of Nature: A Guide to the New Essentialism. Acumen,
causation, in contrast to general causation.) Heathcote and Chesham.
Armstrong (1991) claim that even though the causal relation Glennan, S., 1996. Mechanisms and the nature of causation. Erkenntnis 44, 49–71.
Hall, E.J., 2004. Two concepts of causation. In: Collins, J., Hall, E.J., Paul, L.A. (Eds.),
is in fact identical with nomic necessity, this identity does not
Causation and Counterfactuals. MIT Press, Cambridge, MA.
itself hold as a matter of necessity. Heathcote, Armstong, and Hausman, D., 1992. The Separate and Inexact Science of Economics. Cambridge
Anscombe all agree, then, that ‘higgledy-piggledy’ worlds are University Press, Cambridge.
conceivable – that is, possible worlds where there is plenty of Heathcote, A., Armstrong, D.M., 1991. Causes and laws. Noûs 25, 63–73.
causation but where causally related events are not Hitchcock, C., 1995. The mishap at Reichenbach fall: singular vs. general causation.
Philosophical Studies 78, 257–291.
subsumable under any laws (and so under any regularities) Hume, D., 1748/1975. Enquiry concerning human understanding. In: Selby-
at all. Bigge, L.A., Nidditch, P. (Eds.), Enquiries Concerning Human Understanding and
Concerning the Principles of Morals, third ed. Clarendon Press, Oxford.
See also: Causal Inference: Overview; Causation: Physical, Lange, M., 2002. Who’s afraid of ceteris paribus laws? Or: how I learned to stop
worrying and love them. Erkenntnis 52, 407–423.
Mental, and Social; Counterfactual Reasoning, Qualitative: Lewis, D.K., 1973a. Causation. Journal of Philosophy 70, 556–567.
Philosophical Aspects; Counterfactual Reasoning, Quantitative: Lewis, D.K., 1973b. Counterfactuals. Blackwell, Oxford.
Philosophical Aspects; Determinism: Social and Economic; Lewis, D.K., 1986. Events, in His Philosophical Papers, vol. II. Blackwell, Oxford
Economics, Philosophy of; Empiricism, History of; Free Will pp. 241–269.
Lewis, D.K., 1997. Finkish dispositions. The Philosophical Quarterly 47, 143–158.
and Action; Idealization, Abstraction, and Ideal Types; Joint
Lewis, D.K., 2000. Causation as influence. The Journal of Philosophy 97, 182–197.
Action; Logical Positivism and Logical Empiricism; Mental Lewis, D.K., 2004. Void and object. In: Collins, J., Hall, E.J., Paul, L.A. (Eds.),
Causation; Natural Law; Probability and Chance: Philosophical Causation and Counterfactuals. MIT Press, Cambridge, MA.
Aspects; Scientific Explanation. Machamer, P., Darden, L., Craver, C., 2000. Thinking about mechanisms. Philosophy
of Science 67, 1–25.
Mellor, D.H., 1995. The Facts of Causation. Routledge, London.
Menzies, P., 1989. Probabilistic causation and causal processes: a critique of Lewis.
Philosophy of Science 56, 642–663.
Bibliography Menzies, P., 1996. Probabilistic causation and the preemption problem. Mind 105,
85–117.
Anscombe, G.E.M., 1971. Causality and Determination. Cambridge University Press, Menzies, P., 1998. How justified are Humean doubts about intrinsic causal links?
London. Communication and Cognition 31, 339–364.
Armstrong, D.M., 1983. What Is a Law of Nature? Cambridge University Press, Menzies, P., 2007. Causation in context. In: Corry, R., Price, H. (Eds.), Causation,
Cambridge. Physics and the Constitution of Reality: Russell’s Republic Revisited. Oxford
Armstrong, D.M., 1997. A World of States of Affairs. Cambridge University Press, University Press, Oxford.
Cambridge. Noordhof, P., 1999. Probabilistic causation, preemption, and counterfactuals. Mind
Beebee, H., 2000. The non-governing conception of laws of nature. Philosophy and 108, 95–125.
Phenomenological Research 56, 571–594. Pietroski, P., Rey, R., 1995. When other things aren’t equal: saving ceteris paribus
Beebee, H., 2006a. Hume on Causation. Routledge, Abingdon. laws from vacuity. British Journal for the Philosophy of Science 46, 81–110.
Beebee, H., 2006b. Does anything hold the universe together? Synthese 149, Salmon, W., 1984. Scientific Explanation and the Causal Structure of the World.
509–533. Princeton University Press, Princeton, NJ.
Beebee, H., 2009. Causation and observation. In: Beebee, H., Hitchcock, C., Schaffer, J., 2000a. Trumping preemption. The Journal of Philosophy 97, 165–181.
Menzies, P. (Eds.), The Oxford Handbook of Causation. Oxford University Press, Schaffer, J., 2000b. Causation by disconnection. Philosophy of Science 67, 285–300.
Oxford, pp. 471–497. Schrenk, M., 2007a. The Metaphysics of Ceteris Paribus Laws. Ontos, Frankfurt.
Beebee, H., 2011. Necessary connections and the problem of induction. Nous 45, Schrenk, M., 2007b. Can capacities rescue us from ceteris paribus laws? In:
504–527. Kistler, M., Gnassounou, B. (Eds.), Dispositions and Causal Powers. Ashgate,
Beebee, H., Mele, A.R., 2002. Humean compatibilism. Mind 111, 201–223. Aldershot, pp. 221–247.
Berofsky, B., 2012. Nature’s Challenge to Free Will. Oxford University Press, New York. Strawson, G., 1989. The Secret Connexion. Clarendon Press, Oxford.
Bird, A., 1998. Dispositions and antidotes. The Philosophical Quarterly 48, 227–234. Suppes, P., 1970. A Probabilistic Theory of Causality. North-Holland, Amsterdam.
Bird, A., 2005. Laws and essences. Ratio 18, 437–461. Swartz, N., 1985. The Concept of Physical Law. Cambridge University Press, New York.
Bird, A., 2008. The epistemological argument against Lewis’s regularity view of laws. Williamson, J., 2011. Mechanistic theories of causality. Philosophy Compass 6,
Philosophical Studies 138, 73–89. 421–447.
Carroll, J.W., 1994. Laws of Nature. Cambridge University Press, Cambridge. Woodward, J., 2003. Making Things Happen. Oxford University Press, Oxford.
Cartwright, N., 1989. Nature’s Capacities and Their Measurement. Cambridge Woodward, J., 2008. Causation and manipulability. In: Zalta, E.N. (Ed.), The Stanford
University Press, Cambridge. Encyclopedia of Philosophy (Winter 2008 Edition). http://plato.stanford.edu/
Cartwright, N., 2004. Causation: one word, many things. Philosophy of Science 71, archives/win2008/entries/causation-mani (accessed 31.07.13.).
805–819. Woodward, J., Hitchcock, C., 2003a. Explanatory generalizations, part I: a counter-
Dowe, P., 2000. Physical Causation. Cambridge University Press, Cambridge. factual account. Noûs 37, 1–24.
Dowe, P., 2001. A counterfactual theory of prevention and ‘causation’ by omission. Woodward, J., Hitchcock, C., 2003b. Explanatory generalizations, part II: plumbing
Australasian Journal of Philosophy 79, 216–226. explanatory depth. Noûs 37, 181–199.

Вам также может понравиться