Forthcoming in A. Gopnik and L. Schulz (eds.) Causal Learning: Psychology, Philosophy and Computation. New York: Oxford University Press.
Interventionist Theories of Causation in Psychological Perspective
1. Introduction. Broadly speaking, recent philosophical accounts of causation may be grouped into two main approaches: difference-making and causal process theories. The former rely on the guiding idea that causes must make a difference to their effects, in comparison with some appropriately chosen alternative. Difference-making can be explicated in a variety of ways. Probabilistic theories attempt to do this in terms of inequalities among conditional probabilities: a cause must raise or at least change the probability of its effect, conditional on some suitable set of background conditions. When probabilistic theories attempt to define causation in terms of conditional probabilities, they have obvious affinities with associative theories of causal learning and with the use of contingency information (conditional p) as a measure of causal strength (Dickinson and Shanks, 1995). Counterfactual theories explicate difference making in terms of counterfactuals: a simple version might hold that C causes E if and only if it is true both that: (i) if C were to occur, E would occur and (ii) if C were not to occur, E would not occur. Following David Lewis, counterfactuals are often understood in the philosophical literature in terms of relationships among possible worlds: very roughly, a counterfactual like (i) is true if and only if there is a possible world in which C and E hold that is “closer” or “more similar” to the actual world than any possible world in which C holds and E does not hold. A set of criteria is then specified for assessing similarity among possible worlds. (cf. Lewis, 1986, p.47)
The interventionist theory described in Section 2 is a version of a counterfactual theory, with the counterfactuals in question describing what would happen to E under interventions (idealized manipulations of) on C. The interventionist theory does not require (although it permits) thinking of counterfactuals in terms of possible worlds and, as noted below, the specification of what sorts of changes count as interventions plays the same role as the similarity metric in Lewis’ theory. When causal information is represented by directed graphs as in Bayes net representations, these may be given an interventionist interpretation (Woodward, 2003, Gopnik and Shulz, 2004).
It is usual in the philosophical literature to contrast so-called type causal claims which relate one type of event or factor to another (“Aspirin causes headache relief”) with token or singular causal claims which relate particular events (“Jones’ taking aspirin on a particular occasion caused his headache to subside”). There are versions of difference-making accounts for both kinds of claim, although it is arguable that such accounts apply most straightforwardly to type causal claims. In contrast, causal process accounts apply primarily to singular causal claims. The key idea is that some particular event c causes some other event e if and only if there is a connecting casual process from c to e (Salmon, 1994). Processes in which one billiard ball collides with another and causes it to move are paradigmatic. There are a number of different accounts of what constitutes a causal process, but it is perhaps fair to say is that the generic idea is that of a spatio-temporally continuous process that transmits a conserved quantity such as energy and momentum, or, as it sometimes described in the psychological literature, “force”. Theorists in this tradition often deny that there is any intimate connection between causation and difference making: they claim that whether c causes e depends only on whether there is a causal process connecting c and e, something that (it is claimed) does not depend in any way on a comparison with what happens or would happen in some other, contrasting situation(s) (Salmon, 1994, Bogen, 2004). In contrast, such comparisons are at the heart of difference making accounts. Although most philosophical versions of causal process accounts are not committed to claims about the possibility of perceiving causal connections, an obvious analogue in the psychological literature are approaches that focus on launching or Michotte type phenomena. Psychological theories that attempt to understand causation in terms of mechanisms or generative transmission (where these notions are not understood along difference-making lines) are also in broadly the same tradition.
2. Interventionism. Interventionist accounts take as their point of departure the idea that causes are potentially means for manipulating their effects: if it is possible to manipulate a cause in the right way, there would be an associated change in its effect. Conversely, if under some appropriately characterized manipulation of one factor, there is an associated change in another, the first causes the second.
This idea has a number of attractive features. First, it provides a natural account of the difference between causal and merely correlational claims. The claim that X is correlated with Y does not imply that manipulating X is a way of changing Y, while the claim that X causes Y does have this implication. And given the strong interest that humans and other animals have in finding ways to manipulate the world around them, there is no mystery about why they should care about the difference between causal and correlational relationships. Second, a manipulationist account of causation fits very naturally with the way such claims are understood and tested in many areas of biology and the social and behavioral sciences and with a substantial methodological tradition in statistics, econometrics and experimental design, which connects causal claims to claims about the outcomes of hypothetical experiments.
Although it is possible to provide a treatment of token causation with a manipulability framework1, I will focus on the general notion of one type of factor being causally relevant (either positively or negatively) to another. There are two more specific causal concepts that may be seen as prescifications of this more general notion: total causation and direct causation. X is a total cause of Y if and only if it has a non-null total effect on Y -- that is, if and only if there is some intervention on X alone (and no other variables) such that for some values of other variables besides X, there will be a change in the value of Y under this intervention. Woodward, 2003 argues that this notion is captured by the conjunction of two principles (TC):
(SC) If (i) there are possible interventions (ideal manipulations) that change the value of X such that (ii) if such an intervention (and no others) were to occur X and Y would be correlated, then X causes Y.
(NC) If X causes Y then (i) there are possible interventions that change the value of X such that (ii) if such interventions (and no other interventions) were to occur, X and Y would be correlated.
Before turning to the notion of direct causation, several clarificatory comments are in order. First, note that if TC is to be even prima–facie plausible, we need to impose restrictions on the sorts of changes in X count as interventions or ideal manipulations. Consider a system in which A = atmospheric pressure is a common cause of the reading B of a barometer and a variable S corresponding to the occurrence/non-occurrence of a storm, but in which B does not cause S or vice-versa. If we manipulate the value of B by manipulating the value of A, then the value of S will change even though, in contradiction to (SC), B does not cause S. Intuitively, an experiment in which B is manipulated in this way is a badly designed experiment for the purposes of determining whether B causes S. We need to formulate conditions that restrict the allowable ways of changing B so as to rule out possibilities of this sort.
There are a number of slightly different characterizations of the notion of an intervention in the literature – these include Spirtes, Glymour, and Scheines, 2000, Pearl, 2000, and Woodward. 2003. Since the difference between these formulations will not be important for what follows I will focus on the core idea. This is that an intervention I on X with respect to Y causes a change in X which is of such a character that any change in Y (should it occur) can only come about through the change in X and not in some other way. In other words, we want to rule out the possibility that the intervention on X (or anything that causes the intervention) affects Y via a causal route that does not go through X, as happens, for example, when B in the example above is manipulated by changing the common cause, A, of B and S. I will also assume in what follows that the effect of an intervention on X is that X comes entirely under the control of the intervention variable and that other variables that previously were causally relevant to X no longer influence it – that, as it is commonly put, an intervention on X, “breaks” the causal arrows previously directed into X. In the case of the A-B-S system an intervention having these features might be operationally realized by, for example, employing a randomizing device which is causally independent of A and B and then, depending on the output of this device, experimentally imposing (or “setting”) B to some particular value. Under any such intervention, the value of S will no longer be correlated with the value of B and (NC) will judge, correctly, that B does not cause S. Note that in this case, merely observing the values of B and S that are generated by the ABS structure, without any intervention is a very different matter from intervening on B in this structure. In the former case, but not in the latter the values of B and S will be correlated. It is what happens in the latter case that is diagnostic for whether B causes S. The difference between observation and intervention thus roughly corresponds to the difference between so-called back-tracking and non-backtracking counterfactuals in the philosophical literature. The mark of a back-tracking counterfactual is that it involves reasoning or tracking back from an outcome to causally prior events and then perhaps forward again, as when one reasons that if the barometer reading were low (high) this would mean that the atmospheric pressure would be low (high) which in turn would mean that the storm would (would not) occur. Evaluated in this back-tracking way, the counterfactual “If the barometer reading were low (high), then the storm would (would not) occur” is true. By contrast, when the antecedent of a counterfactual is understood as made true by intervention, back-tracking is excluded, since, as emphasized above, an intervention “breaks” any previous existing relationship between the variable intervened on and its causes. Thus, when the barometer reading is set to some value by means of an intervention, one cannot infer back from this value to the value that the atmospheric pressure must have had. For this reason, the counterfactual “If the barometer reading were low (high), then the storm would (would not) occur” is false when its antecedent is regarded as made true by an intervention. Lewis holds that non-backtracking rather than backtracking counterfactuals are appropriate for understanding causation and the interventionist theory yields a similar conclusion.. This illustrates how, as claimed above, interventions play roughly the same role as the similarity metric in Lewis’ theory and how they lead, as in Lewis’ theory, to non-backtracking counterfactuals. with arrow-breaking having some of the features of Lewisian miracles2.
What is the connection between this characterization of interventions and manipulations that are performed by human beings? I will explore this issue below but several comments will be helpful at this point. Note first that the characterization makes no explicit reference to human beings or their activities – instead the characterization is given entirely in non-anthropocentric causal language. A naturally occurring process (a “natural experiment”) that does not involve human action at any point may thus qualify as an intervention if it has the right causal characteristics. Conversely, a manipulation carried out by a human being will fail to qualify as an intervention if it lacks the right causal characteristics, as in the example in which the common cause A of B and S is manipulated. Nonetheless, I think that it is plausible (section 7) that as a matter of contingent, empirical fact, many voluntary human actions as well as many behaviors carried out by animals do satisfy the conditions for an intervention. Moreover, I also think that it is a plausible empirical conjecture that humans and some other animals have a default tendency to treat their voluntary actions as though they satisfy the conditions for an intervention and to behave, learn, and in the case of humans to make casual judgments as if their learning, behavior, and judgments are guided by principles like TC. The connection between interventions and human (and animal) manipulation is thus quite important to the empirical psychology of causal judgment and learning, even though the notion of an intervention is not defined by reference to human action.
Second, note that both SC and NC involve counterfactual claims about what would happen if certain “possible” interventions “were” to be performed. I take it to be uncontroversial that the human concept of causation is one according to which causal relationships may hold in circumstances in which it may never be within the power of human beings to actually carry out the interventions referred to in SC and NC. (In this respect the human concept may be very different from whatever underlies non-human causal cognition – section 8) Both conditions should be understood in a way that accommodates these points: what matters to whether the relationship between X and Y is causal is not whether an intervention is actually performed on X but rather what would happen to Y if (perhaps contrary to actual fact) such interventions were to be performed.
SC and NC connect the content of casual claims to certain counterfactuals and, as such, are not claims about how causal relationships are learned. However, if SC and NC are correct, it would be natural to expect that human beings often successfully learn causal relationships by performing interventions and in fact this is what we find. But this is not to say (and SC and NC do not claim) that this is the only way in which we can learn about causal relationships. Obviously there are many other ways in which humans may learn about causal relationships – these include passive observation of statistical relationships, instruction, and the combination of these with background knowledge. What SC and NC imply is that if, for example, one concludes on the basis of purely observational evidence that smoking causes lung cancer, this commits one to certain claims about what would happen if certain experimental manipulations of smoking were to be performed.
Finally, a brief remark about an issue that will probably be of much more concern to philosophers than to psychologists: the worry that TC is “circular” . Since the notion of intervention is characterized in causal terms, it follows immediately that TC does not provide a reductive definition of causation in terms of concepts that are non-causal. I have argued elsewhere (Woodward, 2003) that it does not follow from this observation that TC is uninformative or viciously circular. Rather than repeating those arguments here, let me just observe that TC is inconsistent with many other claims made about causation, for example, claims that causal relationships require a spatio-temporally connecting causal process. So regardless of what one makes of the “circularity” of TC it is certainly not vacuous or empty.
Let me now turn to the notion of direct causation. Consider a causal structure in which taking birth control pills (B) causally affects the incidence of thrombosis (T) via two different routes. B directly boosts the probability of thrombosis and indirectly lowers it by lowering the probability of an intermediate variable pregnancy (P) which is a positive cause of T (cf. Hesslow, 1976)
Suppose further that the direct causal influence of B on T is exactly cancelled by the indirect influence of B on T that is mediated through P so that there is no overall effect of B on T . In this case B is not a total cause of T , since there are no interventions on B alone that will change T. Nonetheless, it seems clear that there is a sense in which B is a cause, indeed a direct cause, of T.
The notion of direct causation can be captured with in an interventionst framework as follows:
(DC) A necessary and sufficient condition for X to be a direct cause of Y with respect to some variable set Vis that there be a possible intervention on X that will change Y (or the probability distribution of Y) when all other variables in V besides X and Y are held fixed at some value by other independent interventions.
In the example under discussion, B counts as a direct cause of T because if we intervene to fix the value of P and then, independently of this, intervene to change the value of B the value of T will change. The notion of X’s being a direct cause of Y is thus characterized in terms of the response of Y to a combination of interventions, including both interventions on X and interventions on other variables Z. This contrasts with the notion of a total cause which is characterized just in terms of the response of the effect variable to a single intervention on the cause variable. The notion of direct causation turns out to be normatively important because it is required to capture ideas about distinctness of causal mechanisms and to formulate a plausible relationship between causation and probabilities (for details, see Woodward, 2003, ch. 2). Of course, it is a separate question whether the notion corresponds to anything that is psychologically real in people’s causal judgments and inferences. I will suggest below that it does – that it is involved in or connected to our ability to separate out means and ends in causal reasoning. It is also centrally involved in the whole idea of an intervention, which turns on there being a contrast between doing something that affects Y directly and doing something that affects Y only indirectly, through X. We will see below that even young children see able to reason causally about the consequences of combinations of interventions.
Finally let me note that both TC and DC address a very specific question: whether the relationship between X and Y is causal at all, rather than merely correlational. However if we are interested in manipulation and control, we typically want to know much more than this: we want to know which interventions on X will change Y, and how they will change Y, and under what background circumstances – that is, we want to know a whole family of more specific and fine-grained interventionist counterfactuals connecting X to Y. We may view this more detailed information, which may be captured by such devices as specific functional relationships linking X and Y , as the natural way of spelling out the detailed content of causal claims within an interventionist framework. Such information about detailed manipulability or dependency relationships is often required for tasks involving fine grained control such as tool use.
3. Additional Features of Interventionism. I said above that interventionist accounts are just one kind of approach in the more general family of theories that conceive of causes as difference makers. To further bring out what is distinctive about interventionism, consider the following causal structures.
Let us make the standard Bayes’ net assumption connecting causation and probabilities – the Causal Markov condition CM , according to which conditional on its direct causes, every variable is independent of every other variable, singly or in combination, except for its effects. Given this assumption, both structures 3.1 and 3.2 imply exactly the same conditional and unconditional independence relationships: in both X, Y and Z are dependent and X and Z are independent conditional on Y. The difference between the structures 3.1 and 3.2 shows up when we interpret the directed edges in them as carrying implications about what would happen if various hypothetical interventions were to be performed, in accordance with DC. In particular, if 3.1 is the correct structure, under some possible intervention on Y, X and Z will change, while if 3.2 is the correct structure Z but not X will change under an intervention on Y. Similarly, 3.2 implies that under some intervention on X, both Y and Z will change, while 3.1 implies that neither Y nor Z will change. In general, if two causal structures differ at all, they will make different predictions about what will happen under some hypothetical interventions, although, as 3.1-3.2 illustrate, they may agree fully about the actual patterns of correlations that will be observed, in the absence of these interventions.
Although an interventionist account does not attempt to reduce causal claims to information about conditional probabilities, it readily agrees that such information can be highly relevant as evidence for discriminating between competing causal structures. Indeed, as explained in Woodward, 2003, pp. 339ff, we may think of CM as a condition that connects claims about what happens under interventions to claims about conditional probabilities involving observed outcomes, thus allowing us to move back and forth between the two kinds of claims. Arguably (see Section 8) the ability to move smoothly from claims about causal structure that follow from information about the results of interventions to claims about causal structure that are supported by observations and vice-versa is one of the distinctive features of human causal cognition. In this connection, there is considerable evidence that at least in simple cases humans can learn causal Bayes nets from passive observations, interventions and combinations of the two. Indeed, for at least some tasks the assumption that subjects are Bayes’ net learners does a better job of accounting for performance than alternative learning theories.
I suggested above that an interventionist account will lead to different causal judgments about particular cases than causal process accounts. Consider cases of “double prevention” in which A prevents the occurrence of B which, had it occurred, would have prevented the occurrence of a third event C, with the result that C occurs. Cases of this sort occur in ordinary life and are common in biological contexts. For example, the presence (A) of lactose in the environment of E. Coli results in the production (C) of a protein that initiates transcription of the enzyme that digests lactose by interfering with the operation (B) of an agent that (in the absence of lactose) prevents transcription. There is dependence of the sort associated with interventionist counterfactuals between whether or not lactose is present and the synthesis (or not) of the enzyme which digests it – manipulating whether lactose is present changes whether the enzyme is synthesized -- but no spatio-temporally continuous process or transfer of energy, momentum, or force between lactose and the enzyme3. Interventionist accounts along the lines of TC will judge such relationships as causal while causal process theories will not. Biological practice seems to follow the interventionist assessment, but it would be useful to have a more systematic experimental investigation of whether ordinary subject regard double prevention relationships as causal, how they assess causal efficacy or strength in such cases, and the ease with which such relationships can be learned.
Double prevention cases suggest that energy transmission is not necessary for causal relatedness. Is it sufficient? Arguably, energy transmission between two events is sufficient for there being some causal process connecting the two. However, the information that such a process is present is not tantamount to the detailed information about dependency relationships provided by interventionist counterfactuals. This is suggested by the following example. (Hitchcock, 1995). A cue stick strikes a cue ball which in turn strikes the eight ball causing it to drop into a pocket. The stick has been coated with blue chalk dust, some of which is transmitted to the cue ball and then to the eight ball as a result of the collision. In this case, energy, momentum, and “force” are all transmitted from the stick to the cue ball. These quantities are also transmitted through the patches of blue chalk that eventually end up on the eight ball. The sequence leading from the impact of the cue stick to the dropping of the eight ball is a causal process, as is the transmission of the blue chalk, and a connecting mechanism is present throughout this sequence. The problem is that there is nothing in all this information that singles out the details of the way in which cue stick strikes the cue ball (and the linear and angular momentum that are so communicated) rather than, say, the sheer fact that the cue stick has struck the cue ball in some way or other or the fact that there has been transmission of blue chalk dust as causally relevant to whether the eight ball drops. Someone might fully understand both the abstract notion of a causal process and be able to recognize that the process connecting cue stick, cue ball and eight ball is a causal process that transmits energy and yet not understand how variations in the way the cue strikes the cue ball make a difference to the subsequent motion of the eight ball, and that the transmission of the chalk dust is irrelevant. Yet this information, which is captured by interventionist counterfactuals of the sort described in TC, is crucial for manipulating whether the eight ball drops in the pocket4. As we will see, this observation has implications for primate causal understanding.
In general, then, an interventionist account predicts that when information about spatio-temporal connectedness is pitted against information about dependency relations of the sort captured by interventionist counterfactuals, the latter rather than the former will guide causal judgment. For example, if the relationship between C and E satisfies the conditions in TC, people will judge that C causes E even if there appears to be spatio-temporal gap between C and E. Moreover, even if there is a connecting spatio-temporally continuous process from C to E, they will judge that C does not cause E if the dependence conditions in TC are not satisfied. Similarly, for the information that something has been transmitted from C to E: although chalk dust is transmitted to the eight ball, subjects will not judge that its presence causes the ball to go into the pocket because the conditions TC are not satisfied.
Despite these observations, adherents of an interventionist account can readily acknowledge that information about causal mechanisms, properly understood, plays an important role in human causal learning and understanding. However, rather than trying to explicate the notion of a causal mechanism in terms of notions like force, energy, or generative transmission, interventionists will instead appeal to interventionist counterfactuals. Simplifying greatly, information about a mechanism connecting C to E will typically be information about a set of dependency relationships, specified by interventionist counterfactuals, connecting C and E to intermediate variables and the intermediate variables to one another, perhaps structured in a characteristic spatio-temporal pattern (cf, Woodward, 2002). Among other things, such counterfactuals will specify how interventions on intermediate variables will modify or interfere with the overall pattern of dependence between C and E. As an illustration, consider Shultz’s classic, 1982 monograph in which he argues that children rely heavily on mechanism information in causal attribution. This mechanism information can be readily reinterpreted as information about interventionist counterfactuals. For example, in experiment two, subjects must decide which of two different lamps is responsible for the light projected on a wall. Here the relevant interventionist counterfactuals will describe the relationship between turning on the lamp and the appearance of a spot on the wall, the orientation of the lamp and the position of the spot, the effect of inserting a mirror in the path of transmission, and so on. Similarly, in the cue ball example, the relevant mechanism will be specified in terms of the dependence of the trajectories of the cue and eight ball on variations in the momentum communicated by the stick, the effect of intervening independently on the eight ball (e.g. gluing it to the table) and so on.
On this construal, detailed information about the operation of mechanisms is not, as is often supposed, something different in kind from information about dependency or manipulability relationships, understood in terms of interventionist counterfactuals, but rather simply more of the same: more detailed fine grained information about dependency relationships involving intermediate variables5. An additional advantage of this way of looking at things is that it provides a natural account of how it is possible, as it clearly is, for people to learn that there is a causal relationship between C and E without knowing anything about a connecting mechanism. This is much harder to understand if, as some mechanism –based approaches claim, the existence of a causal relationship between C and E just consists in the obtaining of a connecting mechanism between C and E and the information that C causes E consists in or implies information to the effect that there is such a mechanism. By contrast, according to TC, people will judge that C causes E, if they are presented with evidence (e.g. from repeated experimental manipulations) that the relevant interventionist counterfactuals hold between C and E, even if they have no information about an intervening mechanism.