Counterfactuals and Innovation

Mind changes: A simulation semantics account of counterfactuals
Srini Narayanan
ICSI and UC Berkeley
1947 Center Street
Berkeley, CA:94704
snarayan@icsi.berkeley.edu
November 3, 2010
Abstract
Counterfactuals are mental simulations of variations on a theme. They refer to imagined alternatives to something that has actually occurred. Counterfactual reasoning is basic to human cognition and is ubiquitous in commonsense reasoning as well as in formalized discourse. They play a significant role in other cognitive processes such as
conceptual learning, planning, decision making, social cognition, mood adjustment, and performance improvement.
We present a modeling framework and results that represent the first step toward a computationally adequate cognitive model of counterfactuals. Our treatment of counterfactuals comes from independent considerations of cognitively
motivated event structure representation useful for event coordination and for language processing. Our model is able
to capture the variety, scope, and inferential richness of the psychological data and makes detailed predictions about
the neural substrate that underlies counterfactual processing.
Alternatives
1. If Ted Kennedy were alive, universal health care would have an unshakable champion.
2. If only we had left earlier, we would have avoided the traffic.
3. He almost made it to the track on time.
4. He never would have made it without my help.
5. It couldve been worse.
6. If only I had ten dollars more, I could have bought that shirt.
7. If this had been an actual emergency, the signal you just heard would have been followed by official information,
news or instructions.
Counterfactuals are mental simulations of what might have been. They refer to imagined alternatives to something that has actually occurred. They play a significant role in other cognitive processes such as conceptual learning,
decision making, social cognition, mood adjustment, and performance improvement. They help us to process causal
relations by highlighting possible causal antecedents of an unpleasant outcome (for example, if only we had left
earlier, we would have avoided the storm; if only I had studied harder, I would not have flunked the test), and to
imagine better ways of proceeding in the future (Henceforth, we will leave earlier ...; ...study harder ..., and so
on). They appear to be a pervasive part of normal cognitive processing and may occur fairly often outside of conscious
awareness. Without counterfactual thinking a person would find it more difficult to avoid repetition of past mistakes,
to adjust their mood after an unpleasant event, to reason effectively about events, to learn from mistakes and failures,
and so forth.
1
Counterfactual reasoning
Counterfactual reasoning is basic to human cognition and is ubiquitous in commonsense reasoning as well as in
formalized scientific and in literary discourse. Much of the recent psychological research on counterfactuals (Epstude
& Roese 2008; Roese et al. 2005; Roese & Olson 2003; Markman et al. 1993; Mandel et al. 2005; Markman et al.
2006), adopts a functional perspective. The functional perspective emphasizes the use of counterfactuals to regulate
behavior. In this view, counterfactuals as seen as being closely related to goals, actions, and homeostatic control.
They are typically triggered (or activated) based on a failed goal or being in an undesirable state, and their content
relates to situational and behavioral alterations to achieve the failed goal or to restore a more desirable state. This
connection between counterfactuals and actions/goals has been demonstrated in a series of experiments (Markman
et al. 1993; Markman & McMullen 2003; Epstude & Roese 2008). As an illustrative example (Epstude & Roese
2008) discuss an an experiment where students generated counterfactual thoughts in an open ended fashion. Of the
generated counterfactuals, 75% were about personal actions and over 33% focused on goals.
Counterfactuals have also been a subject of research within philosophy and Artificial Intelligence (AI). From a
logical perspective, using material implication makes the counterfactual conditional always evaluate to true since
A = B = A B (using DeMorgans laws) and by definition counterfactuals make the premise false, hence
the conditional is always true. This led to a series of proposals to compute the meaning (in terms of truth values)
of counterfactuals, of which the possible worlds model of David Lewis (Lewis 1973) spurred a lot of subsequent
research. (Ginsberg 1986) related counterfactuals to belief revision where the counterfactual is evaluated as true if it
holds for maximal sets of proposition similar to the antecedent. The extension of the possible worlds model which
requires a complete theory to approximate theories was proposed by (Costello & McCarthy 1999). None of these
logical accounts adequately address the variety, scope, and inferential richness of the psychological data presented
earlier.
In general, truth conditional semantic accounts of counterfactuals lead to logical contradictions and circularities
(Goodman 1983)). (Fauconnier 1997) further argues persuasively that truth conditional accounts are both functionally
inadequate and cognitively invalid. As with other phenomena in language and thought (such as metaphor, metonymy,
modals, aspect), it appears that counterfactuals rely on the mappings between the conceptual structures (or domains) to
communicate new facts and trigger inferential processes like analogy or metaphor that support specific communicative
goals, intentions and evaluations.
Some central questions in the study of counterfactual reasoning include the following.
1. Activating counterfacutals: What processes trigger the generation of counterfactuals.
2. Content of counterfactuals: Which aspects of the mental model of the world are most likely to be the subject
of counterfactuals. What principles govern which aspects are changed (in the premise) and which aspects are
held constant?
3. Analyzing counterfactuals: Counterfactuals are used extensively in both the physical and social sciences for
explanatory purposes as well as to highlight certain salient causal aspects. What are the principles by which a
hearer/interpreter can analyze the counterfactual implications?
4. Counterfactuals and Embodiment: Where do counterfactuals come from? When do kids generate counterfactuals? Can we begin to piece together a neurally motivated computational model of counterfactual reasoning?
This paper outlines a cognitively motivated computational model of counterfactual reasoning. We start with a
quick tour through the psychological literature on counterfactuals focusing on the accumulating evidence of the functional role of counterfactuals in human thought and behavior. We follow this discussion with a cognitive model that
is motivated by the observation that much of human planning and decision making about actions and events relies
on a fine-grained representation of events where the circuits active in the performance and monitoring of actions are
also used in a simulative mode for planning, reasoning and for making decisions. We call this framework simulation
semantics, in that it ties the performance and recognition of behavior with planning and inference about the behavior
through simulation. Section 6 applies the computational realization of simulation semantics to model the generation and analysis of counterfactuals. Section 7 outlines a possible neural realization of our computational model that
matches specific evidence on deficits in counterfactuals. Section 8 describes the model of counterfactuals in technical
detail, making connections with previous attempts within philosophy and Artificial Intelligence (AI). Section 9 illustrates the application of the computational model and counterfactual evaluation algorithm on several examples. We
close with a discussion of possible future work.
Background
Over the last two decades there has been mounting evidence about the complex relationship between counterfactuals
and human action and goals (Roese et al. 2005). Counterfactuals relate goals and outcomes with behavior, actions,
and affect. First, counterfactuals influence motivation, decisions, and emotions via the contrast effect. Contrasts pair
reality with an alternative that heightens the salience of some aspect or attribute. Second, counterfactual activation
and content depends on the type of goal. Performance goals, or goals of achievement, are distinguished from a)
affect goals, such as coping, mood-maintenance, or hedonic satisfaction, and b) communicative goals of persuasion,
rhetoric, performative, dramatic effect, etc.1 Each of these classes of goals has a rich structure and specific processes.
Third, counterfactuals exploit the connection between goals and affect (thwarting a goal in a specific way results in
a specific affective state). Many complex emotional states (regret, guilt) are often co-activated with counterfactuals.
Developmentally, counterfactuals appear fairly early and are correlated to other abilities children have with respect
to interpreting complex and fictitious events (as in pretend situations). In terms of neural correlates, it appears that
diseases and pathologies that affect the Prefrontal Cortex (PFC) directly impact the ability to generate counterfactuals.
This is consistent with evidence of PFC involvement in cognitive control (Miller & Cohen 2001; Badre et al. 2009),
event coordination (Krueger et al. 2008), and in representing complex events(Wood & Grafman 2003; Dreher et al.
2008).
3.1
Counterfactuals and contrast
Counterfactual thoughts may influence emotions and judgments by way of a contrast effect, which is based on the
juxtaposition of reality versus what might have been. For example, winning $50 feels nice, but if one came close to
winning $100 instead of $50, it does not feel quite as nice. This effect of counterfactuals on emotion and satisfaction
is an example of a widely observed psychological principle, that of the contrast effect. Contrast effects occur when a
judgment is made more extreme via the juxtaposition of some anchor or standard (Sherif and Hovland, 1961). Contrast
effects can apply to any sort of judgment, including physical properties, such as heaviness, brightness, loudness, or
temperature. For example, ice cream feels especially cold immediately after sipping hot tea. A suitcase may feel
especially light if one has just been moving furniture. Contrast effects also apply to subjective appraisals of value,
satisfaction, and pleasure. Thus, by the same token, a factual outcome may be judged to be worse if a more desirable
alternative outcome is salient, and that same outcome may be judged to be better if a less desirable alternative outcome
is salient.
3.2
Goals and counterfactuals
Counterfactuals often make salient a causal relationship between actions and desired outcomes that can be used for
learning and prediction. Consider the case where one might ponder actions that could have made a job interview result
in success given that the job went to someone else. The various actions pondered could include: if only I had prepared
more, if only I had been better dressed, if only I had been more persistent, less pushy, etc. Such actions are about a
past event, and so are properly termed counterfactual. But, of course, one might generate the same actions as possible
conditionals for future action (Roese et al. 2005). So one might imagine preparing more, being well dressed for the
interview, being less aggressive than usual etc. which might propel behavior. Indeed, (Nasco & Marsh 1999) show
that manipulating counterfactual thinking (regarding an exam grade) and subsequent academic achievement (getting a
better grade in a subsequent exam) could mediate the self-reported contingent relation between actions (studying) and
success.
1 In
addition to their specific relationship to counterfactuals (Roese et al. 2005), these classes have shown to be useful in social comparison and
in coping (Lazarus and Folkman 2004).
Researchers make an important distinction between upward and downward counterfactuals (Markman et al. 1993;
Markman & McMullen 2003; Roese & Olson 2003; Epstude & Roese 2008; Roese et al. 2009). Upward counterfactuals compare reality to a more desirable alternative (eg. if only I had been taller, I would have gotten more dates)
while downward counterfactuals compare the current reality to a less desirable alternative (if I hadnt braked in time,
I would have rear ended the other vehicle). In terms of affect, upward counterfactuals often elicit a negative response
(regrets like if only..), while downward counterfactuals often elicit a positive affect (being grateful as in thank god I
avoided the worse outcome.). Upward counterfactuals seem more useful in achieving performance goals since such
thoughts specify improvements to the current state.
Performance goals may also be differentiated on whether they have a acquisition focus (also called promotion
focus (Roese et al. 2009) or an avoidance focus (also called preservation focus (Roese & Olson 2003)). Acquisition
focus corresponds to the promotion, advancement, achievement and regulation of positive end states. Avoidance
focus, by contrast, is more concerned with the prevention of unwanted circumstances and dealing with obstacles and
goal blocking conditions. 2 There is an obvious and important inferential distinction between acquisition goals and
avoidance goals. To properly imagine a causal condition for acquisition, one identifies a causally sufficient condition
(to fly to New York I take a plane), whereas for avoidance goals, imagining a necessary condition and removing it (to
avoid a plane crash I dont fly) would prevent the unwanted circumstance from obtaining. Thus causal sufficiency is
more relevant for acquisition goals while necessity appears to be more relevant for prevention goals.
(Roese et al. 2005) builds on the social psychology research that characterizes affect goals into three categories.
These categories are
1. Mood repair: Mood repair refers to techniques designed to improve mood after a negative event has depressed
it.
2. Mood maintenance: Mood maintenance refers to the tendency to enjoy and thus attempts to preserve good
moods.
3. Self-protection: Self protection is a strategy to prepare for future threats by cognitively minimizing its potential
impact.
Counterfactuals are important components of rhetorical and narrative strategies aimed at communication, pedagogy, and persuasion. They can dramatize events and illuminate arguments in a vivid and informative manner that
appears to resonate with the natural human instincts for imagery and imaginative simulation. An example of the effectiveness of using counterfactuals can be evidenced by their extensive use in communicating historical events (Tetlock
& Belkin 1996) and in popular novels (http://www.uchronia.net has a bibliography).
Specifically the use of counterfactuals heighten several aspects of communication
1. Counterfactuals highlight specific attributes in a contrastive manner. These contrasts could be upward or downward. For instance upward counterfactuals may make salient health care problems in the US by imagining a
different system, while downward counterfactuals may highlight contrastive aspects of civil liberty or freedom
of choice enjoyed in the United States.
2. Counterfactuals reinforce beliefs. This is extremely important in a globalized world where any extreme polarization potentially affects everyone. (Tetlock et al. 2000) demonstrate that people respond to the threat in
heretical counterfactuals by exaggerated declarations of moral belief or by metaphorical moral cleansing.
For instance, Christians reacted with disgust and moral outrage to suggestions that some sequence of events in
the life of Jesus Christ could have been altered by accidental circumstances (what if Jesus had given in to the
devils temptations during his fast of 40 days and nights in the wilderness). Clearly similar and possibly more
consequential reactions may have resulted from Salman Rushdies imaginings or to the Danish cartoonists
portrayals of Mohammed.
3. Counterfactuals help in communicating the evaluative stance of a speaker. (Harding 2007) analyzes situations
in which speakers express attitudes not only toward events that have happened, but also toward counterfactual
2 This
distinction may have significance in cross cultural differences in counterfactual reasoning (Roese 2006).
events; speakers communicate these attitudes by expressing an evaluative stance toward counterfactual scenarios. As many researchers have noted including (Dancygier & Sweetser 2005; Harding 2007), often in literature,
different speakers, including characters and narrators, may imagine and describe counterfactuals, which are
scenarios not realized in the story and which are regarded as unrealized by the speaker who introduces them.
Because counterfactual scenarios are often depictions of foreclosed possibilities, lost opportunities, and near
misses linked to strong feelings of relief and regret, they are evocative elements of narrative that reward readers
for their mental work with an enhanced appreciation for characters and textual themes. Counterfactuals also
encourage readers to take a participatory role in the process of judgment.
3.3
Counterfactuals and Emotion
It appears that the thwarting of personal goals and the creation of undesirable situations are central to activating
counterfactuals. Clearly these are often accompanied by negative affect such as regret, blame, or guilt. It has been
pointed out that such affective variables (such as regret) appear to be often co-activated with counterfactuals.
Indeed, certain emotions seem to be counterfactual (Landman 1993; Niedenthal et al. 1994). Eliciting these emotions requires a comparison between the actual occurrence of an event and a (often desired) possible alternative evolution. This extends to assignments of blame and responsibility and other social judgments. Some counterfactual
emotions are listed below.
1. regret (something bad happened and didnt have to).
2. relief (something bad almost happened).
3. blame (something bad happened and it was someone elses fault).
4. guilt (actors fault which could have been prevented).
5. surprise (something good happened unexpectedly).
6. hope (something good may happen).
7. anxiety (something bad may happen).
One triggering condition for counterfactual generation appears to be the notion of almostness (Kahneman &
Tversky 1982). The obvious manifestation of this is in sports. It has been shown that counterfactual generation is
much more likely after a close game than a one-sided one. In a landmark study on Olympic athletes (Medvec et al.
1995), it was discovered that counterfactual emotions like regret were much higher for the silver medalist compared to
the bronze medalist. The identification of a specific small change that could have generated a more desirable alternative
seems to be more likely to trigger counterfactual thoughts than distant, large, or less specific changes.
3.4
Counterfactuals and Development
A fundamental requirement for counterfactual thinking is the ability to keep two possibilities (future or past worlds)
in mind (Bryne 2005). Children as young as four years old are able to keep separate (given spatial and temporal cues)
multiple worlds (Weisberg & Bloom 2009) and use them in pretend play situations. Furthermore, four year olds can
figure out in a pretend situation how certain outcomes could have been avoided (Beck et al. 2006). Four year olds
can entertain realistic counterfactuals in experimental settings. In these tasks children hear a story, typically acted out
with puppets or illustrated with pictures. They are then asked a question about a counterfactual antecedent and they
must work out the consequences of this change. For example, in a story used by (Riggs et al. 1998), a character Jenny
makes a painting, which she leaves on the table in the garden while she goes into the house. While she is away the
wind blows the painting up in to a tree. The counterfactual test question is What if the wind hadnt blown, where
would the picture be? Riggs et al. found that 3-year-olds tended to give answers about the event described In the
tree whereas by 4 years children were able to speculate about the counterfactual alternative On the table. Other
authors have used similar tasks (German & Nichols 2003; Guajardo & Turley-Ames 2004; Guttentag & Ferrell 2004)
and it appears that normal 4-year-olds can entertain counterfactuals with ease.
5
3.5
The Neural Correlates of Counterfactual Behavior
(Knight & Grabowecky 1995) described a man with dorsolateral prefrontal cortical damage in which the most marked
behaviour was a complete absence of counterfactual expressions. This lack of counterfactual expression existed in
spite of the fact that the man had recently experienced emotional stressors (for example, a career setback) that are
typically associated with the emergence of conscious counterfactual thinking. The man was perseverative and socially
impaired as well. More recently, Hooker and coworkers (Hooker et al. 2000) reported that patients with schizophrenia are less likely than controls to mention counterfactual thoughts after recalling personally experienced negative
events; in addition, counterfactually derived inferences were reliably different between the schizophrenic subjects and
controls, as assessed by scores on a counterfactual inference test (CIT) developed by Roese and coworkers (Roese
1997). On the other hand the two groups did not differ on other cognitive measures such as the vocabulary subtest of
the WAIS-R, so the counterfactual deficit could not be attributed only to a general cognitive impairment. Like persons with frontal lesions, schizophrenic individuals are perseverative, socially inappropriate, and less efficient learners
when compared with healthy people. As many patients with schizophrenia typically show prefrontal lobe dysfunction,
the authors suggested that the poor counterfactual performance might be related to that feature. Unfortunately, no
data were reported on the strength of relation between performance on tests of frontal function and performance on
counterfactual tasks, so no direct evidence of involvement of the frontal lobes was obtained.
Another piece of evidence linking counterfactuals to circuits associated with Prefrontal Cortex (PFC) comes from
results on Parkinsons patients. (McNamara et al. 2003) Parkinsons disease is characterized by rigidity, bradykinesia,
gait disorders, and sometimes tremors. The primary pathology involves loss of dopaminergic cells in the substantia nigra and the ventral tegmental area (VTA). These two subcortical dopaminergic sites give rise to two projection
systems important for motor, affective, and cognitive functioning. The nigrostriatal system, primarily implicated in
motor functions, originates in the pars compacta of the substantia nigra and terminates in the striatum. The mesolimbic-cortical system contributes to cognitive and affective functioning. It originates in the VTA and terminates in
the ventral striatum, amygdala, frontal lobes, and some other basal forebrain areas. Dopamine levels in the ventral
striatum, frontal lobes, and hippocampus are approximately 40% of normal values in Parkinsons disease. The degree
of nigro-striatal impairment correlates with the degree of motor impairment in affected individuals, while VTA mesocortical dopaminergic impairment correlates positively with the degree of affective and intellectual impairment. The
mesocortical dopaminergic dysfunction very probably has a negative impact on prefrontal lobe functions.
Patients with Parkinsons disease spontaneously generated significantly fewer counterfactuals than controls despite
showing no differences from controls on a semantic fluency test; they also performed at chance levels on a counterfactual inference test, while age matched controls performed above chance levels on this test. Performance on both
the counterfactual generation and inference tests correlated significantly with performance on two tests traditionally
linked to frontal lobe functioning (Stroop colour-word interference and Tower of London planning tasks) and one test
of pragmatic social communication skills.
Recent research (Camille et al. 2004; Ursu & Carter 2005) suggests that the orbitofrontal cortex (OFC) is critical
for representations of outcomes of actions and their subsequent impact on the control of behavior. (Ursu & Carter 2005)
present results from two event-related functional MRI experiments consistent with two hypotheses regarding the role
of the human OFC in guiding behavior through outcome representation: (1) counterfactual effects are manifested in
the human OFC during expectation of outcomes, such that the anticipated affective impact of outcomes is modulated
by the nature of the various possible alternative outcomes; (2) a regional specialization exists in the human prefrontal
cortex, such that affective impact of potential negative outcomes of actions is represented mainly by the lateral areas
of the OFC, while areas situated progressively more medial and dorsal on the ventral and medial PFC are specifically
involved in representing the impact of positively valenced outcomes.
Together, these studies suggest that the PFC lesions impact counterfactual processing and that counterfactual
deficits are correlated with PFC function and performance on planning, multi-tasking and task coordination, pragmatic communication, and outcome expectations.
3.6
Summary
Counterfactual (CF) thinking is pervasive in everyday situations, both in linguistic and non-linguistically mediated
situations. Counterfactual generation and reasoning occurs early (by four years) in development and language allows
the sharing of counterfactual thoughts.

Studies on the neural correlates of counterfactual reasoning both in normal and in pathological cases of schizophrenia and Parkinsons patients suggest that the Prefrontal cortex (PFC) is implicated in counterfactual processing. Specifically, PFC lesions impact counterfactual processing and that furthermore, these deficits are are correlated with PFC
function and performance on other planning and decision making tasks.
In summary, counterfactuals tap into and manipulate the connection between actions, outcomes and goals (desired
outcomes). A proper understanding of counterfactual processes thus depends on a model of (the relationship between)
goals, actions, and their outcomes. We now turn to the ontological and computational underpinnings of a cognitive
model of counterfactuals.
Components of a cognitive model of counterfactuals
Proposition 1 . Counterfactuals tap into the rich structure of human event and action representation. Encoding this
structure provides the basis for generating and simulating the effect of counterfactual reasoning.
Proposition 2 . Local perturbations to the event structure can lead to counterfactual generation. Candidate perturbations and changes are specified by goals, resources, preconditions, outcomes, and the branching dynamics associated with event structure (in terms of event parameters, and composition, and alternative evolutions).
There are two components that jointly determine the cognitive generation and content of counterfactuals.
Inferences about actions and events involves the imaginative simulation of the event. Such simulation is tied
closely to the structure and ontology of events.
Counterfactual generation involves simulating imagined variations in the basic model of the event. Such variations can be generated by a variety of conditions including personal goal violations, resource depletion, and in
planning and hypothesis generation.
In general, functional counterfactuals are generated whenever
1. There is a specific attribute that is under the control of the agent (such as the resource for an action, achieving the
preconditions, or controlling the evolution of the event (such as performing the event, preventing it, interrupting
it, terminating it, etc.)
2. The variation in an action attribute could change the outcome of the action such as achieving a goal.
Fundamentally, counterfactuals are about the connections between circumstances, actions, outcomes and goals
(desired outcomes). They enable the creative exploration of the emotional and inferential links between actions and
goals as structured variations on a theme (Hofstadter 1979). We hypothesize that the theme is the fine grained structure
of actions and the variations are modifications to the components of the action ontology.
How effective (impactful, convincing) a counterfactual is seems to depend on the degree to which the real world is
altered in its construction. (Roese et al. 2009) points out a similar principle at play in art appreciation. Art appreciation
seems to depend on the degree to which familiar expectations are violated. Art that does not violate any expectation
is boring; art involving a huge violation strikes many as bizarre and repugnant. Somewhere between the extremes
of the boring and the bizarre lays a sweet zone of recognition coupled with mild surprise. This principle appears
to apply to counterfactuals as well (Roese et al. 2009), whether they are used by artists to influence an audiences
emotions, or if they are used as persuasive arguments to convince someone of a particular point of view. Indeed, a
counterfactual that convinces the audience that some alternative might well have happened, must follow a minimal
rewrite rule. (Tetlock & Belkin 1996). Small, minor changes to reality are acceptable, whereas bigger changes may
leave the audience baffled. As psychological research on counterfactual thinking has shown, the regrets with which
people chastise themselves also follow this minimal rewrite rule (Roese and Sommerville, 2005). People typically
focus on just one action to alter within the counterfactual. All other aspects of reality remain within the counterfactual
exactly as they truly were. In the best stories of the alternate history genre (in which the entire story takes place in a
counterfactual world), there are a few key differences between the storys setting and reality, framed by innumerable
similarities, such as the laws of physics and basic characteristics of human nature.
7
4.1
Mutability and the minimum rewrite rule
The minimal rewrite rule asserts that counterfactuals are often predicated on alterations of a local or specific nature
rather than alterations of more general laws or universals. For example, people will more likely generate statements
like If he had run a little faster, he would not have missed the bus rather than statements like If he had been able
to fly, he would not have missed the bus. Much of the psychological research on evaluations of counterfactuals has
focused on just what kinds of particular or local factors can be manipulated in acceptable counterfactuals. Kahneman
and Miller (1986) proposed these general rules of mutability:
1. Exceptions are more mutable than routines.
2. Ideals are less mutable than non-ideals. When asked to change the outcome of a card game or tennis match
subjects do so by imagining an improvement of the losing game rather than a deterioration of the winning game.
3. Reliable knowledge is less mutable than unreliable knowledge.
4. Causes are less mutable than effects.
5. The actions of the focal or attended actor in a situation are more mutable than those of a background actor.
To make these rules of mutability operational in specific situations, we need an characterization and formalization
of the structure of actions and events. Such an ontology enables these general rules of mutability to be tied to the
specific structure (choice points, attribute variations) of the entailed actions. This allows us to formally specify and
run computer simulations of counterfactual reasoning. This is the subject of the next section. Consistent with the idea
that counterfactuals are about human actions and their connections to goals and affect, we propose that the minimal
rewrite is best described as being local in the space of actions. This also allows us to make specific predictions about
the cognitive and neural architecture that supports counterfactual reasoning (a topic of Section 7).
4.2
Minimal rewrites occur in the space of event structures
The minimal rewrite rule requires a notion of a minimal change. A minimal change is usually defined as one where the
world resulting from the change is most similar to the world before the change in every respect (other than the change)
(Lewis 1973; Ginsberg 1986; Costello & McCarthy 1999). There are several obvious problems with such a notion.
First, similarity judgments are inherently contextual. Any two (distinct) worlds will share many different properties
and fail to share many other different properties. Therefore, rather than a general notion of similarity, what is required
is a notion of specifying similar in what way i.e. what is a relevant similarity and what is a relevant difference.
We hypothesize that the relevant change or variation is specified by the structure of actions and events. The
structural and dynamic properties of events provides the context in which minimality is determined. Specifically, we
hypothesize the following two principles related to the imagination of counterfactuals.
Principle 1. People imagine two possibilities when they generate counterfactuals. One possibility corresponds
to the actual world and the second corresponds to a variant of the actual world. This principle is adapted from
(Bryne 2005)).
Principle 2. The ontology of events and actions includes multiple possibilities or branching points in the evolution of the event. These branching points are likely candidates for generating variants or changes to reality.
Notice that this turns the minimal rewrite rule (also related to the minimally different world in possible world semantics (Lewis 1973)) from minimality in the change in the world to minimality in the space of actions and events. As
an example, the real world (in which one may have no money) is considerably different from the counterfactual world
where one is Bill Gates. However, counterfactuals such as If only I were Bill Gates, I would have bought Google in
2002, are fairly felicitous. The reason these counterfactuals are possible is that in the space of actions, the change
is local where it varies a single resource (money) from one value to another where the outcome (buying a company)
becomes possible. This change is local since it is a variation on a single ontological attribute, the resource (money)
required for an action (buying). In general, our hypothesis is thus that the minimal change (rewrite) corresponds to
local alternatives and variations in the ontological attributes of an action. This ontology of actions that supports local
changes as counterfactual imaginings is the topic of the next section.
8
An ontology of actions and events
Actions and events are the frequent subject of human planning, language, and hypothetical reasoning. What will
happen if X does Y?, What does X need before it can do Y?, If X now has Z, what action Y may have been taken?, are
a few examples.
A general ontology capable of describing complex events must fulfill some essential requirements. The action
ontology and corresponding model has to be a) fine-grained to capture the wide range of possible events and their interactions; b) context-sensitive and evidential in order to adapt to a dynamic and uncertain environment; c) cognitively
motivated to allow humans to easily query and make sense of the answers returned; and d) elaboration-tolerant so that
new domain models can specialize existing representations without changing the basic primitives.
Figure 1: A rich ontology of actions and events.

Figure 1 shows a basic ontology of events. The various components are described below. In each of these cases,
we have a precise semantics in terms of the overall structure of the interacting events.
1. The Basic Structure of an Event: A basic event is comprised of a set of inputs, outputs, preconditions, effects
(direct and indirect), and a set of resource requirements (consuming, producing, sharing and locking). The
hasParameter link in Figure 1 depicts the set of parameters in the domain of the basic event type.
2. Events have a Frame Semantic Structure: Events are described in language using Frame-like relations (Fillmore
& Baker 2010; Scheffczyk et al. 2010). Frames are labeled entities comprised of a collection of roles that
include major syntactic and semantic sub-categorization information. The relation hasFrame in Figure 1 is a
many-to-many link since an individual event may involve multiple frames and a single frame could participate in
multiple events. Frame inferences allow many inferences about participants, roles, and presuppositions (Chang
et al. 2002b; Chang et al. 2002c).
3. Composite Events have a rich temporal structure and evolution trajectories: The fine-structure of events comprises a set of key states (such as enabled, ready, ongoing, done, suspended, canceled, stopped, aborted) and a
partially ordered directed graph that represents possible evolution trajectories as transitions between key states
(such as prepare, start, interrupt, finish, cancel, abort, iterate, resume, restart). Each of these transitions may
be atomic, timed, stochastic or hierarchical (with a recursively embedded event-structure). Each branch in the
evolution of an event provides a potential alternative (variant) to the real scenario.
4. Process primitives and event construals: Events may be punctual, durative, (a)telic, (a)periodic, (un)controllable,
(ir)reversable, ballistic or continuous. Composite events are composed from a set of process primitives or control
constructs (sequence, concurrent, choice, conditionals, etc.) which specify a partial execution ordering over
events. The composeBy relation in Figure 1 shows the various process decompositions.
5. Composite processes support defeasible construal operations of shifting granularity, elaboration (zoom-in), collapse (zoom-out)) and enable focus, profiling and framing of specific parts and participants.
6. Inter-event relations: A rich theory of inter event relations allows sequential and concurrent enabling, disabling,
or modifying relations. Examples include interrupting, starting, resuming, canceling, aborting or terminating
relations.
A static event ontology and schema, though, does not have the dynamic semantics to simulate the execution of
an event scenario unfolding over time, as is necessary for simulation and inference. We require a dynamic model of
events. In the next section we discuss the basic ideas of simulation semantics and also describe a computational model
of simulation semantics.
Simulating counterfactuals
The computational model relies on previous work on simulation based inference(Narayanan 1997; Narayanan 1999b;
Narayanan 1999a; Narayanan & McIllraith 2003; Feldman & Narayanan 2004; Feldman 2006). I will outline the basic
model and then the application of the model to counterfactuals.
6.1
Simulation Semantics: Basics
Simulation semantics hypothesizes the mind as simulating the external world while functioning in it. The simulation takes sensory input about the state of the world together with general knowledge and makes new inferences
that predict what might happen. Monitoring the state of the external world, drawing inferences, and acting jointly
constitute a dynamic ongoing interactive process.
The same is true in a discourse, where the sensory input is linguistic. In this case the simulation functions to
figure out what the input means and how to react on the basis of that information. The information gathered from
linguistic input is of at least the following kinds:
1. Literal information about some literal situation.
2. Metaphorical information about some abstract conceptual domain.
3. Speech act information about the purpose of the speech act and inferences to be drawn on the basis of the
semantics of speaking.
4. Interpersonal information about the other persons social position, emotional state, need to maintain face, and
so on.
5. Narrative information about the structure and purpose of the narrative being processed.
Each type of information can be seen as characterizing a different dimension of the ongoing world state being
monitored, reasoned about, and contributed to.
1. Simulation is used both in perceiving and in acting, both in listening and in speaking.
2. It is sometimes conscious, but mostly unconscious.
3. It is imaginative in nature. It involves mental imagery, including both visual, auditory, and bodily imagery.
4. It is always based on bodily grounding.
10
5. It is always done in the context of a conceptual system and a belief system.

6. Many additional inferences are made as a result of the simulation. The additional inferences come from the
direct model changes from simulation, and also the propagation of these changes throughout the mental state.
7. In listening, linguistic input provides the parameters necessary for an adequate simulation. The next state of the
simulation depends on the current world state as you take it to be and on the analyzed input.
8. In comprehension, each simulation is based on linguistic input and epistemic state, that is, contextual knowledge
of all sorts, conceptual systems, linguistic knowledge, general world knowledge, emotional state, and beliefs.
6.2
Cognitive structures for simulation
Simulations are guided by cognitive structures and schemas. Cognitive Linguists and Semanticists have long observed
that recurring structures in perception, action, and social cognition play a large role in language and thought. We
hypothesize that many of these structures are grounded in directly embodied neural circuits tuned during our interaction
with the environment. Examples of grounded structures pertain to:
1. goals (their achievement or thwarting)
2. results, outcomes, and rewards.
3. spatial motion (schematic motor structures (such as grasp, walk) with parameters such speed, direction, phase).
4. spatial relations (schematic image structures (such as containers, orientation, topological relations (inside, outside)).
5. paths (reified trajectories (such as linear, circular)).
6. rhythm (beats, pitch, harmonics)
7. event structures (phases of events (such as inception, ongoing, completion, suspension), viewpoints (zoom-in,
zoom-out)).
8. forceful interactions between entities (such as prototypical FD schemas of let, prevent, help, hinder).
9. emotional schemas (basic ones of fear, pain, pleasure and more complex ones such as guilt, or regret).
10. social cognition (authority, affection, abandonment, respect).
It appears that such structures called image schemas (Lakoff 1993; Lakoff 1994; Fauconnier & Turner 2002;
Lakoff & Johnson 1980; Lakoff 1987; Johnson 1987; Langacker 1991; Talmy 2000) and more recently cogs (Lakoff
2009) apply invariantly across domains and may be used to learn and structure new domains, as well as be used
to perform inference. Fundamentally, these structures appear to be non-propositional and involve dynamics (forces,
movement, harmonics), and uncertainty (in action selection, world evolutions, and in inference). Over the past several
years, the NTL group (http://www.icsi.berkeley.edu/NTL) has been constructing neurally plausible computational
models of such structures. The next section introduces such a model, its application to modeling counterfactuals,
followed by more detailed technical description about the formal properties of the simulation and inference framework.
6.3
A Computational Model of Simulation Semantics
Complex reasoning about event interactions requires not only an event description, but also a dynamic model that can
simulate the execution of the event unfolding over time. We can instantiate such a model with facts about a particular
event, enabling us to project which situations are likely or possible based on the consumption and production of
resources and the creation and elimination of states.
Our computational model of simulation semantics is based on X-nets, a structured connectionist model of actions that whose formal properties extend the theory of Generalized Stochastic Petri Nets (Bause & Kritzinger 1996).
11
Structured connectionist models (Feldman 1990; Feldman et al. 1988; Feldman 2006) are computational models that
combine principles of neural computation with known structural constraints in the brain (such as functional circuits
and connectivity constraints).3 X-nets model unfolding actions and events and state changes (both discrete and continuous). A method for dynamic modeling of events requires two key pieces: a way of storing state and a way of changing
state. X-nets have these two main components. Places hold resources and condition state. Interrelations between state
variables are encoded in a probabilistic network which is also used for inference (for technical details, please see Section 8). Transitions are active elements that create, destroy, and test the resources and conditions encoded in Places.
Together, they provide a solution for representing the dynamics of events and the means of simulating them.
X-nets coordinate complex behaviors and are used for action as well as reasoning about actions and events. Xnets represent sequentiality as well as concurrency and synchronization, alternatives, stochasticity, and asynchronous
control (see Figure 2 and Figure 3). In addition X-nets support a variety of analysis routines including prediction,
reachability, and hypothesis disambiguation and diagnosis. Section 8 describes in technical detail, the formal properties of X-nets. In this section, we will describe the functional aspects and their use in counterfactual generation and
inference.
Section 7 outlines a possible mapping of X-nets to regions in the human Prefrontal Cortex (PFC), including presenting evidence that the coordination of complex events is encoded in a distributed fashion throughout the PFC. The
specific features of events (such details of the person, location, destination, action) are encoded in sensory and memory
circutis and regions and a primary role of X-nets (and the PFC circuits) is to integrate this information into a complex event. In addition, our hypothesis that the neural realization of X-nets are distributed over the PFC entails that
X-nets have rich reciprocal connections to subcortical and cortical reward and affect systems (including dopaminergic
systems and the amygdyla) and integrate social and affective knowledge into action and event coordination. Event
unfoldings such as effects of action performance that achieve specific desired states will thus trigger the appropriate
social and emotional responses.
Figure 2 represents an event as a state change (before state and after state). Places P-2 and P-3 have no tokens.
An Arc exists from P-1 to Transition T-1 representing a resource dependence. P-1 must have adequate resources
(one token by default) for T-1 to execute (also known as firing). T-1 will fire in this case, resulting in the second
frame (After). Here T-1 has consumed the resource from P-1. The Arcs from T-1 to P-2 and P-3 represent resource
production; T-1 creates resources in P-2 and P-3 when firing. Each firing thus changes the overall state of the X-net.
We call this distribution of tokens over Places a marking, in this case going from [1, 0, 0] to [0, 1, 1] (for [P-1, P-2,
P-3]). This simple, abstract model can represent a specific event. For example, Joe buys a can of soda. P-1 represents
Joes money, in this case, one token is 1 dollar. T-1 represents the buying operation. P-2 represents Joe having a soda,
and P-3 is the receipt. Joe has a $1, he executes the buying operation, the $1 is consumed, and he gains a soda and
a receipt. Dynamic modeling allows us to make different inferences based on the evidence available. Had Joe had
no money, he would not have been able to buy the can of soda. In full, the X-net representation directly captures a
number of additional features of events. (See Figure 2 bottom row for some additional features).
1. Events unfold in uncertain ways. X-nets transitions can be stochastic.
2. Some events take appreciable time and some do not. X-nets provide instantaneous Transitions and timed Transitions.
3. For conceptualization and display, it can be useful to abstract away subevents, collapsing an X-net as a special
Transition in another X-net. This is depicted in the graphical X-net model as a hexagon.
4. The existence of certain resources and conditions can have both positive and negative impact on an events
execution. X-nets provide inhibitor arcs for those cases where a satisfied condition should prevent a Transition
from executing.
5. Not all resources and conditions required for the execution of an event need be consumed during the execution
of the event. Enable arcs can be used to test-but-not-consume resources and conditions.
3 (Feldman
2006) has a detailed description of the modeling paradigm and its uses.
12
Figure 2: A simple event. The top row shows the basic model of an event as a state consumption and production
system. The active element (transition) changes the state of the system (distribution of activation over the network)
from a pre (see the left column top row) to a post (right column top row) state. The bottom row shows additional
features including stochastic transitions, hierarchies of actions (hexagon), consumption, production, inhibition, and
enabling conditions (different types of arcs). See text for further details.
All of these features combined provide us the building blocks to represent the complex event structures. At their
core, X-net Transitions represent simple events. We are able to chain several events together through their common
conditions and resources to represent a larger scenario. Figure 3 shows some possible compositions of simple events
into the larger coordination structures.
Figure 3 shows the compositional primitives described in the ontology as X-nets. X-nets perform actions and
inferences via bindings that activate other sensory, motor, and memory circuits. Every action node is preceded and
followed by a state node, with activation spreading from states to actions to states. Timing nodes coordinate the
durations of states and actions (which may be instantaneous or elongated). Iterated actions are formed by loops from
the state following an action to the state preceding the action. Complex actions can be decomposed to a coordinated
arrangement of subactions (for more information on the action coordination structure, see Figure 4). Conditional
actions are formed by gatings. Conditional Choice nodes have outputs going to two or more ther nodes, with gatings
that determine the choices, perhaps probabilistically. Actions typically have initial and final states, initiating and
concluding actions, central actions, and may have purposes. A purposive action is one with a desired state. The
purpose is met if the desired state is active after the central action, and if so, the action is concluded. Each action
can be neurally bound to specific nodes of another X-net, to produce quite complex actions. (Narayanan & McIllraith
2003) describes the expressive power of the compositional primitives.
Figure 4 models the cognitive control structure of a complex event (Narayanan 1997). The controller graph captures the key transitions and states that comprise a complex event, action, or process. The controller is also useful to
connect events related by an unfolding trajectory. Figure 4 shows a general event controller, where the event in focus
is represented with three components, the starting action, the ongoing state, and the finishing action (shown contained
in the dashed-line hierarchical transition). Other events related to the focal event are linked according to the design
13
Figure 3: Compositional primitives of the event ontology and their x-net model. The basic compositional primitives
cover sequential and concurrent actions, branching and coordination, and hierarchical decomposition.
shown. For example, an event that suspends the focal event will be linked so as to remove the ongoing control token
when fired, preventing the focal event from finishing. (Note: each hierarchical transition represents a sub-net that can
be expanded; thus our suspend is bound to the event that is suspending the focal event.) The control states (ready,
suspended, stopped, etc.) are added to connect the related events and to mark event evolution progress; they assist in
control, monitoring, and inference about the current phase an event may be in. (Not all relations shown may be used
in each description; the unused portions of the design are collapsed.)
Figure 5 shows the X-net model of some of the essential features of the event ontology. Transtions may have a delay
or be instantaneous (Figure 5a), model the consumption and production and temporary use of resources (Figure 5b),
c), and d)), They also model the use of energy and effor (Figure 5e), enabling based on preconditions and goals
(Figure 5f), and different control and execution patterns (such as peroidicity and the ability for ressetting the process
after its termination, so that it can restart on a new execution event.
In general, the X-net model formally encodes and simulations a set of event structure features. The following is a
partial description showing how essential features of actions and events are captured in X-nets.
1. Precondition: translates into an incoming Place p (with a directed arc from p to t), with Capacity cp = 1. By
default, the arc connecting p to t (I-[p,t]) is an Enable arc (s.t. I+[p,t] = I-[p,t]). If flagged as consuming, the arc
will be a standard Resource arc. If set as negative, the arc is an Inhibitor arc. The arc weight is 1 (I-[p,t] = 1).
2. Resource-In: translates into an incoming Place p. By default, Capacity cp is not set (i.e. infinity), and the arc
weight is 1 (I-[p,t] = 1), unless otherwise specified by max and amount, respectively. If flagged as test, the arc
is set to be an Enable arc. If flagged as negative, the arc is set to be an Inhibitor arc.
3. Effect: is similar to a Precondition, but instead translates into an outgoing Place p (with a directed arc from t to
p), with Capacity cp = 1 and arc weight 1 (I+[p,t] = 1).
14
Figure 4: An event graph (controller) encodes the structure of events. The graph models the temporal evolution and
possible trajectories of events. The controller is used both for the coordination and top down cognitive control of
complex and abstract actions as well as in reasoning about these actions.
4. Resource-Out: translates into an outgoing Place p. Again, by default, Capacity cp is not set, and the arc weight
is 1 (I+[p,t] = 1), unless otherwise specified by max and amount.
5. An event Duration can be directly translated into the rate parameter of t. The existence of a duration implies
Transition t is timed; the lack of a duration parameter implies t is immediate.
6. Input, Output, and Grounding parameters (and the frameSet attribute of the aforementioned parameters) are not
taken into account by the simulation firing rule, and are thus not mapped over to the simulatable model. (Inputs
are, for example, credit card numbers needed for a Buying event. The Precondition of having a credit card is
sufficient for simulation; the exact number is not necessary.)
7. Actions may be repeatable (like rub or walk), in that the enabling conditions for the initiation of the action
may obtain at the completion, or non-repeatable (like fall) where the terminating condition (be supine) may
inhibit the subsequent re-initiation of the action from disabling its inception. Events may be punctual, durative, (a)telic, (a)periodic, (un)controllable, (ir)reversable, ballistic or continuous. Composite processes support
defeasible construal operations of shifting granularity, elaboration (zoom-in), collapse (zoom-out)) and enable
focus, profiling and framing of specific parts and participants.
8. Actions are ready to be initiated (assuming other conditions and resources are available) as long as certain
goals (desired states) do not obtain. The inactivation of the goal state is thus among the preconditions for the
entailed action that achieves that state. There is an inhibition arc from the goal of an action to the transition that
initiates the action. At the successful completion of the entailed action, the goal state obtains, and the action is
inhibited as long as the goal obtains. This process is called goal based enabling of the action, since the action is
disinhibited when the goal is not achieved, and inhibited when the goal is achieved.
15
Start
Finish
D
a) Durative
k
Start
f) Goal Based Enabling

Finish
b) Resource consuming
R
p
g) Inherently periodic
c) Resource producing
R
h) Inherently Aperiodic
d) Resource lockrelease
Energy
N
Intent
Finish
Start
Finish
Start
e) Effort
Reset
i) Resettable
Figure 5: Basic Features of events

X-nets have been used in language acquisition and use (Narayanan 1997; Chang et al. 1998; Narayanan 1999b;
Chang et al. 2002a; Sinha 2008), in modeling complex biochemical processes (Makin 2008), in modeling the distributed operational semantics of web services (Narayanan & McIllraith 2003). We now turn to the use of the X-net
model in generating and analyzing counterfactuals.
6.4
Using the X-net action model for counterfactuals
Recall that in our theory, counterfactuals are generated whenever

1. There is a specific attribute that is under the control of the agent (such as the resource for an action, achieving the
preconditions, or controlling the evolution of the event (such as performing the event, preventing it, interrupting
it, terminating it, etc.)
2. The variation in an action attribute could change the outcome of the action such as achieving a goal.
Let us look at some possibilities given the X-net model of actions.
Varying Resources:
The presence or absence of resources determines whether an action is performed or not. Resources (as shown
in Figure 5) can be consumed during an action (like energy), produced during an action (receiving money), or
locked and released during an action (such as a room during a meeting). Each of these types of resources can be
varied or changed to generate counterfactuals.
1. If I had more money, I could have gone to the game. (consumption)
2. If I had more energy, I could have completed the marathon. (consumption)
3. If we had more reservoirs, the city wouldnt have flooded. (lock-release) (Figure 5c).
16
4. If we had more space, we could have held the wedding at home. (lock-release) (Figure 5c).
5. If we could produce more electricity, we could meet demands. (produce)
Varying Preconditions
Achieving the preconditions of an action make the performance of the action possible. Conversely removing a
precondition may now make the action impossible. Both types of variations are readily generated.
1. If only I had not opened the gate, the dog would not have run out.
2. If only you had not dropped the banana peel, the old man would not have fallen.
3. If only I had fixed the lamp, there would have been more light.
4. If only I had removed the vase, it would not have been toppled.
Remove
resource
Add
resource
Remove
precondition
Add
precondition
Figure 6: Resource and precondition based counterfactual examples. In the top row is the addition and removal
of resources that change the outcome. In the bottom row are the addition and removal of preconditions that are
counterfactual simulations that change the outcome.
Imagining Preventions
Preventing the occurrence of an event or action clearly alters the outcome and is frequently the subject of
counterfactuals. Prevention is also an important linguistic construct in modals, causatives, and force-dynamics
(Talmy 1988).
1. If only I had stopped her from leaving, we would be married now.
2. If only I had prevented the leak, the tank would not be flooded.
3. If only I had let her go, she would have made the bus in time (removing)
4. If only the professor let the student take the course, she would have graduated. (removing)
Imagining Alternative Choice Points
(Bryne 2005) identifies action vs. inaction as a specific choice point that governs counterfactual generation.
The model of event structure presented here offers a more elaborate branching structure and more choice points.
For instance, actions can be scheduled but canceled prepared (all preconditions satisfied) but not performed,
performed upto a point and then aborted, interrupted while being performed, and then suspended, resumed or
aborted and restarted etc. All these choice points readily offer the possibility for generating counterfactuals.
17
Restart
Enable
Disable
Suspended
Resume
Suspend
Iterate
Stop
Ready
Enabled
Prepare
Stopped
Done
Ongoing
Start
Cancel
Finish
Canceled
Undo
Undone
Figure 7: Counterfactual altervatives based on the event control graph. Consider the counterfactual example If only
the diplomatic efforts had resumed, we would be closer to peace.. The transition to resume presupposes that the talks
were ongoing but suspended at the time of reference. Furthermore, the absence of resumption implies that the peace
talks are still suspended or aborted the other possible trajectories from the suspended state of the controller graph.
1. If the talks had continued, we would have reached an agreement. (suspended and not resumed)
2. If we had stopped talking, we would have been able to listen. (action not suspended)
3. If we had canceled the game, we could have avoided getting wet. (action not canceled)
4. If the intifada had not restarted, peace talks would have continued. (one action interrupts another).
Figure 8 shows the X-net model for the situation described in the last sentence (Sentence 4). The main idea
here is that the two events in focus (the intifada and the peace talks) are related through connections between
the controller graphs. Shown in Figure 8 is the example where two processes, (a) the Intifada restarting, and (b)
the peace talks continuing, interact with each other. In the example, both in reality and in the counterfactual,
the background condition is that intifada is suspended and the peace talks are ongoing. Note that this is an
inference from the event structure, where the restarting requires a context in which the process is suspended
and continue requires a process that is ongoing. In the actual evolution, the Intifada restarts and the peace talks
suspend. In the counterfactual evolution, the intifada remains suspended and the peace talks continue unaffected.
The crucial facts that a) the background condition (the talks are suspended) is part of the fine-grained action
structure and b) the background forms a shared context for both the actual and counterfactual worlds and (thus
enables interactions between the two worlds) (Pearl 2000) is crucial to the semantics (generation and analysis)
of counterfactuals. Section 8 formalizes this in a computational model of counterfactuals.
Imagining Obligations
Obligations are expected courses of behaviors which impose requirements for agents to act in specific ways.
The requirement can come from moral, social or cultural sources and the force of the obligation can vary along
all these dimensions. Whenever there is an obligation (such as respecting elders, not cheating in exams), there
appears to be the possibility of imagining both the obligated and the forbidden (not respecting elders, cheating)
possibilities (Bryne 2005). This allows for the variation of the forbidden possibility in the the generation of the
counterfactuals. The forbidden possibility is often changed to the obligated one in generating counterfactuals
that change the outcome of an event.
18
Restart
Enable
Disable
Suspended
Resume
Suspend
Iterate
Stop
Ready
Enabled Prepare
Stopped
Done
Start
Ongoing
Cancel
Finish
Canceled
Undo
Undone
Event 1: Intefada restarts

Restart
Enable
Disable
Suspended
Resume
Suspend
Iterate
Stop
Ready
Enabled Prepare
Stopped
Done
Start
Ongoing
Cancel
Finish
Canceled
Undo
Undone
Event 1: Peace talks suspended
Figure 8: One event can impact another through the controller graph. Shown is the example where two processes,
(a) the Intifada restarting, and (b) the peace talks continuing, interact with each other. In the example, both in reality
and in the counterfactual, the background condition is that intifada is suspended and the peace talks are ongoing.
Note that this is an inference from the event structure, where the restarting requires a context in which the process is
suspended and continue requires a process that is ongoing. In the actual evolution, the Intifada restarts and the peace
talks suspend. In the counterfactual evolution, the intifada remains suspended and the peace talks continue unaffected.
1. If he had taken care of his parents, they would not have been so lonely.
2. If he had not cheated in his exams, he would still be in school.
3. If she had showed up to the interview on time, she would have got the job.
4. If Marion Jones had not taken steroids, she would have not won so many Olympic gold medals.
Imagining Enabling/Disabling Conditions
1. If only I had stopped her from leaving, we would be married now.
2. If only I had prevented the leak, the tank would not be flooded.
3. If only I had let her go, she would have made the bus in time (removing)
4. If only the professor let the student take the course, she would have graduated. (removing)
6.5
Semifactual Alternatives
Sometimes, the variation of a specific attribute is made to highlight the irrelevance (or the limited relevance) of that
attribute to changing the outcome. Such expressions use the even-if construction in language. Any of the action and
event attributes could be the the specific variant in the even-if construction.
1. Even if we had stayed together then, we would have broken up by now.
2. Even if I had taken the higher paying job, I would not have been able to afford the house.
3. Even if it had been sunny, the game would have been canceled.
4. Even if it had stopped raining, the levee would have collapsed.
19
50
60
80
80
90
80
80
80
Figure 9: Consider the sentence Even if I had given you the $ 10 you wouldnt have been able to buy the train ticket.
The buying a ticket action requires a certain amount of resource. Both the real and counterfactual alternatives are
simulated, however the outcome doesnt change. This is the state of affairs shown in the left column of the figure.
Now consider the alternative Even if you had loaned me the $10 dollars, you would have been able to buy the train
ticket. This says that the loan would not have changed the outcome of the action and thus points out the irrelevance
of making the loan to goal achievement (taking the train). This is the situation in the second column of the figure.
Figure 9 shows two situations where the even if construction is used without changing the outcome. In the first
case, the irrelevance of an action (me loaning you $10) in changing a negative outcome (you still not having enough
money to buy a train ticket) is highlighted and used to argue against me lending you money. In the other case, I am
trying to get money from you and use the even if construction to inform you of the irrelevance of your lending action
in terms of a positive outcome which achieving your goal anyway.
6.6
Concessive conditionals
Consider the following sentence:

Even if he had committed a crime, they would have voted for him (from (Dancygier & Sweetser 2005)).
Clearly, this sentence entails that the commission of a crime by this specific individual is irrelevant for vote getting
and that the candidate would have obtained the vote if he did or did not commit a crime. But there seems to be more to
the story. (Dancygier & Sweetser 2005) argue that concessive conditionals (sentence of the form Even if P Q) like the
one above communicate the atypical nature of the implication ( P = Q ) holding. Indeed, they point out that it is
usually the case the P = Q . Thus, in the example above( P = commit crime, Q = vote for criminal), the normal
case world be If someone commits a crime, then people will not vote for him. Thus the concessive conditional above
sets up an atypical situation in which the normal expectation is violated and an unusual situation is asserted where the
people still vote for the criminal candidate.
Our interpretation of this kind of concessive counterfactual is that it points to an unusual situation pointing to a
non-canonical alternative cause that becomes highly salient for the outcome. The normal outcome (criminals dont
get votes) is reversed under the new situation where the counterfactual highlights the determinative impact of a usually
less salient cause (for example, someones ethnicity) that is able to reverse the outcome (this criminal get votes) in this
instance.
Figure 10 shows two situations where the even if construction is used without changing the outcome (gets votes).
The right side corresponds to the expected network where the commission of a crime makes the candidate. Specifically,
the following aspects come out of our model of concessive conditonals.
20
Voting.enable & not(crime)
Even-if network
Default
criminal(x)
NOISY-OR
other causes
criminal(x)
vote
vote
elected(x)
elected(x)
Figure 10: Consider the sentence Even if he had committed a crime, they would have voted for him (from (Dancygier
& Sweetser 2005)).. The usual situation (shown on the right) where a criminal does not get votes (shown as a
inhibitory precondition) is violated in this case which asserts that there is an alternative cause (shown in the left of the
figure) that in this case is salient enough to override the default expectation. A simple example of the override could
be the generalized NOISY OR function (see text for details).
1. Concessive conditions often highlight the relative importance of normally (in the default situation) less salient
factors for an outcome. In the example above, this could be the background of the candidate, his past deeds,
ethnicity or any number of other factors that could override the fact that he has committed a crime.
2. Concessive conditionals specify the extreme case of the specific value of the changed parameter that still maintains the outcome. This aspect is shared with all semifactive interpretations. So in a clearly scalar case, one
could assert something like Even-if the walls had been 10 feet higher, the water would have flowed into the
city. This often implies that they should have at least been over ten feet for the water to have been stopped.
Thus the conditional holds not just for the situation described but for a whole range of situations which are less
likely to change the outcome than the one described. (Dancygier & Sweetser 2005) suggest this as the scalar
interpretation set up by the even construction.
3. In Figure 10, the combination of factors that impact the descision to vote is captured by a combination funtion
which is a NOISY OR (Pearl 1988). The Noisy Or function models exception independence of causes (whatever
inhibits one cause from producing the effect is independent of things that inhibit other causes from producing the
effect). The notion of exception independence is simple and natural in many domains, computationally attractive
(it has a closed form solution), and has proven useful in psychological accounts of causation (Cheng 1997). It
is an emperical question as to how useful this particular combination is in counterfactual interpretations.
6.7
Simulation and Temporal Order
Inherent in simulation semantics is the notion that people imagine events in a temporal order that is congruent to their
occurrence in the world. This would predict that all other things being equal (salience of events, impact, etc.), people
would more likely change the more recent of two events. So if there is a sequence of connected events, John opened
the door, went outside, played in the dirt, and got hurt. The simulation theory predicts that people would be more likely
to think If only he hadnt played in the dirt (the more recent event) that If only John had not gone outside (the less
recent event). Locality in the space of event unfoldings makes the recency effect a default entailment of the simulation
semantics model of counterfactuals.
21
(Bryne 2005) and colleagues have performed experiments suggesting that there indeed is a recency effect in counterfactual generation. Their test scenario involved imagining two individuals who are in a game show. They are asked
to pick a square which contains a blue or red colored sports car. If they both pick the same color (red or blue), they
each get to keep the car they picked. If they chose different colors, they dont get anything. Now suppose, the first
person, John chose red. The Jack, the second chooses blue. When asked to complete the sentence, The players would
have won if only ..., most people tended to say The players would have won if only Jack had picked a red car, even
though the choices for John were equally likely.
However, our model predicts that the recency effect is defeasible and can be overridden at least in instances where
there is a highly salient precondition or resource. For instance, in the case where you go camping and stay an extra
day you didnt plan for, the initial resource of not having extra food or water (which may be the usual practice) may
be a more likely source of counterfactuals (If only I had the usual extra food) than the most recent action (If only we
hadnt decided to stay longer). More generally, our framework of tying the fine-grained ontology of actions and events
to the dynamic simulation model suggests many new experiments on the trade-offs between recency and other factors
related to counterfactuals.
6.8
Acquisition focus versus avoidance focus
Recall from Section 2, that performance goals can be about acquiring a desired object or reaching a desired state. This
is termed acquisition focus and is differentiated from the case where the goal is the avoidance of a bad outcome or
to avoid unwanted circumstances. Acquisition focus corresponds to the promotion, advancement, achievement and
regulation of positive end states. Avoidance focus, by contrast, is more concerned with the prevention of unwanted
circumstances and dealing with obstacles and goal blocking conditions.
Desired state
Enabled
Enabled
Prepare
Prepare
Start
Start
Ongoing
Ongoing
Finish
Binds to the specific

schema for avoidance
Finish
Avoidable state
Figure 11: Acquisition focus versus avoidance focus. Notice that with a desirable state in mind, not having the
desirable state is sufficient to execute the action (top row) which if successful establishes the desired state. With an
undesirable (avoidable) state, (bottom row) the presence of the state triggers of the avoidance focus action, and the
absence of the state is necessary for the successful completion of the action.
Inferentially, if a desired state is my being in Berkeley, I will perform actions that will take me there (such as take
a flight). Thus any action or causal condition that achieves the goal state is sufficient for the acquisition focus. On the
contrary, in order to avoid an undesirable state it is necessary to inhibit the action that might lead to the undesirable
state. So if I wish to avoid flying through a specific airport (like Chicago in winter), it is necessary to avoid all flights
with a stopover in Chicago.
In summary, X-nets appear to have the basic structure and dynamic semantics to capture many aspects of counterfactuals. We now turn to a possible neural realization of X-nets based the converging evidence pointing to the
22
Prefrontal Cortex as having a central role in coordinating complex events.
Mapping to the Neural Architecture
Figure 12: Brodman Area Lebeling of the Lateral Prefrontal Cortex (taken from (Badre 2008)).
How does our X-net model fit with the findings on deficits in counterfactual generation described in Section 3.5
which implicates the Prefrontal Cortex (PFC) as having a central role in counterfactual processing? Given the neural
evidence on PFC involvement in counterfactual deficits, we outline a detailed model to support our hypothesis that
counterfactuals depend on the coordination of complex events that relate goals and outcomes with behavior, actions,
and affect. Specifically, we propose that a) PFC circuits are involved in the monitoring and cognitive control of
complex event structures and b) counterfactuals depend on PFC circuits coordinating complex events. We start with a
description of the architecture of the PFC. We then outline an X-net mapping to the PFC along with evidence (primarily
from imaging studies) supporting the hypothesized mapping.
7.1
The Architecture of the Prefrontal Cortex (PFC)
Figure 12 shows a lateral view of the Prefrontal Cortex along with the main divisions and Brodman area labels. The
prefrontal cortex (PFC) can be divided into anterior (APFC, Brodmann area (BA) 10), dorsolateral (DLPFC, BA 46
and 9), ventrolateral (VLPFC, BA 44, 45 and 47) and medial (MPFC, BA 25 and 32) regions. BAs 11, 12 and 14 are
commonly referred to as orbitofrontal cortex.
The prefrontal cortex (PFC) has extensive connections with other cortical and subcortical regions that are organized
in a topographical manner, such that regions that regulate emotion are situated ventrally and medially and regions that
regulate thought and action are situated more dorsally and laterally. The dorsolateral PFC (DLPFC) has dense connections with sensory and motor cortices and is key for regulating attention, thought and action. In humans, the right
inferior PFC (rIPFC) seems to be specialized for inhibiting inappropriate motor responses. By contrast, the ventromedial PFC (VMPFC) has connections with subcortical structures (such as the amygdala, the nucleus accumbens and the
hypothalamus) that generate emotional responses and habits and is thus able to regulate emotional responses. Finally,
the dorsomedial PFC (DMPFC) has been associated with error monitoring and, in human functional MRI studies,
reality testing. These PFC regions interconnect to regulate higher-order decision making and to plan and organize for
the future.
23
There are also large cortico-cortical direct reciprocal connections between the PFC and the medial temporal lobe,
passing through the uncinate fascicle, anterior temporal stem and anterior corpus callosum. The orbitofrontal and
dorsolateral cortices have strong reciprocal connections with the perirhinal and entorhinal cortices. There are more
connections from the PFC to the perirhinal cortex than vice versa. The PFC has reciprocal connections with sensory
association cortices including temporal and parietal regions and many subcortical structures.
7.2
Mapping X-nets to the PFC
Recall that our hypothesis and cognitive model of counterfactuals consists of the following specific propositions.
1. Reasoning about actions uses the same frontal neural circuits that are used in monitoring, selection, performance,
and control of actions.
2. Reasoning about actions and events at multiple levels provides the underlying basis for counterfactuals.
Taken together, this suggests that X-nets and the coordination of complex events with their outcomes and goals
at multiple levels are essential structures that support counterfactual reasoning through simulation. X-net structures
enable decisions about complex and abstract actions. Decisions about complex actions are referred to as cognitive
control in psychology and cognitive neuroscience.
Cognitive control permits us to make decisions about abstract actions, such as whether to e-mail versus call a friend,
and to select the concrete motor programs required to produce those actions, based on our goals and knowledge. The
frontal lobes are necessary for cognitive control at all levels of abstraction. Recent neuroimaging data have motivated
the hypothesis that the frontal lobes are organized hierarchically (Badre et al. 2009), such that control is supported
in progressively caudal (posterior) regions as decisions are made at more concrete levels of action. They also found
that frontal damage impaired action decisions at a level of abstraction that was dependent on lesion location (rostral
(anterior) lesions affected more abstract tasks, whereas caudal (posterior) lesions affected more concrete tasks), in
addition to impairing tasks requiring more, but not less, abstract action control.
Miller and Cohen (Miller & Cohen 2001) propose an integrative theory of prefrontal cortex function. They theorize
that cognitive control stems from the active maintenance of patterns of activity in the prefrontal cortex that represents
goals and means to achieve them. They provide bias signals to other brain structures whose net effect is to guide the
flow of activity along neural pathways that establish the proper mappings between inputs, internal states, and outputs
needed to perform a given task (Miller & Cohen 2001). Essentially they suggest that the prefrontal cortex coordinates
the inputs and connections which allows for cognitive control of our actions.
In another line of work providing further evidence for the PFC based neural embodiment of X-nets, (Koechlin
et al. 2002; Wood & Grafman 2003; Dreher et al. 2008; Krueger et al. 2008) propose representational accounts of PFC
function where the main role of the PFC is event coordination. Their hypothesis is called the Structured Event Complex
(SEC). The SEC arises from the observation that the coordination of complex events is encoded in a distributed fashion
throughout the PFC. The specific features of events (such details of the person, location, destination, action) are stored
in sensory and memory regions and a primary role of the PFC circuits is to integrate this information into an event.
An SEC is a goal-oriented set of events that is structured in sequence and incorporates thematic knowledge, morals,
abstractions, concepts, social rules, event features, event boundaries and grammars. The stored characteristics of
these representations form the bases for the strength of representation in memory and the relationships between SEC
representations. Aspects of SECs are represented independently but are encoded and retrieved as an episode.The SEC
framework is a representational viewpoint that makes specific predictions regarding the properties and localization of
SECs in the PFC (see Figure 13). Maintenance of SEC activation depends on the completion of the behavioral goal,
this is consistent with sustained firing of PFC neurons, but can be interfered with by supervening goals. The SEC
framework is consistent with the structure, connectivity, neurophysiology and evolution of the PFC.
The SEC proposal is compatible with the X-net model of PFC function and the simulation semantic account of
counterfactuals. In previous work, (Narayanan 2003) showed how specific control aspects of the X-net model modeled
the structure and function of fronto-striatal networks (both nigrostriatal and mesolimbic) and including specific motor
and cognitive implications of pathologies connected with this network (such as Parkinsons disease). More specifically,
X-nets provide an elaboration and operational model of the SEC in terms of ontological structure, functional circuits,
and processing predictions. X-nets directly model several aspects of complex event structures.
24
Figure 13: Event coordination distributed in the PFC (taken from (Wood & Grafman 2003).)
1. They model the general cognitive attributes of events (such as resources, preconditions). These attributes are
closely tied to the events (are local to the event). Specific bindings to these features (such as the money being
the resource for the action of buying something) are distributed in other circuits and structures.
2. They model specific events through bindings to the sensory motor and memory circuits that capture the specific details of events. For instance, (Narayanan 2003) described X-net bindings in dopamine circuits relating
frontal regions to the basal ganglia (cortex-basal ganglia-thalamic loops). Similar circuits model the PFC medial
temporal lobe bindings which include the hippocampus and amygdala, as well as the entorhinal, perirhinal and
parahippocampal neocortical regions.
3. The encode temporal and aspectual (Narayanan 1997) properties of event structures. These properties include
durations, relationship of actions to goals and purposeful actions, and the coordination of multiple events and
actions through activation, inhibition, interruption, suspension, resumption, and termination of events.
4. They encode inter-event relations and conditional evocation of events based on contextual and situational factors. The inter-event relations can be rigid such as subevents (taking steps is a sub-event of walk) or flexible as
in events interacting by modifying the execution trajectory (such as initiation, resumption, termination). Stereotypical relations of an event enabling, disabling, or being mutually exclusive with another is a subset of the rich
inter-event relational structure proposed by the X-net model.
5. They encode category specific information such as motor events, force dynamics, social situations both directly
(as in encoding specific sequence information) and in terms of their binding to other structures (both cortical
(DLPFC connections to PMC and SMA or MPFC connections to the Medial Temporal Lobe) and subcortical
(to the limbic system, the basal ganglia and other circuits)).
6. X-nets are always activated in conjunction with the structures they are bound to, which includes structures
encoding the sensory, motor, cognitive, and contextual memory of the events. Access and storage of specific
X-nets depends on the salience, reliability, and frequency of usage of the underlying experience of the events.
There is growing evidence making the detailed neural models of X-nets possible in a few years.
25
Recent evidence supports the X-net theory of PFC coordination of events, especially in the case of interruption and
suspension. In a study of left frontopolar lesions in BA 10, (Dreher et al. 2008) found that the lesion size correlated
with the ability to suspend and resume activity. Since the frontopolar cortex in the human brain appears to have evolved
in size and organization, this suggests that complex functions requiring the temporary interruption of a current plan to
achieve subgoals (such as planning of future actions and reasoning) associated with this part of the cortex have become
particularly important during hominid evolution.
(Krueger et al. 2008) propose that the Medial PFC encodes social event knowledge. Specifically, they suggest the
MPFC integrates information from two pathways, a Goal Pathway and an Outcome Pathway. The goal pathway connects PFC regions with motor regions including SMA, PMC, ACC, and multimodal Association areas in the temporal
cortex and the Parietal cortex. Outcome pathway recruits vMPFC, mOFC amygdyla, Hippocampal formation, OFC
reward processing. The integration of outcome with goals for social schemata and non-social schemata use MPFC and
DLPFC respectively.
Another line of evidence comes from the medial PFC involvement in self projection (Buckner & Carroll 2007),
the integration of memories of past events with future imaginings (Addis et al. 2007), and as part of the default
network (Buckner et al. 2008). (Buckner et al. 2008) propose that the default network is best understood as two
interacting subsystems. The medial temporal lobe subsystem provides information from prior experiences in the form
of memories and associations that are the building blocks of mental simulation. The medial prefrontal subsystem
facilitates the flexible use of this information during the construction of self-relevant mental simulations. These two
subsystems converge on important nodes of integration including the posterior cingulate cortex. The implications of
these functional and anatomical observations are discussed in relation to possible adaptive roles of the default network
for using past experiences to plan for the future, navigate social interactions, and maximize the utility of moments
when we are not otherwise engaged by the external world. We propose that the event structure encoding and social
event coordination provided in the medial structures of the PFC structure both past memories and future imaginings
and are central to the generation and interpretation of counterfactuals.
The recent evidence that PFC coordinates complex structured events in a distributed fashion combined with the
observation that counterfactuals exploit the detailed structure of complex events raises the possibility that we may
be able to arrive at a neural architecture of counterfactual processing that is both anatomically valid and functionally
adequate. As far as we are aware, our model of event structure is the most concrete proposal for a cognitive model of
complex events. Furthermore, the existence of a computational model and simulation enables us to test the model and
generate specific predictions that can lead to the design of new, highly specific and informative experimental studies.
So far, we have argued for the cognitive validity of the X-net model of counterfactuals by accounting for a wide
range of psychological and neurological data. We now turn to the technical issue of capturing these insights in an
operational semantics and computational model of counterfactuals. We start with the technical details of the X-net
model and then describe a counterfactual evaluation algorithm and its application to a recent newspaper discourse on
international politics.
Technical details of the computational model
As illustrated earlier, complex reasoning about event interactions requires not only an event description, but also a
dynamic model that can simulate the execution of the event unfolding over time. People routinely instantiate such a
model with facts about a particular event, enabling us to project which situations are likely or possible based on the
consumption and production of resources and the creation and elimination of states.
Reasoning about structured stochastic dynamic systems requires modeling coordinated temporal processes and
complex, structured states. A significant amount of work has gone into different aspects of overall problem.4 Figure 2
maps out the space of relevant probabilistic modeling and inference techniques along three basic dimensions (extended
from the description in (Anderson et al. 2002)). The dimension along the x-axis (left-right) depicts the increasing
expressiveness of the action model. The y-axis (vertical going up) corresponds to increasing the complexity of the
state representation. The z-axis (into the plane) corresponds to increasing the richness of the overall representation.
4 In all this work, we can have continuous variables as well as discrete ones. For the purposes of this exposition, all the comments here apply to
both types of states and actions.
26
The origin of the space is an unstructured probabilistic state vector representation with no explicit temporal or relational
information.
Moving to the right along the x-axis, we get to linear temporal models of sequences. Markov Models (MM) are the
most widely used technique to model such simple sequential processes. They have achieved considerable success in
a variety of domains (speech, computational biology). However, Markov models (including Hidden Markov Models
(HMM) which are properly subsumed under DBN (or TBN) (Murphy 2002)) are fairly inflexible and representationally
inadequate as models of actions.5 Specifically these representations are unable to model and reason about central
aspects of actions such as concurrency, synchronization and resources. Moving further right, we arrive at a set of well
developed graphical modeling approaches designed to model distributed dynamic systems with complex coordination,
concurrency and resource constraints. These representations are based on Stochastic Petri Nets (SPN) (Ciardo et al.
1994), are used widely in modeling in many domains (such as networks, distributed systems, computational biology).
Figure 14: Structured Probabilistic Models and Inference Space

One of the main drawbacks of MM and SPN based representations is the inability to represent states with complex
internal structure. Moving from the origin up along the y-axis, we have Factor Models, Markov Random Fields
(MRF) and Bayes Nets (BN) (directed) all of which make conditional independence assumptions to factorize the joint
probability distribution of the state variables into a compact product form. The next rightward node, Temporally
extended Bayes Nets (TBNs, also called DBNs) model each time step in a sequence as a BN and use links between
state variables at different time steps capture the temporal dependencies between variables. TBNs are thus able to
model simple sequences and structured state variables. Moving rightward, CBNs (Narayanan 1999b) combine the
expressive action modeling framework provided by the SPN (or CSPN (SPN with typed tokens)) based representation
with the ability to model complex states provided by the BN framework. This model of action is called X-nets and its
use in language understanding was illustrated in (Narayanan 1997; Narayanan 1999a; Sinha 2008).
5 The more recent versions of Hierarchical HMM (HHMM) (as in (Murphy 2002) are also subsumed under the TBN framework and such finer
gradations are omitted in the figure to ease exposition.
27
All the representations so far model states as propositions (or simple fluents) and are unable to handle relational
information. A critical aspect of scaling the current models to complex domain topologies and coordinated actions
is the ability to model predicates and relations. Moving along the z-axis (into the plane), are relatively recent probabilistic models that handle relational information. The RMM model (Anderson et al. 2002) is a relational extension
to sequential processes that allows variables and relations in markov models. However, unlike BN (or TBN), RMMs
are unable to model complex states with dependencies. PRMs (Getoor et al. 2001; Pfeffer 2000) extend the Bayes Net
formalism to allow specification of a probability distribution over a set of relational interpretations. As in Bayes Nets,
a PRM consists of a qualitative dependency structure and a set of parameters quantifying the conditional dependencies.
One basic difference is that in the case of PRMs we specify the dependency model quantifying the various domain
relations at the class level. This dependency is assumed to be duplicated for each instantiation. Recent developments
in combining structure and probabilities have resulted in similar proposals to link first order logic with probabilistic
representations such as Markov Logic (ML) (Richardson & Domingos 2006) or Bayesian Logic (BLOGs) (Milch et al.
2007).
While PRMs, ML, and BLOGs are able to model the relational information essential for modeling inference over
complex-structured domains, they are unable to model coordinated action or dynamics. Moving rightward from PRMs
we come to T(D)PRMs which are model simple sequences of PRMs in a manner analogous to T(D)BN.
Our existing formalism integrates extended CSPN nets (eXecuting Nets or X-nets) for action and TBNs for inference and appears to be an attractive methodology for the construction and analysis of intelligent agents. For large
scale systems, we use probabilistic networks, or PRMs. Our current efforts at the computational level are directed
towards a complete integration of DPRM and our extended Petri nets (with continuous and discrete state), which
we call Coordinated Probabilistic Relational Models or CPRM (Barrett 2010; Makin 2008). CPRM combine the
most coordinated action modeling with relational probabilistic inference. It allows the description, analysis, and
modeling of related, factored, coordinated temporal processes. It can be seen in two ways: either it uses a Petri
net to coordinate a Probabilistic Relational model, or it uses a PRM to add temporal, factored relationships and inferences to a Petri net, as described below. Because it is a coordinated form of a PRM, we may call it a CPRM.
CPRM inference procedures combine abduction, projection, intervention (Pearl 2000; Barrett 2010), and simulation.
These can be seen as extending an ongoing community effort to add more sophisticated temporal and control capabilities to probabilistic inference models (Anderson et al. 2002; Boyen & Koller 1998; Jordan 1999; Murphy 2002;
Richardson & Domingos 2006).
Inference in CPRM consists of combining fine grained causal and temporal simulation using X-nets with relational
probabilistic state inference using PRM. The basic algorithm involves computing reachable states (both direct and
ramifications) sn+1 resulting from executing/simulating the X-net corresponding to the action set S = [a1 , . . . an ]
in a given initial state s0 . The solution to this problem in our action model involves both simulating the direct temporal
and causal structure of the action in the current context using an exectuable model of actions (X-nets), as well as using
the PRM to compute inferential updates. These updates can be mixture distributions over the state variables as well
as fixed point revisions using the Maximum A Posteriori (MAP) estimates, of related states. The MAP estimates and
related belief revision (Pearl 1988; Barrett 2010) procedures allow us to compute the indirect effects (ramifications)
that flow from the action (Narayanan 1999b). The coordination between the state inference using PRM and the action
simulation using x-nets allows for a variety of real-time synchronization and event-triggered control schemes. A full
description of the control is outside the scope of this paper but the details can be found in Leon Barretts Berkeley
EECS PhD thesis (Barrett 2010).
8.1
X-nets: An executable semantics of event structure
The most relevant property of the X-nets for action modeling is its well-specified execution semantics: a transition
is enabled when all its input places are marked, such that it can fire by moving tokens from input to output places.
The active execution semantics serves as the engine of context-sensitive inference in the simulation-based model of
language understanding (Narayanan 1999b).
Definition 1 The basic x-net: An x-net consists of places( P ) and Transitions ( T ) connected by weighted directed
arcs A ( A (P T ) (T P) ). Each arc aij A has weight wij ( wij N ). Input Arcs T ( T (P T ) )
28
connect Input Places to Transitions. Output Arcs T ( T (T P) ) connect Transitions to Output Places. Arcs
are typed as enable arcs E , inhibitory arcs I , or resource arcs R .
X-nets have a well specified real-time execution semantics where the next state function is specified by the firing
rule. In order to simulate the dynamic behavior of a system, a marking (distribution of tokens in places (depicted as
dark circles or numbers)) of the x-net is changed according to the following firing rule.
Definition 2 Execution Semantics of the basic x-net A transition T is said to be enabled if no inhibitory arc i I
of T has a marked source place and all sources of enable arcs e E of T are marked and all input arcs p R
have at least wpt tokens at their source place, where wpt is the weight of the arc from P to T . The firing of an
enabled transition T , removes wP T tokens from the source of each resource input arc P and places wT P tokens
in each output place of T .
X-nets cleanly capture sequentiality, concurrency and event-based asynchronous control; with our extensions they
also model hierarchy, stochasticity and parameterization (run-time bindings). Besides typed arcs (Definition 1), the
following two extensions to the basic Petri net are designed to allow us to model hierarchical action sets with variables
and parameters: First, tokens carry information (i.e. they are individuated and typed) and transitions are augmented
with predicates which select tokens from input places based on the token type, as well as relate the type of the tokens
produced by the firing to the types of tokens removed from the input. Second, transitions are typed into four kinds,
namely stochastic, durative, instantaneous and hierarchical transitions. An instantaneous transition fires as soon as
it is enabled. A timed transition fires after a fixed delay or at an exponentially distributed rate. Hierarchical transitions,
activate a subnet, wait for its return, or timeout.
The state of an x-net is defined by its marking. The firing rule produces a change in the state of an x-net by taking
it from one marking to the next. Given an x-net and a marking, we can execute the x-net by successive transition
firings. This can continue as long as there is at least one enabled transition that can fire. The x-net firing rule semantics
allows enabled transitions to fire in a completely distributed manner without any global clocks or central controllers.6
Execution halts at the state where there x-net enabled transition. This naturally allows us to extend the earlier definition
to define an extended next-state function for x-nets.
Definition 3 Extended next-state function
The extended next-state function is defined for a state si and a sequence of transitions T as
(si , tj ) = ((si , tj ), )
(si , ) = si
is the null transition.
8.2
A PRM model of states
Our representation of states must be capable of modeling causal knowledge and be able to support both belief updates
and revisions in computing the global impact of new observations and evidence both from direct observations and from
action effects. Traditional implementation of the agents state uses directed graphical models, or Bayes nets (Jensen
1996; Pearl 1988; Pearl 2000). A Bayes net is a convenient data structure to encode causal domain knowledge.
In previous work (Narayanan 1999b), we have used temporally extended propositional graphical models or Dynamic (Temporally extended) Bayes Nets (TBN) (Pearl 1988; Jensen 1996; Jordan 1999; Murphy 2002) to model
complex states. The basic algorithms that operate on the probabilistic network data structure deal both with new observations and database updates due to external interventions such as random disturbances, explicit actions, and new
textual (or other) inputs.
While the Bayes Net structure supports structured inferences (Narayanan 1999a), it does not exploit the relational
structure inherent in any domain. The PRM model of a state enables us to scale to relational domains.
6 However
our sequential simulation adjusts step size to be able to fire multiple enabled transitions in a single step.
29
Definition 4 A PRM Model of a State

The State S is defined as a PRM comprised of: A disjoint set Xi X of classes, related by inheritance relationships. A set I of named instances, each denoting an instance of the class. A set A(Xi ) of attributes, denoting
simple or functional relations, where the domain Dom[A] C and the range some Range[A] V al[A] . A set B
of complex attributes, denoting functional relations, where the domain Dom[B] C and the range Range[B] C .
A set of Conditional Probability Distributions (CPD) where for each class Xi and for each attribute A(Xi ) , we
define a CPD of the form P (a|pa(x.A)) , where pa(x.A) are the parents of the attribute (simple or complex). CPDs
are attached to classes and inherited by instances. Cyclic dependencies in the parent links are disallowed. For each
relational schema structure ( ) with objects O and parameter set , we can define the complete joint probability of
a specific instantiation of the PRM state using the chain rule as follows (Getoor et al. 2001; Pfeffer 2000).
Theorem 1 . Instantiating CPRM Domain Models:The Chain Rule Let S be the PRM structure of a CP RMs ,
be the set of Attributes, O the set of objects, be the set of parameters, and I be a ground instance. Then the joint
probability distribution P (I) is the product of all conditional probabilities specified in CP RMs .
Y
Y
Y
P (Ix.A |Ipas, (x.A) )
(1)
P (I|S, , O, ) =
Xi X AA(Xi ) xO (X(i))
We can use this recursive factorization for both Belief Updates and MAP Estimation to find the best explanation
for an input. Results so far (Narayanan 1999b; Barrett 2010) suggest that this technique is promising enough to be
useful for a variety of inference and decision making applications.
8.3
Inference With CPRMs
Traditional inference methods in probabilistic models of sequential time (non-branching) dynamic systems (such as
TBN) (see (Murphy 2002) for a fuller description) consist of the following kinds of computations. Here Xt is a state
variable at time t (lowercase xt is a value assignment), and yt is an observation value at time t .
Filtering Compute P (Xt |yi...t ) . Update the state based on the observation sequence.
Prediction Compute P (Xt+h |y1...t ) . Predict the state at some future time t + h based on the observation sequence
upto time t .
Smoothing Compute P (Xtm |y1...t ) . Recompute previously estimated states in the present of current evidence.
Viterbi Decoding Compute argmaxx1...t P (x1...t |y1...t ) . Compute the best assignment of values to the state given
the observation sequence.
MAP Estimate Compute argmaxhi P (x1...t |y1...t , hi ) . Compute the best hypothesis from a set hi H based on
the assignment of values to the state given the observation sequence.
With the extension to branching or coordinated models of dynamic systems, we need to enhance these traditional inference procedures to include the computation of the reachability set of a marking in an x-net based action
framework.
Reachability is a fundamental problem in dynamic system theory and is central to computing a variety of event
related inferences (Chang et al. 1998). In terms of x-nets, give that the state space evolves through execution of
enabled transitions, we can define the reachable states of an x-nets, given an initial marking.
Definition 5 Immediately reachable states
For an x-net S , with a state si , state sj is immediately reachable if there exists a transition tk T such that
(si , tk ) = sj .
Extending this concept, we can define the set of reachable markings for a given x-net in some initial state. Basically,
if sj is immediately reachable from si , and sk is immediately reachable from sj , then sk is in the reachability set
of si . Thus the reachability relationship is the reflexive transitive closure of the immediately reachable relationship.
30
Definition 6 Reachability set

The reachability set R(S, s0 ) for an x-net S with state s0 is the smallest set of markings defined by a) s0
R(S, s0 ) , and b) If sj R(S, s0 ) and sk = (sj ), tl for some tl T , then sk R(S, s0 ) .
Definition 7 Reachability Given an x-net S with an initial state s0 and a final state sf , is sf R(S, s0 ) ?
Given this definition of reachability, the following theorem allows us to directly use the vast number of techniques
developed in the distributed systems literature for x-net based inference.
Theorem 2 . (Proof in Narayanan 1997) An x-net is formally equivalent to bounded High Level Generalized Stochastic Petri Net (HLGSPN). The reachability graph of a marked x-net is isomorphic to a semi-Markov process.
Given this theorem, we can compute the various parameters of interest in our x-net based model of coordinated
events. As in other models (Ciardo et al. 1994), we assume that the x-net transition firing time is governed by an
exponentially distributed random variable xi . The firing time of transition ti is given by
Fxi = 1 ei x
(2)
The negative exponential distribution renders the reachability graph of the SP N isomorphic to a continuous time
Markov chain. The Markov chain M C can be obtained from the reachability graph as follows: The M C state
space is the reachability set R(P N ) of the marked SP N . In M C , the transition rate from Mi to Mj is given
by qij = k , corresponding to the firing rate of the transition tk from Mi to Mj . If several transitions lead from
Mi to Mj , then qij is the sum of the rates of these transitions. If there is no link from Mi to Mj in R(P N ) then
qij = 0 in M C .
The steady state distribution of the M C is obtained by solving the linear equations:
Q = 0
s
X
j = 1
(3)
1=1
(4)
From the vector = (1 , . . . , s ) we can compute the following measures.
Probability of being in a set of states: Let B R(P N ) constitute the states of interest in a given SP N . Then
the probability of being in a state of the corresponding SP N is
X
P (B) =
i
(5)
Mi B
Probability of taking a transition tj : Let ENj be the subset of R(PN) in which the transition tj is enabled.
Then the probability that an observer looking randomly into the next sees tj firing next ( pj ) is given by
pj =
Mi ENj
j
qii
where qii is the sum of the transition rates out of Mi .

The throughput of a transition is the mean number of firings at steady state.
X
dj =
i j
(6)
(7)
Mi ENj
With this semantics in hand, we can now turn to one of the standard inference procedures in an action model,
temporal projection.
31
The temporal projection problem consists of computing states sn+1 resulting from executing the action set S =
[a1 , . . . an ] in a given initial state s0 . The solution to this problem in our action model involves both simulating
the direct temporal and causal structure of the action in the current context, as well as using the PRM to compute
the Maximum A Posteriori (MAP) estimates of related states. The MAP estimates and related belief revision (Pearl
1988) procedures allow us to compute the indirect effects (ramifications) that flow from the action (Narayanan 1999b).
The basic algorithm is outlined below.
Algorithm 1 Temporal Projection
1. Set initial Marking M0 . p s0 : M0 (p) = 1, p
/ s0 : M0 (p) = 0 .
2. Fire enabled transitions Te0 T of M with initial marking M0 . The next state function described earlier
takes the system to a new marking Mint0 . The state corresponding to this marking Sint = p : Mint (p) = 1 :
p Sint , p : M1 (p) = 0 : p
/ sint .
3. Run the belief revision procedure to return the most consistent a posteriori assignment ( M AP ) of values to
the state variables. The new state S1 corresponds to the marking M1 where p S1 : M1 (p) = 1, p
/ s1 :
M1 (p) = 0 .
S
4. While 1 i n , do: fire enabled transitions Tei1 with marking Mi . set Mint = ai Mi . Run Step 3
to get Si+1 . Return Sn+1 as the answer.
Steps 1 and 2 are essentially constant time, since our notion of state as a graph marking is inherently distributed
over the network, so the working memory of an x-net-based inference system is distributed over the entire set of xnets and state features. The result is a massively parallel solution to the projection problem. In addition, the central
features of our action representation, namely that they are executing provides an elegant solution to the Frame Problem.
Specifically, the action-based executing action semantics allows frame axioms to be implicitly encoded in the structure
of the net and the local transition firing rules.
Step 3 requires MAP estimation of the state variables. MAP estimation over such a network is well known to
be intractable for complex domain topologies. While the worst case analysis does not change with the more expressive CPRM design, we propose to exploit the additional structure provided by the PRM framework to develop state
and domain models to minimize the inferential complexity. The complexity of exact inference on PRMs (using the
Structured Variable Elimination (SVE) algorithm is given below.
Theorem 3 . (Pfeffer 2000) The space and time complexity of solving a query on q basic variables for a PRM P is
at most O(N kbk(m+2) bq ) , where N is the total number of attributes in P , k is the maximum number of interface
variables for any object in P , and m is the maximum tree-width of any object in P .
The two critical variables to control are k (the maximum number of interface variables) and m , the maximum
tree-width or the maximum width of the dependency graph of a class, relative to an elimination ordering. For instance,
if the dependency graph of a class is a polytree, then the (space+time) complexity of solving a query on q variables for
a PRM P reduces to O(N kb(p+2)k bq ) , where p is the maximum number of parents of any attribute in P . Clearly,
the cost is dominated by the size of the interface variables ( k ). In a general P RM , there may be no local method to
guarantee that the interface size stays small. However by carefully introducing relations and doing query optimization,
one can often get a handle on this variable. For instance, if the only relation allowed is the part-of relation, then it
is possible to specify at design time the maximum size of the interface variable k . This results in a specialization
of PRMs called Object Oriented Bayes Nets (OOBN) (Pfeffer 2000). Hence the tractability of the inference can be
controlled by local design choices.
To summarize, in our model, executing actions is fast, parallel and reflexive, while inference with complex state
dependencies to achieve global consistency is hard. To deal with the scaling problem, we propose to exploit the
relational structure of knowledge (linguistic and domain) to allow for explicit domain modeling and query optimization
choices to influence the tractability of inference. Together, this provides a formalism expressive enough to support
deep semantic inference necessary for autonomous agents with guidelines and techniques to bound the time and space
complexity. We expect that accomplishing this could result in the widespread use of structured probabilistic models
autonomous systems research, something long overdue.
32
8.4
The Counterfactual Evaluation Algorithm
We now present an algorithm for constructing and evaluating counterfactuals. The basic idea is to exploit the shared
structure between the two networks (the real and counterfactual). As mentioned earlier, our algorirhtm depends on
the action space in which the counterfactual is evaluated. We assume that actions have the internal structure of the
controller (Figure 4) and are modeled by X-nets. States are modeled as relational probabilistic networks in a factorized
representation. Our evaluation algorithm is a modification of the action semantics in structural models (Pearl 2000).
Specifically, our treatment of actions and simulations is considerably richer than Pearls structural model semantics.
We start with the basic definition from (Pearl 2000) and introduce the extended model, the structural action semantics,
and the counterfactual evaluation algorithm with the structural action semantics.
Definition 1 . A Causal Model
A causal model is the triple M =< U, V, F, N >
Where:
1. U is the set of background variables that are determined by factors outside the model.7 .
2. V is a set {V1 . . . Vn } of variables that are determined by variables in the causal model ( U V ).
3. F is a set of functions (f1 . . . , fn ) such that each fi is a mapping (from the respective domains of) U (V \Vi )
to Vi such that the entire set F forms a mapping from U to V . Each fi thus specifies the value of Vi
given the values of all the other variables in U V . Symbolically, the set of equations F can be written as
vi = fi (pai , ui ), i = 1, . . . n , where pai and Ui U are the minimal set of variables (parents in the model
graph) in V and U sufficient for representing fi .
4. N is the X-net of actions. N consists of places( P ) and Transitions ( T ) connected by weighted directed arcs
A ( A (P T ) (T P) ). Each arc aij A has weight wij ( wij N ). Input Arcs T ( T (P T ) )
connect Input Places to transitions. Output Arcs T ( T (T P) ) connect Transitions to Output Places.
Arcs are typed as enable arcs E , inhibitory arcs I , or resource arcs R .
Every causal model M can be associated with a directed graph G(M ) , in which each node corresponds to a
variable and the directed edges point from the members of P Ai and Ui toward Vi .
Definition 2 . A Submodel
1. Let M be a causal model of the state, X a set of variables in V , and x a particular realization of X . A
submodel Mx of M is the causal model
Mx =< U, V, Fx >
(8)
Fx = fi : (Vi X) (X = x)
2. Let N be the X-net model of the action, P be the set of places in N , T the set of transitions in N .
The submodel formed by asserting a precondition or resource value ( p ) results in the appropriate marking
Mp : p P , where Mp [t, f ] for preconditions and binary resources, while Mp N for integer resources
and fluents. The submodel formed by asserting the firing of a transition (such as suspending an ongoing process)
involves marking the preset of the transition t P, t T (all the preconditions and resources p, r T )
with the appropriate marking and then firing the transition t T .
In other words, for a state, Fx is formed by deleting from F all the functions fi , and replacing them with
constant functions X = x . For the X-net places and transitions, a submodel is just a marking that includes the
specific place p P or the preset of a transition t P, t T . In the X-net, there is no need to replace any
7 The
background variables and their connections correspond to the Generic Space in the Mental Space models
33
of the functions, since the simulation-based dynamics propagates information forward in time. Thus in the case of a
submodel formed by marking places in the X-net, there is no need for local surgery to model actions.
As (Pearl 2000) points out, submodels are useful for representing actions and the premises of counterfactuals. If
each fi is interpreted as an independent physical mechanism, then the action do(X = x) corresponds to the minimal
change in M required to make X = x hold true under any u . This change results in the submodel Mx since Mx
differs from M only in those mechanisms that directly determine the variables in X . The transformation from M
to Mx a structural modification (of F ), hence the name modifiable structural equations (Galles and Pearl 1998).
For the extended case that includes the X-net model, the minimal change in the space of the event is the minimal
change in the marking which results in a submodel Mx which is structurally exactly the same as the original model
M and differs only in the marking and thus the evolution trajectory of the event.
Definition 3 . A Probabilistic Causal Model (Pearl 2000)
A probabilistic causal model is the pair < M, P (u) >
where M is a causal model and P (u) is a probability function defined over the domain of U .
Algorithm 2 Counterfactual Evaluation
Given a model (M, Pu ) , the conditional probability P (BA |e) of a counterfactual sentence If it were A then B,
given evidence e, can be calculated using the following steps.
1. Step 1: Abduction: Update the PRM P (u) by the evidence e to obtain P (u|e) .
2. Step 2: Temporal Projection:
(a) Assert counterfactual Set initial Marking M0 based on the counterfactual assertion. p P reA :
M0 (p) = 1, p
/ P reA : M0 (p) = 0 . Fire enabled transitions Te0 T of M with initial marking
M0 . The next state function described earlier takes the system to a new marking Mint0 . The state
corresponding to this marking Sint = p : Mint (p) = 1 : p Sint , p : M1 (p) = 0 : p
/ sint .
(b) Predictive Update: Run the belief revision procedure to return the probability of B, the consequence of
the counterfactual. If B is an action state, then fire enabled transtions based on the new marking and repeat
the previous two steps.
This approach generalizes the approach in (Pearl 2000). It generalizes from the structural equation modification
approach (where actions are represented by the do(X = x) semantics in creating the submodel) in (Pearl 2000) to
the structural action simulation approach where the actions are represented as X-net markings and relational probablistic inference in a simulation semantics of actions. This explicit modeling of action structure appears to match the
psychological data on counterfactual processing and also enables several interesting counterfactual inferences that are
often based on events, actions, and processes, their framing, explicit structure, and dynamic evolutions. Examples of
the application of the counterfactual evaluation algorithm follow.
Applying the Algorithm
Consider the following recent story from the New York times (Sept 18, 2009) on the eve of the United Nations general
assembly in New York. 8
1. Context: American diplomats were unable on Friday to bridge gaps between Israel and the Palestinians on
restarting peace talks, meaning that while their leaders will likely meet with President Obama next week at the
United Nations General Assembly, they will not announce a renewal of negotiations, officials on all sides said.
2. Sentence 1: The goal of the meetings this week was to produce conditions for a summit meeting in New York,
led by Mr. Obama, at which Prime Minister Benjamin Netanyahu of Israel and President Abbas would say they
were starting peace talks again.
8 The
story is from the online version of the New York Times at http://www.nytimes.com/2009/09/19/world/middleeast/19mideast.html?hp
34
3. Sentence 2: Mr. Erekat and others said there were two sets of problems, the first having to do with the length
and extent of an Israeli settlement freeze in the West Bank and Jerusalem, and the second having to do with
the basis for the negotiations themselves. Mr. Erekat said that without a freeze in advance, negotiations were
pointless.
4. Sentence 3: Mr. Mitchell also met twice on Friday with Mr. Netanyahu. An aide to Mr. Netanyahu said only
that the prime minister would leave for New York as planned on Wednesday and that Israel was willing to restart
negotiations immediately, so the difficulty lay not with Israel but with the Palestinians.
5. Sentence 4: The Americans and Palestinians have been pushing Israel to agree to freeze settlement building
entirely as evidence of its seriousness about peace talks. The settlements are on land that the Palestinians want
for their future state. But Mr. Netanyahu has declined to do so, saying that he would be willing to reduce or
slow building, but not freeze it, because he would not turn his back on Israelis living there.
Event 1
Context
2
6
6
6
6
4
Event Schema: Peace Talks

Frame: Negotiation
Current State: Suspended
Participants: Israel, Palestine
Mediator: US
3
7
7
7
7
5
2
6
6
6
6
6
6
4
Event Schema: Meeting

Frame: Meeting
Participants: US, Israel, Palestine
Current State: Done
Goal State = Context.Schema.enable
Time: This week
3
7
7
7
7
7
7
5
Domain Knowledge
3
Talks.enable = Part1.agree and Part2.agree
5
4 Part1 : Palestine
Part2 : Israel
2
Evidence
3
(Restart(Talks))
4 (Succeed(Event1))
5
(Freeze(Israel, Settlements))
2
With this knowledge input, we construct the network of interactions shown in Figure 15, a part of which (the state
and domain PRM) is further elaborated in Figure 16.9
Figure 15 captures the two events mentioned above and their interactions. The two events correspond to the
just concluded meeting this week and the possible restarting of talks next week. Event 1 is a meeting that has just
concluded, modeled by a marking on the Done node of the controller graph. The goal of the meeting was to produce
conditions for restarting the currently suspended peace talks (modeled as a marking in the Suspended place of Event
2). As noted earlier, the structure of the graph sanctions the pre-suppositional inference that the talks were previously
ongoing (a backward inference from the current marking). This is also shown highlighted in Figure 15. The context
is that the talks are suspended (shown on the right as Event 2). The meeting this week is done and was not successful
(shown as the activation of the done node and the enabling of the Fail transition in the controller for Event 1). The
bottom left corresponds to the PRM that captures the complex state constraints. The PRM has two components, one
for each potential participant. Figure 16 shows the extended network in the context of evaluating the counterfactual
Peace talks could restart if Israel freezes settlements.
9 For simplicity, we neglect the modeling of the mediating role of the US in both events. We can add this explicitly to the X-net and/or the PRM.
We could also treat this as an exogenous variable that impacts the state PRM. An extended analysis will have to include this mediation, but none
of this additional complexity makes a qualitative difference in the illustration of the counterfactual evaluation algorithm that is the subject of this
section.
35
Produce resource: conditions for restarting talks

Restart
Enable
Disable
Suspended
Resume
Suspend
Iterate
Stop
Ready
Enabled Prepare
Stopped
Done
Start
Ongoing
Cancel
Finish
Canceled
Succeed
Fail
Failed
Event 1: Meeting This week

Restart
Part 1 = Palestine
Part 2 = Israel
Enable
Disable
Suspended
Resume
Suspend
Iterate
Stop
Ready
Agree(P, T)
Enabled Prepare
Agree(I, T)
Stopped
Done
Start
Ongoing
Cancel
Finish
Canceled
Undo
Undone
Event 1: Peace talks suspended
Figure 15: The goal of the meetings this week was to produce conditions for a summit meeting in New York, led
by Mr. Obama, at which Prime Minister Benjamin Netanyahu of Israel and President Abbas would say they were
starting peace talks again. The two events correspond to the meeting and the possible restarting of talks the next week.
The context is that the talks are suspended (shown on the right Event 2). The meeting this week is done and was not
successful (shown as the activation of the done node in the controller for Event 1). The bottom left corresponds to
the PRM (see text) that captures the complex state constraints. The PRM has two components, one for each potential
participant. See text for details.
We now illustrate Algorithm 2 at work to evaluate the following two counterfactuals. Notice that in both cases, the
background context is that the peace talks are currently suspended (see Figure 15).
1. If the meeting last week had succeeded, talks could restart in New York this week.
2. If Israel had agreed to freeze settlements, the peace talks could restart in New York this week.
9.1
Evaluation of If the meeting had succeeded, talks could restart in New York this week
Consider the success criteria for the meeting (Event 1). Figure 15 shows the current state. The peace talks are
suspended and the meeting had failed (from the analysis of Sentence 1). So the counterfactual statement is Talks
could restart, if the meeting had succeeded, given that the meeting did not succeed. The basic algorithm, Algorithm
2 then applies to the counterfactual query in the following way.
1. Step 1: Step 1 of Algorithm 2 involves asserting the evidence (talks did not succeed) and then propagating this
evidence to the background context (Context.CurrentState).
(a) Assert evidence: Meeting was not a success. This is shown in the current state in Figure 15 where the
transition Fail is fired.
36
BACKGROUND: SUSPENDED(PT)
ACTUAL SPACE
Part 1 = Palestine
COUNTERFACTUAL SPACE
Part 2 = Israel
Part 1 = Palestine
F(I,S)
Part 2 = Israel
F(I,S)
A(P, T)
A(P, T)
A(I, T)
Precond(T)
A(I, T)
Precond(T)
Figure 16: The state is a Probabilistic Relational Model (PRM) that captures the two partitions (actual and counterfactual) are connected networks in the narrative. The precondition for talks is a result of the agreement of both participants
(Israel (I) and Palestine (P)) to talk. This is modeled as the propositions A(I,T) and A(P,T). This requires different
constraints. In the dual network model of (Balke and Pearl 1994), the actual network has the evidence asserted that
Israel does not freeze settlements (I(F,S)= False in the actual network). This is shown as the red circle in the actual
network (left) of the figure above. In the counterfactual network this assertion is changed (Israel freezes settlements)
(I(F,S) = True in the counterfactual network). This is shown as the green circle in the counterfactual network of the
figure above. In both the actual and the counterfactual networks, we assert that Israel agrees to the talks (shown as the
green circle for A(I,T) in both the networks). Standard belief propagation algorithms compute the belief that the talks
could restart if Israel agreed to freeze settlements, given that Israel is not freezing settlements (the evidence). Notice
that the information flow from the actual to the counterfactual networks is only through the background conditions.
The result of applying algorithm 2 on the networks makes the counterfactual precondition for talks true while the actual network has the precondition evaluating to false. Thus in the counterfactual case where Israel freezes settlements,
we conclude that the precondition for restarting talks is true. For details, please see Algorithm 2.
(b) Compute the background context P (Context.CurrentState|Success) . Clearly there is no change in
the current state, since the only way to influence the Current State is through the Success transition in the
graph. Hence P (Context.CurrentState|Success) = P (Context.CurrentState) = Suspended .
2. Step 2: Step 2 involves simulating the event corresponding to the counterfactual and running the temporal
projection algorithm (Algorithm 1).
(a) First the Success transition is fired. In the model (see Figure 15), the preconditions of the transition both
participants agree to the talks ( A(I, T ) A(P, T ) ). Setting these preconditions (the third precondition
done is already true), and firing the Success transition leads to the precondition of restart being produced
(see Figure 15).
(b) Propagating evidence through the network implies the following conditions holding.
A(I, T ) A(P, T ) P recond(T )
(9)
This would enable the Restart transition of Event 2, which when fired leads to T alks.Ready holding in
37
the event corresponding to the peace talks. Thus the P (Context.Restart|Event1.Success) = 1 in the
deterministic case.
Together the two steps show that the counterfactual query Context.Restart|Event1.Success is true. Hence if
the meeting were a success, the talks could restart.
9.2
Evaluation of If Israel had agreed to freeze settlements, the peace talks could restart
in New York this week
Figure 16 shows the expanded network that captures the state variables shown in Figure 15. The state is a Probabilistic Relational Model (PRM) that captures the multiple partitions (spaces) in the narrative. To keep things simple
for exposition, we focus only on one of the preconditions for talks the agreement of both participants (Israel and
Palestine) to talk. The same techniques outlined here could be extended to other resources (location, time, personnel).
In the dual network model of (Balke and Pearl 1994), the actual network has the evidence asserted that Israel does not
freeze settlements. In the counterfactual network this assertion is changed (Israel freezes settlements). Standard belief
propagation algorithms compute the belief that the talks could restart if Israel agreed to freeze settlements, given that
Israel is not freezing settlements (the evidence). Notice that the information flow from the actual to the counterfactual
networks is only through the background conditions.
Consider the model represented in Figure 16. We explicitly outline the model (the actual situation) and the counterfactual submodel (the counterfactual situation). Again the background here is Context.CurrentState = Suspended
and the evidence is F reeze(Israel, Settlements) . The counteractual query is P (Context.CurrentState =
Ready|F reeze(Israel, Settlements) (see the graph in Figure 15.
1. Model M :
(a) F (I, S) is the Freezing (F) of settlements (S) by Israel (I). A(P, T ) represents the Agreement (A) for
talks (T) by Palestine (P). A(I, T ) represents the Agreement (A) for talks (T) by Israel (I). P rocond(T )
represents the satisfaction of one of the precondition for Talks (that both participants agree).
(b) The model Constraints are
F (I, S) = A(P, T ).
(10)
P recond(T ) = A(P, T )A(I, T ).

(c) The evidence is F(I,S).
2. Consider the submodel represented by the change in the value of F (I, S) from F (I, S) to F (I, S) . We
denote the new network of the submodel MF (I,S) by starred (*) suffixes.
(a) F (I, S) is the Freezing (F) of settlements (S) by Israel (I). A (P, T ) represents the Agreement (A) for
talks (T) by Palestine (P). A (I, T ) represents the Agreement (A) for talks (T) by Israel (I). P rocond (T )
represents the satisfaction of one of the precondition for talks (that both participants agree).
(b) The model constraints are
F (I, S) = A (P, T ).
P recond (T ) = A (P, T ) A (I, T ).
(11)
(c) The evidence is F (I, S) .

Given this situation, the counterfactual to be evaluated is P (P recond (T ), F (I, S))|F (S) . As in (Pearl 2000),
this can be determined by the network shown in Figure 16. The basic construction follows the dual network model
(Pearl 2000), where there is one network for the actual (the left partition) and one for the counterfactual (starred, right
partition).
The steps of the algorithm are as follows.
38
1. Step 1 Assert and propagate the known evidence ( F (I, S) ) on the actual network.
(a) Assert ( F (I, S) ) on the actual network.
(b) Calculate P (Context.CurrentState|F (I, S)) . Clearly from the model constraints F (I, S) = P recond(T )
and as in the earlier example P (Context.CurrentState|P recond(T )) = P (Context.CurrentState) =
Suspended 10
2. Step 2. This involves running the temporal projection algorithm on the network from Step 1, with the counterfactual assertion asserted.
(a) Assert the counterfactual evidence ( F (I, S) ) on the counterfactual network. Notice this is shown in the
right side of Figure 16. In this case, the evidence corresponds to a direct intervention in the network since
the Freeze Settlements action is modeled as a simple proposition. If the Freeze Settlement action was
a full fledged action (like the peace talk and meeting events in the earlier example), the resulting simulation
would be run as earlier. In this special case where the action is abstracted into a proposition, the do(x)
operator (Pearl 2000) can be directly applied to set the value of F (I, S) to True.
(b) Compute the query P (P recond (T ), F (I, S))|F (S) using standard belief propagation algorithms.
In the simple case shown here, from the domain constraints P recond(T ) = A(I, T ) P (I, T ) , we can
see that P recond(T ) evaluates to True. That sets the precondition variable (see Figure 15 which enables
the Restart transition of the Peace talks (currently suspended). Firing this transition leads to a token in the
Ready State of the X-net network. Hence Context.CurrentState = Ready, which is the evaluation result
of the counterfactual.
9.3
Metaphoric Counterfactuals
Consider the following sentence.

(12) If Israel had turned around on the settlement issue, Abbas would have moved forward on the peace talks.
In this case, both the antecedent and the consequent clause are metaphorical. (Narayanan 1997) describes a model
of metaphor interpretation that is completely compatible with the counterfactual model proposed here. The basic idea
behind the model is that the metaphoric inferences provide assertions on the network model in the target domain.
In this case, both the metaphors use the Event Structure Metaphoric (ESM) mapping (Lakoff 1993; Lakoff 1994)
where spatial motion is mapped onto abstract action. The ESM is an extremely general mapping that appears to be
a cross-linguistic and cross-cultural structuring of our understanding of abstract action in terms of spatial motion,
manipulation, and force interactions. Some examples of the mapping include:
1. States are locations (bounded regions in space).
2. Changes are movements (into or out of bounded regions).
3. Causes are forces.
4. Actions are self-propelled movements.
5. Purposes are destinations.
6. Direction of motion is sequence of actions.
7. Means are paths (to destinations).
8. Difficulties are impediments to motion.
10 We could use a probabilistic estimate by introducing an exogenous variable U (another set of reasons, perhaps US persuasion) to force
A(P, T ) and thus P recond(T ) , in which case the probability P (Context.CurrentState|F (I, S), u) will have to be calculated using
standard belief propagation. This is omitted to ease exposition.
39
9. Expected progress is a travel schedule; a schedule is a virtual traveler, who reaches pre-arranged destinations at
pre-arranged times.
10. External events are large, moving objects.
11. Long term, purposeful activities are journeys.
(Narayanan 1997) describes the use of the model presented here to analyze expressions like in the example above.
In effect, the system described is able to conclude that the antecedent expression of a reversal is direction (turn
around) corresponds to a reversal in action on the settlements. Given the current state F (I, S) and the two choices
( F (I, S), F (I, S) ), the system concludes that the antecedent is F (I, S) . Similarly, the consequent uses the ESM
mapping Actions are self-propelled movements and Progress is forward motion to conclude the consequent corresponds to the assertion that the Palestinians are agreeable to the peace talks restarting, or A(P, T ) . This makes the
counterfactual query P (A(P, T )|F (I, S), F (I, S)) which is a subpart of the example in the previous section, and
thus evaluates to true.
In general the counterfactual evaluation algorithm is modified as shown below to deal with the combination of
metaphors and counterfactuals. This is a three step process shown in Algorithm 3. The first step is to run the metaphor
interpretation algorithm to identify the counterfactual assertion and the counterfactual query. The rest of the algorithm
is identical to Algorithm 2.
Algorithm 3 Metaphoric Counterfactual Evaluation
1. Step 1: Metaphor interpretation: Run the metaphor interpretation algorithm KARMA (Narayanan 1997) on
the antecedent and consequent expressions. Assert the counterfactual evidence ec to create the submodel Mx .
Compute the metaphoric consequent, b and compose the counterfactual query P (B|e, ec ) .
2. Step 2: Abduction: Update the PRM P (u) by the evidence e to obtain P (u|e) .
3. Step 3: Temporal Projection:
(a) Assert counterfactual Set initial Marking M0 based on the counterfactual assertion. p P reA :
M0 (p) = 1, p
/ P reA : M0 (p) = 0 . Fire enabled transitions Te0 T of M with initial marking
M0 . The next state function described earlier takes the system to a new marking Mint0 . The state
corresponding to this marking Sint = p : Mint (p) = 1 : p Sint , p : M1 (p) = 0 : p
/ sint .
(b) Predictive Update: Run the belief revision procedure to return the probability of B, the consequence of
the counterfactual. If B is an action state, then fire enabled transitions based on the new marking and repeat
the previous two steps.
To our knowledge, ours is the only computational account that is able to deal with the combination of metaphor
and counterfactual reasoning in the kind of complex scenarios presented here.
10
Discussion and future work
Counterfactuals are mental simulations of variations on a theme. They refer to imagined alternatives to something
that has actually occurred. Counterfactual reasoning is basic to human cognition and is ubiquitous in commonsense
reasoning as well as in formalized discourse. They play a significant role in other cognitive processes such as conceptual learning, decision making, social cognition, mood adjustment, and performance improvement. They help us
to process causal relations by highlighting possible causal antecedents of an unpleasant outcome. They appear to be
a pervasive part of normal cognitive processing and may occur fairly often outside of conscious awareness. Without
counterfactual thinking a person would find it more difficult to avoid repetition of past mistakes, to adjust their mood
after an unpleasant event, and to reason effectively about unpleasant events.
40
To our knowledge, our model is the first step toward a cognitive model of counterfactuals. Our treatment of
counterfactuals comes from independent considerations of cognitively motivated event structure representation useful
for event coordination and for language processing. We believe the same structures capable of event coordination are
used to generate structured alternatives or counterfactuals. Our ontology and model of event structure is thus able to
make detailed predictions of the type, content and neural architecture of counterfactuals.
From a variety of experimental evidence, it appears that the Prefrontal Cortex (PFC) is centrally implicated in
deficits of counterfactual reasoning. It also appears that the deficits in counterfactuals is correlated to other aspects
of PFC function. We believe that the commonality in PFC involvement between counterfactual deficits and the other
tasks suggest that the PFC is involved in encoding complex events by coordinating multiple distributed aspects of event
knowledge. This view is consistent with recent representational theories of PFC function (in sequencing events, multitasking, and in coordinating multiple events). Our mechanistic account and computational model offers the advantage
of being able to make detailed predictions related to both representation and to processing and can potentially shed
light on the functional architecture of the PFC and its role in event coordination and integration.
From the purely computational side, we believe that probabilistic, graphical models are inherently desirable for
action representation, inference and learning. The graph-based representation allows us to formally state and reason
about inter-schema relations declaratively while using their real-time execution capability for inference. This allows
our representation to be used to declaratively to specification and design or procedurally for projection and automatic
inference. The factorized topology of the graph supports recursive parameter estimation techniques (Jordan 1999)
and provides powerful constraints and inductive bias for relational structure learning (Getoor et al. 2001; Pfeffer &
Koller 2000). We believe these properties to be essential for representations that are to be used both for acting and for
reasoning about action descriptions.
The work described here is a first step toward a comprehensive treatment of counterfactuals. There are several
possible research threads opened up by the current work.
1. A complete computational model of the different types of counterfactuals. We would like to use the event and
action modeling framework to investigate the generation and interpretation of different types of counterfactuals.
(a) As far as we are aware, our model is the most detailed proposal and computational realization of event
structure. Early application of of the counterfacual evaluation algorithm, Algorithm 2, appears to be able
to tackle some previously unsolved issues in counterfactuals, including metaphoric counterfactuals and the
interaction of counterfactuals with other event construals. Systematic experiementation with the framework and algorithms would allow us to investigate the scope and richness of counterfactual inference in a
cognitively valid framework with realistic complexity.
(b) A second aspect of the model would be to simulate the pathological effects of defective counterfactual
reasoning. It is well known that in specific diseases (such as schizophrenia or depression), counterfactual
generation is severely impacted in disease specific ways (for instance, trouble keeping the alternative and
real worlds distinct, or excessively negative counterfactuals which do not change intention or behavior).
Also, the sudden onset of traumatic events often leads to the hyperactive generation of counterfactuals
(such as those accompanying regret). Having a computational model enables us for the first time to explore
in detail the connection of specific states and pathologies with counterfactual reasoning.
2. The language of counterfactuals. Languages have grammatical constructions (such as the subjunctive in English)
as well as lexical and pragmatic devices to indicate the counterfactuality of an utterance. The use of counterfactuals in language interacts strongly with other indicators of event structure including linguistic aspect, with tense
marking, and with epistemic distance. In their treatment of counterfactuals (Dancygier & Sweetser 2005) use
the theory of mental spaces (Fauconnier 1985; Fauconnier 1997). The specific construction and the pragmatic
context determine the type and structure of the mental space built during the analysis process.
In our model, the different simulations correspond to the the different mental spaces that are produced. The simulations make use of multiple cognitive structures including grammatical constructions, metaphor, metonymy,
frames, and schemas. Mappings between simulations produce blends that are emergent structures. Blends, thus
combine information for multiple forms (including metaphor, modality, aspect, counterfactuals) to produce an
overall interpretation. (Narayanan 1997) produced the first example of a computational model of blending in
41
the case of metaphoric language where the input spaces were spatial motion and the abstract economic policies.
The system was able to process information about abstract policies in sentences such as The economy has
fallen into recession and is on the verge of climbing out. where the metaphoric projection allows for the spatial
motion terms falling and climbing out to be interpreted in the context of recessions and abstract economic
states.
In combination with the X-net framework, three recent efforts within the NTL group at Berkeley make a computational model covering linguistic counterfactuals possible. Specifically, the NTL group has formalized cognitive structures in grammar (including frames, schemas, mental spaces, and metaphor) in Embodied Construction
Grammar (ECG) (Bergen & Chang 2002; Feldman et al. 2009). In his recent dissertation, John Bryant has built
an analysis program that is able to produce a simulation structure from an utterance using ECG (Bryant 2008).
Additionally, Steven Sinha, in his dissertaton, has extended the simulation framework to linguistic frames (via
FrameNet (http://framenet.icsi.berkeley.edu)) and to external web-based ontologies (encoded as OWL and RDF
triples) and has shown the framework to be capable of answering questions about complex events, including
questions about prediction, ability, causes, hypotheticals, and hypothesis disambiguation (Sinha 2008). Together this new work provides all the components to build a cognitively based computational model of linguistic
counterfactuals.
3. Counterfactuals and neural architecture of complex event representation. Previous work on PFC function
strongly suggests that the PFC is involved in coordinating complex events. However, the lack of a mechanism
or detailed computational model has rendered identification and functional descriptions of specific systems and
circuits impossible. We believe that our ontology and structured dynamic model of complex events provides the
necessary mechanism for a detailed investigation into the neural encoding of complex event structure. This could
potentially lead to a new understanding of the role of the PFC but also on the neural architecture of complex
event representation.
4. Differences in culture and individuals. Our model predicts that differences in the content and structure of action
encodings would relate to differences in the generation and comprehension of counterfactuals. For instance,
different obligations in different cultures, or different notions of goals or enablement and causation would lead to
differences in counterfactual generation and processing. Within the same culture there are likely to be difference
between novices and experts, between creative and less creative individuals and also differences in subgroups
based on age or gender. It would be interesting to use known differences in action and goal representation to
predict counterfactual generation and processing difficulty and also use differences in counterfactuals to infer
differences in action and goal representations.
42
References
A DDIS , D.R., W ONG A.T., & S CHACTER D.L. 2007. Remembering the past and imagining the future: Common
and distinct neural substrates during event construction and elaboration. Neuropsychologia 45.13631377.
A NDERSON , C ORIN R., P EDRO D OMINGOS, & DANIEL W ELD. 2002. Relational markov models and their application to adaptive web navigation. In Proc. KDD-2002.
BADRE , DAVID. 2008. Cognitive control, hierarchy, and the rostro-caudal organization of the frontal lobes. Trends in
Cognitive Sciences 12.193200.
, J OSHUA H OFFMAN, J EFFREY C OONEY, & M ARK DE SPOSITO. 2009. Hierarchical cognitive control deficits
following damage to the human frontal lobe. Nature Neuroscience 12.515522.
BARRETT, L EON, 2010. An Architecture for Structured, Concurrent, Real-time Action. Computer Science Division,
University of California at Berkeley dissertation.
BAUSE , F., & P. K RITZINGER. 1996. Stochastic Petri Nets: An introduction to the theory. Vieweg Verlag.
B ECK , S., E. ROBINSON, D. C ARROLL, & I. A PPERLEY. 2006. Childrens thinking about counterfactuals and future
hypotheticals. Child Development 77.413426.
B ERGEN , B ENJAMIN K., & NANCY C. C HANG. 2002. Simulation-based language understanding in Embodied
Construction Grammar. In Construction Grammar(s): Cognitive and Cross-language dimensions, ed. by Jan-Ola
Ostman.
John Benjamins.
B OYEN , X AVIER, & DAPHNE KOLLER. 1998. Tractable inference for complex stochastic processes. In Proceedings
of the Conference on Uncertainty in AI, UAI-1998, 3342.
B RYANT, J OHN E DWARD, 2008. Best-Fit Constructional Analysis. Computer Science Division, University of California at Berkeley dissertation.
B RYNE , RUTH. 2005. The Rational Imagination. Cambridge, England: Cambridge University Press.
B UCKNER , R.L., & D. C. C ARROLL. 2007. Self-projection and the brain. Trends in Cognitive Sciences 11.4957.
, A NDREWS -H ANNA J.R., & S CHACTER D.L. 2008. The brains default network: Anatomy, function, and
relevance to disease. The Year in Cognitive Neuroscience, Annals of the New York Academy of Sciences 1124.1
38.
C AMILLE , NATHALIE, G IORGIO C ORICELLI, J EROME S ALLET, PASCALE, & P RADAT-D IEHL J EAN -R ENE. 2004.
The involvement of the orbitofrontal cortex in the experience of regret. Science 304.11671169.
C HANG , NANCY, DANIEL G ILDEA, & S RINI NARAYANAN. 1998. A dynamic model of aspectual composition. In
Proc. 20th Cognitive Science Society Conference, Madison, Wisconsin.
, S RINI NARAYANAN, & M IRIAM R.L. P ETRUCK. 2002a. Putting frames in perspective. In Proc. Nineteenth
International Conference on Computational Linguistics (COLING 2002).
, S RINIVAS NARAYANAN, & M IRIAM R.L. P ETRUCK. 2002b. From frames to inference. In Proceedings of the
First International Workshop on Scalable Natural Language Understanding, Heidleberg. SCANALU.
, S RINIVAS NARAYANAN, & M IRIAM R.L P ETRUCK. 2002c. Putting frames in perspective. In Proceedings of
19th International Conference on Computational Linguistics, Taipei. COLING.
C HENG , P. 1997. From covariation to causation: a causal power theory. Psychological Review 104.367405.
C IARDO , G IANFRANCO, R EINHARD G ERMAN, & C HRISTOPH L INDEMANN. 1994. A characterization of the
stochastic process underlying a stochastic petri net. Software Engineering 20.506515.
43
C OSTELLO , T OM, & J OHN M C C ARTHY. 1999. Useful counterfactuals. Electronic Transactions on Artificial Intelligence (ETAI) 3.
DANCYGIER , BARBARA, & E VE S WEETSER. 2005. Mental Spaces in Grammar: Conditional Constructions. Cambridge: Cambridge University Press.
D REHER , J EAN -C LAUDE, E. KOECHLIN, M. T IERNEY, & J. G RAFMAN. 2008. Damage to the fronto-polar cortex
is associated with impaired multi-tasking. PLoS One 3.19.
E PSTUDE , K., & N.J. ROESE. 2008. The functional theory of counterfactual thinking. Personality and Social
Psychology Review 12.168192.
FAUCONNIER , G ILLES. 1985. Mental Spaces: Aspects of Meaning Construction in Natural Language. Cambridge,
Mass. and London: MIT Press/Bradford.
. 1997. Mappings in thought and language. Cambridge, England: Cambridge University Press.
, & M ARK T URNER. 2002. The way we think: Conceptual blending and the minds hidden complexities. New
York: Basic Books.
F ELDMAN , J. 1990. Computational constraints on higher neural representations. In Computational Neuroscience.
Cambridge, MA: MIT Press.
2006. From Molecule to Metaphor. Cambridge, MA: MIT Press.
, E. D ODGE, & J. B RYANT. 2009. A neural theory of language and embodied construction grammar. The Oxford
Handbook of Linguistic Analysis, B. Heine and H. Narrog, eds. .
, M ARK A. FANTY, N IGEL H. G ODDARD, & K ENTON J. LYNNE. 1988. Computing with structured connectionist networks. Communications of the ACM 170187.
, & S. NARAYANAN. 2004. Embodied meaning in a neural theory of language. Brain and Language 89.385392.
F ILLMORE , C HARLES J., & C OLLIN F. BAKER. 2010. A frame approach to semantic analysis. In Oxford Handbook
of Linguistic Analysis, ed. by Bernd Heine & Heiko Narrog. OUP.
G ERMAN , T.P., & S. N ICHOLS. 2003. Childrens counterfactual inferences about long and short causal chains.
Developmental Science 6.514523.
G ETOOR , L ISE, N IR F RIEDMAN, DAPHNE KOLLER, & AVI P FEFFER. 2001. Learning probabilistic relational
models. In Relational Data Mining, ed. by DzeroskiLavrac, 307335. SV.
G INSBERG , M.L. 1986. Counterfactuals. Artificial Intelligence 30.3579.
G OODMAN , N ELSON. 1983. Fact, fiction, and forecast / Nelson Goodman. Harvard University Press, Cambridge,
Mass. :, 4th ed. edition.
G UAJARDO , N.R., & K.J. T URLEY-A MES. 2004. Preschoolers generation of different types of counterfactual statements and theory of mind understanding. Cogntive Development 19.5380.
G UTTENTAG , R., & J. F ERRELL. 2004. Reality compared with its alternatives: Age differences in judgments of
regret and relief. Developmental Psychology 40.764775.
H ARDING , J ENNIFER . R. 2007. Evaluative stance and counterfactuals. Language and Literature 16.263280.
H OFSTADTER , D OUGLAS. 1979. Goedel, Escher, Bach: An Eternal Golden Braid. Basic Books.
H OOKER , C., N. J ROESE, & S. PARK. 2000. Impoverished counterfactual thinking is associated with schizophrenia.
Psychiatry 63.326335.
44
J ENSEN , F INN V. 1996. Introduction to Bayesian Networks. Springer-Verlag.

J OHNSON , M ARK. 1987. The Body in the Mind. University of Chicago Press.
J ORDAN , M ICHAEL I. 1999. Learning in graphical models. MIT Press.
K AHNEMAN , D., & A. T VERSKY. 1982. The psychology of preferences. Scientific American 246.160173.
K NIGHT, R.T., & M. G RABOWECKY. 1995. Escape from linear time: Prefrontal cortex and conscious experience.
M. S. Gazzaniga (Ed.): The cognitive neurosciences 13571371.
KOECHLIN , E TIENNE, A DRIAN DANEK, Y VES B URNOD, & J ORDAN G RAFMAN. 2002. Medial prefrontal and
subcortical mechanisms underlying the acquisition of motor and cognitive action sequences in humans. Neuron
35.371381.
K RUEGER , F., A. BARBEY, & J. G RAFMAN. 2008. The medial prefrontal cortex mediates social event knowledge.
Trends in Cognitive Sciences 749.17.
L AKOFF , G EORGE. 1987. Women, Fire, and Dangerous Things: What Categories Reveal about the Mind. University
of Chicago Press.
. 1993. Cognitive phonology. In The Last Phonological Rule, ed. by John Goldsmith. Chicago: University of
Chicago Press.
. 1994. What is metaphor. In Advances in Connectionist Theory, V3: Analogical Connections, ed. by Barnden J.
& K Holyoak. Addison-Wesley.
. 2009. The Neural Theory of Metaphor. SSRN eLibrary .
, & M ARK J OHNSON. 1980. Metaphors We Live By. University of Chicago Press.
L ANDMAN , J. 1993. Regret: The persistence of the possible. New York NJ: Oxford University Press.
L ANGACKER , RONALD W. 1991. Concept, Image, and Symbol: The Cognitive Basis of Grammar. Cognitive
Linguistics Research. Berlin and New York: Mouton de Gruyter.
L EWIS , DAVID. 1973. Counterfactuals. Blackwell Publishers.
M AKIN , J OSEPH, 2008. A Computational Model of Human Blood Clotting: Simulation, Analysis, Control, and
Validation. Computer Science Division, University of California at Berkeley dissertation.
M ANDEL , D.R., D.J. H ILTON, & P. C ATELLANI. 2005. The psychology of counterfactual thinking. London:
Routledge.
M ARKMAN , K.D., G AVANSKI I., S HERMAN S. J., , & M C M ULLEN M. N. 1993. The mental simulation of better
and worse possible worlds. Journal of Experimental Social Psychology 29.87109.
, M ATTHEW N. M C M ULLEN, RONALD A. E LIZAGA, & N OBUKO M IZOGUCHI. 2006. Counterfactual thinking
and regulatory fit. Judgment and Decision Making 1.98107.
, & M.N. M C M ULLEN. 2003. A reflection and evaluation model of comparative thinking. Personality and Social
Psychology Review 7.244267.
M C NAMARA , P., R. D URSO, A. B ROWN, & A. LYNCH. 2003. Counterfactual cognitive deficit in persons with
parkinsons disease. Journal of Neurology, Neurosurgery, and Psychiatry 74.10651070.
M EDVEC , V. H., M ADEY S. F., & T. G ILOVICH. 1995. When less is more: Counterfactual thinking and satisfaction
among olympic athletes. Journal of Personality and Social Psychology 69.603610.
45
M ILCH , B RIAN, B HASKARA M ARTHI, S TUART RUSSELL, DAVID S ONTAG, DANIEL L. O NG, & A NDREY
KOLOBOV. 2007. Blog: Probabilistic models with unknown objects. In Introduction to Statistical Relational
Relational Learning, ed. by List Getoor & Ben Tasker. MIT Press.
M ILLER , E.K., & J.D. C OHEN. 2001. An integrative theory of prefrontal cortex function. Annual Review Neuroscience 24.167202.
M URPHY, K EVIN, 2002. Dynamic Bayesian Networks: Representation, Inference, and Learning. Computer Science
Division, University of California at Berkeley dissertation.
NARAYANAN , S RINI, 1997. Knowledge-based Action Representations for Metaphor and Aspect (KARMA). Computer
Science Division, University of California at Berkeley dissertation.
. 1999a. Moving right along: A computational model of metaphoric reasoning about events. In Proc. Sixteenth
National Conference of Artificial Intelligence (AAAI-99). AAAI Press, Menlo Park.
. 1999b. Reasoning about actions in narrative understanding. In Proc. Sixteenth International Joint Conference
on Artificial Intelligence (IJCAI-99). Morgan Kaufmann Press.
. 2003. Cortico-subcortical loops and cognition: A computational model and preliminary results. Neurocomputing
52.605614.
, & S. M C I LLRAITH. 2003. Analysis and simulation of web services. Computer Networks 42.675693.
NASCO , S.A., & K.L. M ARSH. 1999. Gaining control through counterfactual thinking. Personality and Social
Psychology Bulletin 25.556568.
N IEDENTHAL , P.M., TANGNEY J. P., & G AVANSKI I. 1994. If only i werent versus if only i hadnt: Distinguishing
shame and guilt in counterfactual thinking. Journal of Personality and Social Psychology 67.585595.
P EARL , J UDEA. 1988. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. San Mateo,
CA: Morgan Kaufmann.
. 2000. Causality. Cambridge, England: Cambridge University Press.
P FEFFER , AVI, 2000. Probabilistic Reasoning for Complex Systems. Stanford University dissertation.
, & DAPHNE KOLLER. 2000. Semantics and inference for recursive probability models. In AAAI/IAAI, 538544.
R ICHARDSON , M ATTHEW, & P EDRO D OMINGOS. 2006. Markov logic networks. In Machine Learning Journal, 62:
107136.
R IGGS , K.J., D.M. P ETERSON, E.J. ROBINSON, & P.R. M ITCHELL. 1998. Are errors in false belief tasks symptomatic of a broader difficulty with counterfactuality? Cognitive Development 13.7390.
ROESE , N.J., , & M. M ORRISON. 2009. The psychology of counterfactual thinking. Historical Social Research
34.1626.
1997. Counterfactual thinking. Psychological Bulletin 121.133148.
, & J.M. O LSON. 2003. Counterfactual thinking. Encyclopedia of Cognitive Science 34.858861.
, L.T. S ANNA, & A.D. G ALINSKY. 2005. The mechanics of imagination: Automaticity and control in counterfactual thinking. New York: Oxford University Press.
S CHEFFCZYK , JAN, C OLLIN F. BAKER, & S RINI NARAYANAN. 2010. Reasoning over natural language text by
means of FrameNet and ontologies. In Ontology and the Lexicon: A Natural Language Processing Perspective,
ed. by Chu-Ren Huang, Nicoletta Calzolari, Aldo Gangemi, Alessandro Lenci, Alessandro Oltramari, & Laurent
Prevot, Studies in Natural Language Processing, chapter 4, 5371. Cambridge, UK: Cambridge University Press.
Expanded version of paper at OntoLex, 2006. (ISBN-13: 9780521886598).
46
S INHA , S TEVEN, 2008. Answering Questions about Complex Events. Computer Science Division, University of
California at Berkeley dissertation.
TALMY, L EONARD. 1988. Force dynamics in language and cognition. Cognitive Science 49100.
. 2000. Toward a Cognitive Semantics. MIT Press.
T ETLOCK , P.E., & A. B ELKIN. 1996. Counterfactual thought experiments in world politics: Logical, methodological,
and psychological perspectives. Princeton NJ: Princeton University Press.
, K RISTEL O. V., E LSON S. B., G REEN M. C., & L ERNER J. S. 2000. The psychology of the unthinkable: Taboo
trade-offs, forbidden base rates, and heretical counterfactuals. Journal of Personality and Social Psychology
79.173196.
U RSU , S., & C.S. C ARTER. 2005. Outcome representations, counterfactual comparisons and the human orbitofrontal
cortex: Implications for neuroimaging studies of decision-making. Cognitive Brain Research 23.5160.
W EISBERG , D.D., & P. B LOOM. 2009. Young children separate multiple pretend worlds. Developmental Science
12.699705.
W OOD , J.N., & J. G RAFMAN. 2003. Human prefrontal cortex: Processing and representational perspectives. Nature
Reviews Neuroscience 4.139147.
47

Counterfactuals and Innovation

Загружено:

Сведения о документе

Исходное описание:

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Counterfactuals and Innovation

Загружено:

Авторское право:

Доступные форматы

Mind changes: A simulation semantics account of counterfactuals

Counterfactuals and contrast

Goals and counterfactuals

Counterfactuals and Emotion

Counterfactuals and Development

The Neural Correlates of Counterfactual Behavior

the sharing of counterfactual thoughts.

Components of a cognitive model of counterfactuals

Mutability and the minimum rewrite rule

Minimal rewrites occur in the space of event structures

An ontology of actions and events

Figure 1: A rich ontology of actions and events.

Simulation Semantics: Basics

5. It is always done in the context of a conceptual system and a belief system.

Cognitive structures for simulation

A Computational Model of Simulation Semantics

f) Goal Based Enabling

Figure 5: Basic Features of events

Using the X-net action model for counterfactuals

Recall that in our theory, counterfactuals are generated whenever

Event 1: Intefada restarts

Event 1: Peace talks suspended

Consider the following sentence:

Voting.enable & not(crime)

Simulation and Temporal Order

Acquisition focus versus avoidance focus

Binds to the specific

Prefrontal Cortex as having a central role in coordinating complex events.

Mapping to the Neural Architecture

The Architecture of the Prefrontal Cortex (PFC)

Mapping X-nets to the PFC

Technical details of the computational model

Figure 14: Structured Probabilistic Models and Inference Space

X-nets: An executable semantics of event structure

A PRM model of states

Definition 4 A PRM Model of a State

Inference With CPRMs

Definition 6 Reachability set

where qii is the sum of the transition rates out of Mi .

The Counterfactual Evaluation Algorithm

Applying the Algorithm

Event Schema: Peace Talks

Event Schema: Meeting

Produce resource: conditions for restarting talks

Event 1: Meeting This week

Event 1: Peace talks suspended

P recond(T ) = A(P, T )A(I, T ).

(c) The evidence is F (I, S) .

Consider the following sentence.

Discussion and future work

J ENSEN , F INN V. 1996. Introduction to Bayesian Networks. Springer-Verlag.

Вам также может понравиться