Вы находитесь на странице: 1из 368


This page intentionally left blank

Realism Regained
An Exact Theory of Causation, Teleology, and the Mind



UNIVERSITY PRESS Oxford NewYork Athens Auckland Bangkok Bogota Buenos Aires Calcutta Cape Town Chennai Dares Salaam Delhi Florence Hong Kong Istanbul Karachi Kuala Lumpur Madrid Melbourne Mexico City Mumbai Nairobi Paris Sao Paulo Singapore Taipei Tokyo Toronto Warsaw and associated companies in Berlin Ibadan

Copyright2000 by Robert C.Koons

Published by Oxford University Press, Inc. 198 Madison Avenue, New York, New York 10016 Oxford is a registered trademark of Oxford University Press All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without the prior permission of Oxford University Press. Library of Congress Cataloging-in-Publication Data Koons, RobertC. Realism regained: an exact theory of causation, teleology, and the mind Robert C. Koons p. cm. Includes bibliographical references and index. ISBN 0-19-513567-9 1. Causation. 1. Title. BD541.K66 2000 122dc21 99-054666

9 8 7 6 5 4 3 2 1 Printed in the United States of America on acid-free paper

In memory of Jon Barwise, friend and mentor

This page intentionally left blank

Causation, the relation of cause to effect, has long been recognized as one of the most central subjects in philosophy. After a period of relative neglect during the era of logical positivism, the late twentieth century has seen a renaissance of interest in causation, as one philosopher after another provides a "causal theory" of this or that phenomenon: reference and meaning, identity and duration, perception and knowledge, information and representation. At the same time, the development of the formal disciplines, including modal logic (the logic of possibility and necessity), probability theory, mereology (the theory of parts and wholes), defeasible or "nonmonotonic" logics (developed in the field of artificial intelligence to represent commonsense inference), and partial semantics (most prominently, the situation theory of Barwise, Perry, and Etchemendy), has provided the tools needed for an exact and comprehensive theory of causation. Up to this point, formal accounts of causation have followed the empiricist strictures laid down by David Hume. These accounts of causation force the concept into the periphery (making the concept of causation dependent on our prior understanding of such theoretical machinery as spatiotemporal location, subjunctive conditionals, experience, and empirical knowledge) and consequently do not mesh with the causal theories that have become so popular in epistemology and the philosophy of mind, which, by contrast, require causation to play a central and non-derivative role. In this book, I construct a non-Humean or realist theory of causation (employing the technical tools mentioned in the preceding paragraph), and I show how this account sheds light on existing causal theories and their outstanding problems. In the process, I sketch a metaphysical theory that employs relatively few primitive elements and comprises a well-understood mathematical theory of these elements and a precise account, in terms of these elements, of a wide variety of phenomena, drawn both from our common experience and scientific knowledge. These phenomena include information, teleology and biological function, mental representation, qualia and mental causation, our knowledge of logic, mathematics, and theoretical science, the structure of space and time, the identity and duration of physical objects, and the nature and objectivity of ethical values. I offer what could be called a "naturalistic" account of the normative dimension: the standards of correctness and propriety that are essential to our understanding both of intentionality and of ethics. It builds upon and refines


Realism Regained

recent work on the teleological theory of norms on the part of Dretske, Stampe, Millikan, and others. At the same time, the argument of the book is in part directed against a narrowly materialistic ontology. I provide seven independent lines of argument for thinking that we need to recognize the existence of states other than merely physical states; in particular, we must acknowledge the existence of modal facts, including facts of logical, mathematical, and natural necessity. By bringing these modal facts within the scope of causation, I explain how it is possible for us to gain information about them. Consequently, I am able to defend a position that is realist in the sense both of including a version of the traditional correspondence theory of truth and of including an ontology in which mental states, qualia, numbers and sets, objective norms, and modal facts are first-class citizens. Acknowledgment is made to the following publishers for their kind permission to reprint excerpts from: "Teleology as Higher-Order Causation: A Situation-Theoretic Account," Minds and Machines 8 (1998): 559-585. Published by Kluwer Academic Publishers; reprinted on pages 82-90, 95-96, 115-116, 135-143, and 203-215. "Situation-Mereology and the Logic of Causation," Topoi 18 (1999). Published by Kluwer Academic Publishers; reprinted in chapter 3, pages 35-49. "A New Look at the Cosmological Argument," by Robert C. Koons, American Philosophical Quarterly 34 (April 1997), pages 194-199 and 202-207; reprinted in chapter 9, pages 146-159. "Information, Representation and the Problem of Error," by Robert C. Koons, 000000000000000000000000000000000000000000000000000000000000000000 published by the Center for the Study of Language and Information, Stanford, California, 1996, pages 333-345; reprinted in chapter 11, pages 181-184. Work on this book was made possible by a Faculty Research Assignment from the University Research Institute at the University of Texas at Austin, during the spring semester of 1997. I would also like to thank Michael Dunn, the Philosophy Department and the Institute for Advanced Study at Indiana University for their support during my visit in Bloomington during much of that semester. I would also like to thank Anil Gupta, and Gregg Rosenberg, who provided very helpful feedback on early drafts of the book. Jon Barwise provided the inspiration for the formal framework, situation theory, used in this book, and Jon was extraordinarily generous in giving me both his time and his encouragement at the inception of the project. Professor Barwise was one of the most creative and original philosophers of our time. He will be sorely missed. My debt to my teachers, including David Charles at Oriel; Robert M. Adams, David Kaplan, and Tony Martin at UCLA; and, especially, my doctoral supervisor, Tyler Burge, is incalculable. Drafts of several chapters were much improved through discussion with the Naturalism Reading Group at the University of Texas: Daniel Bonevac, Brian Leiter, Cory Juhl, and David Sosa. My colleague Nicholas Asher has played an



indispensable role in the development of my ideas concerning nonmonotonic inference. Professor T. K. Seung has mentored me throughout my years in Austin and opened my eyes to the contemporary relevance of Plato's later philosophy. Ms. Yi Mao provided very helpful feedback, as did two anonymous referees for Oxford University Press. I would also like to thank my editor, Peter Ohlin, for his perseverance in support of this project. Finally, I thank my wife, Debbie, for her patience and support. Austin, Texas August 1999 R. C. K.

This page intentionally left blank

1 Introduction 1

1.1 A Comprehensive Realism 1.2 Metaphysical Method 1.3 An Alternative to Both Physicalism and Mysterianism 1.4 Causal Internalism 1.5 The Ontology of Causation 1.6 The Need for an Indeterministic Model 1.7 A Causal-Probabilistic Theory of Information 1.8 Why an Exact Theory? 1.9 The Big Picture: Preview of Part II 1.10 A Glossary of Symbols

1 3 3 4 7 8 9 10 11 14

1 A Theory of Causation and Information

2 Toward a Unified Theory of Causation 19

2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 2.10 2.11 2.12 2.13

The Nomological/Deductive Tradition Theories of Probabilistic Causation Davidson and Event-Tokens Lewis's Counterfactual Account Mackie's INUS Conditions Yablo's Theory Branching-Time Models Artificial Intelligence and Models of Causal Inference Tooley and Cartwright . . Process and Linkage Theories Mellor's Theory Accounts of Causal Asymmetry Distinctive Features of My Theory

20 20 21 21 23 24 25 25 25 26 27 28 29

xii 3 Situation Theory and Causation 3.1 The Need for Situation Theory 3.2 Situation Mereology and Causation 3.3 A Situation-Theoretic Logic of Causation 3.4 The Transitivity of INUS Causation 4 A Deterministic Model 4.1 Desiderata 4.2 Causation and Determinism 4.3 Basic Ontology 4.4 Constraints and Causation 4.5 Defining Causal Explanation 4.6 Singular Causation 4.7 Empiricism and Modality 4.8 Causal Relevance 4.9 Piecemeal Causation 4.10 Desirable Features of the Theory 4.11 Applying the Theory to Some Examples 4.12 Verifying the Axioms of Chapter 3 5 An 5.1 5.2 5.3 5.4 5.5 5.6 Indeterministic Model Beyond Determinism Why an Indeterministic Account Is Difficult If Not Determinism, Then What? Causation and Causal Explanation Desirable Features of the Theory Example Applications

Realism Regained 31 31 37 39 41 45 45 47 49 55 56 59 61 61 65 65 69 75 77 77 78 79 81 82 87 91 91 92 92 93 93 97 99 99 101 102 103 105

6 A Probabilistic Model of Causation 6.1 Models 6.2 Token Causation 6.3 Weighted Causal Constraints on Types 6.4 Probabilistic Explanation 6.5 Examples 6.6 Humphreys's Explanation 7 Higher-Order Causation: Modal Facts as Causes 7.1 A Problem with Higher-Order Causation 7.2 Modal Facts as Causes 7.3 The Causal Relevance of the Excluded Middle 7.4 First-Order Teleological Causation 7.5 Higher-Order Teleological Causation

Contents 8 The Universality of Causation 8.1 A Modal Mereology of Situations 8.2 Principles of Causation 8.3 The Universality of Causation 8.4 The Existence of an Uncaused First Cause 8.5 The Well-Foundedness of Causation 8.6 Objections 9 A Theory of Information and Misinformation 9.1 Introduction 9.2 The Historical (Retrospective) Strategy 9.3 Two New Strategies 9.4 Information as the Basis of Knowledge 10 A Look Back, and Ahead 10.1 The Causal Relation 10.2 Against Determinism 10.3 Spacetime as Constrained by Causation, Not Vice Versa


107 108 109 109 110 113 113 121 121 122 124 127 129 129 129 130

II Applications to Metaphysics, Epistemology, and Ethics

11 An Overview 11.1 Teleology as Higher-Order Causation 11.2 Teleosemantics 11.3 The Link between Teleosemantics and Epistemology 11.4 Causal/Teleological Accounts of Knowledge 11.5 Mental Causation and Qualia 11.6 Teleological Accounts of Ethics 11.7 Enduring Substances as Logical Constructions 12 Teleology as Higher-Order Causation 12.1 Three Definitions of Teleology 12.2 Darwin: Real or Only Apparent Functionality? 12.3 Retrospective and Non-Retrospective Accounts 12.4 Extrinsic Functions and the Extended Phenotype . 12.5 Our Knowledge of Teleology 12.6 Teleological Natural Kinds 135 135 136 137 137 138 138 139 141 141 149 150 151 152 153

xiv 13 Causal Theories of Mental Content 13.1 Millikan 13.2 Dretske 13.3 Fodor's Critique of Teleological Semantics 14 Teleosemantics of Mental Representations 14.1 An Overview of Representational States 14.2 Pre-cognitive Representations 14.3 Cognitive States: Opinions and Intentions 14.4 Mental Representation and Language 14.5 The Narrowness of Mental Content 14.6 Teleosemantics and the Liar Paradox

Realism Regained 155 155 156 158 161 161 163 166 167 168 168 169 169 170 172 173 . 184 185 190 192 194 197 197 198 201 205 206 207 213 213 217 217

15 A Causal Theory of Logical and Mathematical Cognition 15.1 The Need for a Causal Theory 15.2 Logico-Modal Facts as Causes 15.3 Knowing How to Infer Correctly 15.4 Is Logic Factual? 15.5 Logical and Physical Necessity 15.6 From Logic to Arithmetic 15.7 Set Theory and Other Branches of Mathematics 15.8 Alternatives to Mathematical Realism 15.9 Why the Human Mind Is Not a Turing Machine 16 A Teleological Theory of the Mind 16.1 The Irony of Non-Reductive Materialism . . .16.2 Supervenience and Type and Token Identity 16.3 Downward Causation versus Epiphenomenalism 16.4 Two Further Problems of Mental Causation 16.5 Qualia 16.6 Problem Cases 16.7 The Correlation of Qualia and Physiology 16.8 Free Will 17 Teleological Reliabilism 17.1 Reliabilism: The Reference Class Problem

Contents 17.2 Grue, Bleen, and the New Riddle of Induction 17.3 Curve-Fitting: The Problem of Mathematical Simplicity 17.4 The Reliability of Simplicity as a Criterion of Truth 17.5 The Incompatibility of Materialism and Scientific Realism 17.6 When Does Bayesian Learning Constitute Knowledge? 17.7 Objective Chance and Empiricism 18 Enduring Substances and Their Identities 18.1 Substances as Logical Constructions 18.2 Change and the Johnston Paradox 18.3 Zeno's Paradox and the Instant of Change 18.4 Hard Cases for Substance Identity 18.5 Quantum Reality and the Foundations of Materialism 19 Eudaemonism and the Objectivity of Value 19.1 Objectified Subjectivity: A Dead End 19.2 Eudaemonia 19.3 The Connection between Eudaemonia and Motivation 19.4 Nature and Nurture 19.5 The Unity and Universality of Good 19.6 Indeterminacy and Objectivity 19.7 The Semantics and Epistemology of Ethics 19.8 Eudaemonism versus Evolutionary Ethics 19.9 Moore and the Indefinability of Good 20 Moral Theory as the Teleology of Character 20.1 Virtue as Both Means and End 20.2 Eudaemonism versus Egoism 20.3 Is and Ought 20.4 Sociobiology, Game Theory, and Species Relativity 20.5 Elements of a Teleo-Ethological Morality 20.6 Politics and the Natural Law 20.7 Justice toward Future Generations 20.8 Kierkegaard and the Teleological Suspension of the Ethical


.218 220 221 222 232 234 241 241 244 245 245 249 257 257 260 260 263 263 265 266 267 268 271 271 271 272 273 274 274 276 277

xvi 21 A Coherent Realism Is a Comprehensive Realism 21.1 The Four Waves of Anti-Realism 21.2 A Prolegomenon to Any Future Critique of Metaphysics 21.3 Causalism, Yes! Materialism, No! 21.4 Anti-Realist Obscurantism 21.5 Is the Theory Naturalistic? A Partiality, Modality, and Conditionals A.I Partial Prepositional Logics A.2 Partial Modal Logics A.3 Partial Conditional Logics A.4 Partiality and Quantificational Logic A.5 First-Order Quantification over Situation-Types B A Causal Calculus B.I Causation and Projectible Statistics B.2 Some Other Well-Known Puzzles B.3 Screening Off B.4 Conditions on Hyperfinite Probability Functions B.5 Examples B.6 Abduction and Induction B.7 Proofs of Theorems B.2 through B.4 Bibliography Index

Realism Regained 279 279 281 282 285 286 289 289 293 297 303 309 311 311 313 314 316 320 323 324 329 341


This page intentionally left blank

Physicists are currently searching for what they call a "theory of everything." However, it turns out that the "everything" they have in mind falls far short of every thing. The physicists' theory of everything has nothing to say about mental phenomena, agency, values, norms, teleology, or intentionality, to mention but a few. In fact, physicists rarely have much to say about the natures of the fundamental elements of their theory: particles, fields, space, and time. None of this is surprising, and none of it is a criticism of current physics as such. When physicists refer to a coming "theory of everything," they do so (or, at least, the sophisticated ones do so) with tongue in cheek. It is metaphysics, and not physics, whose province it is to fashion a theory of everything. This book is a work in real, honest-to-God, no-apologies-given metaphysics, but metaphysics conducted in a thoroughly scientific spirit. My hope is that it will help to stimulate a return to the perennial concerns of philosophy.


A Comprehensive Realism

A class of propositions can be interpreted realistically when two conditions are met: 1. Some of the propositions are evaluated as true or false. 2. The truth or falsity of the propositions in the class is determined by some set of facts, and this set of facts plays an indispensable role in explaining our knowledge of the truth or falsity of the propositions in the class. The first condition is not sufficient, since the truth values of the propositions could be determined by facts about our collective acts of affirmation or projection, in which case the propositions could not be interpreted realistically. The causal element introduced by the second condition is critical, because it specifies a direction of asymmetric dependence: our knowledge depends causally on the fact establishing the truth or falsity of the corresponding propositions. This entails that the facts determining these truth conditions do not include facts about

Realism Regained

our attitude toward those very propositions, since causal dependency cannot be circular. I will argue that propositions involving reference to the following things can and should be interpreted realistically: Natural properties and relations Situation and event tokens Modality and objective probability Causal connections Numbers Proper functions (teleofunctions) Mental states Secondary qualities Enduring substances Values and norms Although my position is one of a comprehensive realism, I give a relatively simple and unified picture of the world. The first three items on the list above are treated as primitives, but all of the others are explicated in terms of these more fundamental entities, properties, and relations. Everything that is posited to exist is posited to exist because of some role it plays in the causal network of the world. My approach is resolutely non-dualistic: I reject any sort of Cartesian or neo-Cartesian postulation of a scientifically inaccessible realm of subjectivity. At the same time, I do not start with any a priori or dogmatic requirement. My aim has not been to build a theory of the mind that is materialistic or physicalistic or naturalistic. To begin one's metaphysical inquiry with such dogmatic commitments is methodologically irresponsible. We must simply follow the evidence where it leads. If it leads to materialism, well and good, but if it leads away from it (as my own account does in several respects), we must be willing to be accountable to the facts, not to philosophical fashion. Theories of content, meaning, and representation in terms of causal connection have become very prevalent. A number of philosophers have taken causal theories of content as reason to be anti-realist about values (Mackie, Harman), numbers (Field), and minds (the Churchlands). In my view, the burden of such anti-realism is too great for a theory of content to bear. However, if a causal theory of content could be devised that vindicated realism about values, numbers, and minds, such a theory would give us the best of both: a plausible, informative, and simple account of content, and the accommodation of much of our commonsense view of the world. In this book, I will try to develop such a theory.



Metaphysical Method

This book is unapologetically a work of substantive metaphysical theory. Fortunately, blind anti-metaphysical prejudice is not as common as it once was. Nonetheless, many may legitimately ask for the ground rules of the enterprise. In a recent book on causation, Daniel Hausman (1998) proposed five criteria for evaluating metaphysical theories: 1. Intuitive fit 2. Empirical adequacy, consistency with what we know about the world, including our best scientific knowledge 3. Epistemic access the theory should include some account of how we could come to know its truth 4. Superseding competitors the theory should incorporate the successes of its predecessors 5. Metaphysical fecundity the theory should shed light on a variety of metaphysical issues The only criterion that I would add to the list is that of simplicity or elegance. A good metaphysical theory should not be in need of ad hoc rescues or endless epicyclic tinkering. The principal motivation of my work is that of unification. I aim to provide a unified account of intentionality and knowledge, one in which we give exactly the same kind of account both for our thought about and knowledge of objects and events in space and time, and for our thought about and knowledge of the facts of logic, mathematics, laws of nature, and objective chance. We should not accept a bifurcated, disjunctive account of thought and of knowledge so long as a unified account is possible. The theoretical cost of postulating genuine modal facts (as I do) is small in comparison to the benefits of unification.


An Alternative to Both Physicalism and Mysterianism

Since causal relations play the fundamental role in my metaphysics, the term "causalism" might be an appropriate term for my approach. In recent years, others have taken what could be described as an essentially causalist approach to the metaphysics of mind, namely Armstrong, Millikan, Dretske, Papineau, and Lycan. A causalist theory of mind identifies intentionality with a certain kind of causal property (perhaps involving higher-order causal connections), and the peculiar qualities of conscious experience are taken to be explicable in terms of their intentionality. In all of these cases, causalism is seen as a strategy for defending materialism against various objections concerning intentionality and consciousness. The opponents of these approaches, including Searle and

Realism Regained

McGinn, have been labeled the "mysterians," since they hold that we can expect to find no informative account of the nature of intrinsic intentionality or consciousness. Unfortunately, those participating in these controversies have overlooked the fact that causalism is separable from a commitment to physicalism or materialism. A non-physicalist causalism would include an informative account of the nature of mental states without insisting that everything can ultimately be explained in terms of atoms and the void. I will argue that all of the extant objections to causalist theories of mind are in reality objections to the conjunction of causalism with physicalism. A non-physicalist causalism provides the resources for an adequate answer to these objections. In addition, I will argue that there are independent grounds, having nothing to do with the philosophy of mind, for rejecting physicalism.


Causal Internalism

The notion of causality is absolutely central to recent philosophical work in semantics, the philosophy of mind and intentionality, epistemology, and philosophy of science. Work by Donnellan, Kripke (1980), and Putnam (1975) helped to make causal connections an indispensable part of our accounts of reference and signification. This in turn has generated causal theories of information and content by Dretske (1981), Fodor (1990), and others. The Gettier problem led to the renaissance of causal theories of knowledge by Goldman (1979), Armstrong (1968), Pollock (1986), and Plantinga (1993). Causality is put to much work in recent theories of personal identity and of the nature of mental states (as in the functionalism of Lewis (1986b) and Putnam (1975)). Causation continues to figure prominently in philosophy of science e.g., Wesley Salmon's causal theory of evidence (Salmon (1984)) and in theoretical science, both within physics and outside. Additionally, causal reasoning plays a central role in both understanding and predicting events. Recent work in artificial intelligence has brought causal reasoning into renewed prominence. For example, the much-discussed Yale Shooting Problem reveals (according to most diagnoses; see especially Pearl (1988)) the absolute necessity of recording and using information about the causal links between the bits of information we have about the world. Attempts to explain away causation or to replace it with some purely statistical regularity (whether or not supplemented by some kind of psychologistic decoration) have proved to be catastrophic failures. Every attempt to explain causal direction (surely one of the most fundamental features of causality) in terms of the nomological-deductive model has failed. Such models of causality have generated paradoxes far more rapidly than ad hoc solutions can be invented for them. If a robust sense of reality leads us to recognize causal connections as firstclass citizens of our ontological inventory, we must also make room for those special kinds of objects that can serve as relata for causal relations, whether


we call these objects possible 'facts', 'situations', or 'states of affairs'. These objects must be distinguished from propositions and from quasi-linguistic representations if we are to capture accurately the logical relations governing causal idioms. The restoration of such fact-like entities to respectability has also been a common theme of recent work in philosophy, including philosophical linguistics and the Stanford situation theory of Barwise and Perry (1983). The project of building a unified theory of intentionality and knowledge in causal (or teleo-causal) terms faces a major obstacle: accounting for our knowledge of modal facts, i.e., facts about necessity and possibility (including logical and mathematical modality), about counterfactual conditionals, about objective chance or propensity (as a generalization of objective modality), and about physical or natural necessity as embodied in natural laws. This obstacle is a generalization of the problem Paul Benacerraf (Benacerraf (1983a), Benacerraf (1983b)) has raised in the case of mathematics: how is definite reference to and substantive knowledge about mathematical objects possible, given that our best theories of reference and knowledge involve causal connections between our thoughts and their targeted aspects of reality? Benacerraf's problem generalizes to our thought about the laws of nature, about the objective chances of certain kinds of events in certain situations, and about various kinds of possibility and necessity. In each case, we seem to have intentional reference to and knowledge of things that the philosophical tradition has long considered to be causally inert. Overcoming this obstacle calls for a revolutionary rethinking of our standard picture of causation. This standard picture I call the horizontal or externalist model of causation. The alternative I am proposing is the thesis of causal internalism, which countenances the reality of vertical causation. On the standard, horizontal model, causes and effects are, exclusively, physical, spatiotemporally local states and occurrences. The causal nexus, whether it consists in a kind of necessary, stochastic, or nomic connection, stands outside of both the cause and the effect. This is why I call it causal externalism: the causal nexus is wholly external to both the cause and the effect. The horizontal/externalist model can account for our knowledge of occurrent properties realized in spatiotemporal locations, but it leaves the entire realm of modality causally, and, therefore, cognitively and epistemically, inaccessible. My alternative proposal is that we consider the modal (or nomic or stochastic) facts that tie the cause to the effect to be internal to the cause or to the effect. Depending on the details of one's account of causation, causes necessitate or probabilify or possibility their effects. On an internalist model, the fact that a given cause necessitates its effect is itself an integral part of the total cause, not something that stands outside or above the cause-effect pair. Consequently, modal facts are every bit as causally efficacious as are occurrent physical facts, and so there is no barrier to providing a unified, causal theory of all of human thought and knowledge. For instance, we can think about and gain knowledge of natural laws by virtue of the fact that each of these laws enters into some, but not all, causal connections. When we observe a regularity (like the elliptical orbits of the planets) that is really caused by a particular nomic fact (like

Realism Regained

the law of gravitation), then our observations provide us with intentional and epistemic contact with that nomic fact. Here are some of the more significant claims that I make in part I concerning the nature of causation: 1. The causal nexus is not something above and outside the cause and effect but consists of facts wholly internal to the cause and the effect. This thesis of causal internalism commits me to the existence of vertical causation from modal and nomic facts to ordinary spatiotemporal ones, crucial to giving a unified, causal account of intentionality and knowledge. 2. Modal facts exist, including facts of logical and mathematical necessity, and these facts are not reducible to or supervenient on the occurrent facts of the world (including its merely actual regularities). The existence of logical types (negations, conjunctions, disjunctions, etc.) of arbitrary complexity is a substantive fact about the world. 3. There are compelling reasons for rejecting a strong version of determinism, reasons that are independent of the problem of free will (chapters 4 and 5). 4. Only actual situations exist, but in constructing models for modal logic, it is convenient to introduce the fiction of merely possible and even impossible situations. 5. I propose a new solution to the problem of the scope or extent of causation, namely, that every wholly contingent state has a cause. On the basis of this principle, I demonstrate the existence of a necessary first cause (chapter 8). 6. It is possible to give a principled basis for a defeasible or nonmonotonic logic that incorporates causal information. This logical calculus (developed in appendix B) generates rich and plausible conclusions about probable consequences of known or hypothesized states. My theory of causation is designed to provide an exact, mathematical model that satisfies the following aims: 1. Causal connections and order should be defined without reference to space and time, permitting the construction of a non-circular, causal theory of spacetime. 2. It should permit the possibility of higher-order or vertical causal connections, in order to explain logical and mathematical knowledge, mind/body interaction, and the nature of teleofunctions. 3. It should provide natural explanations of the formal properties of causation and causal explanation, including transitivity, asymmetry, and veridicality.


4. It should match the data provided by intuitions about the validity and invalidity of various forms of causal reasoning. In particular, it should explain the failure of substitution of classical equivalents in causal contexts (see chapter 3), and our default assumption of the universality of causal explanation (chapter 8). 5. It should be able to navigate successfully through the complexities of the relationship between causality on the one hand, and modal and statistical relations on the other. It should not treat causation as a primitive, with no intrinsic relationship to correlation or necessity, but it must avoid the paradoxes that have resulted from attempts to reduce causality to statistical relations. 6. It should be compatible with indeterminism, and with merely probabilistic connections between cause and effect (chapters 5 and 6). 7. It should provide an account of the modularity (or locality) of causal reasoning: the role (recently much investigated by researchers in the field of artificial intelligence) of causation in enabling us to draw correct default conclusions in the presence of irrelevant information (appendix B). The last desideratum is especially important, since any theory of causation that does not account for the special virtues of causal reasoning is seriously incomplete. Researchers in logic and artificial intelligence, such as Judea Pearl (1988), have discovered that reference to causal relations plays an indispensable role in our commonsense reasoning about the world. The Yale Shooting Problem of McDermott and Doyle (which I discuss in appendix B) is an excellent example of the sort of problem of reasoning about prospective change that requires a causally informed description of the situation. I argue that the fundamental characteristic of causality that explains its importance in commonsense reasoning is the Markov property: when one fact is causally screened off from a second by one of its causes, then the conditional probability of the second on the cause is independent of the first fact. This justifies our exclusion of causally irrelevant information (information that is causally screened off from our prospective conclusions by our premises) in reasoning defeasibly.


The Ontology of Causation

In order to make sense of causal relations, we must be able to apply the partof relation (and the associated machinery of mereology) to the causal relata. This means that we must acknowledge the reality of concrete existences, tokens, that can play the role of concrete events and states (or "situations"). In addition to these situation-tokens, we will need abstract, repeatable situationtypes. The situation-types represent intrinsic qualities or characters of situationtokens. This choice of primitives is drawn from the work of Barwise, Perry, and Etchemendy (Stanford situation theory).

Realism Regained

The situation-tokens can serve as the truth-makers for propositions, playing the role that "facts" play in the philosophies of Austin, Bergmann, and Hochberg. When it is true that the cat is on the mat, there is a concrete cat-on-the-mat situation-token s that makes it true. This token s is of the cat-on-the-mat type. Complex situation-types can be constructed from simpler ones by means of logical operators, such as negation and disjunction. These operators should be interpreted by means of the strong Kleene three-valued truth tables or the four-valued Dunn tables (as explained in appendix A). In addition to tokens and types, there is a causal priority relation -<, a strict partial ordering (transitive, irreftexive, and asymmetric) of situation-tokens. If s -< s', then s is qualified to act as part of a cause of s'. Intuitively, we can think of s -< s' as meaning that s is wholly in the backward time cone of s'. In chapters 5 and 8, I advocate the thesis that all of the causal antecedents of a token are essential to its identity: if any of them had failed to exist, the token itself could not have existed. If we accept this thesis, then we can define the causal priority relation in this way: s -< s' if and only if s and s' do not overlap mereologically (that is, they have no part in common), and no part of s' could exist unless s existed. There are two notions of causation that I define: (1) total causation (s is a total cause of s') and (2) INUS causation. INUS causation refers to J. L. Mackie's account of a cause as an insufficient but necessary part of an unnecessary but sufficient condition for the effect (Mackie (1965)). Both total causation and INUS causation introduce a modal or statistical element: a total cause must make its effect conditionally necessary, or at least, conditionally much more probable than it would otherwise be. An INUS cause is an indispensable part of some total cause: s is an INUS cause of s' just in case there is a total cause s" of s', s is a part of s", and any part of s" that does not contain s as a part is no longer a total cause of s'.


The Need for an Indeterministic Model

In chapter 4, I develop a deterministic model of causation, one in which a total cause necessitates its effect. However, I discover a number of independent reasons for being dissatisfied with such a model: 1. We have clear intuitions that causation should be possible in an indeterministic world. 2. If causes necessitate their effects, and effects necessitate their causes (since the identities of their causes are essential to their own identities), then causes and effects would be modally inseparable. 3. When applied to specific examples, the necessitation model over-generates causal connections and inflates the minimal content of causal explanations.


There are several difficulties that pose serious problems for building an indeterministic model of causation, however. First of all, verifying the transitivity of causation is no longer trivial, once we abandon strict necessitation as the standard. Verifying the veridicality of causation is also non-trivial. In addition, mere probabilistic relevance is neither necessary nor sufficient, as is demonstrated by two kinds of cases: (1) causes with no or even with negative statistical relevance to their effects, and (2) pre-empted causes, preconditions with positive statistical relevance that are nonetheless not causes, because some independent factor preempts their operation. Finally, there is the Markovian independence principle that I mentioned above, which is critical to explaining the modularity of causal reasoning, but which is also difficult to secure in an indeterministic setting. In chapter 6, I use Lewis/Stalnaker conditionals in a novel way to overcome these difficulties.


A Causal-Probabilistic Theory of Information

My teleological account of mental representation depends crucially on being able to define information without reference to mentality or teleofunctionality. In order to do this, I borrow heavily from the work of Fred Dretske (1981), in which information is defined by means of objective probabilities. According to Dretske, a fact p carries the information q just in case the conditional probability of q on p is equal to 1, which Dretske interprets as meaning that p necessitates The principal difficulty with such an account is that of accounting for the possibility of error or misinformation. If p carries the information that q, then it is impossible for p to be true and q false. There are two popular solutions to this difficulty, neither of which is really satisfactory. We could require only that the conditional probability of q on p be within some small, finite interval of 1, or we could require only that the conditional probability of q on p be higher than that of q on ->p. However, if we do either of these, we lose the validity of the Xerox principle, the principle that information is transitive: if p carries the information that q, and q carries the information that r, then p carries the information that r. A second popular strategy (adopted by Dretske himself) is to add some condition N, representing normal or canonical training conditions, and require that the conditional probability of q on the conjunction p&N be equal to 1. These normal conditions are usually specified retrospectively, by reference to some salient, historical facts. In chapter 9, I argue that these retrospective strategies are inadequate, and I propose two alternative solutions, one using infinitesimal probabilities and the other conditional functions. A token s carries the information that p robustly in world w just in case every part s' of w that contains s as a part carries the information that p. This means that s carries the information that p, and every extension of s in w also



Realism Regained

carries this information. Robust information is the pre-cognitive analogue of knowledge. When one knows something on the basis of robust information, one is immune to Gettier-like counterexamples.


Why an Exact Theory?

A formal or exact theory is an attempt to use logic and mathematics to represent a conception (or family of conceptions) of a particular subject matter. For example, Newtonian mathematics involved the use of the calculus to represent a conception of the physics of motion. Formal or exact metaphysics should not be thought of as the analysis of concepts, or as a branch of pure logic. Nor should it be identified with the articulation of our commonsense worldview (the conception of the world ensconced in ordinary language and everyday practice), although metaphysics typically begins with this task. An exact theory of a metaphysical subject, such as causation, is an attempt to express our best, most-educated guesses about the truth of the matter in a form that is as falsifiable and corrigible as possible. The alternative to developing an exact theory is operating with an undisciplined miscellany of hunches and intuitions, poorly defined and changing unsystematically as one moves from one sphere of application to another. Without an exact theory, inconsistency is very difficult to detect. Unanticipated consequences are rarely discovered, and one's reasoning is often afflicted with non sequiturs and unintended equivocation. The task of defining and investigating an adequate formal language for representing causal reasoning remains unfinished. Recent work by Pearl (1988), Pearl and Verma (1991), and Spirtes et al. (1993) is suggestive but limited, in that all this work takes the relation of causation to hold among a fixed enumeration of dynamic variables. However, in ordinary causal reasoning, we often take complex facts and events to be causal factors. In part I, I define a formal language for causal reasoning that is capable of treating facts of arbitrary complexity as causes and effects, and of resolving many of the outstanding logical puzzles. I am confident that the theory of causation that I develop in part I is clear and precise enough to be falsifiable. Where it goes wrong (as I'm sure in many places it does), it should be possible to construct clear counterexamples, either from real life or from imagination, accompanied by strong intuitions of real possibility. The subject of causation has experienced a renaissance in analytic philosophy over the last generation. Theories and arguments involving causation proliferate, in epistemology, philosophy of mind, philosophy of science, and philosophy of language. However, few working in these areas have attempted systematic and exact accounts of causation, and no such account, to my knowledge, is directly relevant to as broad a range of outstanding philosophical problems as is the account presented here.




The Big Picture: Preview of Part II

In this book, I develop a theory of causation, and I apply this theory to a large number of outstanding problems in philosophy, including such topics as: 0 The definition of proper function (teleology) The semantics of mental representations The mind/body problem (including free will) The causal basis for logical and mathematical knowledge and cognition The problem of induction (including Goodman's puzzle) Enduring substances and their identity-conditions The construction of space and time The objectivity of values and moral norms Obviously, I cannot do justice to the vast literature on any one of these topics. However, in each of these topics, the concept of causation plays a central role, and I cannot claim to have developed an adequate theory of causation without at least beginning the task of testing my theory against the data provided by each of these problem areas. For this reason, I have been forced to cast my net very broadly. 1 do not pretend to have said anything dispositive on any of these subjects in this book, but I do believe that the novel account of causation that I develop here enables me to make a genuinely original contribution in each case, one that I hope will stimulate further discussion. In each case, confusion about the nature and conditions of causation have produced an impasse. The introduction of an exact account of causation, together with the development of some novel proposals, may help to move the discussion to more fruitful ground. The overall structure of the project goes something like this. The theory of causation and information (developed in part I) is used to construct a theory of teleofunctionality as a form of higher-order causation (chapter 12), and an account of the causal efficacy of logical and mathematical facts (chapter 15). After a survey of recent accounts of mental representation, I combine my theories of information and teleology, resulting in an account of the semantics of mental representations (chapter 14): a mental representation carries the content p just in case it has the teleofunction of carrying the information that p. The theory of mental representation is then used in developing theories of mind/body interaction, qualia, and free will (chapter 16), and knowledge and induction (chapter 17). I develop a causal/teleological theory of enduring substances and their identities through time in chapter 18. Both the theory of teleology and that of mental representations are used in the development of a eudaemonistic theory of ethics (chapter 19), which in turn is used in sketching an account of moral realism (chapter 20). Here are some of the more significant claims that I make in part II:


Realism Regained

Figure 1.1: Overview 1. There is a tight connection between the semantics of belief and epistemology: once we have the semantics right, the theory of knowledge is merely a corollary (chapters 14 and 17). 2. There are powerful reasons for rejecting materialism (which I take to include, at a bare minimum, the limitation of causal relations to spatiotemporal items), reasons that are independent of the well-known problems in the philosophy of mind (see chapter 21 for a summary of these reasons). 3. A simple, causal theory of mathematical thought and knowledge is possible, one that unifies the theory of mathematical knowledge with that of empirical and scientific knowledge (chapter 15). 4. Taking functions seriously leads to a very robust form of ethical realism, one that does not identify objectivity with some sort of idealized subjectivity but instead revives the eudaemonism of Plato and Aristotle (chapters 19 and 20). 5. The use of the mereology of events and of non-classical (three- arid fourvalued) interpretations leads to more sophisticated conceptions of super-



venience, type identity, and token identity than were available heretofore. These more sophisticated conceptions enable us to solve the problem of mental causation (chapter 16). My aim in this book is to bring an end to the dualism that has dogged philosophy since the downfall of Aristotle's metaphysics (including his "metaphysical biology") at the beginning of the modern era. Commentators such as Leo Strauss, Alisdair Mclntyre, and John McDowell have all located the roots of the dualisms of mind and body, of fact and value, and of objectivity and subjectivity, in that early modern separation of scientific fact and normativity. In my view, the early modern turn away from Aristotle has been both unnecessary and disastrous. Aristotle's "metaphysical biology" is more viable in light of modern knowledge than it has ever been, and the recognition of this fact can bring about a great reunification of our view of the world. At the same time, I will argue staunchly against a false reunification built upon a narrow physicalism. Physicalists have been right to insist that our knowledge of the real cannot extend beyond the network of causation. They were right, therefore, to challenge the viability of positing a subjective and normative realm beyond the reach of science. However, they were wrong to think that science teaches us that only physical states, states located within the framework of space and time, can be causally efficacious. In fact, science provides abundant evidence, albeit implicitly, of the causal efficacy of physical, mathematical, and logical modality. There is no need to read the chapters of this book in strictly sequential order. In fact, I expect few readers to be interested in all of the topics covered. For example, if you have little interest in logic or in formal theories of causal reasoning, you can skip appendixes A and B altogether, without doing damage to your comprehension of the rest of the book. If you don't care about learning the ins and outs of the metaphysics of causation, then I would recommend giving part I only a cursory reading and getting into the applications in part II as quickly as possible. You could go directly to part II, referring back to part I only as needed (I hope the cross-references, the index, and the table of contents will give you all the guidance you need). If you would like to read just enough of part I to grasp the outlines of my account of causation, I would suggest reading chapters 3, 4 (especially 4.1 through 4.8), and 9, while skipping the technical material, such as the proofs and detailed examples. Alternatively, if your interests lie exclusively in the field of philosophical logic or theories of causation, there is no reason for you to read part II at all. In addition, you should feel free to jump around within part II all you wish: the order of the chapters is not essential. My only recommendation would be for you to read chapter 12 before reading chapters 14, 16, 17, 19, or 20, and to read chapter 14 before 16 or 17.


Realism Regained


A Glossary of Symbols

Although this book contains a considerable number of formulas of symbolic logic, the meanings of the formulas are nearly always spelled out in plain English. There are a few logical and mathematical symbols that the reader must be familiar with: Logical Symbols -i represents negation, "it is not the case that . . ." V represents inclusive disjunction, "either . . . or ... & represents conjunction, "both . . . and . . . " > represents a conditional, "if . . ., then . . . " <-> represents the biconditional, ". . . if and only if . . ." Vx represents universal quantification, "every object x is of such a kind that . . ." 3x represents existential quantification, "there is at least one object x of such a kind that . . . " D represents necessity, and O represents possibility. D> represents a non-truth-functional conditional: (</>d> i/O means that 0 is extremely probable (objectively speaking), conditional on </>. These conditionals warrant defeasible inferences. 4>\t/x] represents the substitution of a; by i throughout formula </>. Pr(A/B) represents the conditional probability of A on B. Metalinguistic Symbols |= represents the relation between a token (or a token in a model) and a type relative to a model, where M., s \= 4> is true just in case s supports type (j) (according to model M.). In accordance with standard mathematical practice, I also sometimes use the \= symbol to represent the relation of logical consequence or implication between formulas or propositions (especially in appendix A). | w represents the relation of nonmonotonic or defeasible consequence, defined in appendix B. h is used in representing the inference rules of a logical system. The symbol H represents a two-way, or reversible, inference rule. (or both)."



||||, \\<f>\\ represent the interpretations of symbols t and (j> in the model under consideration. Set Theoretic Symbols , C represent membership and subset, respectively. U, n represent union and intersection. 0 is the empty set. fi[{A}] is the image of A under relation R, that is, the set of all of the objects that are related by R to something in A. In addition to these familiar symbols, I will make use of a significant number of special symbols. These are all introduced at appropriate places in the text, but I have assembled them all here as well, for the sake of later reference. Symbols of Mereology E represents the non-strict part-to-whole relation (everything bears this relation to itself). [I is the symbol for proper parthood (asymmetric). U and n represent mereological union and intersection, respectively. O represents mereological overlap (having a part in common). x<p represents the mereological sum of all the things that satisfy the open formula <p. Special Primitive Symbols As represents the actuality of situation s (its being part of the actual world). = is used to form a higher-order type by conjoining a situation-token and a type, i.e., the expression (s\= </>) represents the type that is realized by any token s' whenever s supports the type </>. |= is an object-language counterpart to the metalinguistic \=. -< represents the relation of causal priority. (This is primitive in chapter 5, but definable according to the model built in chapter 6.)

16 Defined Symbols

Realism Regained

-<o represents immediate causal priority: s -<o s' just in case s is prior to s', and there is nothing intermediate between any part of s and any part of s'. > represents the total cause relation: s > s' if s is a total cause of s'. ~~> stands for causation in the sense of Mackie's INUS condition: an insufficient but necessary part of an unnecessary but sufficient condition. I also use this symbol to represent the closely related idea of causal relevance of one fact to another. |~ represents the relation of causal constraint between types. N stands for the immediate causal succession relation: sNs' means that s' is the mereological sum of all the situations immediately posterior to s. R> and i- represent the simple and robust carrying of information. The expression (s : </>) is used to represent an ordered pair consisting of a situation s and a type <j). These ordered pairs are typically used to represent actual or possible facts.

Part I

A Theory of Causation and Information

This page intentionally left blank

Toward a Unified Theory of Causation

The literature in the twentieth century on causation is vast and complex. I will give here only a cursory survey of it, with the aim of locating the elements that I have appropriated into the formal theory developed in the rest of part I. My main objective has been to unify the theory of causation in such a way as to provide something useful to philosophers of science, researchers in artificial intelligence, and philosophers of mind and intentionality. The main division within recent work in causation conies between those who have focused on causal relations between event-types and those who focus on relations between event-tokens. An integrated account of both of these sets of relations is much needed. The focus on event-types typifies the broadly Humean tradition, including the deductive-nomological model, statistical theories of probabilistic causation, and Mackie's INUS account. In contrast, Davidson, counterfactual accounts like those of David Lewis, branching-time theorists like Kutschera, singularists like Nancy Cartwright, Michael Tooley, and David Armstrong, and ontological-linkage theorists like Wesley Salmon, James Fair, Phil Dowe, and Douglas Ehring all place primary emphasis instead on the occurrence of concrete event-tokens. In my account, I try to give equal justice to both the token and the type levels. My account is essential a modal account of causation, and modal relations, like those of conditional necessity or objective chance, can hold as well between token events as between event-types. My framework enables me to be neutral on the question of the existence of singular causation: I can represent the possibility of a singular connection between token-events, but nothing in my theory commits me to treating this as a real possibility.



Realism Regained


The Nomological/Deductive Tradition

Hume argued that the concept of causation cannot be a primitive, undefinable concept, since we have no sensory acquaintance with the causal relation as such. He suggested that we can define causation (or, perhaps, replace it with one defined) in terms of regular associations of event-types. The Humean tradition takes the task of science to be the discovery of natural laws, certain kinds of regularities in the occurrence of event-types. One event causes another if the type of the second can be deduced from the type of the first by means of true natural laws. Consequently, this model became known as the "nomological/deductive" model. The nomological/deductive model has run into a number of problems: It has proved impossible to give a satisfactory account of the direction of causation, the asymmetry of the cause/effect relation. The relation between causation and time remains an unilluminated mystery. Typically, there is the bare, unmotivated stipulation that causes must precede their effects. There are some difficulties in extending the model to cover probabilistic causes and other kinds of indeterminism. Humeans have not been able to produce a plausible account of the distinction between natural laws and merely accidental generalizations. There are a number of resistant counterexamples to the model, including preempted would-be causes, and the apparent possibility of worlds with correlations but no causation whatsoever. At the same time, it is vitally important to acknowledge the many virtues of the N/D approach. In replacing the model, we must find an alternative that subsumes its successes. It provides an explanation for the connection between causation and correlation. It deals explicitly with causal relations between event-types. It provides a plausible model of causal explanation, drawing on the analogy between explanation and deduction.


Theories of Probabilistic Causation

Humean empiricists like Reichenbach, Suppes, Eells, Humphreys, and Skyrms have created a very impressive body of work extending the nomological/ deductive account to the domain of probabilistic causal theories and statistical data. As the work has progressed, we can see a clear movement away from

A Unified Theory of Causation


the strict reductionism of Hume and toward an account in which the relation of causal relevance or priority is taken as an unanalyzed primitive. The account of causation that we find in Skyrms and Eells falls roughly into this pattern: C is a positive causal factor for E iff P(E/CH) > P(E/-iCH), where H includes all of the causal factors relevant to E, except for C itself, and those factors causally influenced by C. Notice that this definition does not attempt to give a reduction of all causal concepts to merely statistical or probabilistic ones: the relation of being a relevant causal factor is left unanalyzed. A second feature of the standard probabilistic approach is its exclusive attention to causal relations at the level of types. Very little is said about what it takes for one token to be a cause of another.


Davidson and Event-Tokens

Donald Davidson's work on causation, like earlier work by Anscombe and Ducasse, is concerned with causation as a relation between concrete eventtokens. Davidson's approach is resolutely non-reductive, thereby avoiding the counterexamples to the deductive/nomological account. This attention to tokens and their relations was an important corrective to the Humean tradition, but Davidson's original treatment of event-types was seriously defective. Davidson did not distinguish between the intrinsic character of an event and arbitrary true descriptions of the event. For instance, the intrinsic character of the murder of Caesar includes facts about the number, angle, and timing of the knife thrusts. It does not include features mentioned in such extrinsic descriptions as: foreseen by Caesar's wife, the cause of a civil war, or the result of Caesar's high-handedness. However, it is by virtue of their intrinsic types that tokens support causal relations. Davidson individuates events by including all of the causes and the effects in the essence of each individual event. This means that the occurrence of any particular event necessitates both its own past and the entire subsequent course of history. Davidson was only half right here: the actuality of a particular situation-token necessitates the actuality of all causally prior tokens, but not that of the causally posterior ones. It is this asymmetry that constitutes the fixity of the past and the openness of the future (see section 5.3).


Lewis's Counterfactual Account

Like Davidson, David Lewis sees causation as primarily a relation between eventtokens. Lewis's theory has the virtue of connecting causation with modal properties (like necessity and sufficiency) via his work on the logic and semantics of counterfactual (subjunctive) conditionals. In brief, Lewis (Lewis, 1986b, pp. 164-167) defines causal dependence between tokens x and y in this way:

22 1. If a; had not occurred, y would not have occurred. 2. If a: had occurred, y would have occurred. 3. x and y both occurred.

Realism Regained

Condition (1) states that the occurrence of x is necessary, not absolutely but in the actual circumstances, for the occurrence of y. Condition (2) states that occurrence of x was sufficient (again, in the actual circumstances) for the occurrence of y. Lewis defines causation as the transitive closure of causal dependence. In my view, Lewis's reliance on counterfactuals to define causation has the order of analysis backward. An adequate account of the semantics of counterfactuals must incorporate causal notions. The work of Stalnaker (1986) and Lewis (1973) on the logic of counterfactuals, and on the formal semantics of these conditionals, is quite impressive and entirely successful. However, more is required than a logic and a formal theory of semantics to qualify a concept for foundational use in metaphysics. A foundational concept must have a unity and fixity of reference that I believe counterfactual conditionals, with their sensitivity to context and practical interest, lack. In addition, Lewis gives no reason for the transitivity of causation but instead builds this condition into his definition by fiat (by taking the ancestral of the causal dependence relation). Furthermore, Lewis cannot guarantee that causation is asymmetric, and his account of the directionality of causation seems circular. Counterfactual accounts rule out a priori the possibility of necessary facts acting as causes. It is unclear how to evaluate counterfactuals with impossible antecedents, other than treating them all as vacuously true. Hence, any necessary fact would be, vacuously, a cause of everything, including itself. It is an essential feature of my account that necessary facts are well qualified to act as causes. This plays a crucial role, for example, in my account of the causal connections underlying logical and mathematical knowledge. Finally, there are a number of examples of preemption and overdetermination that Lewis's account gets wrong, unless weakly motivated epicycles are added (Ramachandran (1997), Noordhof (1997)). For example, if e is caused by c, and e itself pre-empts d, and d would have caused e, had it not been preempted by e, then e does not depend counterfactually on c, and so Lewis's account does not treat c as a cause of e. In addition, it is difficult to see how Lewis's account can be extended to probabilistic causation, or, in general, to causation in an indeterministic world (Menzies (1996)). Ramachandran has recently proposed a counterfactual analysis that avoids these counterexamples and that resembles the account I give in part I Ramachandran (1997). Ramachandran first defines an M-set of a: S is an M-set for a iff S is a minimal set such that if none of the members of S had occurred, a would not have occurred. Ramachandran then defines cause in terms of M-sets:

A Unified Theory of Causation c is a cause of e iff c belongs to an M-set for e, and there are no M-sets R and S for e such that R contains c and S differs from R only in containing one or more non-actual events in place of c.


Ramachandran's definition (like Mackie's INUS condition, to be discussed in the next section) is an attempt to formalize the fact that a cause is necessary in the actual circumstances for its effect. A fatal flaw in Ramachandran's definition lies in the definition of M-sets. The minimality of an M-set is defined solely in terms of set membership: there is no proper subset meeting the counterfactual condition. Nothing prevents c from belonging to an M-set even though c itself contains (as parts) totally irrelevant, and even causally posterior, sub-events. The mereological theory of event-tokens developed in chapter 3 is needed in order to define the appropriate form of minimality. There are two further shortcomings to the counterfactual account of causation. First, the account does not provide any guidance to the use of causal facts in prediction and explanation. We must already know what would and would not happen under various hypothetical situations before we can apply causal descriptions to the situation. Causal concepts are of no use in deriving these counterfactual relations. On my account, as delineated in appendix B, in contrast, causal information is critical to the task of prediction and counterfactual projection. Hence, it is valuable to have a characterization of the causal relation that does not presuppose complete knowledge of counterfactual connections. Second, neither Lewis nor Ramachandran offer anything like a complete account of the principles of event identity. They rely on our somewhat woolly intuitions on a case-by-case basis. My account of token causation includes an explicit and precise account of the identity conditions of event-tokens (section 4.3.5).


Mackie's INUS Conditions

In his essay "Causes and Conditions" (Mackie (1965)), J. L. Mackie introduced the idea of an INUS condition. An INUS condition is a condition that is an insufficient but necessary part of an unnecessary but sufficient condition for some event-type. Mackie was working in the broadly Humean, empiricist tradition, and, consequently, his primary concern was with relations between situationtypes. However, he was beginning to see the importance of relations between tokens, and of distinguishing clearly between event-types and event-tokens. In fact, the INUS idea works better than Mackie himself realized when it is transferred to the setting of tokens. We can say that one token a is an INUS condition for another token b when a is an indispensable part of a token c whose occurrence is sufficient for the occurrence of 6. By "indispensable part of c," I mean that no part of c that does not contain a is sufficient for b. This notion of INUS condition illuminates much of our natural-language discourse about causation (as I argue in chapter 3).


Realism Regained


Yablo's Theory

Stephen Yablo (1992), while working on a theory of mental causation, develops a theory of causation that bears some resemblance to my own. He also works with an ontology of event and state tokens, with a relation, subsumption, that is related to the part-of relation that I employ. Essentially, a state s subsumes s', s > s', just in case s' is a part of s (s' C s) and s and s' are coincident. Two tokens are coincident when they occupy the same spatiotemporal location. Thus, Yablo's theory only applies to tokens with spatiotemporal location. In addition, Yablo's theory cannot be used to give a causal definition of spacetime, since it presupposes spatiotemporal relations in its formulation. The essence of a token corresponds to the set of intrinsic types supported by the token. Yablo assumes that all such types are persistent, in the sense that if s C s' and s is of type </>, then s' is also of type </>. This means that if s C s', then the essence of s is a subset of the essence of s'. In Yablo's terminology, if s subsumes s', then the essence of s' is a subset of the essence of s. Yablo uses counterfactuals to define two preconditions of causation: contingency and adequacy. Contingency Adequacy These conditions are very similar to those used by Lewis in his counterfactual definition of causation. Where Yablo differs from Lewis is his use of the subsumption relation to capture a version of Mackie's INUS condition. A token c is required for e just in case for every proper part d of c, if d had occurred without c, then e would not have occurred. Yablo's condition of requirement can be thought of as a refinement or clarification of Contingency, since it tells us that in testing whether Oe would occur on the assumption of -iOc, we must consider every possibility in which some proper part of c occurs but c itself does not. Requirement guarantees that every part of c is necessary for the occurrence of e, under the circumstances. Any state token that is a part of such a required token will be an INUS condition of the effect, since it will be an indispensable part of a mereologically minimal adequate (quasi-sufficient) condition. Yablo's use of unanalyzed counterfactuals burdens his account with the same deficiencies that characterized Lewis's and Ramachandran's accounts. In applying his theory to mental causation, Yablo reveals that his conception of event-tokens is more abstract than that of Davidson or Lewis. Apparently, there are logically "impoverished" tokens corresponding to each concrete occurrence. For example, if there is a token of John walking, there is also a distinct token of John walking or Jane whistling, with the first token subsuming the second. In addition, there are also distinct tokens of someone walking and of John doing something. This leads to a very extreme multiplication of entities. On my alternative model, any token that realizes some genuinely disjunctive type must realize one or the other of the disjuncts. This corresponds to thinking of each token as a concrete part of the world.

A Unified Theory of Causation



Branching-Time Models

Work by Belnap (1987), McCall (1976), and von Kutschera (1993) builds on the branching-time models of temporal logic. For example, in von Kutschera's theory, an event a is a cause of b just in case it is the first event whose occurrence guarantees the occurrence of b. In all these models, of course, temporal relations are taken as primitive, and causal order is parasitic upon temporal order. This makes time travel and backward causation absolutely impossible. It also blocks the option of giving a causal theory of spacetime.


Artificial Intelligence and Models of Causal Inference

Judea Pearl (1988) and his colleagues at UCLA have made considerable progress in recent years in two areas: the theory of causal inference (inferring causal structures from statistics, without prior information about causal or temporal priority), and the role of causal notions in defeasible, commonsense reasoning. Even more recently, Spirtes et al. (1993), building on Pearl's work (as well as the Reichenbachian tradition), have developed workable algorithms for a welldefined program of causal inference. Throughout this work, Reichenbach's notion of 'screening off' plays a central role. Roughly, if one factor screens off a second from one of its effects, then the conditional probability of the effect on the cause is independent of the screenedoff factor. This principle is also known as Markov's rule, after the famous Russian mathematician. Another principle that plays a crucial role in the theory of causal inference as developed by Pearl and by Spirtes, Glymour, and Scheines is Occam's razor. This work includes a rigorous definition of the relative simplicity of a causal hypothesis. There are two main deficiencies in this body of work: first, it does not make a clear distinction between tokens and types, and, second, it deals only with logically simple factors. No work has been done to date on extending these ideas to types of arbitrary logical complexity. In appendix B, I develop a causal calculus that builds on this tradition and rectifies these two shortcomings.


Tooley and Cartwright

In Causation: A Realist Approach, Michael Tooley (1987) subjects the Humean tradition to a barrage of cogent objections. Tooley demonstrates that singular causation (causation between tokens) does not supervene upon the causal laws and non-causal facts of the world, an insight that I incorporate into my own account. Tooley's positive account treats causation as a relation between properties (universals). I agree with Tooley on the need for treating universals or types as first-class members of our ontology.


Realism Regained

In Nature's Capacities and Their Measurement, Nancy Cartwright (1989) endorses and defends two theses that I have incorporated into my account. First, she argues, with Tooley, that singular causation is irreducible to type-level relations. Cartwright uses the example of the complexity of the causal relationship between the use of the birth-control pill and thrombosis to demonstrate the priority of token-level causation (Cartwright, 1989, p. 99). In general, the use of the pill lowers the probability of thrombosis, by lowering the probability of pregnancy, which is a positive causal factor for thrombosis. However, in many cases, the use of the pill causes thrombosis directly. The truth of causal generalizations must be sensitive not only to statistical relationships among classes of events, but also to the presence or absence of token-level causal relations. Second, Cartwright insists that all causal generalizations are defeasible or exception permitting. This latter insight plays a crucial role in my indeterministic model of causation in chapter 5.


Process and Linkage Theories

In recent years, a number of theories of causation have been proposed that forthrightly insist that there is a real connection between causes and events at the token level. This linkage is to be understood as an irreducible element of reality. On Wesley Salmon's account Salmon (1998), causes and effects are connected by something called a process. David Fair (1979) proposes that the linkage consists in the transfer of energy; for Phil Dowe (Dowe (1992), Dowe (1995)), it consists in the transfer of some conserved quantity, and for Douglas Ehring (1997), in the transference of a property trope. I agree with all these accounts in thinking that a real, non-Humean linkage between token cause and token effect is needed. However, I locate this linkage in a modal connection: the asymmetric necessitation of the token-cause by the token effect (see section 5.3). The main difficulty with all of the other ontological-linkage theories is that they are too narrow. They each cover some but not all cases of genuine causation. Salmon's process account, for example, cannot handle cases of causation by absences, and it dogmatically rules out the possibility of action at a distance, despite the fact that quantum mechanics seems to require it (as Salmon himself concedes (Salmon, 1998, p. 231 n. 19)). A second drawback to the ontological-linkage theories is that they tend to be unilluminating. It would seem that to be able to distinguish between genuine processes and pseudo-processes, we must make use of an unilluminated concept of causation. The same is true for distinguishing cases of the genuine transfer of energy or charge or some trope from cases of mere coincidence of identical quantities or tropes. Thirdly, ontological linkage theorists believe that causation is always a purely local, intrinsic fact involving only the causally connected particulars. However, there a number of clear counterexamples to this claim of intrinsicality. For example, there are cases of double prevention: cases in which A causes B by preventing the occurrence of a potential preventer of B. An escort fighter could

A Unified Theory of Causation


participate in causing a successful bombing raid by shooting down interceptors that otherwise would have shot down the bombers. In such cases, there is no single, compact process to which the causal connection is an intrinsic feature. That the sending of the bomber was a cause of the ultimate damage depended on the extrinsic presence of the fighters, and, similarly, that the action of the fighters was a cause of the bomb damage depended on the extrinsic presence of the bombers. Finally, ontological linkage theories require that we make use of a primitive relation of identity over time for the conserved quantities or tropes. In contrast, I want to give a causal account of all identities through time, and so these conserved-quantity and conserved-trope theories of causation are of no use to me.


Mellor's Theory

In a recent book, D. H. Mellor analyzes causation as a relation between facts that involves the modal relation of objective chance Mellor (1995). One fact causes another just in case it increases the objective chance of the second. My own account of causal explanation or fact/fact causation in sections 4.5 and 5.4.2 is quite close to Mellor's. The main differences are these: Mellor takes fact/fact causation to be more fundamental than token/token causation, while I take the two to be equally fundamental. Mellor bases his position on the fact that absences or negative facts can act as causes and as effects. I concur with this assumption, but I would insist that these negative causes and effects are never pure absences: they always involve the supporting of some negative property by some situation-token at some determinate position in the causal network of the world. Consequently, wherever we have causation by or of absences, we also have instances of token-level causation. Although Mellor insists on the possibility of higher-order or iterated causation (Mellor, 1995, p. 108), he never explains how this causation is possible on his account, since it would involve higher-order objective chance, a problematic notion (as I demonstrate in section 7.1). Mellor accepts the substitution of classically equivalent sentences within causal contexts, while I argue in chapter 3 that only strong-Kleene equivalents may be so substituted. Mellor defines causation in terms of the raising of the objective chance of the effect (as compared with some background level), which faces a number of counterexamples, as I discuss in section 6.5. My own account makes use of Mackie's INUS conditions and the mereological relations among tokens, avoiding these counterexamples.


Realism Regained Mellor denies the existence of negative or complex properties, while such properties play a crucial role in my account of teleology, our knowledge of logic and mathematics, and mental causation. Causal laws play a central role in Mellor's account of causation, but he never provides an account of the semantics or logical form of such laws. In contrast, I provide an explicit account of the logic and semantics of the modal constraints that I use in elucidating the nature of causation.


Accounts of Causal Asymmetry

A basic feature of causality is its directionality and asymmetry; the relation between a cause and its effect is different from the relation between an effect and its cause. There have been four predominant accounts of this asymmetry: the appeal to fork asymmetry, the appeal to entropy, the appeal to human agency and manipulability, and the appeal to time. The first of these was pioneered by Hans Reichenbach (1956) and has been recently defended by David Papineau (1992). Fork asymmetry refers to a global feature of the network of probabilistic connections between the world's events. In a detailed study of this account, Daniel Hausman recently concluded that the assumptions underlying this account are a "useful approximation," but that the presence of fork asymmetries is neither a necessary nor a sufficient condition for the existence of causal direction (Hausman, 1998, pp. 239-242). Huw Price (1992) has argued that "fork asymmetry is not a sufficiently basic and widespread feature of the world to constitute the difference between cause and effect." As I discuss in appendix B, fork asymmetries and the screening off of probabilistic dependencies by common causes play an important role in the episternology of causation and in our use of causation in drawing inferences, but I agree with Hausman and Price that they seem misplaced when pressed into metaphysical service as an analysis of the essence of causal direction. The account of the direction of causation in terms of the increase in entropy faces similar difficulties. Although unlikely, a decrease in entropy from cause to effect does not seem to be essentially impossible. There are two problems with accounting for the asymmetry of cause and effect by reference to human agency, i.e., to the fact that we manipulate causes in order to bring about effects, and not vice versa. First, this account seems too narrow, since it excludes causal relations from things that, because they are too large or too small, too fast or too slow, cannot be controlled by human beings. Second, it denies the objectivity of causal asymmetry, reducing it to a merely anthropocentric phenomenon. Since Hume, it has been popular to explain the difference between cause and effect by reference to time: causes always precede their effects. This has two major drawbacks. First, it rules out backward causation without sufficient warrant. For example, some recent interpretations of quantum mechanics have taken the possibility of temporally reversed causation seriously, and discussions

A Unified Theory of Causation


of tachyons and Feynman electrons also presuppose the possibility of backward causation (Price (1996), Cramer (1986)). Second, it would rule out any causal theory of time, rendering such an account circular. Causal direction seems promising as an account of the direction of time (with time being the axis through local spacetime that agrees with the predominant direction of causation in that neighborhood), as I suggest in section 4.10.2 and 4.10.3. I account for causal asymmetry in chapter 5 in terms of the asymmetric necessitation of token-causes by their token-effects. The tokens causally prior to a given token are essential to its identity: it wouldn't be the very situationtoken it is were those causal antecedents either added to or subtracted from. This asymmetry corresponds to the fixity of the past and the openness of the future. This fragility of the identity of events will prove quite useful in making sense of cases of preemption. In section 10.2, I summarize the advantages of this account of asymmetry.


Distinctive Features of My Theory

To summarize, I will mention five distinctive features of my theory of causation: 1. Causal priority is treated as a modal relation among tokens, not supervenient on general and non-causal facts (as per Davidson, Cartwright, and Tooley). 2. The theory includes clear and precise conditions for event-token identity, namely, sameness of parts, intrinsic types, and causal antecedents. 3. Modality (possibility and necessity) and objective probability play a crucial role in the definition of causal relations. There is a seamless general theory covering both deterministic and indeterministic causation, and a link is established to the theories of probabilistic causality and causal inference in the work of Skyrms, Eells, and Pearl, and in the joint work of Spirtes, Glymour, and Schemes. 4. The theory of mereology (of parts and wholes) is used to construct an improved version of Mackie's INUS conditions. 5. The theory of causation is comprehensive, including both token-level and type-level relations, and including causal relations among events, states, dispositions, and modal and causal facts. Although I will make use of merely possible, and even of impossible, situations in my models of partial modal logic, these are to be thought of as mere artifacts of the models. The only situations that really exist are actual situations (this is the thesis of actualism).1 Modality is a property not of non-actual to0000000000000000000000000000000000000000000000000000000000000000000000


Realism Regained

kens, but of types.2 Certain situation types have the property of being possibly, but not actually, instantiated. It is convenient, as Leibniz, Kanger, and Kripke have discovered, to represent these possibilities in formal models by means of merely possible worlds or situations, but the use of such models should not be taken as committing the theorist to the real existence of such merely possible worlds. It is because I take such an instrumentalist view of merely possible situations that I am able to take on board impossible situations with equanimity, and with very useful results. My account is, without apologies, anti-Humean and non-empiricist. I give no special priority to non-modal or occurrent properties, or to properties with which we are immediately acquainted. I do not take the structure of space and time as given prior to my account of causation: instead, I seek to lay the groundwork for a causal theory of spatiotemporality. Natural laws, in the sense of merely extensional generalizations that we happen to find attractive or economical, or that just happen to form a simple and powerful theory of the world, play no role in my account. Causation is not taken to be a projection of our minds, or a function of our practices or preferences. If causality is not taken realistically, nothing else can be.

2 Strictly speaking, I should say that modal types are properties of actual situations. For each situation-type 4>, there exists a corresponding modal type Or/>, which is true in an actual situation s just in case s supports the possible instantiation of tj>.

Situation Theory and Causation

3.1 3.1.1 The Need for Situation Theory Naked Infinitives and Situation Theory

In their seminal book Situations and Attitudes (Barwise and Perry (1983)), Jon Barwise and John Perry introduced the basic elements of situation theory. These elements included: the existence of concrete parts of the world, called situations or situation-tokens, a realistic attitude toward abstract situation-types, and the use of a partial, non-classical semantics for the relationship between situationtokens and situation-types. Barwise and Perry made use of the strong Kleene tables of three-valued logic, making use of the three values true, false, and undefined. Here are the strong Kleene truth tables for negation, disjunction, and conjunction.



Realism Regained

The principal motivating data for early situation theory was work done by Barwise on the semantics of naked infinite clauses within perceptual contexts. For example, consider the contrast between these two sentences: Mary saw that John was smiling. Mary saw John smile. The first sentence, with a "that"-clause in the complement position, entails that Mary knew that she was looking at John and was aware that he was smiling. The second, with a naked-infinitive clause as complement, entails neither of these: it could be true even if Mary believed that she was seeing Paul wince. Barwise and Perry argued that the most natural way to understand nakedinfinitive perceptual reports was to take the object of the perceiving to be a part of the world in the case of seeing, a scene. A scene makes some sentences true and others false, and still others are made neither true nor false by the scene. The semantical relations between a scene and a sentence are embodied by the strong Kleene tables: if a scene s makes a disjunctive sentence (p V q) true, then it must make p true or make q true. If s makes (p&q) true, then it must make both p and q true. Barwise and Perry argue that the following principles of naked-infinitive reports are intuitively plausible (Barwise and Perry, 1983, pages 181-182). 1. Principle of Veridicality: if b sees (f>, then <p. 2. Principle of Substitutivity: if b sees 0(ii), and t\ 2, then b sees (^(2)3. Existential Generalization from Definite Descriptions: if b sees ^(theTr), then there is somethingi such that b sees ^>(iti). 4. Negation: if b sees -K/>, then b doesn't see <f>. 5. Conjunction Distribution: if b sees (<f>&ctp), then b sees </> and b sees ip. 6. Disjunction Distribution: if b sees (</> V i/j), then b sees <p or b sees -0. 7. Distribution of Indefinite Descriptions: if b sees </>(a7r), then there is a TTJ such that b sees </>(iti). For example, Barwise and Perry suggest that if Ralph sees Ortcutt or Hortcutt hide the letter, then either Ralph sees Ortcutt hide it, or Ralph sees Hortcutt hide it.


From Perception to Causation

Naked-infinitive perceptual reports are merely a special case of a much wider phenomenon. It is the causal element that it crucial to the features of nakedinfinitive perceptual reports that Barwise observed. When b sees a scene s, there is some sort of causal connection between s and the perceptual state of b. The distinctive logical properties of causation are reflected in the case of perception.

Situation Theory and Causation


We can see a parallel by considering the agentive use of the verb 'to make', for example: Mary made John smile. This use of to make, like the use of verbs of perception studied by Barwise, takes naked-infinitive clauses in the complement position. The same principles, such as veridicality, disjunction distribution and conjunction distribution, apply in this case as well. If Ralph makes Ortcutt or Hortcutt hide the letter, then either Ralph makes Ortcutt hide the letter, or Ralph makes Hortcutt hide the letter. The same logical phenomena can be seen in cases without the presence of naked-infinitive clauses. For example, consider the relation of causal relevance. We can express a relation of causal relevance between two states, events, or conditions by the use of gerundive phrases. For example: Mary's dancing was relevant to John's smiling. We can also nominalize the events, as in the expression fire: The fire was relevant to the water's boiling. J. L. Mackie's INUS account (insufficient but necessary part of an unnecessary but sufficient condition) can be thought of as an attempt to formalize this relation of causal relevance (Mackie (1965)). If we think of causes and effects as parts of the worlds (i.e., as situation-tokens), then token s is an INUS cause of token s' just in case s is an indispensable part of a token s" that is sufficient to account for s', that is, no part of s" that does not include s is sufficient to produce s'. The relation of causal relevance, whether taken intuitively or as refined in Mackie's analysis, satisfies the semantic principles that Barwise and Perry discovered in the case of perception. (DD) (A V B) is relevant to C = (A is relevant to C) V (B is relevant to C) (CD) (A & B) is relevant to C == (A is relevant to C) & (B is relevant to C) According to the principle DD, causal relevance distributes over disjunction, and according to CD, it also distributes over conjunction. Principle CD is not plausible if we replace the relation of being relevant to with the relation of being a total cause of, but in fact we rarely make reference to such total causes in everyday life. If we mean by causal relevance something like being an essential part of a total cause, something that is necessary in the circumstances, then principle CD is clearly correct. Principle DD is a special case of the referential transparency of causal contexts. If the condition that Fa is relevant to p, and a = b, then the condition that Fb is relevant p. Similarly, if 3 x Fx is relevant to p, then there must exist an a such that Fa is relevant to p. If p is true and q is false, then p V q is merely


Realism Regained

another way of referring to the condition that p, and so, if p V q causes r, so does p itself. However, if we combine these principles with the assumption that classically equivalent sentences are inter-substitutable in causal contexts, absurdity quickly results. 1. The fire's burning was relevant to the water's boiling. (Premise) 2. (The fire's burning and the moon's eclipsing the sun) or (the fire's burning and the moon's not eclipsing the sun) was relevant to the water's boiling. (1, Substitution of classical equivalents) 3. ((The fire's burning and the moon's eclipsing the sun) was relevant to the water's boiling) or ((the fire's burning and the moon's not eclipsing the sun) was relevant to the water's boiling). (2, DD) 4. The moon's eclipsing the sun was relevant to the water's boiling, or the moon's not eclipsing the sun was relevant to the water's boiling. (4, CD, positive dilemma) Since there are so many different variant concepts of causation, there is some legitimate worry that the plausibility of CD and that of DD are due to the use of disparate versions of causation, a kind of fallacy of equivocation. However, I can construct a reductio that uses only CD and the principle of the substitution of classical equivalents. 1. The fire's burning was relevant to the water's boiling. (Premise) 2. (The fire's burning and (the moon's eclipsing the sun or the moon's not eclipsing the sun)) was relevant to the water's boiling. (1, substitution of classical equivalents) 3. (The moon's eclipsing the sun or the moon's not eclipsing the sun) was relevant to the water's boiling. (2, CD) Thus, any tautology would be causally relevant to any actual fact, surely an inappropriate result. The obvious semantic solution is to replace worlds with partial situations, employing strong Kleene (three-valued) evaluations. Strong Kleene equivalents are substitutable in causal contexts. This is the generalization of Barwise and Perry's work (Barwise and Perry (1983)) on the semantics of perception reports. Perception includes a causal component, which explains the behavior of naked infinitive perception reports. Consider the naked-infinitive report "Smith sees the fire burn and the moon eclipse the sun." This report entails that the both the fire's burning and the moon's eclipsing of the sun are causes (in Mackie's INUS sense) of Smith's visual experience (principle CD). Similarly, if Smith sees the fire burn or the moon eclipse the sun, then either Smith sees the fire burn, or he sees the moon eclipse the sun (principle DD). The explanation for why classically

Situation Theory and Causation


equivalent expressions are not inter-substitutable in naked-infinite reports (as observed by Barwise and Perry) is that naked-infinitive reports entail a causal connection, and classically equivalent expressions are not substitutable in causal contexts generally.


The Frege-Church Slingshot

Barwise and Perry made use of a second argument, one that had originally been used by Frege and by Church (1943), as well as by Quine (Quine, 1976, p. 163164) and Davidson (Davidson, 1984, page 19), to argue against the existence of such things as fine-grained as facts or situations. Barwise and Perry call this argument the slingshot. The argument depends on the following transparency principle, which Barwise and Perry apply to the contexts of naked-infinitive perception reports: (SCDD) If TheF = TheG, then ^[TheF] -<[TheG]. So, for example, if Jane is both the youngest spy and the secretary of the French club, then we can infer either of the following from the other: (1) John sees the youngest spy yawn. (2) John sees the secretary of the French club yawn. Of course, (1) does not imply that John sees or even knows that Jane is the youngest spy, and (2) does not imply that John sees that the person yawning was the secretary of the French club, but, since Jane is that secretary, by seeing Jane yawn, he did see the secretary yawn. Similarly, causal contexts are referentially transparent. If John made the youngest spy angry, and the youngest spy is the secretary of the French club, then John made the secretary of the French club yawn. However, if we combine the principle of (SIDD) (the substitution of co-referring definite descriptions) with the substitution of classical equivalents, we can use the slingshot argument to derive the absurd result that every fact causes every other fact. Suppose that the fire's burning is causally relevant to the water's boiling. The proposition the fire is burning is logically equivalent to the identity:

By the substitution of classical equivalents, this identity's holding is also causally relevant to the water's boiling. Suppose that the moon is eclipsing the sun. Then the following identity is true:

By (SCDD), since these specifications of the set {0} are definite descriptions of a sort, we have that the following identity's holding is also causally relevant to the water's boiling:


Realism Regained

Finally, by the substitution of classical equivalents, we reach the absurd conclusion that the moon's eclipsing of the sun is causally relevant to the water's boiling. Since the result is absurd, and the transparency principle (SCDD) seems to hold of causal contexts, we have an independent argument for rejecting the substitution of classical equivalents. Mellor has recently argued that causal contexts are not referentially transparent. For example, suppose Don falls during a rock-climbing expedition because his rope broke. Suppose that Don's fall was the first of the expedition, and suppose that his rope was the weakest. Mellor argues (Mellor, 1995, p. 115) that we can accept (3) without accepting (4) or (5): (3) Don's fall is the first because his rope was the weakest. (4) Don's fall is Don's fall because his rope was the weakest. (5) Don's fall is the first because the weakest rope was the weakest rope. In these cases, the definite descriptions the first fall and the weakest rope are doing more than simply picking out a particular fall or a particular rope. They are also making reference to causally relevant features of the situations. We have moved from the assertion of causal relevance between two tokens to the causal explanation at the level of types. If we consider instead (6) and (7): (6) The weakness of Don's rope was causally relevant to his fall. (7) The weakness of the weakest rope was causally relevant to the first fall. we can see that substitution of co-referring definite descriptions in these contexts is wholly unproblematic. More generally, suppose that c is the one and only K, and e the one and only L. In the statement Kc is causally relevant to Le we can freely substitute the K for c and the L for e: The K's being K is causally relevant to the L's being L. In this case, as Mellor notes (Mellor, 1995, p. 152), the definite descriptions are being used as rigid designators of c and e, and so cause and effect are both merely contingent, despite their apriority.


The Transitivity of Causation

It is a commonplace that causation is transitive: if A is a cause of B, and B is a cause of (7, then A is a cause of C. This transitivity, however, is a difficult thing to account for naturally. Positive statistical correlation is not transitive: it is quite possible for A to be positively correlated with B, B to be positively correlated with C, yet A to be independent of, or even negatively correlated with C. Similarly, counterfactual dependence is not transitive: from the fact that B

Situation Theory and Causation


wouldn't have happened in the absence of A, and C wouldn't have happened in the absence of B, it does not follow that C wouldn't have happened in the absence of A. Of course, it is always possible to take the transitive closure of a non- transitive relation (as David Lewis does in the case of counterfactual dependence), but by itself this is not enough to give an illuminating account of causal transitivity. What is needed in addition is a demonstration that the characteristic modal or statistical properties of the immediate causal connections are inherited by the connections established by transitive closure. This transference of characteristic properties from immediate to mediate connections fails in both the probabilistic and counterfactual analyses. Some have argued recently that causation is not transitive. For example, Michael McDermott (1995) proposes the following counterexample. A and B each control a switch: if both switches end up in the same position (left or right), person C receives a shock. Person A must set his switch first, in full view of person B, who wishes to deliver a shock to C. If A puts his switch right, this causes B to switch right also, which causes a shock to C. Similarly, if A sets his switch to the left, the result is that C is shocked. McDermott argues that A's setting of his switch is not a cause of the shock, even though it is a cause of a cause. Simpler examples also exist: one's birth is a cause of the various episodes of one's life, at least one of which is a cause of one's death. Hence, it seems that, if transitivity holds, one's birth is a cause of one's death. Although I admit that it sounds odd in these cases to insist that A's setting of the switch is a cause of the shock, and that one's birth is a cause of one's death, these do not seem to me to be convincing counterexamples to the transitivity of causation. If a doctor causes a child to be born, and that birth causes a particular death from a genetic disorder, the doctor's action can correctly be ascribed as one of the causes of the death, that is, of the particular death that resulted.


Situation Mereology and Causation

I will take facts or situations to be the relata of causation. A situation is a real, concrete part of the world, one that makes certain propositions true and other propositions false. There is one maximally large fact, that we can call the "world." For each proposition p, p is either made true or false by the world. Smaller situations, proper parts of the world, make certain propositions true, others false, and leave still others undefined in their truth values. Hence, in reasoning about facts, we must make use of a three-valued logic. The appropriate logic to use in this case seems to be the strong Kleene truth tables, in which, for instance, a disjunction is true if either of its disjuncts is true, false if they are both false, and undefined otherwise. Since facts are concrete parts of the world, it makes sense to apply mereology, the calculus of parts and wholes. Some standard symbols of mereology are the O (fr overlap) and the C for the weak part-of relation (i.e., a C b iff either a


Realism Regained

is a proper part of b or a is identical to b). There are three standard axioms of mereology: Axiom 3.1 Axiom 3.2 Axiom 3.3 Axiom 3.1 defines the part-of relation in terms of overlap, and axiom 3.2 is an aggregation or fusion principle: if there are any facts of type 0, then there is an aggregate or sum of all the c/> facts. The mereological sum of facts of type 4> is symbolized as ux<j)(x)." Axiom 3.3 guarantees that the part-of relation is reflexive and anti-symmetric. I will represent the causal relation by means of the symbol >. Axiom 3.4 (Irreflexivity of Causation) I will also assume that causation is closed under part-inclusion with respect to the effect, that is: Axiom 3.5 (Right Closure under Part) The axioms of Irreflexivity and Right Closure immediately imply the separation of a cause from its effects: if a causes 6, then a and b cannot overlap. A notion that will be prove to be useful is that of a coincident token. I will use the symbol -< to represent a primitive relation of causal priority. Causal priority is a necessary, but perhaps not a sufficient condition, of the causal relation proper. A coincident token is one none of whose parts is causally prior to any other part. This means that all of the parts of a coincident token are in some sense simultaneous (in relativity theory, at a spacelike, rather than timelike, separation from one another). Definition 3.1 Since it is impossible for a to cause b without a's being causally prior to b, we have as an immediate corollary that if a situation is coincident, then no part of it causes another of its parts. Corollary 3.1 We will at least consider the following hypothesis of causality: that if a causes 6, and b can be extended to a coincident fact c, then a can be extended to a cause of c: Hypothesis 3.1 (Causality) I will also assume that causation is transitive:

Situation Theory and Causation Axiom 3.6 (Transitivity)


In subsequent chapters, I will define the causal relation in terms of a notion of causal priority, and I will demonstrate there that the causal relation is transitive. However, for the moment, we will take the transitivity of total causation as a given. In the philosophical literature, two kinds of causes are distinguished: total causes, and essential parts of total causes. The latter were called INUS conditions by the British philosopher J. L. Mackie, where INUS stands for "insufficient but necessary part of an unnecessary but sufficient condition." Unfortunately, Mackie's account was subject to insuperable difficulties, because he did not make use of a mereological theory of facts as the relata of causation. We will understand the condition "necessary part" to mean that a fact is part of a minimal cause of the effect. In other words, I will take a > b to mean that a is a total, sufficient cause of fact b. Fact a is an INUS cause of b (symbolically, a ~ b) iff there is some total cause c of b such that: (i) a is a part of c, and (ii) no proper part of c is a total cause of b. Definition 3.2 (INUS Cause) In working with INUS conditions, it is useful to define the relation of being a minimal total cause of another event-token. Definition 3.3 (Minimal Total Cause) One token is an INUS cause of another just in case it is part of some minimal total cause of the second.


A Situation-Theoretic Logic of Causation

In the previous section, I sketched out a theory of causation, treating causation as a relation between facts or situations. In this section, I want to build on that theory to create a logic of causation, treating causation now as a connective that combines two propositions or formulas. In order to do so, we must take each true formula of a formal language as standing for or representing some definite fact. I will take an expression '$*' to stand for the sum of all the minimal verifiers of the formula <t>. A minimal verifier of a formula is a fact that makes the formula true but has no proper parts that make the formula true. For example, a minimal verifier of a disjunction will typically make one or the other, but not both, disjuncts true. If both disjuncts are true, the disjunction stands for the sum of the verifiers of the two disjuncts. If only one disjunct is true, the disjunction stands for the verifier of that disjunct. Formally, I define the factual correlate of a formula to be the mereological sum (represented by the x notation) of all the facts that minimally verify that formula.


Realism Regained

Definition 3.4 Given this definition, we can stipulate the truth-conditions of causal formulas involving formulas in terms of the underlying relations between the factual correlates of those formulas:

We are now in a position to resolve a puzzle: the failure of the substitution of classical equivalents within logic contexts. Two formulas can be counted on to stand for the same fact only if they are strong-Kleene equivalent. StrongKleene equivalence is a much stronger condition than classical equivalence. For example, (fyVip) is strong-Kleene equivalent to (i^Vfy, since they are verified by exactly the same facts. However, <f> is not strong-Kleene equivalent to ((4>&ztp)V (</>&-ii/0)> despite the fact that these are classically equivalent. Here again are the strong Kleene truth tables for negation, disjunction, and conjunction.

The following logical principles are confirmed by the present understanding of the meanings of the causal connectives:

Situation Theory and Causation



Failures of Substitution

Substitution of classical equivalents clearly fails in this framework. To return to the example, in which it appeared that an eclipse caused the water to boil, we can see that the reductio fails at step 2: from the fact that fire caused the water to boil, we cannot conclude that the complex condition consisting of fire and eclipse or fire and no eclipse did so. Situations exist that verify the occurrence of fire without verifying this disjunction: namely, situations containing information about the fire but no information about the occurrence or non-occurrence of the eclipse.


The Transitivity of INUS Causation

Is INUS causation a transitive relation? Suppose a ~~> b, and 6 ~> c. By Causality and the Transitivity of >, it follows that a is part of a total cause of c. However, it does not follow that a is part of a minimal total cause of c. Suppose, for example, that c was overdetermined, that there were two independent causes of c. It could be that a is then partly redundant as a cause of c, despite its being non-redundant as part of a cause of b. In order to make INUS causation transitive, we would have to add something like the following two theses: Hypothesis 3.2 (No Overdetermination) Hypothesis 3.3 (No Action at a Distance} No Overdetermination stipulates that if a and c are both total causes of b, and a and c are coherent, then the mereological intersection of a and c exists and is also a total cause of b. No Action at a Distance requires that if a causes b and b causes c, d is a part of a, e is a part of c, and d is a cause of e, then there must exist a causal chain leading from d to e that co-exists with b, It seems reasonable to assume that if one situation is a minimal cause of another, then the first must be coincident. If it were not coincident, then it would have some redundant parts, namely, those that are caused by other of its parts. Hypothesis 3.4 (Coincidence of Minimal Causes) These hypotheses entail that INUS causation is transitive.


Realism Regained

Theorem 3.1 (Transitivity of INUS, I) Causality, Transitivity, No Overdetermination, No Action at a Distance, and Coincidence of Minimal Causes entail that INUS causation is transitive. Proof: Suppose a ~> b and b ~> c. We must show that a ^> c. By definition of ~, we have that a C d, d >min b, b C e, and e \>min c. By the Coincidence of Minimal Causes, we know that d and e are each coincident. By Causality, it follows that there exists an / such that d C /, Co(f), and / > e. Since / > e and e > c, it follows by the transitivity of > that f > c. It suffices to show that, for every g C /, if g > c, then a C g. Suppose that g E / and g > c. It suffices to prove that a C 3. Since we have / > e, e > c, g E /, c C c, and 3 > c, it follows from No Action at a Distance that there is an h such that Co(h U e), g > /i and /i > c. Since /i > c and e > c, by No Overdetermination, we have that (hr\e) > c. Since e is a minimal cause of c, it must be that e C (h fl e), which means that e C. h. By Right Closure, it follows that g\> e. By Right Closure again, since 6 C e, it follows that g > b. Since / is coincident, and d and g are both parts of /, it must be that d U g is also coincident. Since g > 6, d > 6, and Co(d Up), it follows by No Overdetermination that (d l~l 3) > 6. Since d is a minimal cause of 6, it must be that d C (d n g), which means that d C g. Since a C d, we can conclude that a C g. QJSZ) The No Action at a Distance axiom seems reasonable, since we do have a strong predilection toward believing in the existence of an intervening chain of events linking any cause and effect widely separated in time. However, the No Overdetermination axiom seems too strong: although unlikely, overdetermining causes are not altogether impossible. It seems reasonable to weaken the No Overdetermination axiom into a defeasible or default rule, with the consequence that the transitivity of INUS causation is to be expected as a rule, but not without exceptions. There is another way of securing the transitivity of INUS causation. Instead of the No Overdetermination condition, we could impose a condition of strict downward monotonicity of causation: if a is a total (not INUS) cause of b, and c is a proper part of 6, then there is always a proper part of a that is a total cause of c. This implies the absence of granularity in the causal structure of the world. It also implies that if a is a minimal cause of b, then b is a maximal effect of a. Hypothesis 3.5 (Strict Downward Monotonicity) In order to secure transitivity of INUS, we need a generalized version of the hypothesis of Causality: Hypothesis 3.6 (Generalized Causality) In addition, we need to assume that any coincident extension of a cause is still a cause, and that the sum of two effects is also an effect.

Situation Theory and Causation Axiom 3.7 (Extensibility of Causes) Axiom 3.8 (Right Closure under Sum)


Theorem 3.2 (Transitivity of INUS, II) Strict Downward Monotonicity, Generalized Causality, Transitivity, and No Action at a Distance entail that INUS causation is transitive. Proof: Assume that a ~> b and 6 ~> c. We must show that a ~> c. By definition of ~>, we have that a C d, d \>min b, b C. e, and e >m,n c. By Generalized Causality, it follows that there exists an / such that d C /, Co(f), and / \> m in Since f>e and eD>c, it follows by the transitivity of > that / > c . It suffices to show that for every g C /, if g > c, then a C g. Suppose that g E / and g > c. By No Action at a Distance, it follows that there is an h such that g > /i, /i > c, and Co(/i U e). By Extensibility of Causes, since <? > /i, g C /, and Co(f), it follows that f > h. Since f > h and / > e, by Right Closure Under Sum, we have / > (h U e). Suppose for contradiction that /i g e. Then e C (/iUe). By Strict Downward Monotonicity, it follows that there is a j d f such that j>e. But this contradicts the fact that / is a minimal cause of e. Thus, h C. e. Since h > c, e >min c, and h C e, it follows that h = e. We know that gt>h, so g>e. Since <? C /, and / is a minimal cause of e, it follows that g = /. Since aCL d, d ^ f, and / = <?, we have that a C g. Q-BJ5

This page intentionally left blank

A Deterministic Model
In this chapter, I will employ the logic of situations (including the partial modal logic laid out in appendix A) in developing definitions for a family of causal notions. The atomic formulas of the language, or to use the material rather than the formal mode the basic situation types, consist of types of the following forms: (basic atomic types) (modal types) (mereological types) (classificatory types) As (actuality types) (causal priority types) All of these types, with the exception of the last one, are given precise definitions in appendix A. In this chapter, I will treat the causal priority relation -< as a primitive, corresponding to a classical (bivalent) and mereologically persistent binary relation on the class of situation-tokens. In the next chapter, once I have moved to an indeterministic model of causation, I will be able to offer a definition of causal priority in terms of modality and mereology. As explained in appendix A, the class of situation types is closed under the logical operations corresponding to the usual connectives of classical predicate logic (negation, conjunction, disjunction, and existential and universal predication).



An adequate theory of causation would provide a framework for evaluating philosophical arguments involving causal notions, and for checking the consistency


Realism Regained

and unanticipated consequences of claims made about causation in the course of theorizing about other subjects. An exact theory of causation would also be of value to linguists and researchers in artificial intelligence, in providing a formal language adequate to the task of representing causal information carried in natural language or implicit in the design specifications for an intelligent agent. However, before looking at such specific applications of causal theory, it is possible to identify a number of at least prima facie desirable features for the prospective theory, based on existing philosophical insight into the nature of causation. There are five such desiderata of special significance: 1. Veridicality, Asymmetry, and Transitivity. Causation is veridical (both causes and effects are actual), asymmetric, and transitive, and, in an optimal theory, these properties would be natural consequences of some more fundamental properties, and not generated in an ad hoc fashion, by, for example, taking the transitive closure of some non-transitive relation. 2. Constructibility of Spacetime. This is a more controversial desideratum, but a number of philosophers have attempted to give causal theories of time. This would effect considerable simplicity in our account of the world, so an ideal theory of causation would make no use of spatial or temporal concepts, leaving open the possibility of defining such concepts in causal terms. 3. Modal Facts as Causes. This is also a more controversial desideratum than the first one. I want to give a causal account of our knowledge of modal truths (truths about possibility and necessity), and I hope to extend this to a causal account of logical and mathematical knowledge. Consequently, I want a theory of causation that leaves open the possibility that "eternal" facts, like modal facts, could enter into the causal nexus. 4. Formalizability of Teleological Explanations. The final cause or explanation seems to be causally posterior to what it explains. The mystery is: how can the order of explanation reverse the order of efficient causation? There would seem to be some relationship between teleological and causal explanation, and it is to be hoped that a formalism for causal reasoning will clarify both the nature of teleological explanation and its relation to causation. 5. Compatibility with Indeterminism. In all likelihood, the actual world is not deterministic, but that does not rule out causality. An adequate theory of causation should be robust enough to encompass the possibility of indeterminism. In this chapter, I will lay out a formal theory of causation that clearly satisfies the first four desiderata. It will, however, presuppose a deterministic conception of causation (according to which causes strictly necessitate their effects). In the following two chapters, I will show how to modify this deterministic account in order to make it compatible with indeterministic, probabilistic forms of causation.

A Deterministic Model


4.2 4.2.1

Causation and Determinism Token Determinism and Type Determinism

The thesis of determinism has two quite distinct versions, one applying to situation-tokens and the other to situation-types. Token determinism would be the view that for every token outside of a special class of uncaused tokens, there is a causally prior, strictly sufficient condition for the existence of that token. In other words, for every caused token s, there exists a causally prior token s' such that the existence of s' strictly necessitates the existence of s. Type determinism is the view that for every caused token s and every type <f> such that s belongs to (j>, there is a situation s' that is immediately prior to s and a type I/; such that s' is of type 1/1, and the existence of a situation of type ip strictly necessitates the existence of an immediately posterior situation of type <t>It is natural for a determinist to identify token causation with necessitation by a causally prior token, and to identify causal explanation with the necessitation of a type by the type of a causally prior token. And that is exactly what I will do in this chapter. However, although these definitions of causation and causal explanation are quite simple and have a number of desirable features, they possess one very serious defect. If we define causation as a species of strict necessitation, we are forced to the conclusion that a world without strict necessitation is a world without causation. We make causal idioms inapplicable to an indeterministic situation. Consequently, in the following two chapters, I will develop definitions of causation that are compatible with indeterminism. Of course, a determinist need not define causation as a species of strict necessitation. Indeed, we will see that there are reasons for not doing so, even if determinism were true. The theses of token determinism and type determinism are independent. Much turns on the identity criteria for tokens. If we accept the principle that two tokens belonging to the same types and having the same causes are identical, and we assume that every token belongs to its types essentially, then type determinism entails token determinism. However, token determinism does not entail type determinism, even if we assume that every token has all of its types essentially. A token s might cause token s' with necessity, and s' might belong to all of its types essentially, but it does not follow that the types of s necessitate the types of s'.


Expressibility of Strictly Sufficient Conditions

A total cause is something like a sufficient causal condition. However, there are two problems with simply defining a cause as a strictly sufficient condition. First of all, such a definition would make the non-existence of all sorts of recherche events part of the cause of actual events. For example, the cause of the starting of my car's engine would include the non-existence of an immobilizing ray emitted from a passing UFO. I would like a definition of cause that would not require


Realism Regained

the inclusion of such purely negative and highly probable conditions. Second, we might be dealing with a model in which the only genuinely sufficient conditions are inexpressible as types. For example, all sufficient conditions might involve non-denumerably many atomic types, and our set of situation-types might be denumerable. Nonetheless, in this chapter I will set aside these worries, and take up the task of defining causation without strictly sufficient conditions in subsequent chapters.


Empiricist Conceptions of Determinism

Some philosophers, including van Fraassen (1987), Earman (1986), and Lewis (1983), have argued that empiricism implies that all modal properties (including which causal laws obtain) supervene on the distribution of occurrent properties in the world. In other words, if two worlds agree on the distribution of occurrent properties (like the primary qualities) throughout space and time, then they must also agree on all modal properties. One consequence of this version of empiricism is that we cannot have two worlds wi and w^ which agree in all of their occurrent properties, and so agree on all causal laws, if these are interpreted as the simplest and most powerful extensional generalizations, but which are such that in wi these laws actually constrain the course of events, making alternative paths impossible, while in w%, the course of events merely happens to conform to the laws by mere happenstance. The inability to make such a distinction is a fatal flaw in the empiricist approach, as Tooley (1987) has argued. When we reach such an unacceptable result, it is time to go back and examine the credentials of the "empiricism" that led us there. As I will argue in part II, there is no reason to think of irreducibly modal properties as epistemologically problematic in a way that occurrent properties are not. Lewis and others have been led astray by a version of the Myth of the Given, according to which occurrent properties (like the primary and secondary qualities) are somehow directly present before the mind in a way that modal properties cannot be. In fact, there is no reason to think that our occurrentproperty belief-forming apparatus is more basic or more reliable than our modalproperty belief-forming apparatus. Indeed, from the perspective of causal or naturalized epistemology, occurrent properties could not be known were it not for irreducibly modal properties that accompany them, underlying the relation of reliability that constitutes the core of knowledge. Natural selection favors mental constitutions that are attuned to various modal and stochastic constraints. Natural selection rewards thinkers who successfully track what might/must/probably would happen under various situations, objectively speaking. Naturalized epistemology thus favors the modal realist. The empiricist approach to modality is based on a failed Humean epistemology. Empiricists also owe us an account of how our experiences are experiences of the instantiation of occurrent properties. I develop a causal-teleological account of intentionality in part II. There is no competing empiricist account on

A Deterministic Model


the table. Empiricists such as Lewis (1983) define determinism by first defining a deterministic set of laws. A set of laws is deterministic if it is impossible for two worlds that conform to a set of laws to agree in all their properties at one point in time and yet converge thereafter. A world is deterministic if its laws are deterministic, where the laws of a world are identified with the set of extensional generalizations that combines the greatest simplicity with the greatest content. Thus, the empiricist account of determinism, unlike mine, makes no reference whatsoever to modal properties, such as necessity. If we try to translate Lewis's empiricist account of determinism into the terms of modal realism, we might come up with something like this. According to Lewis, the laws of nature are contingent. However, in order to explain the fact that laws support counterfactual conditionals, Lewis supposes that the laws of the actual world hold throughout some neighborhood of the actual world in logical space. Corresponding to this neighborhood is a proposition, Na, which for Lewis is a set of possible worlds. Let va be a situation-type that corresponds to the proposition Na: a token s is of type va just in case every possible world that extends s is in the neighborhood Na. Suppose the laws of nature include the generalization Vx((Ax&z (x = (j))) > By(Ay&x -<o 3/&(j/|= V0))> that is, every actual situation of type (j) is followed by an actual situation of type -0- Suppose situationtoken s is of type 0 and of type va. If s is actual, then since it is of type va, every possible world it is part of must support all the actual laws of nature. These laws include the one connecting </> and ip. Hence, the existence of such an s strictly necessitates the subsequent existence of a token of type ifrSince all of the laws are deterministic, it follows that every temporally bounded situation is strictly necessitated by some earlier situation-token. Thus, Lewis's account can be seen as a version of the strict-necessity account developed in this chapter.


Basic Ontology

In this section, I will introduce the model structures to be taken as formal representations of real possibilities concerning causation. These structures incorporate two kinds of individuals: situation-tokens and situation-types. Actual situation-tokens are to be thought of as real, concrete parts of the world, analogous to Davidsonian events. Merely possible situation-tokens are abstract objects, constructible from actual tokens and types, and representing possible but unrealized actualities. Each token carries a certain amount of information or fact about the world; these units of fact are represented as situation-types.


Realism Regained


Classification Systems

A classification system consists of a set of tokens, a set of types, and a binary relation on the two sets (the classification relation).1 For my purposes, the set of tokens will be a set of situation-tokens, the set of types situation-types, and the classification relation the verification relation |= defined in the last chapter.



Each model shall contain a classification system, together with two partial orderings on situation-tokens, C and -<!. The first represents the part-whole relationship of standard mereology. The second, a strict partial well-ordering, represents the relation of causal precedence. In addition, there shall be two modal accessibility relations, R^ and R^-. Consequently, a standard, deterministic model M. consists of an n-tuple: (Sit, Typ, Rl,Rl, |=, C, -<), where: Sit is a nonempty set, the set of situation-tokens. Typ is a nonempty set of situation-types, closed under the various logical and modal operations. R^ and R^ are binary relations on Sit, the outer and inner accessibility relations introduced in the last chapter. is a binary relation on Sit x Typ. is a partial ordering of Sit (antisymmetric and transitive). is a strict partial ordering on Sit (irreflexive and transitive).


Situation Types

There are a set of primitive, atomic situation-types, corresponding to the simple, atomic formulas of the last chapter. There are complex types of the following kinds: , where s and s' are situation-tokens where s is a situation-token and <j> is a situation-type As, where s is a situation-token

, where 4> is a situation-type

These structures have been independently discovered many times over. Birkhoff (1940) called them "polarities," and Hardegree (1982) called them "contexts." They were also invented by the German mathematician Wille, whose work is discussed in Davey and Priestley (1990). More recently, Vaughan Pratt and his colleagues, working in the field of theoretical computer science, have called them "classification systems."

A Deterministic Model


These categories of types were introduced in the situation logic of the last chapter. In this chapter, I add a new kind of atomic type: s -< s' , expressing the causal priority of s over s'. Corresponding to this type in the class of models is a binary relation -< between situation- tokens. In this chapter, I will treat causal priority as a primitive relation, and, for simplicity's sake, I will assume that all situations have concordant and complete information about the priority relation, so:


Persistence of Situation- Types

One situation excludes another whenever there exists no situation containing both of them as parts. If we assume that all actually possible situations are coherent, in the sense that there is no type (/> such that the situation belongs to both (f> and -i<f>, then facts about what possible situations exclude other situations will be constrained by facts about the persistence of types, that is, about when a whole inherits the types of its parts. There are four forms of persistence that seem plausible: 1. Global mereological persistence. If a part belongs to the type, so does the whole. 2. Synchronic persistence. If s belongs to the type, s C s', and no part of either is causally prior to any part of the other, then s' also belongs to the type. 3. Punctual persistence. If s belongs to the type, s C s', and exactly the same situations are prior to both, then s' also belongs to the type. 4. Non-persistence. There is no condition that guarantees that when a part belongs to the type, so does the whole. A globally persistent type represents an eternal fact (such as a modal or mathematical fact), or a fact that includes reference to particular individuals (or places) at a particular time (such as 'Clinton was speaking at noon, July 3, 1994'). A synchronically persistent type represents a fact in which individuals or places involved are specified, but not the time, such as 'Clinton speaking at the White House'. A punctually persistent type represents a purely qualitative type, in which no particular individual, place, or time is specified, such as 'a man speaking on a platform'. In the case of a punctually persistent type, the spatiotemporal location of the fact is fixed by the causal antecedents of the situation-token belonging to the type. In appendix A, I assume that all situation-types are globally persistent. In this chapter and its successors, I will relax this assumption somewhat, but I will continue to assume that all types are at least punctually persistent. A causal theory of spacetime (to be discussed in the next section) depends on the


Realism Regained

model's containing an adequate repertoire of types that are punctually but not synchronically or globally persistent. 4.3.5 Identity Conditions for Tokens

I will assume that each token has three kinds of properties essentially: its types (representing its intrinsic character or quality), its parts, and the network of its causal antecedents (representing its backward time-cone). The third assumption is a generalization of the Kripkean intuition that the origin of a thing is always essential to it. It seems plausible to suppose that a particular event could not have been the very event it is if either the intrinsic character of the event were different, or if the causal chain leading up to the event were different. In contrast, the subsequent course of events, causally posterior to an event, is not essential to its identity. The very same event could exist in different worlds, with different subsequent histories. Some such view as this seems to be implicit in our conviction that the past is fixed and the future is open. It is surely not the case that the type of event realized by the present state of the universe necessitates a particular type of prior history. It is metaphysically possible for an event just like the one realized by the present state of the universe simply to pop into existence without any real past. However, we remain convinced that the past is somehow necessary, given the present. The best way to make sense of this conviction is to say that the existence of the event-token of the present state of the universe necessitates the existence of the particular tokens in its actual history. Since past tokens are causally prior to present tokens, we can generalize this to the thesis that the token causal antecedents of any token are necessitated by it. If we make these assumptions, then any possible token could be represented as an ordered triple, consisting of a set of coherent types, a set of possible tokens (representing the token's proper parts), and a causal tree of possible tokens, rooted in the token itself (if we use non-well-founded set theory) or in the immediate causal antecedents of the token. I would not want to identify a real situation-token with such an ordered pair. There is, however, a homomorphism from actual tokens to the set of such pairs. Pairs that do not represent actual situation-tokens can be taken as representing merely possible tokens. If we do not adopt the thesis of the relative necessity of causal antecedents, it is hard to see how we could provide clear criteria of identity for merely possible tokens. Suppose a and b are two non-actual tokens that share all the same types, but have quite different causal antecedents. What could possibly determine whether or not a and b are identical? They do not actually exist, so it seems implausible to say that their identity or distinctness could be a matter of brute fact. If we are to embrace some sort of actualism, we seem to be forced to say that merely possible tokens are constructed from actual tokens and types in the way sketched above. The substance of my account of the identity of tokens can be broken down into two claims: (1) the claim that identity of types, parts, and antecedents is necessary for identity (even trans-world identity), and (2) the claim that

A Deterministic Model


these identities are sufficient for token identity. On the question of necessity, the claim that the sameness of both parts and antecedents are essential to a situation-token is an extension of the idea that the constitution and origin of a concrete thing are both essential to it. It is true that in natural language we sometimes treat event-tokens with slightly different parts and antecedents as identical. For example, we might say that the death of Caesar would have been less painful had Brutus not participated. However, such looseness in natural language should not be taken as settling the metaphysical issue. On the question of the sufficiency of the criteria, it is important to compare the identity criteria with the means we actually use in settling event identity questions. Typically, we identify two events, such as the beginning of the Civil War and the attack on Fort Sumter, by finding that one and the same event is responsible for two distinct effects. This typically involves tracing the causal antecedents of the effects back until in each chain we find an event-token of the same type at the same location in space and time. I will argue in section 5.10.2 that spatiotemporal location is determined by the parts and the causal antecedents of an event-token. Thus, our practice of identifying event-tokens seems to take the sameness of types, parts, and antecedents as sufficient for token identity.


The Causal Priority Relation

The causal priority relation -< cannot be identified simply with causation. Instead, it represents a necessary precondition for causation. In fact, under the assumption of determinism, these three causal notions are interdefinable: the relation of causal priority. the relation of being a total cause of. the relation of being an essential part of a total cause of, Mackie's INUS condition: an insufficient but necessary part of an unnecessary but sufficient condition. I will assume that the causal priority relation is transitive and irreflexive. A token s is immediately prior to s' just in case s is prior to s', and there are no intermediate tokens between any part of s and any part of s'.

The three causal relations -<, ~>, and > are such that any two of these can be defined in terms of the third, together with the mereological part-whole relation s x s' iff there is a world (a coherent and total situation) w such that s is a part of the mereological sum of INUS causes of s' in w (or equivalently, iff s is a part of the mereological sum of minimal total causes of s' in w).


Realism Regained iff s is part of some minimal total cause of s' in w. and s necessitates s .

I will take the causal priority relation to be primitive and define the other two in terms of it, since in the following chapter I will be able to offer a definition of the causal priority relation in terms of mereological and modal relations. In this case, the first condition above imposes a minimality requirement on the extension of -< in a model. It is also possible to define the mereological part-whole relation using only a primitive sum operation on situations and the causal relation >. I will discuss this fact further in the section on spacetime topology. These causal relations interact with mereology in a number of interesting ways. First of all, the total cause relation is closed under the operation of taking a part of the right term. If s is a total cause of s', and s" C s', then s is a total cause of s". This does not hold for ~~> or -<. For all three relations, we can take the sum of terms on the right: if s > s' and s > s", then s > (s1 U s"), and similarly for ~> and -<. Both ~> and -< are closed under the operation of taking a part of the term on the left. If s ~> s', and s" C s, then s" ~^ s', and similarly for -<. This does not hold for the total cause relation. The mereological sum of two total causes of s is always a total cause of s, and likewise for causal priority. However, the sum of two INUS causes need not be an INUS cause, since one might make the other redundant. Finally, if si > s', s2 > s", si X s", and s2 -< s', then (sj U s 2 ) > (s' LJ s"). The following axioms characterize the interaction between mereology and causal priority:

Axiom 4.1 Axiom 4.2 Axiom 4.3 Axiom 4.4 Axiom 4.5
We can think of x -< y as meaning that x lies entirely in the backward time cone (causally speaking) of y. If we model mereology by means of sets of atoms, defining a C b as true if and only if every atom in |a (the interpretation of a of a set of atoms) is also in the |6| (the interpretation of b), then we can define the causal priority relation between tokens in terms of an underlying causal priority relation ~<atom between atoms, specifically:

A Deterministic Model


In other words, a token a is prior to a token b just in case every atom in o is prior to some atom in b, no atom in b is prior to any atom in a, and a and b have no atoms in common. Although the causal priority relation is an undefined primitive, its interpretation is highly constrained, so as to give a -< b the meaning: a is prior and relevant to 6. I stipulate that a model is proper only if, whenever a -< b is true, a is part of some minimal total cause of b in some world in the model. Causal priority is stipulated to be transitive and irreflexive. Consequently, both total causation and INUS causation are irreflexive. It is easy to verify that total causation is transitive. INUS causality is transitive only under special conditions (see chapter 3)


Constraints and Causation

Token-Level Causation

To say that s caused s' is to say that the actuality of s was something like a causally sufficient condition for the actuality of s'. I do not in fact think that the actuality of a cause strictly necessitates the actuality of its effect. In fact, I think that the reverse is probably true: the identity of the causes of a situation is essential to its identity. However, in this chapter I will pretend that causes do necessitate their effects, in order to capture a deterministic conception of causality. Definition 4.1 (Token-to-Token Constraint)

This definition can also be generalize to a relation between tokens and sets of tokens (assuming that our language has been enriched by some means of referring to sets of tokens). Token s constrains the actuality of set B just in case it necessitates the actuality of some member of B:

Definition 4.2 (Token Causation under Determinism)

This definition of token causation can also be generalized to a relation between tokens and sets of tokens:


Realism Regained

The token-to-token constraint relation is one of strict necessitation: every world containing the first situation must also contain the second. Token determinism consists of two theses: causes must be actual, and causes constrain their effects to be actual as well. Given the definition of constraint, it follows that if s is a world, and s \= (si O s 2 ), then both si and s2 must be actual in (parts of) s.


Type/Type Constraints

We can also define causal constraints between situation-types. In order to do so, I must first define a causal succession relation between tokens, abbreviated as sNs'. Definition 4.3 (Causal Succession)

Definition 4.4 (Causal Constraint on Types)

A causally informed constraint from </> to ^ entails that every ^-situation must be immediately followed by a ^-situation. Type constraints give rise to a distinctive form of modal logic. Since we are working with partial, three- or four-valued worlds, substitution into modal contexts is permissible only if the relevant types are strong-Kleene or Dunn equivalent, not just classically equivalent. (See appendix A for the details.) For example, </> and ((0&V) V (0&-1-0)) classically, but not strong-Kleene or Dunn, equivalent. This hyperintensionality of causal contexts is vital to their use in explicating teleological and representational properties.


Defining Causal Explanation

A causal explanation is a relation between one token-type pair and another token-type pair. A pair (s, 4>) causally explains a pair (s',^) Just in case s caused s', and s's being of type </> explains why its effect had to be of type ip. This corresponds closely to Terence Morgan's (see Horgan (1989) and LePore and Loewer (1989)) notion of quausation: s qua <j) causes s' qua ip. I will use this notion to define the causal powers of a property, and it can also be used to define the causal relevance of a particular instantiation of a property to other facts. My definition of causal explanation is intended to capture only the metaphysical core of our ordinary notion of 'explanation'. As many have observed,

A Deterministic Mode]


there are a host of pragmatic factors that enter into making something a good or apt explanation. Explanation in this full, pragmatic sense is typically contrastive: we explain why something is </> as opposed to i/j. Moreover, explanation depends on the knowledge and interests of the audience: we do not typically cite something that everyone knows was present, such as including the presence of oxygen in the atmosphere as part of the explanation of a house fire. However, I am aiming at a characterization of an interest-independent, non-pragmatic explanatory relation, one that constitutes a necessary condition of something's being a correct explanation. This relation could also be thought of as a relation of objective causation between facts, where facts are identified with pairs consisting of an actual situation-token and a type that it supports. Definition 4.5 (Causal Explanation (Fact/Fact Causation))

Causal explanation is veridical in both terms: both s^ and 3% must be parts of s, si must be of type 4>, and s2 must be of type i/j. It is also irreflexive: no token-type pair explains itself. The transitive closure of the explanation relation would also be irreflexive, so explanatory loops are excluded. If the transitive closure of the -< relation is a partial well-ordering, there cannot be any explanatory infinite regresses. As I just said, causal explanation under the deterministic conception is provably sound: if there exists an explanation of s's being ijj, then s really is ijj. In contrast, there is no necessity that explanation be complete that there be an explanation for every type characterizing every causally consequent situation. Thus, the deterministic conception of causation does not by itself guarantee that type determinism is true. We can consider the completeness of causal explanation as an optional hypothesis. Theorem 4.1 (Soundness of Causal Explanation)

Proof A trivial consequence of the definitions. Hypothesis 4.1 (Completeness of Causal Explanation)


Negative Causation

Negative facts (that is, actual tokens paired with negative types that they support) can serve both as causes and effects. Preventions involve negative effects; causation by omission or absence involves negative causes. There are also cases


Realism Regained

(collected by Jonathan Schaffer (2001)) of positive cause/effect pairs in which the causal connection between the two passes through a wholly negative intermediary. We can have that A causes B by preventing a state that, otherwise, would have prevented B. For example, a terrorist causes a midair collision of two airliners by kidnapping the air traffic controller. The kidnapping causes an absence the absence of the controller from his post and this absence in turn causes the collision, by failing to produce the radio signals needed to prevent the collision. Negative facts exist in abundance. For every positive situation-type, there exists a corresponding negative type, its negation. Situation-tokens are, typically, partial in nature. Hence, in many cases, a token s supports neither type <f> nor its negation -*j>. For this reason, we cannot simply identify supporting -*/> with not supporting 0. In many cases, perhaps in all, when a token s supports a negative type ->(/>, there is some positive type ^ that competes with or excludes (f> (ijj might be a different determination of some common determinable) and is supported by s.2 However, we must clearly distinguish between the two questions: Are there genuine negative types and negative facts (consisting of the pairing of an actual token with a negative type it supports)? Are there purely negative tokens tokens that support only negative types? If our answer to the first question were "No," we would have to find a "No" answer to the second question nearly compelling, unless we were willing to countenance the existence of the empty token, a token supporting no types whatsoever (surely an implausible supposition). However, I am endorsing a "Yes" answer to the first question, and, consequently, I consider the second to be very much an open question, to be decided on scientific, and not a priori, grounds. D. H. Mellor has argued that the reality of negative causation argues strongly against taking concrete events as the relata of causation, since the mere absence of something does not constitute a concrete event. Does Mellor's argument apply to my own account of token causation, of causation as a relation between situation-tokens? Situations are clearly a broader category than that of events: every event is a situation, but not vice versa. If the absences that figure in negative causation were mere nothings, if the corresponding relation-instances of causation involved the relation of causation with one or the other of its relata simply missing, then negative causation would refute my thesis that every instance of causation involves two situation-tokens. However, the absences involved in these cases are not mere nothings: they are absences of a particular sort, at a particular time and place. For example,
2It would be a serious error to try to reduce negation to something like exclusion or competition between types, since these notions themselves seem to involve an element of negation: type <j> excludes tp just in case it is not possible for the two to be instantiated together. It is best, I think, to treat negation as a primitive, indefinable relation between types.

A Deterministic Model


in a Democritean universe, situations involving the void are every bit as much a part of the world as are the situations of the atoms. It is the absence of the controller from his post at the time immediately preceding the collision that caused the disaster. I see no reason to doubt that this absence is supported by part of the world, by a particular, concrete situation-token. Is there a token that supports only the absence of the controller at that place and time? This is a more difficult question to answer, but however we answer it, the compatibility of tokens as relata and negative causation is secure.


Singular Causation

Following Tooley, I will use "Humean supervenience" to represent the thesis that the facts about token-causation supervene upon occurrent facts plus the facts about the actual causal laws. To deny Humean supervenience is to affirm the possibility of singular causation, causal connections whose existence is inexplicable in terms of causal laws and non-causal facts. My account so far is neutral on the question of Humean supervenience. However, it does treat singular causal connection as a more basic notion than that of causal law. This does not preclude Humean supervenience, but it certainly makes this thesis an unnatural assumption to make without corroborating evidence. In fact, the notion of causal law does not play a central role in my account, in contrast to the Armstrong/Tooley tradition. I prefer to make use of modal and stochastic notions, rather than talking directly about "lawlike" generalizations. If the hypothesis of the completeness of causal explanation is true, then every instance of token-level causation falls under some necessary generalization at the level of types. This implication of explanatory completeness is important enough to warrant separate attention. I shall refer to it as Hume's hypothesis, since its truth is entailed by the extensional adequacy of Hume's definition of causation in terms of relations between types. Hypothesis 4.2 (Hume's Hypothesis) // and exists a type <f> such that


then there

Hume's hypothesis can be generalized by use of the generalized (token-set) causation. Hypothesis 4.3 (Generalized Hume's Hypothesis) //
B(s'\= if}), then there exists a type <j> such that

and and

These hypotheses do not entail Humean supervenience, however, since even if they held, it could still be the case that which token is causally connected to which is not determined by the combination of non-causal facts about the tokens plus the type-level necessities. It may be that law-like generalizations always presuppose some irreducible facts about token-level causal connections. This is especially plausible if, as I will argue in section 4.10.2, space and time


Realism Regained

are themselves constructible from such token-level causal connections. Typically, causal generalizations will make reference to the spatiotemporal relations between the cause and effect. Both Armstrong and Tooley are overly concerned about whether causal laws are contingent or necessary (they both insist that these laws are contingent). Are necessities of causal connection themselves necessary or contingent? This is a familiar question in modal logic. It amounts to asking whether metaphysical necessity is at least 54; that is, is the relevant accessibility relation transitive? Armstrong and Tooley are, in effect, asserting that necessity is not 54, that some necessities are themselves non-necessary. I am inclined to believe that most causal necessities at least are contingent, but, unlike Armstrong and Tooley, I do not see any interesting metaphysical issues turning on this question. Armstrong and Tooley seem to have a tendency to confuse the necessary/ contingent contrast with the analytic/synthetic distinction. They seem to suppose that, if some causal laws were necessary, they would have to be analytic as well. Since no causal law is analytic, they infer that all causal laws are contingent. However, I cannot see how we can exclude the possibility that at least some causal laws are necessary but synthetic.


Heterogeneous Causal Explanations

Heterogeneous explanations would include transitions from folk science to advanced science, and vice versa. For example, one might explain the occurrence of nuclear fission in terms of slowly pulling a control rod from a reactor core (an advanced explanandum and a folk explanans). Or, one could explain the fragility of glass in terms of its molecular structure (folk explanandum, advanced explanans). Some of the most interesting cases of heterogeneous explanations are psychophysical and physicopsychic explanations. The conceptual incongruity of the two systems of classification, including the fuzziness of one and the precision of the other, are no bar to the existence of genuine explanations. Where explanation breaks down is in the borderline, indeterminate cases. If John is a borderline case of baldness, or the clump of salt is a borderline heap, then a physical explanation of either fact will be problematic. This conclusion contrasts sharply with Jaegwori Kim's well-known views on the matter. Kim embraces a principle of causal inheritance (Kim, 1993, p. 351): If M is instantiated on a given occasion by P, then the causal powers of this instance of M are identical with (perhaps a subset of) the causal powers of P. I take the 'causal powers' of a token to be a function of the causal explanatory force of the types of that token. A token that realizes a mental type has, by virtue of realizing that type, certain irreducible causal powers. These powers include powers to influence both the mental and the physical properties of other tokens. These mental-level powers can be supervenient on the physical-level

A Deterministic Model


powers of the token without being identical to some subset of the latter, contra Kim.


Empiricism and Modality

Van Praassen has argued that the sort of naive reliance on modality that characterizes my approach violates certain empiricist strictures. In particular, van Fraassen argues that a modal realist like me, who denies that modal facts supervene on the non-modal facts, cannot solve the "inference problem" (van Fraassen (1987)). This inference problem concerns the rationality of accepting axiom T of modal logic: if necessarily <f> , then 4>. Since I decline any attempt to define necessity, I cannot argue that T is an analytic truth, derivable by deductive logic from a set of stipulative definitions. How then can I claim that acceptance of T is rationally obligatory? If I deny that it is rationally obligatory, I have no basis for claiming that causal explanation is sound, or that causal necessities constrain the actual sequence of events in the world. My response to van Fraassen is simply to insist that the acceptance of T is required by the proper functioning of the human mind, which I do not take to be exhausted by conformity to the demands of deductive logic. Axiom T is in fact always true, and necessarily so. Hence, reliance on T is highly reliable, as reliable as reliance on any axiom of standard first-order logic. The "inference problem" is a problem only for one who, like van Fraassen, is wedded to the Humean doctrine that the only standard of rational belief is closure under standard deductive logic.3


Causal Relevance

A key notion in my definitions of teleofunction and of modal knowledge is that of causal relevance. There are two ways to define the causal relevance of the type of a token to a type of a second token. The first way makes use of the INUS connective, ~. Definition 4.6 (Causal Relevance, I) if and only if (i) and is a natural type (relative to s), and (iv) for all s", if and then In other words, (s : 4>] is causally relevant to (s' : ijj] just in case: is a natural (not gerrymandered) type, and s' is a minimal token verifying the relation s ~> s'. Thus, mereological minimality comes into the definition of causal relevance twice: first in the definition of the INUS condition (s is an INUS cause of s' just in case s is part of a minimal total cause of s'), and second, in the definition of causal relevance itself.
3 For a further discussion of this topic, as part of a general account of inductive knowledge, see section 19.7.


Realism Regained

A second approach to the definition of causal relevance would be to define a relation of subtype. Type <j> is a subtype of type ijj just in case every possible token that verifies tp also verifies <j>. The intension of the subtype is a subset of the type's own intension. Two types are identical if each is a subtype of the other, i.e., if their intensions coincide. Using subtypes, we can define a minimal explanation: Definition 4.7 (Minimal Explanation) if and only if (f> is natural (relative to s), and for every natural type x such that (si : x) and x is a subtype of Finally, causal relevance can be defined in terms of minimal explanation, exactly as INUS causation has been defined in terms of minimal token causation. Definition 4.8 (Causal Relevance, II) (s : </>) ~> (s' : i/>) if and only if 4> is natural (relative to s), and there exists a type x such that x is a subtype of (j> and It would be worthwhile to investigate under what conditions these two definitions of causal relevance coincide.


Merely Disjunctive (Gerrymandered) Properties

It is a commonplace of the philosophy of causation that merely disjunctive properties, that is, properties formed by the disjunction of unrelated and heterogeneous properties, cannot be causally efficacious. For instance, one can explain a fever by attributing the property of having the mumps, but not by attributing the property of having the mumps or suffering from sunstroke. The latter is not a natural property. Some disjunctions are not merely disjunctive: for example, the property of being a marsupial or being a placental mammal corresponds to the natural property of being a mammal. The real difficulty lies in finding a principled way of distinguishing disjunctive predicates that represent merely disjunctive properties from those that represent natural ones. We don't want to rely simply on linguistic form: a type (/> V i(j might correspond to a perfectly natural, non-gerrymandered type, of which </> and ip exhaustive sub-types. The most promising strategy is to make use of the causal laws (or type/type constraints) in which the property figures. A property </> V if} is merely disjunctive just in case, for every property x, the constraint (0Vt/))|~ x holds if and only if both </>|~ x and V"|~ X hold. However, this method of distinguishing merely disjunctive from natural properties fails in the present context, in which the constraints are held to be deterministic. Every disjunction would turn out, according to the deterministic model, to be merely disjunctive. An alternative characterization of merely disjunctive properties makes use of the mereological structure of the situation-tokens that support these deterministic constraints.

A Deterministic Model


Definition 4.9 (Merely Disjunctive (Gerrymandered) Types) (<^V^) is merely disjunctive (or gerrymandered,) relative to situation s iff, for every type then there exist proper parts of s, s\ and s%, such that and Goodman's quality of grue, for example, is a clear case of a gerrymandered, merely disjunctive property. Any causal constraint involving grue can be factored into separate causal constraints, one involving blueness (and, perhaps, being first observed after 2000) and the other involving greenness (and being first observed before 2000). These two separate constraints could each be supported by tokens that were proper parts of the token supporting the gerrymandered grue-constraint. Gerrymandered types are never causally relevant. This conclusion is immediate if we use the definitions of causal relevance given above and make the identification of natural types with non- gerrymandered types. This distinction between the causal irrelevance of grue and the (possible) causal relevance of green provides simple and satisfactory solution to Goodmans's "new riddle of induction."


Efficacy of Dispositional Properties

It is clear that there are dispositional states. For example, a token has the dispositional state of being dormative just in case it supports some type (j>, and also supports a causal constraint linking <j> to the state of being asleep. If dispositional states are situation-types, then they certainly satisfy the definition of causal relevance. If the dispositional type of dormativity, then there undoubtedly exists a causal constraint linking dormativity itself (and not just each of its various realizations) with the state of being asleep. However, it is not a trivial matter to claim that dispositional situation-types exist. We cannot appeal to a general principle of abstraction, which would assert that every open formula corresponds to a situation-type. Such a general principle would almost certainly be logically inconsistent, since nothing would prevent us from forming the open formula corresponding to the property of heterologicality, the property of being a type that does not apply to itself: \x^(x = x). This property of heterologicality would support itself if and only if it does not support itself, a logical impossibility. Whether dispositional types in general exist, or whether dispositional types of certain special kinds exist, is not an a priori question, but one that can only be settled scientifically. If the best account of the world requires that we posit the existence of dispositional types, then we should do so, but only if this is so. Ever since Moliere parodied the Aristotelian chemist, who explained the sleep-inducing power of a narcotic by appealing to its property of dormativity, many have held that the causal irrelevance of dispositional properties is an a priori certainty. These enemies of dispositions often appeal to Hume's dictum that the relationships between cause and effect must be logically contingent. However, if dispositional types exist, it seems plainly wrong to deny them causal


Realism Regained

efficacy, Hume's dictum notwithstanding. For instance, Feynmann successfully explained the explosion of the shuttle Challenger by reference to the fragility of the O-rings at low temperatures. This property of fragility (if it exists) was certainly causally relevant to the disaster, by virtue of being causally relevant to the shattering of the O-rings under the actual conditions of the launch. It is true of course that, once we have discovered that morphine is sleepinducing, or that O-rings do shatter in cold weather, appeals to dormativity or fragility do not provide an interesting explanation of the phenomena. In fact, they merely restate the explananda. However, that observation falls far short of establishing that dispositional properties have no causal efficacy. Frank Jackson (Jackson, 1996, p. 202) has argued that to allow dispositions to be causes is to "admit a curious and ontologically extravagant kind of overdetermination." I agree with Jackson up to a point: to posit dispositions as causes is to be committed to the existence of dispositional types, and this commitment should be undertaken only on the basis of some positive evidence. I accept Occam's razor: we should not multiply entities unnecessarily. However, Occam's razor is two-edged: we shouldn't refuse to multiply entities in the fact of contrary evidence. I disagree with Jackson (and with many others who take a similar stance, such as Jaegwon Kim (1997b)) in thinking that the avoidance of overdetermination should be a factor in assessing the case for the existence of dispositional types. After all, what is wrong with overdetermination? We can all agree that where overdetermination involves some kind of unexplained coincidence we have good grounds for skepticism. It is unlikely that two or more sufficient causes would converge (for no explicable reason) on exactly the same effect. However, the overdetermination involved in recognizing both the underlying physical properties and the disposition itself as causes does not involve any such unexplained coincidence. There is a perfectly intelligible relationship between the physical basis of the disposition and the disposition itself: the logical relationship between an instance of an existential generalization and the generalization itself. For example, let tj> be the chemical features of morphine that give it its dormative power, and let if) represent the state of being asleep. The relevant dispositional state of a sample of morphine would be the conjunction 0& (0|~ ijj)I take it for granted that such conjunctions of particular chemical and modal properties exist. The corresponding dispositional type of dormativity would be:

The fact that a sample of morphine supports both the particular dispositional state and the general dispositional type is no coincidence. Given that Bonzo is a chimpanzee who is in the cage, there is no mere coincidence in the fact that it is the case both that Bonzo is in the cage and that a chimpanzee is in the cage. Overdetermination is troublesome only when one of two conditions is met: the two causes are non-overlapping and causally unrelated tokens, or the causes are two unrelated types of the same token. For example, if the victim is killed

A Deterministic Model


by a volley of six simultaneously impacting bullets, there is a coincidence to be explained. Alternatively, if both the charge and the spin of the electron are unrelated, and both are causally sufficient to produce some particular quantum effect, the overdeterrnination of the effect would be a puzzling coincidence. However, in the case at hand, neither of these cases apply. The chemical composition of the Challenger's O-rings, and the fragility of these same O-rings, are neither disjoint and unrelated situation-tokens, nor are they unrelated types of the same token. Hence, any "overdeterrnination" of the Challenger disaster by these two facts is entirely innocuous. In chapters 12 and 16, I will argue that we have good grounds for positing the existence of biological and mental dispositions as real situation-types.


Piecemeal Causation

In ordinary language, we sometimes describe one event as causing another, even though the two events overlap in time, so that parts of the cause are causally posterior to parts of the effect. David Lewis (Lewis, 1986b, pp. 172-173) has described this sense of causation as "piecemeal causation." Using the resources of mereology, it is simple to define piecemeal causation: a token s piecemealcauses s' just in case every part of s' is caused by some part of s. In this sense, for example, we can say that the Vietnam War caused the campus unrest of the 1960s, despite the fact that the war outlasted the unrest: every part of the unrest was caused by some part of the war.


Desirable Features of the Theory

At this point, I would like to review the desiderata mentioned in section 2 and check that the theory developed so far meets these goals.


Transitivity, Asymmetry, and Veridicality

Token causation has been denned in terms of the transitive closure of immediate causation, so at one level the question of transitivity is trivial, and the account we have given is unilluminating. However, on the deterministic conception, there is a modal property that is characteristic of causal connection: necessitation. The interesting question, therefore, is this: given that immediate causal connection involves necessitation, does it follow that the same is true of mediate causal connection? Of course, the answer to this question is clearly "Yes," since the relation of necessitation is itself transitive. In the case of asymmetry, it must be admitted that, although the causal relation is indeed asymmetric, this is achieved in an ad hoc manner, by including as an essential component the presence of the causal priority relation, which is simply stipulated to be a partial ordering. The indeterministic definition of


Realism Regained

causation proposed in the next chapter will include a definition of causal priority that will go some way toward dispelling this ad hocness. The causal relation is veridical, in both terms. In the case of the cause, this is achieved by stipulation, but in the case of the effect, it is a natural by-product of the properties of necessity. What is necessitated by an actual situation must itself be actual.


From Causal Mereology to Topology

I hope to be able to define some basic spatial and temporal relations in causal terms. I take causation itself to give the dimension and direction of time: s is before s' if s is causally prior to s'. To make this condition necessary as well as sufficient, we must introduce counterfactuals (as I do in the next section). A token s is before token s' just in case for some substance a located at s, and some condition <j> on a, if a had had condition </>, then s' would have been prevented (the resulting worlds do not contain s' as a part). I can define timelike and spacelike separation between situations: two situations are separated in a timelike way just in case one is before the other. If they do not overlap, and no part of one is separated in a timelike way from any part of the other, then the separation is spacelike. I can use causal relations to define certain basic topological properties in terms of causal and mereological ones. Let's say that two situations si and 2 are cooperative just in case there is a third situation 53 that is caused by the mereological sum of si and s%, yet no part of 83 is caused by either si or S2- If we assume that two situations can cooperate only if they either overlap or are contiguous, then we can define contiguity in terms of cooperation and non-overlap. Once we have a definition of contiguity, we can define continuity and compactness: Definition 4.10 (Cooperation) Definition 4.11 (Contact) Topological operations such as interiority and closure, and the properties closed and open can be defined in the usual way, using contact and the mereological relations see, for example, Clarke (1981), Clarke (19.85), Gotts et al. (1996), and Asher and Vieu (1995). The qualitative or naive version of space and time developed in this way has two necessities as consequences: there can be no temporally backwards causation, and there can be no simultaneous action at a spatial distance, since the direction of naive iime is just the direction of causation, and naive distance is determined by the number of intermediate causal steps. However, when we move from qualitative to quantitative spacetime, from naive to theoretical chronometry and geometry, these necessities need no longer hold without exception. As Gregg Rosenberg has recently argued (in his dissertation at Indiana University),

A Deterministic Model


the task of constructing a metric for spacetime faces two constraints: matching the structure of naive, qualitative spacetime (with its direct correspondence with causation), and achieving mathematical simplicity. The need for greater mathematical simplicity can, in some cases, force us to accept a certain degree of mismatch between spacetime and causation. Consequently, the most we can say a priori is that temporally backward causation and action at a distance must be the exception rather than the rule. The case of quantum mechanics points out another source of discrepancy between causation and spacetime. It may be that the metrical spacetime that fits perfectly (or very nearly perfectly) the causal relations holding at a macroscopic level may do a very poor job of matching the structure of causal relations holding at the microscopic scale. Consequently, backward causation (Cramer (1986)) and action at a distance may be much more common at the microscopic scale, so long as we insist on locating the microscopic events in the spacetime constructed with macroscopic events in mind.


Counterfactuals and Spacetime

A causal theory of relativistic spacetime, along the lines of Robb (1914), depends on the notion of a world-line, the path that a light signal would take, if there existed such a signal starting in a given neighborhood. Thus, in order to begin such a project, we would need to have a theory of counterfactuals. In this section, I will sketch the semantics for a standard-issue counterfactual, one in which the antecedent specifies both a situation-token and a situation-type, and the consequent attributes some situation-type. Like the semantic theory of Frank Jackson (1987), my account of counterfactual conditionals makes explicit use of the concept of causal priority. Intuitively, the antecedent (s : <f>) asks us to consider worlds in which token s has been replaced (if necessary) by a minimal token of type 0, with no change to the tokens that are not causally posterior to s. As preliminaries, I need to define the property of being a minimal token of type (f>, and the relation of non-posteriority. Definition 4.12 (Minimal Token) Definition 4.13 (Non-Posteriority) Definition 4.14 (Counterfactuals)

Given such counterfactuals, we can use hypothetical light signals, or hypothetical acts of measurement with a standard, rigid rod, to specify metrical distances.


Realism Regained


Modal Partiality

Thanks to the partial modal logic developed in appendix A, the account of causation that I have proposed in this chapter does allow for the causal efficacy of modal facts (including nomological and mathematical facts). Situations can be partial with respect to the modal types that they support, and the modal types that a token supports directly contributes to its causal relations. A token s causes token s' only if s itself supports the modal fact s \- s', that is, only if s supports D(As > As'). This means that modal facts can enter into causal explanations according to my definition. I will demonstrate this feature of the theory in more detail in chapter 7.


Teleofunctional Generalizations

Although teleological and functional explanations have been much discussed since the time of Plato, there is relatively little available on the question of the logical form of teleological laws. I conjecture that a teleological law consists of a certain kind of claim linking three generic or parametric facts or types: <j> has the function of making it the case that tfj in species or natural kind v. As has often been noticed, teleological explanation seems to reverse the normal causal order: V> is the final cause of </> in v, even though <j> is causally prior (in the ordinary sense) to ip. This has very often seemed paradoxical, but the air of paradox disappears once the logical form of the teleological claim is seen to involve the assertion of a higher-order causal law. To say that ip is the final cause of </> in v is to claim that there is a higher-order causal law whose antecedent contains the fact that something is an instance of v and the fact that there is a causal law linking <j> to ifj, and whose consequent is tj> itself. Looking at the matter more formally, we can distinguish two kinds of teleological generalizations: those in which the functional attribute is causally necessary (in the presence of type u) for its function, and those in which the functional attribute is causally sufficient:

These two kinds of teleological generalizations represent two extreme cases. In the following two chapters, I will develop models of causation that incorporate relations of objective probability, rather than those of strict necessity. Teleological explanation seems paradoxical because if} occurs in the antecedent of the causal law and </> occurs in the consequent even though <j> is causally prior to tp. This is not in fact a semantic irregularity, because tp does not occur in its own right in the antecedent; instead it occurs as part of a causal conditional. Although (j> is causally prior to ip, it is not causally prior to (ufeK^>)|~ *ip. Consider a concrete example of a teleological connection. Suppose we claim that the function of a robin's tail is aerial stability. Let <j> be the property of having a tail, ip be the property of aerial stability, and v be the property of being

A Deterministic Model


a robin. The teleological claim consists of a claim that there is a higher-order causal law according to which the joint fact of something's being a robin and its being the case that having a tail is a causally necessary condition of aerial stability is a cause of that thing's having a tail. But there is nothing mysterious about such a higher-order law. Such a law is a corollary of a Darwinian theory of natural selection. Darwinism is best understood not as the thesis that there are no final causes in nature, but as the hypothesis that all final causes in nature are ultimately explicable in terms of reproductive advantage. Assuming that aerial stability is an adaptive feature of robins and that having a tail is indeed causally necessary (in the case of robins) for aerial stability, then this causal connection between tails and aerial stability is part of the explanation for actual robins' having tails: had their ancestors not acquired tails, robins would not have successfully reproduced. I will return to this point in chapter 7, and I will develop a theory of teleofunctionality in some detail in chapter 12. 4.10.6 Compatibility with Indeterminism

This is the unfinished business that I will take up in the next two chapters.


Applying the Theory to Some Examples

In order to demonstrate what this account both can and cannot do, I will sketch out applications of it to four simple examples of causal setups: a finite automaton, the determination of supply and demand in a marketplace (according to a rarefied, microeconomic model), a Turing machine, and one-dimensional bil liards. 4.11.1 A Finite Automaton

The basic types for a theory of a finite automata will consist of a finite set of internal state types, a finite set of input types, and a finite set of output types. These three sets can be assumed to be disjoint. Let R be the set of all possible run-types of the automaton, i.e., R consists of a finite or w sequence of input-output-internal state type triples. We can define a set of representations of possible tokens recursively. First, I will define a set Ant of possible tokenantecedents by means of the following recursive definition: (0, i, t) e Ant, if (i, t) represents a, possible initial input and internal state of a run in R. (a, i, t, o) 6 Ant if a Ant, and there is a run r e R such that the inputs and internal states of r agree with a for the first n stages (where n is the recursive depth of a, and i and t are the input and internal state of the n + Ist stage of r).


Realism Regained

An atomic situation-token is defined to be a tuple of the form (a, z), (a, i, t ) , or (a, i, t, o), where i, t and o are, respectively, input, internal state, and output types, and where (a, i , t, o) Ant. The first sort of atom represents a token whose only type is of the input variety, the second sort represents a token whose only type is of the internal-state variety, and the third represents an atomic token whose only type is of the output variety. The set of tokens Tok can be defined as containing the mereological sums of all coherent sets of atoms, where a set of atoms is coherent if all of its members agree in their sequence of types with one run r G R. This construction fully determines the interpretation of C. It only remains to define the causal priority relation -<. I will first define this relation only for atomic tokens. If s is a sub-sequence of a, and s' = (a,i,t), then s -< s'. (The internal-state tokens are causally posterior to all the previous input and internal state tokens.) Furthermore,

Finally, we take the transitive closure of the relation so far defined. For complex tokens, if s = xx A, then an atomic token s' is causally prior to s iff s' is prior to some atomic member of A and s' & A. A complex token s is prior to s' just in case all of its atomic parts are prior to s'. The atomic input tokens are uncaused and are not posterior to anything. The output tokens are causally inert (prior to nothing), and they are posterior to the simultaneous input and state tokens, as well as to all previous input and state tokens. The state tokens are prior to the contemporaneous output token, posterior to the contemporaneous input token, and posterior to all previous input and state tokens. The initial internal state token of a run is uncaused. Every subsequent internal state token is caused by the sum of the immediately preceding state and input tokens. Each output token is caused by the sum of its contemporaneous input and state tokens. In one sense, this setup is not deterministic, because the input tokens at each stage are wholly uncaused, exogenous to the system. However, I do not think that this is a very interesting sense of 'indeterministic'. The input tokens have a location in time only insofar as they causally impinge upon one rather than another internal state token in the run. We could imagine that the input tokens are all located in 'eternity', having only an external relation to the time-sequence of the machine. Or, we could imagine that in each run, the input tokens are all fixed in one fell swoop at the very beginning and held in a kind of suspended animation until the appropriate stage. What makes the finite automaton deterministic is that for every caused token (and every causally explicable token-type pair), there exists a strictly sufficient, causally prior condition.


Supply and Demand

The next example is the determination of market price and quantity by supply and demand in the model of classical microeconomics. In this case there is

A Deterministic Model


an infinite collection of situation types comprising a set of demand-situation types, supply-situation types, and market-clearing types. Each demand, supply, or market situation type is of one of the following forms: (1) at price x a quantity of goods dernanded/supplied/exchanged is at least y, and (2) at price x the quantity of goods demanded/supplied/exchanged is at most y. A set of demand types is coherent if all of its constraints on demand can be satisfied by a monotonically decreasing demand curve, with the quantity ranging from 0 to oo, exclusive. Similarly, a set of supply types is coherent if it can be satisfied by a monotonically increasing supply curve. Supply and demand tokens are all exogenous: none is posterior to any other token. Hence, we can simply identify demand and supply tokens with the corresponding types. Market tokens are prior to nothing, and each market token is posterior to a set of supply and demand tokens. We can define the set of atomic market tokens as follows: If TO is a market type of the form 'at price x the quantity of goods exchanged is at least y\ A is the set of demand types of the form 'at price x, the quantity of goods demanded is at least z\ and B is the set of supply tokens of the form 'at price x, the quantity of goods supplied is at least z\ for some z > y, then (TO, A,B) is an atomic market token. If TO is a market type of the form 'at price x the quantity of goods exchanged is at most y\ A is the set of demand types of the form 'at price x, the quantity of goods demanded is at most z\ and B is the set of supply tokens of the form 'at price x, the quantity of goods supplied is at most z', for some z < y, then (TO, A, B) is an atomic market token. Once again, we can let the set of tokens be the sum of all coherent sets of atomic tokens. An atomic market token (m, A, B} is causally prior to an atomic demand token d iff d A, and it is causally posterior to an atomic supply token s iff s 6 B. The causal priority relation can be extended to the set of all tokens in the same way as in the previous example. It is easy to check that every market token is caused by a token containing both supply and demand tokens as parts. In addition, every type of every market token can be causally explained by the types of its causes. The microeconomic model illustrates the fact that there can be constraints that are not causal constraints. For example, if the world contains a demand token of the type 'at price x, at least y is demanded', then for every 6 there must be an e such that the world contains a demand token of the type 'at price x + S: at least y + e is demanded'. However, the first token does not cause the second, nor is there a causal explanation for the type of the second token.


A Turing Machine

In the case of a standard Turing machine, there are just two tape-square types: 0 and 1. The head of the machine has types of two kinds: internal state and


Realism Regained

location. There must be a finite set of internal state types, and the location types have the structure of the integers: . . . 2, 1,0,1,2,.... Once again, the atomic tokens can be identified with an atomic (simple) type, together with a chain of possible type-transitions leading to an instance of that type. I will define the set H of atomic head-state tokens recursively. (0, i, t, I, o) H, where (i, s, I, o) represents the input, internal state, location and output type of a possible initial state of the machine. If a e H, then (a, i, t, I, o) e H, if (i, t, I, o) represents a possible successor state to the final state of a. Initial tape-square tokens can be identified with triples (0,j, n), where j {0,1}, and n is an integer (representing the location of the square). Noninitial tape-square tokens can be identified with triples (a,j,n), where the final segment of a is {/?, i, t, n, j) for some /3, i, t. A set of atomic tokens is coherent if it is compatible with a possible run of the machine. All of the initial tokens, both of the head and of each of the tape-squares, are exogenous (posterior to nothing). The causal priority relation on tokens can be defined in the usual way, following the pattern of the previous examples. The head of the machine experiences a sequence of state-tokens that can be identified with clock time. There is no reason to suppose that the tape-squares experience a synchronized succession. In the simplest model, the tape squares can be imagined to tunnel through time, so that when the head reaches a square for the first time, it is affected by the initial state-token for that square, and when the head returns to a square after m units of its time, it interacts with the very state-token that it produced on its last visit. We can model the causation relation in a very intuitive way. However, when it comes to causal explanation, the deterministic conception we have adopted in this chapter produces what I call "explanation inflation." For example, suppose square n was written with value j at stage a of the head's activity. Suppose that the head returns to n for the first time at stage a + m. Does the fact that the square had value j at stage o explain the input value of the head at stage a + ml It does only if it is physically impossible for the head to return to square n in less than m units of time, regardless of the state of the head or of the other tape squares. Otherwise, any explanation of the input value of the head at stage a + m must include enough information about the head and the other tape squares to guarantee that the head will not have returned to square n in the intervening period. What is lacking is the notion of being an adequate but defeasible explanation. I think we should say that square n's being written with value j at stage a is an adequate but defeasible explanation of the head's reading j when it returns to n at stage a + m. Information about other tape squares is relevant only if the head actually returned to square n in the intervening period.

A Deterministic Model



One-Dimensional Billiards

In this example, we have two elastic, circular disks on a one-dimensional runway in Flatland (a two-dimensional universe). We will assume Newtonian mechanics with no friction and no gravity or other forces. The disks are each infinitesimal in diameter and have one unit of mass. There are situation-types of two kinds: velocity and position. Each type can take any real number as its value (positive or negative). The only event (besides the uninterrupted rolling of disks) that is possible is the event of collision, and this can happen at most once, since after a collision the distance between the two disks will increase forever. Disk-state tokens fall into four kinds: initial state tokens, pre-collision state tokens, collision tokens, and post-collision state tokens. An initial token can be identified with an ordered pair (p, v), where p and v are real numbers representing position and velocity, respectively. There are two kinds of collisions: head-on arid rear-end. If the initial states of the disks are (PI,VI) and (p2,Vz), then a collision will occur if (v]_v2)(p2 Pi) > 0. If the value of v\ v<2 is negative, then the collision will be head-on. If i>i v% is positive, then the collision will be rear-end. (If either v\ or v% is zero, then the collision is with a stationary disk, which I will count as both head-on and rear-end.) A collision can be represented as a tuple (p\,v\,p%,^2)1 where {PI,UI} and (Piiv2) are both initial-state tokens, and (v\ ^2)(P2 p\) > 0. If the collision is head-on, then it occurs at position -^ ^e collision is rear-end, then it occurs at position After the collision, the first disk assumes the velocity 2, and the second disk assumes the velocity

A pre-collision token state can be represented as a quintuple (p\ ,^1,^2,^2,^3), where: (Pi) u i)

d (P2,f2) are initial state tokens,

Ps > Pi iff vi v-2 is positive, p% < pi iff v\ v% is negative, and Pa < and Pa > Pa < and Ps > if t n e eventual collision is head-on and v\ v? is positive, 1 if the eventual collision is head-on and v\ v% is negative.

if the eventual collision is rear-end and Vi v^ is positive, if the eventual collision is rear-end and v w2 is negative.

A post head-on collision token can be represented as an sextuple {pi) w i,P2!V2,ps,^3), where (pi,fi,p2,W2) is a head-on collision token, either i>3 = Vi or t>3 = i>2, and p% > if the difference between v$ and the other velocity is positive, and ps < if the difference between 1^3 and the other velocity is negative.


Realism Regained

A post rear-end collision token can be represented as an n-tuple (PitviiP2,V2,P3,V3), where (pi, v\,p2,v%) is a rear-end collision token, either ^3 = vi or vz = t>2, and p$ > if the difference between vy, and the other velocity is positive, and pz < if the difference between ^3 and the other velocity is negative. Each state token has two atomic tokens as proper parts: one corresponding to the position of the disk, and the other to the velocity. If s is a state token, then (s, V) and (s,P) can represent these two atomic parts. A set of atomic tokens is coherent just in case every state token constituent of every atomic token can be fit into a physically possible pair of disk trajectories. The set of tokens consists of the sum of all coherent sets of atomic tokens. The causal priority relation can be defined as follows: Initial state tokens are not posterior to anything. Pre-collision state tokens are posterior to both their constituent initial state tokens, and to nothing else. Collision state tokens are posterior to both of their constituent initial state tokens, and to nothing else. Post-collision state tokens are posterior to their constituent collision state tokens, and, by transitive closure, to the constituent initial state tokens of these. I do not think that we should say that a pre-collision token is posterior to any of the earlier pre-collision tokens, nor that a post-collision token is posterior to any of the earlier post-collision tokens. The only tokens that need to have causal efficacy are the initial state tokens and the collision tokens. If we said that a pre-collision token was posterior to all earlier pre-collision tokens, then the causal priority relation would not be well founded in this model, since < on the real numbers is not well founded. In this simple, two-disk setup, we would not need collision tokens at all, since pre-collision and post-collision tokens can be distinguished by comparing their positions and velocities with those of the constituent initial state tokens. However, in more complicated setups, in which multiple collisions and multiple reversals of direction are possible, collision tokens are essential to a correct representation of the causal structure. In this example, we can see the inflation of both causation and explanation as a result of the deterministic conception of causation, despite the fact that, as in the last example, the setup is entirely deterministic. The cause of any pre-collision token, for example, must include the initial states of both disks, since the initial state of the disk whose state is being specified is not a sufficient condition, by itself, of reaching the pre-collision state in question. We must add the fact that the other disk was far enough away and either slow enough to avoid an intervening collision or headed in the wrong direction. This clashes

A Deterministic Model


with what is intuitively correct, namely that the initial state of the other disk is not causally connected to the pre-collision states of the first disk. Similarly, according to the deterministic conception, any causal explanation of a pre-collision state of one disk must include enough information about the initial state of the other disk to ensure that a collision has not in fact taken place. This inflation becomes far worse if we consider a setup in which the number of disks is indefinite. In order to support a network of causal connections under the deterministic conception, we must add negative initial state tokens, one for each position on the real line at which no disk is located in the original situation. This non-denumerable totality of tokens will be part of the cause of every noninitial token, and will be involved in every causal explanation, since without guaranteeing that no disk has been omitted, no condition involving any set of disks can provide a strictly sufficient condition for any subsequent state.


Verifying the Axioms of Chapter 3

Now that we have a well-defined language, complete with a semantic theory and logic, and a series of definitions of the various causal relations in terms of that language, we can now go back to the intuitively plausible axioms of causation that I proposed in chapter 3 and see if they come out as valid, given our logic and our definitions. Here again is a list of the axioms from chapter 3: Axiom 3.1 Axiom 3.2 Axiom 3.3 Axiom 3.4 [Irreflexivity of causation] Axiom 3.5 [Right closure under part] Axiom 3.6 [Transitivity] The first three of these axioms are simply the familiar axioms of mereology. Since our canonical models interpret C by means of the subset relation between supersaturated sets of formulas, it is easy to verify that these three axioms are indeed logically valid.4 Since we are working in a four-valued logic, and since the causal relation > is not bivalent in every situation, we must replace the final three axioms with corresponding inference rules:
There is a complication in the case of Axiom 3.2: strictly speaking, it is logically valid only if the open formula <j> is so constructed as to be bivalent in every situation. This restriction won't affect our uses of Axiom 3.2.

76 Irreflexivity of causation Right closure under part Transitivity

Realism Regained

The irreflexivity of causation is an immediate consequence of the fact that a > b implies a -< 6, and the causal priority relation -< has been stipulated to be irreflexive. In the following chapter, I will define -< as asymmetric necessitation, which will also be evidently irreflexive. Right closure under part is clearly valid, since whenever one token constrains the actuality of another, it always constrains the actuality of all of its parts. Finally, transitivity of causation follows from the transitivity of the causal priority relation -<, together with the obvious transitivity of the necessitation relation.

An Indeterministic Model
5.1 Beyond Determinism

In the last chapter, we encountered a number of reasons for being dissatisfied with a deterministic conception of causation, both for token-causation and for causal explanation. The most important reason for such dissatisfaction is that an ideal definition of causation should not make causation impossible in an entirely indeterministic world. For all we know, the actual world is indeterministic through and through, and yet we can be reasonably confident that causation is a reality in our world. There are two more specific problems with a deterministic conception of token-causation. 1. The Modal Inseparability Problem. If causes necessitate their effects, and effects necessitate their causes, then a cause, and its effect inhabit exactly the same worlds. It seems quite reasonable to suppose that the actual causes of a situation are essential to its identity: it wouldn't be the very token it is if its causes were different. At the same time, it seems reasonable to suppose that two situations that are modally inseparable are identical. When these two hypotheses are combined with a deterministic conception of token causation, we are forced into the absurdity that causes and effects are identical. 2. The Over-Generation of Causal Connections. As we saw in the last chapter, a deterministic conception of token-causation makes entirely quiescent situations causally efficacious, so long as they fill space and time that might be filled by interfering situations. In short, the deterministic conception cannot distinguish between quiescence and action. It would be difficult to combine an indeterministic account of token-causation with a deterministic conception of causal explanation. We would have cases of causation without causal explanation, which would be odd, to say the least.


Realism Regained

There are two additional reasons for being dissatisfied with the deterministic conception of causal explanation: 1. Inexpressibility and Inaccessibility of Sufficient Conditions. Even if type determinism were true, the types that are actually sufficient for some nontrivial explanandum might be inexpressible in human language or thought. They might involve infinite, or even non-denumerable, complexity. In addition, these conditions might be strongly inaccessible to observation or measurement, requiring, for example, absolute precision. Neither of these problems constitutes an insuperable objection to making reference to sufficient conditions in defining causal explanation, but they surely make it preferable to define explanation without reference to such conditions, if possible. 2. The Inflation of Causal Explanations. As in the case of token causation, a deterministic conception of causal explanation makes negative and highly probable conditions, such as the absence of UFO agency of a certain kind, an essential part of causal explanations of mundane occurrences. The deterministic conception cannot distinguish between the presence of favorable tendencies and the mere absence of contrary ones.


Why an Indeterministic Account Is Difficult

There are several desiderata for a theory of causation that are easily met under a deterministic conception, but far more difficult to satisfy when the assumption of the strict necessitation of effects is dropped. First of all, there are three purely formal properties that ought to fall naturally out of a theory of causation: The veridicality of token causation and causal explanation The transitivity of token causation The irreflexivity and asymmetry of the transitive closure of causal explanation (no explanatory loops) In the deterministic account, the first two can be derived directly from the veridicality and transitivity of strict necessitation. In developing an indeterministic account, we must find a modal or statistical relation that supports veridicality and transitivity without necessitation. If, in addition, irreflexivity falls out of the account for free, all the better. There are two cases that have posed serious problems for probabilistic theories of causation in the past. These cases are: Causes with negative statistical relevance to their effects Preemption of causation

An Indeterministic Model


The first case demonstrates that positive statistical significance is not necessary for causal connection. For instance, a particular form of surgery may reliably increase the chances of survival, yet, in particular cases the surgery may kill. The second case demonstrates that positive statistical significance is not sufficient for causal connection. One event may raise the probability of some subsequent event without actually causing it, if some competing cause intervenes to break the causal connection between the would-be cause and its effect. Finally, there is an important material property of causation that is far from trivial to verify: the Markovian statistical independence property. A causal structure has the property if every effect is statistically independent, conditional on one of its causes, from anything screened off from it by that cause. When thinking about propositions or 'events' (in the statistical sense), this property is difficult to motivate and to secure. However, the ontological resources developed in the preceding chapters are adequate to the task.


If Not Determinism, Then What?

The most natural thing to do, once we have abandoned determinism, is to go probabilistic. We could require that a cause raise the probability of its effect above some fixed threshold (say 90%). We could require that the cause raise the probability of its effect from an infinitesimal to some finite probability. Or, we could simply require that the cause raise the probability of its effect, period. All such probabilistic relations suffer from the affliction of non-monotonicity. That is, situation s might raise the probability of situation s', but the larger situation s U s" might lower the probability of s'. Similarly, the type 0 might raise the probability that the next event will be of type i/j, but the stronger type c6 & x might lower that probability. This non-monotonicity plays havoc both with the transitivity of the relation and with the Markov screening-off property. The solution, I think, is to talk about robustly raising the probability of the effect. A situation s robustly raises the probability of s', relative to world w, just in case both s and any extension of s in w raise the probability of s'. Similarly, the pair (s : 0) robustly raises the probability that a ip situation will follow (relative to w) when both <j> and the conjunction of 4> with any other type true of s raise the probability that a i/> situation will follow. To implement this idea, we would have to introduce a probability measure into our model structures. I will do this in the following chapter. In this chapter, however, I want to introduce a qualitative analogue of probability instead, partly because it is somewhat simpler, and partly to establish connections between this theory and existing work on conditional logic (in both the Stalnaker and Ernest Adams traditions). Consequently, in this chapter I will make use of the partial conditional logic developed in appendix A, where </>D> ijj is taken as representing a probability of ip conditional on </> that is infinitely close to 1. As I discuss there, this conditional is a version of Morreau's fainthearted conditional (Morreau (1997)).


Realism Regained

A model M. of situation logic with fainthearted conditionals consists of an n-tuple: (Sit,Typ, where: Sit is a nonempty set, the set of situation-tokens. Typ is a nonempty set of situation-types, closed under the various logical and modal connectives. is a binary relation on Sit x Typ. is a partial, antisymmetric ordering of Sit. The selection functions /T, /^ are functions from Sit x P(Sit) into P(Sii). The class of atomic situation-types consists of types of the following forms: , X, (basic atomic types) (modal types) (mereological types) (classificatory types) As (actuality types) (conditional types) The first five of these forms are identical to forms used in the deterministic model developed in the last chapter. There are two changes: first, the conditional types are added to the list, and, second, the causal priority types are deleted. One additional advantage to an indeterministic model is this: it becomes possible to give a definition of causal priority in terms of modal and mereological properties, making it no longer necessary to treat it as an unanalyzed primitive. As I have already argued in the last chapter, it seems natural to treat the causal antecedents of a token as essential to it. Consequently, I will assume that the actuality of a token strictly necessitates the actuality of all of its causal antecedents. In an indeterministic model, we will no longer assume that causes necessitate their effects. Consequently, it is possible to define causal priority in terms of asymmetric necessitation: effects necessitate their causes, and not vice versa. More precisely, every part of an effect necessitates the cause, but the cause necessitates no part of the effect. However, it remains necessary to distinguish the relation of causal priority from the relation of whole to part. I assume that the actuality of a token strictly necessitates the actuality of all of its parts. However, causes and effects must be separate existences, as Hume observed. Thus, we can define causal priority as follows: Definition 5.1 (Causal Priority)

An Indeterministic Model



Causation and Causal Explanation

Token-Level Causation, without Determinism

I will insist only that a cause is quasi-sufficient for its effects. I will use the variably strict conditional, defined in the last section, to capture this condition. Token causation can be taken to imply either what must (with probability infinitely close to 1) follow, or what might (with finite probability) follow. I will define causation, >, in terms of what follows immediately. Causation in the ordinary sense will then be the transitive closure of immediate causation. An essential virtue of my account is that the same sort of probabilistic relationship holds in both the cases of mediate and immediate causation. The relation of immediate causal priority, -<0i was defined in the last chapter as follows:

We can now define token causation under conditions of indeterminism. I will give two definitions: one using strong probabilification, and the other weak. By strong probabilification, I mean that the probability of the effect conditional on the cause is robustly within an infinitesimal of 1. By weak probabilification, I mean that the probability of the effect conditional on the cause is robustly finite (not infinitely small). The first definition shall be the one that I make use of in the remainder of this project, but I wanted to note the existence of a weaker alternative that might well be appropriate in interpreting some of our talk of causation in natural language.
Definition 5.2 (Token Causation Strong Probabilification)

Definition 5.3 (Token Causation Weak Probabilification)


Redefining Causal Explanation

As in the case of token causation, causal explanation in the indeterministic model involves the property of robustness-in-the-circumstances. Roughly, a fact (s : </>) explains a fact (sr : i/>) just in case s is a token cause of s' , and there is a causal constraint between <f> and -0 that is not overridden by any other feature of s. As in the deterministic case, I begin by defining a minimal token of a given type.

82 Definition 5.4 (Minimal Tokens of a Type)

Realism Regained

Causal constraint between types can then be defined in terms of the probabilistic conditional D>. A causal constraint holds between </> and ip just in case the conditional probability that a given token is succeeded by a token of type V> is infinitely close to 1, conditional on the given token's being of type <p. Definition 5.5 (Causal Constraint between Types)

Finally, the causal explanation relation holds between (s : (f>) and just in case s is a cause of i/', and there is a causal constraint between <f> and that is not canceled by adding any actual property of any token prior to s'. Definition 5.6 (Causal Explanation)


Desirable Features of the Theory

At this point, I would like to review the desiderata mentioned in the last chapter and check that the theory developed so far meets these goals.


Transitivity and Irreflexivity

There are several formal properties of causation that must be verified. First, I need to show that on the hypothesis of strong probabilification for immediate causal connection, strong probabilification also holds for mediate causal connection. Second, I must show that the same thing is true in the case of weak probabilification. Finally, I need to verify that mediate causal connection is irreflexive (and thus asymmetric). We can define mediate causation, >*, as the transitive closure of >. Theorem 5.1 (Strong Probabilification by Mediate Causation) The Strong Probabilification Definition entails the following, on the condition that s is coherent and modally complete:

Proof: By induction. The base case is immediate. Assume M, 82) & (52 > s^). It suffices to show that M, By inductive hypothesis, M, s \= (AsiCI> As?). We also know that M.. s \= , Given the logic of the D(--conditional as developed in appendix

An Indeterministic Model


A, it is sufficient to show that M,s (= (As 2 0 Asi), since that conditional logic supports the rule (CSO):

That M,s \= (As^O > Asi) follows immediately from our assumptions about the identity conditions of situations: since Si>S2, i is causally prior to S2, and so M,s (= n(As2 -> Asi). n(As2 * Asi) logically entails (AszO> Asi). The coherency and completeness of s are needed to ensure that we can employ classical conditional logic in deriving (AsiO> Ass) from (AsiO> A2) and 0(^83 -> An). Q-B^ Theorem 5.2 (Weak Probabilification by Mediate Causation) Tfte Weak Probabilification Definition entails the following, on the condition that s is coherent and modally complete:

Proof. The proof is similar to that of the last theorem. Once again, the coherency and completeness of s are needed to ensure that we can employ classical conditional logic in deriving (AsiO> As3) from (AsiO> As^) and Theorem 5.3 (Irreflexivity of Causation)

This theorem is an immediate consequence of the fact that the relation of causal priority is a strict partial ordering.


Paradoxes of Causation and Statistics

Theories of causation in probabilistic and other indeterministic settings often run aground on two test cases: causes with negative statistical relevance to their effects, and events that would have caused some effect, but were preempted by some other cause of the same kind of effect. Causes with negative statistical relevance A simple example of the first case would be an instance of surgery that killed the patient, even though the probability of short-term survival is increased by the occurrence of the surgery. Thus, the surgery event lowered the probability of death and yet caused the particular death in question. Let si be the token-event of the surgery, s2 the token-event of the patient's subsequent death, 0 the event-type of the form of surgery performed, and i/> the event-type of death. By hypothesis, we have that Pr(i/j/<j>) < Pr(ip/-^4>). Put qualitatively, we may assume that we have both (4>O -r0) and -I(-K/>D> -i^). Expressed in terms of tokens, we could say that Pr^As-2/Asi) is quite low, or,


Realism Regained

in qualitative terms, that we have (siD> s 4 ), where 54 precludes s% (54 might be a possible situation in which the patient survives). In such cases, some unusual event occurred, such as an unusual condition of the patient, which made the surgery especially difficult or dangerous, or some unusual error or oversight on the part of the surgeon. Let us call this unusual event 53, and its type, x- We have then:

In other words, given the existence of 53, the occurrence of si did raise the probability of death. In this case, we cannot say that Si was a total cause of 2, that is, we cannot assert (si > 52)- Instead, we have that si and 53 are both essential parts of some total cause of s?. This means that we can assert that, under the circumstances, si was an INUS cause of 82, i.e., (si ~> 52)1 despite the fact that si has, by itself, negative statistical relevance to 52Preemption Suppose that a medication is given to a patient that significantly raises the probability that the patient will recover within seven days, and raises that probability robustly. However, the patient's own immune system overwhelms the infection before the medication begins to take effect. In this case, the medication would have become a cause of the recovery but was prevented from doing so by the preemptive action of the patient's own system. Let si be the event of the administration of the medication and 0 be its type, and let 2 be the subsequent recovery, with its characteristic type 1/1. We are to assume that the probability of i/> given <j> is high, and robustly so (i.e., there is no actual situation s' of type x such that Pr(i(>/<t>&x) 'is low). These facts, however, are not enough to give us the conclusion that sj > s%. The missing element here is a series of tokens linking s-\_ with s%. These connections are, by hypothesis, lacking: s^ is connected to a series of prior situations involving the patient's immune system. The occurrence of si is simply, in these circumstances, irrelevant to the occurrence of s%. In cases in which situations of type (f> (medication) actually cause a situation of type if} (recovery), there are various intermediate situations for example, situations of the killing of germs by the medication's active ingredients that are lacking in the case in question. In J. L. Mackie's The Cement of the Universe (Mackie (1974)), Mackie discusses a number of other examples of preemption. One of the most interesting concerns a man who sets off across the desert. The man has two mortal enemies, one of whom poisoned his reserve can of water, and the second of whom punctured that same can. The water in the can runs out before the man has a chance to drink it. and he dies of thirst. Which, if either, of the two enemies killed him?

An Indeterministic Model


This example illustrates the inadequacy of the counterfactual theory, or of any theory of necessity-in-the-circumstances in which "necessity" is understood in modal rather than mereological terms. Clearly, the puncture of the can caused the man's death, by causing his dehydration (the immediate cause of the death). Nonetheless, had the can not been punctured, the man would have died anyway, and perhaps even sooner. What is crucial about this example is that if we excise the event of the poisoning, we have a process that is robustly sufficient to ensure the man's death, and the puncturing of the can is an indispensable part of this process. The poisoning turns out to be a preempted, merely counterfactual cause of the death. Consider the following variation. Suppose that the first enemy, instead of poisoning the can, emptied it. Or, suppose he emptied the man's source of water before the can was filled, replacing the water with hydrogen peroxide. In these cases, I don't think we can say that the puncture caused the death. Instead, it was the earlier elimination of water from the series of events that killed the man. In the second case, the puncture might obscure from the victim the fact that he had set out without sufficient water, but it did not cause this state of affairs.


Genuine Overdetermination

Suppose that a man dies at the hands of a firing squad, with bullets simultaneously hitting several vital organs. In this case, the bullets are severally and jointly causes of the death. Each bullet wound is a sufficient condition, causally prior to the death. In each case, the firing of the bullet is an indispensable part of the sufficient condition. Hence each firing is a total cause of the death. This is an example that the counterfactual and necessity-in-the-circumstances accounts get wrong. None of the bullets is necessary in the circumstances, since the man would have died anyway. Only the entire volley counts as a cause, according to the counterfactual theory. This erroneous conclusion arises from confusing mereological indispensability with modal necessity.


Preemption by Trumping

In recent work on the counterfactual theory of causation, Jonathan Schaffer (2000) has created an interesting variant on the idea of preemption: preemption by trumping. Schaffer gives an example of two wizards who simultaneously cast a spell, turning, say, a a prince into a frog. Wizard A uses a more powerful form of magic than Wizard B: whenever the two cast spells with conflicting implications, Wizard A's spell always wins out. Schaffer argues, convincingly to me, that in the case in which both cast the same spell, it is Wizard A's spell, and not Wizard B's, that causes the transformation. Wizard B's spell is preempted, not by an earlier cause, but by a trumping cause. We can imagine non-magical examples as well. A major and a sergeant simultaneously shout "Charge!" to a private under their command. The private


Realism Regained

is disposed to obey orders from both officers, but whenever there is a conflict, he obeys the major rather than the sergeant. When both officers give the same order simultaneously, the major's order trumps the sergeant's. This sort of example is easily handled by the indeterministic model of causation modeled in this chapter. We can suppose that there are defeasible conditionals linking both orders to the state in which they are carried out. The presence of the major giving orders is a defeater to the conditional linking the sergeant's order with its fulfillment. Hence, the sergeant's order by itself is not a robustly sufficient condition under the circumstances. To restore the defeasible conclusion, we would have to add the content of the major's orders. However, the situation containing both orders is not a minimal cause of the private's response: the major's order by itself is a robustly sufficient condition. Hence, the major's order is an INUS cause of the private's response, while the sergeant's order is not.


Negative Causation Revisited

As we saw in the last chapter, the deterministic model leads to an inflation of causes, in particular, to an inflation of negative facts as causes. The absence of every potential preventer of an event gets counted as an actual cause of that event. Moving to an indeterministic model enables us to avoid this inflation. However, there will still remain negative causes, even under the indeterministic model. The difference is this: not every absence of a potential preventer gets counted. Rather, there must exist some actual condition that would, were it not for some unusual absence, prevent the effect. Consider again the example of the terrorists who cause a midair collision by causing the absence of the air traffic controllers from their posts. On the deterministic model, we would have to count the absence from the control tower of any person who could have prevented the collision. On the indeterministic model, we must first find an actual defeater to the connection between the flight paths of the airliners and their subsequent collision. There is in fact such a defeater: the existence of the air traffic control system at the airport, a kind of institutional fact about that airport. Thus, the flight paths alone are not a robustly sufficient condition: taking into account the existence of the air traffic control system would lead to the expectation that a collision will not occur. To obtain a true total cause of the collision, we must find a defeater of this defeater. The actions of the terrorists, interfering with the normal operation of the air traffic control system by locking the controllers in a closet, would qualify. Thus, the absence of the controllers from their posts is part of a minimal total cause of the collision. The absence from the tower of some other person who might have prevented the collision does not so qualify, and so is not an INUS cause of the collision. The definition of causation that I gave in section 5.4 requires some modification if it is to apply correctly to preventions. Consider, for example, the following sort of case. A baseball is hit toward the right field fence. The "fence" is actually a high, thick wall of concrete. Just beyond the fence, in a position

An Indeterministic Model


that would cross the flight of the ball in the absence of the fence, there is a fragile window. The right fielder catches the ball before it hits the fence. Did the fielder prevent the ball from breaking the window? If we apply the definition in 5.4, we get the result that the fielder's catch is an essential part of a quasi-sufficient condition of the ball's not breaking the window. Hence, we seem compelled to say, contrary to our clear intuitions, that the fielder's catch caused the window not to be broken. In order to handle this sort of case, we must modify the definition of causation in the case of preventions, that is, when the potential effect is an absence (like the non-breaking of a window). In such cases, we must add a new condition: in the absence of the "cause," there must exist a condition that would have been robustly sufficient for the prevented situation. Because of the presence of the fence, there was no such robustly sufficient condition of the breaking of the window. Hence, the fielder's catch should not be counted as a cause of the absence of the breaking. More precisely, we can say that s is a total cause of s', where s' is a negative situation (either a pure absence or the absence of a change), if and only if:

It is the second condition that is new. It is not enough for s to be a robustly sufficient condition of s': it must be the case that there is another situation, s", which would, but for s, be a robustly sufficient condition for the non-actuality of s', that is, of the positive condition or change of which s' is the absence.


Example Applications

I started the chapter arguing that the deterministic conception of causation over-generated causal connections and causal explanations, and I used several examples from the previous chapter to make this point, in particular, the cases of the Turing machine and of one-dimensional billiards. At this point, I would like to re-analyze these examples within the indeterministic model.


Turing Machines

In the immediately preceding chapter, we saw that the deterministic model of explanation resulted in explanation inflation: in order to explain the persistence of a value from one stage to a later stage, we had to introduce into the explanans enough information about the position of the head and the values of other squares to guarantee that the head not return to the square in question during the intervening period. Using an indeterministic conception of causation, we can avoid this result.


Realism Regained

For example, let us impose a hyperreal-valued probability function on the total runs of the machine in such a way that: If the head never returns to the same square twice, the run is given a finite probability. If the head returns to some square once, but to no square more than once, the run is given a probability of e, where e is an infinitesimal. If the head returns to some square i times, but to no square more than n times, then the run is given a probability of the order of e1. Suppose square n has been visited by the head i times by stage o, and that its value at stage a is j. The probability that it will return for an i + 1th visit, conditional on the past history of square n, is infinitesimal. Thus, the probability, for any m, that square n will have the value j at stage a + m, given that it has that value at stage a, is infinitely close to 1. Hence, in cases in which in fact the head does not return to square n during the period between a and a + m, we can explain the value of the square at stage a + m by means of its value at stage a alone.


One-Dimensional Billiards

In the example of one-dimensional billiards, the deterministic conception of causation forced us to include the initial state of one disk as a part of the cause of any of the pre-collision states of the other disk. This involved an overgeneration of causal connections, since, intuitively, the initial state of the first disk is causally connected to the states of the second disk only after a collision event. In order to apply the indeterministic conception of causation to this example, we must stipulatively define an appropriate probability assignment to possible total histories. Let us assign a finite probability to a history just in case no collisions occur in that history. If we are dealing with a setup in which there are more than two disks, and thus the potential for more than one collision, we can stipulate that the probability of n + 1 collisions is infinitely smaller than the probability of n collisions, for every n. Given these stipulations, it is possible to describe the setup in such a way that the only cause of the pre-collision states of a disk is the initial state of the disk, since the probability of a collision is infinitesimal. More generally, the causes of the state of a disk will include the initial state of that disk and of any disk that has collided with that disk, and the initial state of any disk that has collided with any of those disk, etc. Initial states of disks that have not yet interacted with the disk in question, either directly or indirectly, need not be counted as part of the cause of that disk's present state.

An Indeterministic Model



Mackie's Slot Machines

I would like to conclude this chapter by examining the indeterministic slot machines L and M, introduced by J. L. Mackie in his book The Cement of the Universe (Mackie (1974)). In slot machine L, the insertion of a shilling coin is necessary, but not sufficient, for the production of a chocolate bar. In machine M, the insertion is sufficient, but not necessary. In the case of machine L, it is clear that every coin-insertion token is causally prior to the subsequent chocolate-bar-output token, if there is one. If we assume that the insertion of the coin raises the probability of the chocolate bar's being produced to a finite level, then this example conforms to the weak probabilification definition of token causation. Since machine L seems a clear case of causation, as Mackie asserts, this example provides some reason for thinking that only weak probabilification is necessary for causation. In the case of machine M, the insertion of the coin is sufficient for the production of a chocolate bar, but sometimes a bar is produced spontaneously. Suppose s is a particular token of coin insertion, and suppose that it is immediately followed by an event s' of chocolate-bar production. // s is causally prior to s', then clearly s is a cause of s', since the existence of s' is necessitated by the existence of s in this case. However, it is not clear that this relation of causal priority actually holds. Suppose there was a probability of p that a chocolate bar would be produced spontaneously at the very time of the occurrence of s'. It would appear that the probability that s' is actually posterior to s is only 1 p, with a preemption of the coin-bar connection occurring with a probability p. Depending on how the details of the example are filled in, it may or may not be possible to decide empirically whether the causal connection to s is actual or preempted.

This page intentionally left blank

A Probabilistic Model of Causation

In this chapter, I will develop a quantitative, probabilistic model of causation, building on the work done in the preceding two chapters. As before, the desiderata for the model include securing the transitivity of causation and the statistical independence properties associated with the Markov properties. At the same time, I want to faithfully represent the full complexity of the relationship between causation and probability, including the possibility of causes with negative statistical correlation to their effects, the possibility of independent overdetermination, and the possibility of cases of the preemption of causality, in which positive correlation is not sufficient for causal connection. To keep things as simple as possible, I will assume that all situation-tokens in all models of probabilistic causation are probabilistically complete. I will include a set W of coherent and complete situation-tokens that will constitute the possible worlds of the model, and I will assume that to each token is assigned a standard probability function over the set of worlds. I will not try to accommodate modal or stochastic partiality in this chapter.



A probabilistic model M. shall consist of a tuple: (Sit, W, Typ, =, C, /it), where: Sit is a nonempty set, the set of situation-tokens. W is a non-empty subset of Sit, representing the possible worlds. Typ is a nonempty set of situation-types, closed under the Boolean operators V and -i. is a binary relation on Sit x Typ. is a partial, antisymmetric ordering of Sit.


Realism Regained / is a function from X into the interval [0, 1], where X is a set of subsets of W. The set X, the set of measurable events, must be closed under finite union, intersection, and complement.

For simplicity's sake, I will assume that W is finite, and that X = p(W). The measure function p, can then be extended to any set of worlds A by stipulating that:

The measure function \JL can be used to define a conditional probability function Pr defined on all pairs of non-empty subsets of Sit, where:

I will use Pr(s'/s) to abbreviate Pr({s'}/{s}), and Pr(s"/s,s') to abbreviatePr


Token Causation

I will take the definitions of token constraint and token causation from the last chapter and simply introduce a new parameter r, representing the probabilistic strength of the connection. Definition 6.1 (Weighted Causal Constraint)

Definition 6.2 (Weighted Token Causation)


Weighted Causal Constraints on Types

I will now extend the definitions of causal constraints on types to the probabilistic setting. Generalized weighted constraints between types can now be denned in terms of objective probability. Definition 6.3 (Weighted Causal Constraints)

A Probabilistic Model



Probabilistic Explanation

I will define explanation as a form of robust objective chance. Definition 6.4 (Immediate Probabilistic Explanation) (si : 0) immediately explains (2 : VO t degree r if and only if:



In this section, I will illustrate how this model can be applied to several wellknown puzzles.


Risky Surgery

A risky form of heart surgery, S, is known to be fatal in some circumstances and is performed only when the chances of imminent death are very high. Overall, the surgery increases the chances of the patient's imminent survival. The difficulty here is to give a probabilistic explanation of a case in which the surgery was fatal, despite the fact that the surgery decreases the probability of imminent death. Let us suppose that the surgery is uniformly performed in circumstances in which the probability of death would otherwise be 75%, and that the probability of imminent death, given the performance of the surgery, is 50%. There are a number of ways in which this could be done. First, it could be that there are certain conditions of the patient that cannot be determined until the surgery is begun, but that, in combination with the surgery, greatly increase the probability of death. Call such a condition C. Suppose that the probability of death given C and no surgery is 80%, and that the probability of death given the conjunction of C and surgery is 90%. Let us suppose that this connection is robust in this case: there are no other actual factors that would, in conjunction with C and 5, lower the probability of imminent death below 90%. In such a case, the conjunction C & S explains, to degree 0.9, the subsequent death of the patient. When this 5 is an indispensable component of this explanation, it does cause the death, despite the fact that S alone would support a probability of death of only 50%.


The Pill and Thrombosis

Another example of a paradoxical situation is provided by the relationship between the use of the birth control pill and the occurrence of thrombosis. There is some direct effect of the pill on the clotting mechanism, increasing somewhat


Realism Regained

the chances of thrombosis. However, the pill prevents pregnancy, and pregnancy has an even greater effect on clotting, leading to an even greater chance of thrombosis. The difficulty lies in providing a probabilistic explanation of the onset of thrombosis in terms of the use of the pill in a particular case, despite the fact that the use of the pill lowers the probability of thrombosis by lowering the probability of pregnancy. In this case, there is some intermediate factor between the use of the pill and the onset of thrombosis; however, this is not essential to the example. So, let's suppose that the pill's causation of thrombosis is direct, and its suppression of thrombosis is indirect (via the prevention of pregnancy). Let us say that the probability of thrombosis, in the absence of some causal factor, is negligible. Let's suppose that the probability of thrombosis given the pill is 1%, that the probability of thrombosis given pregnancy is 2%, and that the pill is 100% effective in preventing pregnancy, which otherwise, in the circumstances, has a 10% probability of occurring. In a case in which the pill causes thrombosis, we can give a probabilistic explanation of this fact in terms of the use of the pill: the use of the pill explains the onset of thrombosis to degree 0.01. This connection is robust: there are no other factors we can add that would bring the probability below 1%. The use of the pill is an indispensable part of this explanation, since the probability of the occurrence of thrombosis, in the absence of any causal factor, is negligible. It is also true that in cases in which the pill prevents thrombosis through preventing pregnancy, we can also give a robust probabilistic explanation of this fact. The use of the pill explains the non-occurrence of pregnancy to degree 1 (since we have assumed, unrealistically, that the pill never fails), and the non-occurrence of pregnancy, together with relevant facts about the patient's sexual activity, explains the non-occurrence of thrombosis to degree 99%. What makes the use of the pill indispensable in this case is the requirement of robustness. We could, without reference to the pill, find actual conditions that made the non-occurrence of thrombosis much more likely than 99%. These conditions would not constitute an explanation of this particular case of the non-occurrence of thrombosis, because of a failure of robustness. The addition of actual tokens, supporting the occurrence of sexual activity, would bring the probability of non-occurrence of thrombosis below 99%. Thus, the use of the pill is an indispensable part of a robust explanation of this particular case of the prevention of thrombosis.


A Hole in One the Hard Way

Deborah Rosen (as cited by Patrick Suppes (Suppes, 1984, p. 41)) created another example of causation with negative statistical relevance: making a hole in one "the hard way." A golfer makes a shot that slices badly, striking a tree limb, by which it is deflected directly into the hole. Hitting the ball as badly as the golfer did significantly lowers the probability of a hole in one, as compared to making a competent stroke in the first place. Hitting the tree limb, let's say, also lowers the probability of the hole in one. Nonetheless, in this particular

A Probabilistic Model


case, the golfer's swing and the ball's striking the limb are clearly both causes of the hole in one. We have three situations: s\, the golfer's bad stroke (supporting type 0), s 2 , the ball's hitting the limb, and s3, the ball's falling into the hole (type V)- The intermediate situation-token supports a number of types, of varying degrees of specificity. There is the type Xi which gives us the bare information that the ball struck the limb, and, at the opposite extreme, there is the type p, which includes complete information about the velocity and mass of the ball and the shape and elasticity of the tree limb. At the level of tokens, we can be confident that there is some situation 84, containing si as a part, together with information about the wind, the ball, and so on, such that we have, for some finite r, the weighted causal constraint $4 ~r 83. In turn, there is some larger situation 55, of which 83 is an essential part, and some finite q such that we have s5 ~g s 2 - The information in Si (the swing) and in s3 (the collision with the tree branch) is surely essential to robustly supporting the objective chances of r and q respectively. Hence, both si and 53 are INUS causes of the hole in one. Even if r and q are quite small, it seems clear that, in the absence of information about the swing, the objective chance of hitting the limb would be much lower (perhaps zero, since golf balls don't spontaneously leap across the course), and similarly, in the absence of information about the collision with the limb, the objective chance of the hole in one (given the slice) would be much smaller than q. At the level of types, we know that the type <p (the bad slice) is negatively correlated with the type of a hole in one (\jj). We are also supposing that the relatively underspecified type of a collision with the tree limb (x) is also negatively correlated with the hole in one. Nonetheless, there is a weighted causal constraint linking the stroke-type 0 with the very specific collision-type p, and another weighted constraint linking p with the hole in one-type ^. Thus, we do have immediate causal explanations linking <p to p and p to tp. The bad swing-type, <j>, can therefore figure in a mediate causal explanation of the hole in one, despite the lack of a positive statistical relationship between <j> and ^. It is important to bear in mind that causation at the token level is a precondition of fact/fact causation, or causal explanation at the type level. If there were no intermediate situation 53 actually instantiating type p, the merely generic links between <j> and p and between p and if) would have nothing to do with providing an explanation of the fact that s2 instantiates i/>.


Mishap at Reichenbach Falls

I. J. Good's example (as cited by Hitchcock (1995)) of a mishap at Reichenbach Falls involving Sherlock Holmes, Dr. Watson, and Professor Moriarty is of a very similar structure to that of the Rosen hole in one. In this case, Watson sees Moriarty about to push a boulder in a direction that will almost certainly result in its crushing Holmes. Watson preempts Moriarty by pushing the boulder in another direction, thereby lowering the probability of Holmes's death. Unfortunately, the boulder takes an unlikely path down the falls, crushing Holmes


Realism Regained

despite Watson's good intentions. Once again, we have a cause (Watson's pushing of the boulder) which actually lowered the probability of its effect (Holmes's death). For simplicity's sake, I will ignore in this case the intermediate events and look simply at s\, Watson's pushing of the boulder, and 82, Holmes's unfortunate death. The situation si is an essential part of some larger situation 53, supporting information about the shape and mass of the boulder and the contour of the cliff side. There is some objective chance q linking s3 to $2- We are assuming that q is quite small, certainly smaller than the probability of Holmes's death had Moriarty pushed the boulder. Nonetheless, q is finite, and the inclusion of Watson's push (si) in the total cause 53 is surely essential. Given only the character of the boulder and the contour of the mountain, the objective chance of Holmes's death is much lower even than q. On my account, we do not look at what would have happened in the absence of the cause. The presence and intentions of Moriarty are irrelevant to the question of whether Watson's push was an INUS cause of Holmes's death. We evaluate the contribution of Watson's push mereologically, not counter/actually. We see if we can find a situation that does not include Watson's push that supports an objective chance as high as that supported by situations that do include Watson's push.


Cartwright's Poison Oak Defoliant

Nancy Cartwright constructed the following example. A gardener had to choose whether to buy a defoliant that is 99% effective or a cheaper one that is only 90% effective. She chose the cheaper defoliant and sprayed a patch of poison oak with it. The poison oak survived. Was the spraying of the poison oak with the cheaper defoliant a cause of its survival? In my view, the answer is clearly "No," and this is the result we obtain by applying my model. There is some situation involving the hardiness of the poison oak that supported an objective chance of the plant's survival of at least 10%. This situation need not include any information about the defoliant. Conversely, there is no situation robustly supporting a higher chance of survival that contains the spraying event. It is true that the gardener's buying the cheaper defoliant was a cause of the plant's survival, since it was presumably her buying of the cheaper defoliant that caused her not to buy and use the more expensive one. If we assume that there was at some point a chance of the gardener's using the more effective spray, and that the choice to buy the cheaper spray precluded that use, then the decision to buy the cheaper one was clearly an INUS cause of the plant's survival. However, the gardener's use of the cheaper spray was not a cause. The plant survived despite, and not because of, that spraying.

A Probabilistic Model



Humphreys's Explanation

In The Chances of Explanation (Humphreys (1989)), Paul Humphreys offers the following definition of a direct contributing cause: B is a direct contributing cause of A just in case: 1. A occurs; 2. B occurs; 3. B increases the chance of A in all circumstances Z that are physically compatible with A and B, and with A and BQ (where B0 is the neutral state of system B, i.e., Pr(A/BZ) > Pr(A/B0Z), for all such Z; and 4. BZ and A are logically independent. The first thing that jumps out from the definition is the fact that the capital letters A and B are being used ambiguously, sometimes to refer to situationtokens (as in conditions (1) and (2)) and sometimes to refer to types, as in conditions (3) and (4). Fortunately, it is relatively easy to disentangle the ambiguity in a way that clearly conforms to Humphreys's intentions. Second, Humphreys's definition depends on the problematic notion of the neutral state of a system. In some cases, this is relatively easy to determine: absolute zero for temperature, rest for relative velocity, etc. In other cases, it is apparently impossible. What is the neutral state of human intelligence or personality type? What is the neutral state of sex? The application of situation theory, with its partial, three-valued interpretations, offers an attractive alternative. We can insist that type A contribute positively to the chances of type B both alone and in combination with any actually realized type in a given token. Definition 6.5 (Humphreys's Explanation) (s : 0) ~-># (s' : -0) 'iff s > s', and for all types x such that s\= x ana $ ig not a subtype of x, the objective probabilities are such that Pr(i/j/x) < Pr(ip/(x&</>)) This definition preserves the desirable features of Humphreys's explanation. For instance, if (s : <p) and (s : x) are both Humphreys's explanations of (s1 : if}), then so is (s :

This page intentionally left blank

Higher-Order Causation; Modal Facts as Causes

I have two reasons for being interested in a theory of higher-order causation, that is, a theory of how modal and causal facts can themselves cause concrete situations. First, my account of teleological or functional causation and explanation will be explicitly higher order. Second, I want to build a causal theory of modal and mathematical knowledge, which obviously depends on the possibility that modal facts can be part of the cause of mental states and processes. I have been careful in the preceding chapters to allow for the construction of three-valued and four-valued semantics for all of the modal elements used in defining causal and explanatory relations. Consequently, we can look for varying degrees of modal partiality in order to determine just how many, and which particular, modal facts are needed in deriving a given causal consequence.


A Problem with Higher-Order Causation

In a recent paper, Hitchcock (1996) uses Ellery Eells's (Eells (1991)) definition o causal relevance to defend the intelligibility of higher-order causation. According to Eells, a property <$> is (positively) causally relevant to property tjj in population p just in case the objective probability Prfy/^&crj) is strictly greater than Pr(tp/-i<t>&zri'), for all homogeneous background contexts 77. In applying Eells's definition to higher-order causation of the kind employed in the definition of teleology, we must suppose that </> is itself a property involving causal relevance. For example, Wright's definition of <^>'s having ^ as a function would, when translated into Eells's definition of causation, come out as something like this (ignoring the background contexts for the sake of simplicity):



Realism Regained

This account depends on making sense of higher-order objective chance, in particular, of making sense of the present objective chance of if> given <f>, and of 0 given itj>, being other than they actually are. As Hitchcock notes, it is very hard to see how to make sense of the present objective chance of any present objective chance being either 1 or 0. In the present state of the world, whatever factors that determine objective chance are either definitely present or definitely absent, so the actual objective chance of any proposition is fully determined. Hitchcock attempts to circumvent these problems without resorting to situation theory by introducing the parameter of populations. He suggests that we treat the objective chance of ip given <j>, and of ifr given -<<j>, as properties of various actual and hypothetical populations. The claim about higher-order causation is then taken to be a claim about a super-population, whose individual members are actual or hypothetical populations. However, Hitchcock has merely sidestepped the problem. To make sense of this solution, we must know two things: (i) which hypothetical populations to include as members of the superpopulation, and (ii) what probability measure over these hypothetical populations to use in computing the higher-order probability. To have a principled solution to these two problems, we would have to know the objective chance of the various objective chances represented in the hypothetical population. However, it was exactly the unavailability of such higher-order objective chances within the conventional possible-worlds approach that led to the impasse described above. In fact, any Humean account of causation will be unable to sustain the possibility of higher-order or vertical causation. At bottom, Humeans are antirealists about modality. In the place of irreducible modal facts, they accept only regularities in the appearance of occurrent qualities. Since these regularities are not themselves instantiated in particular events, there can be no regularities of regularities and, hence, no higher-order modalities. The modal realist can avoid this collapse of higher-order objective chance to triviality by considering partial worlds or situations. A situation is partial, so many of the factors that determine objective chance are undetermined in a given situation. We can, then, sensibly talk about a hierarchy or cascade of objective chances. Meaningful higher-order objective chance could exist whenever there are well-defined objective chances of certain factors, whose presence or absence would, in turn, determine the objective chance of other factors. However, there is, as we shall see, another approach within situation theory to defining the causal relevance of facts about objective chance, an approach that does not depend on making sense of higher-order objective chance. Rather than asking how the objective chance of <j> depends on the objective chance of i/> given <f>, or of V7 given -i</>, we can instead ask whether "deleting" facts about the causal connections between </> and ijj from particular situations leaves enough facts behind to enable those situations to cause the relevant instances of <j>. This talk about "deleting" facts from situation-tokens is metaphorical. We start with a token that supports the causal connection between 4> and ip, and then we consider proper parts of this token that do not support this connection and ask of these parts whether they support enough facts to enable them to count as

Higher-Order Causation


causes of the instances of <j> in question. In this case, it is the indispensability of facts about causal connections as incorporated in parts of actual situation, rather than the probabilistic relevance of those facts to abstract properties, that determines the existence of a causal connection. In this chapter, I will demonstrate that the claims I will make about the possibility of higher-order causation in part II chapters 12 (teleology), 15 (logical and mathematical cognition), and 16 (mind) can be supported by the model of causation developed in this part.


Modal Facts as Causes

Modal facts can themselves act as causes. Suppose that s is a minimal cause of s', that is, no proper part of s is a cause of s'. According to the definition of causation, s itself must support the modal fact O(As As'). Any part of s that does not support this modal fact must be a proper part, and so must not be a cause of s'. If we assume a principle of strict downward monotonicity, it follows that any type supported by a minimal cause of a token is causally relevant to any type supported by that token. Hypothesis 7.1 (Strict Downward Monotonicity) If( then there exists an 82 such that 82 C s and s^ > s\. If (s > B], and C C B, then there exists an s' such that Strict downward monotonicity entails that if s is a minimal cause of s', then s is not a minimal cause of any proper part of s'. If s is a minimal cause of s', then it is certainly part of a minimal cause, and so s is an INUS cause of s', s ~> s'. If strict downward monotonicity holds, then s is not an INUS cause of any proper part of s'. This means that (s : <j>) is causally relevant to (s1 : ijj), for any (j> and ip such that s\= <j) and s' = 1/1. In particular, in the case above, s's being of the modal type D(As > As') is causally relevant to every type of s'. Nomic facts can also be causally efficacious. In the case above, by the definition of >, s must support the causal constraint s ~ s'. If Hume's Hypothesis applies to this case, then there must be a type <f> such that s supports both (f> and the causal constraint </>|~ t/j. By the definition of causal relevance, we have that the causal-constraint type <^>|~ ij) supported by s is indeed causally relevant to the explanation of s' and its type 1/1. The truth of the causal constraint at s is an indispensable part of the explanation of the actuality of an immediately posterior situation of type ip. To make this concrete, suppose that s is an event of the collision of a pair of billiard balls with specific velocities. The relevant physical type of s (representing the masses and velocities of the two balls, as well as their impenetrability and elasticity) is (j>. The causal constraint </>|~ i/> is a special case of the laws of conservation of energy and momentum. This nomic fact is causally relevant to the subsequent velocities of the balls (represented by VO- Since the type ip is observable, our perceptual faculties belong to a causal chain including particular


Realism Regained

nomic facts. Such causal connections make possible reference to and knowledge of such laws of nature.


The Causal Relevance of the Excluded Middle

The type (</> V -n^) is a paradigm case of a merely disjunctive or gerrymandered property (see section 5.8.1). Merely disjunctive types are never causally relevant, since if ((</> V -K/>)|~ x) 'ls a causal-constraint fact in a situation s, then ( and (-/>|~ x) will each be supported by separate, proper parts of s. Instances of the law of excluded middle are always heterogeneous disjunctions and, hence, never represent natural types. However, although (0V-></>) may never be causally relevant, the same cannot be said of the type D(<v-i</>). Suppose that token s supports the following types:

Let us assume that s does not support any other relevant types; in particular, let us assume that it does not support ((-i</>&x)l~ VOi or ((<A&;/0)I~ VO- Token s does support the type (x&/9)|~ V0> since this follows from the first three types. However, let us assume that s supports (x & p) ~ VO only because it supports the first three types. That is, let us assume that any proper part si of s that does not support all of the first three types above does not support Given these types, it follows that s constrains the actuality of a succeeding token of type t/7- To be more precise, s must constrain the existence of a set B of types, each of which is immediately posterior to part of s and each of which supports the type i(>. If we assume, as seems reasonable, that s as a whole is causally prior to each member of B, it follows that s is a cause of B, s > B. Under these assumptions, we can show that s is a minimal cause of B, if we also assume Hume's hypothesis. Suppose that s' is a proper part of s, one that does not support one or more of the types listed above. Since s does not support any other relevant types, neither can s', one of its proper parts. Suppose, for example, that s' supports only the following four relevant types. It is easy to check, in a four-valued model, that these types are not sufficient to guarantee the actuality of a succeeding token of type

Higher-Order Causation


By our earlier assumption, s' does not support the type (x&/?)|~ VO- Let s" be a situation accessible to s' that supports both x and Pi but is not succeeded by any token supporting ifr. Given the support by s' of the first three types above, this entails that s" falsifies both (f> and -i^> (i.e., both -i<^> and </> are supported by s"). This is possible, since s' does not support the modalized law of excluded middle. The existence of s" demonstrates that s' cannot be a cause of B, since every member of B supports ip. Consequently, s is a minimal cause of 5. As before, strict downward monotonicity entails that every type supported by s is causally relevant to every type supported by B, in particular, to type TJJ. Although I have made use, in this argument, of Hume's hypothesis and the hypothesis of strict downward monotonicity, it is not essential to assume that these hypotheses hold universally. All that I need is that they hold in some cases of the appropriate kind. For a concrete illustration, suppose that s represents a situation in which a rabbit is pursued by a pair of predatory animals, x representing the presence of predator PI and p representing the presence of predator P%. Let us suppose that predator PI does not yet perceive the rabbit, but will immediately perceive and devour the rabbit if the rabbit makes any sudden movement (0). In contrast, predator P^ has the rabbit within its perceptual field and will devour it unless the rabbit makes a sudden movement, in which case PI will lose track of the rabbit's location. The rabbit notices predator P% and, consequently, makes a sudden movement (<), resulting in its demise V>, in this case, due to the actions of predator PI . My argument is that in this case, the situation s, which records the necessity of the disjunction <f> V -*j>, plus the two causal constraints, plus the facts x and p, is in every sense a cause of the rabbit's demise, and the inclusion in s of the modalized logical truth is causally relevant to the result. This result can be generalized to any validity of classical first-order logic, by simply substituting the validity for the law of excluded middle, and adding causal constraints that interact appropriately with the logical validity.


First-Order Teleological Causation

Suppose that the fact that wings are causally relevant to flight is part of certain tokens that cause the successful survival and reproduction of a species v of flying bird. The successful survival and reproduction of v is, in turn, causally relevant to the existence of a present-day winged thing, namely, an instance of v. Thus, the existence of an instance of wingedness is explained, in part, by the causal


Realism Regained

relevance of wingedness to flight. This gives us the initial, Wrightian condition for saying that flight is the function of wingedness as instantiated in this case. We can draw a distinction between intrinsic and extrinsic functions. For example, the bird of a wing exists for the sake of flying, and this is a case of intrinsic purpose. In contrast, seeds serve the purpose of feeding the bird: a case of extrinsic purpose. Suppose that we let if) represent the state of having wings, and if} the state of flying. Finally, let v represent the entire bird/bird-niche ecological system, including those aspects of the bird's environment that make possible its successful reproduction. The fact that the wings serve the intrinsic purpose of flying can be expressed as:

The symbol ~> represents the relation of direct causal relevance (as defined in chapter 5). The state </> has the intrinsic purpose of i/'-irrg in the token s, relative to background condition v, just in case the fact that some state-token s' supports a connection between v and (f> on the one hand, and if) on the other, is causally relevant to s's being </>. In the case of a species v of flying birds, the fact that there is a causal connection between being winged and flying is part of the causal explanation of wingedness in the winged members of v. In the case of extrinsic purpose, we have instead:

In this case, take (j) to be the presence of suitable seeds in the environment, and take if} to be the fulfillment of the bird's nutritional needs. In this case, the connection between v&z<j> and if> causes instances of u, not of (f>. In other words, the fact that the seeds fulfill the bird's needs explains why there are birds, not why there are seeds. Nonetheless, we can say objectively that, qua parts of the bird's ecological niche, the seeds do have the extrinsic purpose of fulfilling the bird's nutritional needs. Another mode of teleofunctionality is that of representational states, states whose function is to carry information of a certain kind. In the following chapter, I will define a notion of carrying information (with reference to relation R), which we can represent by the symbol R>H. We can say that a particular pattern of retinal stimulation <j) has the intrinsic function in s (relative to v) of carrying the information that if) is realized in relation R to s just in case:

The pattern (f> exists because it carries (in organisms of type v) the information ij}. We might say that when a state occurs that has the function for an organism to carry potential information of a certain kind, then that information has become actual for that organism. It may seem odd to say that particular patterns of retinal stimulation have proper functions, as opposed to saying merely that the visual system as a whole

Higher-Order Causation


has a function. However, we can see the visual system as consisting of a set of capacities for patterns of stimulation. Where the capacity for a certain pattern has been selected for because that pattern carries particular information, we can say that the pattern itself has the function of carrying that information, since whatever causes the capacity of the system to undergo that pattern also causes individual occurrences of the pattern.


Higher-Order Teleological Causation

In part II (chapter 16), I will argue that the efficacy of mental properties depended on the possibility of higher-order functions. For example, consider the human faculty of inference (whether inductive or deductive). This faculty has the function of interacting with mental states on the basis of their content, a paradigmatically mental or psychological property. Suppose, for example, that mental type (f> has the function of first recognizing the simultaneous presence of a belief in a conditional and a belief in the antecedent of the conditional, and then producing a new belief (by modus ponens) in the consequent of the conditional. Suppose we have three state-tokens, si, $2, and 33, where si is an instance of the type cf>, the state whose function is the performance of modus ponens. Suppose that 3% is a state whose type is that of believing both a particular conditional (p > q) and its antecedent, p. Let us call this type of mental state ij). Finally, let s3 be a state of believing q (call this type x), immediately posterior to the sum of s\ and 2We may suppose that the functionality of type </> corresponds to a causal constraint of the form:

Suppose that situation s supports this conditional and contains the sum of BI and 2- We may finally suppose that in the actual world w, s is actually a total cause of 53. Tokens si and 82 are both indispensable parts of this cause, and so their mental properties are causally relevant to the outcome. In addition, the fact that mental properties <j> and tf> are instantiated can be used in giving a causal explanation of the succeeding state. It may well be true that tokens s\, s<2, and 53 also realize physical states /xi, //2, and yits. It may also be the case that the instantiation of n\ necessitates the instantiation of <p by some super-token, and similarly for ^ and ip, and ^3 and X- Finally, there may be a covering physical constraint of the form:

Suppose token s' is a situation containing Si and s^ and supporting this conditional. Then we can suppose that s' is a total cause of 53. This seems to make the mental properties supported by si and 82 redundant or otiose. Such a conclusion, however, would be a mistake. It is true that s' is a total cause of


Realism Regained

53 and that the mental types supported by si and 2 &re irrelevant to the S'-SB connection. However, it also remains true that s is a total cause of 53. Token s supports the psychological covering law but not the physical one. Hence, in the context of s, the physical properties of si and s2 are irrelevant, but their psychological properties are not.

The Universality of Causation

Does every situation have a cause? On the one hand, there is a strong temptation to say "Yes." On the other hand, embracing the universality of causation in its strongest form leads to inconsistency, since we are forced to say that Reality (the sum of all actual situations) must itself have a cause, which must be an actual situation and thus part of Reality. However, a situation cannot be the effect of one of its parts. A natural response to this problem, common to Aristotle, Leibniz, and many others, is to limit the universality of causation to contingent situations. This cannot be quite right, as well-known objections by James Ross and William Rowe have shown. However, I think it is approximately correct. What is needed is to use the resources of mereology to define a category of "wholly contingent" situations. We can coherently suppose that all wholly contingent situations have causes. The universality of causation, if it is in fact true and knowable, has a number of very significant implications for the theory of epistemology of causal facts. As we shall see, abduction to unknown causes seems to depend on some assumption about the universal scope of causation. The induction of causal laws may have a similar dependence on this assumption. As I have discussed earlier, it is important to distinguish the thesis of the universality of causation from that of determinism. Determinism can be taken as the conjunction of some sort of principle of the universality of causation with the thesis that causes necessitate their effects. We have already seen a number of reasons for rejecting the necessitation model of causation. Reflection on the universality of causation gives us a further reason: if causes necessitate their effects, then it cannot be the case that all wholly contingent situations have causes. The determinist must come up with some alternative restriction of the universality of causation, perhaps to temporally bounded situations. In addition, the determinist needs to produce some independent motivation for

108 this restriction.

Realism Regained


A Modal Mereology of Situations

My formal framework will be a modal logic supplemented by the LesniewskiGoodman-Leonard calculus of individuals ("mereology") (Leonard and Goodman (1940)). By way of modal logic, I need only the axioms of rules of T. I will assume a fixed domain of possible situations; hence, the logic will include the Barcan and converse Barcan axioms. I will use the two usual predicate symbols of mereology, C and Q> representing part-of and overlap, respectively. I need three mereological axioms: Axiom 8.1 Axiom 8.2 Axiom 8.3 Axiom 8.1 defines the part-of relation in terms of overlap, and axiom 8.2 is an aggregation or fusion principle: if there are any facts of type tf>, then there is an aggregate or sum of all the </> facts. Axiom 8.3 guarantees that the part-of relation is reflexive and anti-symmetric. There are two principles linking the modal and mereological languages. Here I need to introduce a new predicate, A. Where b is a possible situation, Ab can be used to state that 6 actually obtains. Axiom 8.4 Axiom 8.5 Axiom 8.4 ensures that aggregation of situations is a form of conjunction: a whole necessitates all of its parts. Conversely, axiom 8.5 implies that the existence of all the members of a sum necessitates the existence of the sum itself. There is one special notion to be denned: that of being "wholly contingent," represented by 'V. Definition 8.1 A wholly contingent situation is an actual situation none of whose parts are necessary. I am not assuming that there are any necessary situations: the existence of necessary truths does not entail the existence of necessary situations (since our logic lacks a comprehension principle). As we shall see, if there are any necessary situations, they are situations of a very special kind.

Universality of Causation



Principles of Causation

The causal relation will be represented by a primitive binary operator, '>'. There are a number of logical properties of causation that can be expressed, for instance, the transitivity and asymmetry of causation. I will, however, need only three facts about causation for the present purposes: Axiom 8.6 Veridicality: Axiom 8.7 Separate Existence: Axiom 8.8 Universality: Axiom 8.6 stipulates that only actual situations can serve as causes or effects. Axiom 8.7 is intended to capture Hume's insight that a cause and its effect must be "separate existences." The language of mereology, when applied to facts, enables us to state Hume's principle precisely: a cause must not overlap its effect. It is very important to bear in rnind that axiom 8.7 does not require that a cause not overlap its effect in space or time: it is only mereological overlap (the having of a common part) that is ruled out. Axiom 8.8 expresses the universality of the causal relation: every wholly contingent fact has a cause. Axiom 8.8 does not entail determinism in any of its usual senses, since I have not stated that causes are sufficient conditions for their effects. I am not assuming that every event is necessitated by its causes; in fact, I believe that this is not typically the case. Causal laws are always exception-permitting or defeasible generalizations. It is quite possible for C to be in every sense the cause of E, even though it was possible for C to occur without being accompanied by E. (For this reason, this account of causation is compatible with, although it does not entail, indeterministic theories of human freedom.) The evidence for axiom 8.8 is essentially empirical. Every success of common sense and science in reconstructing the causal antecedents of particular events and classes of events provides confirmation of axiom 8.8.


The Universality of Causation

The Role of Defeasible Reasoning

Even though we have excellent empirical evidence for the generalization that wholly contingent situations have causes, it is hard to see how any amount of data could settle conclusively the question of whether or not this generalization (axiom 8.8) admits of exceptions. This is a legitimate worry, but I would respond by insisting that, at the very least, our experience warrants adopting the causal principle as a default or defeasible rule. This means that, in the absence of evidence to the contrary, we may infer, about any particular wholly contingent situation, that it has a cause. Using the modal conditional D> to represent a kind of defeasible connection, we can express the weakened form of axiom 8.8 thus:


Realism Regained

AxiomS.8' This version of axiom 8.8 can be read as: normally, a wholly contingent situation has a cause. This defeasible axiom 8.8' will allow us to infer that any given wholly contingent fact has a cause unless some positive reason can be given for thinking that the fact in question is an exception to the rule, for example, by showing that the fact belongs to a category of things that typically does not have a cause.


Is Universality Merely Heuristic?

In his debate with Copleston, Russell insisted that there is a difference between claiming that scientists should always look for a cause and claiming that there is always a cause there to be found. Russell followed Kant's suggestion that the universality of causation be seen as a canon or prescriptive rule for reason, and not as a description of mind-independent reality. The cosmological argument depends on using the principle of universality as a descriptive generalization. I have two principal responses. First, it is hard to see why the abundant success of empirical science in finding causes for contingent facts does not provide overwhelming empirical support for the generalization to all contingent facts. The category of wholly contingent facts is not an unnatural, gerrymandered kind like 'grue' or 'bleen'. Are we to believe that it is merely a coincidence that time and time again we find causes for contingent facts? Second, the denial of the universality of causation as a descriptive generalization constitutes a very radical form of skepticism. All of our knowledge about the past, in history, law, and natural science, depends on our inferring causes of present facts (traces, memories, records). Without the conviction that all (or nearly all) of these have causes, all of our reconstructions of the past (and therefore nearly all of our knowledge of the present) would be groundless. Moreover, our knowledge of the future and of the probable consequences of our actions depends on the assumption that the relevant future states will not occur uncaused. The price of denying this axiom is very steep: embracing a comprehensive Pyrrhonian skepticism.


The Existence of an Uncaused First Cause

Besides the logical principles presented above, the proof of the existence of a first cause depends on only one factual premise: that there exists a contingent situation. For example, suppose there are an odd number of molecules in my pencil at the present moment: surely there could have been an even number. A single contingent situation of this kind is all that I need, although I believe that nearly every fact with which we are acquainted is contingent. I would go so far as to say that every physical situation is contingent.

Universality of Causation



The Nature of Modality

In saying that a situation is contingent, I am saying much more than merely that the proposition asserting its existence is neither logically true nor logically false. A contingent situation is one that is actual but could have been nonactual, where the relevant notion of possibility is that of broadly metaphysical possibility. Broadly metaphysical possibility is the fundamental form of possibility, of which all other kinds (physical, historical, legal, etc.) are qualifications or restrictions. Attempts since the days of logical positivism to reduce metaphysical possibility to logical consistency (or logical consistency with all definitional or "analytic" truths) have failed. First, it has proved impossible to specify the "analytic" truths without making reference to possibility and necessity. Second, nothing is gained in clarity unless we insist on using first-order logic, which, as John Etchemendy (1990) has argued, is an implausible construal of logical consistency. Finally, the attempt to avoid the supposed "mysteries" of metaphysical possibility in this way leads to the much more serious difficulties of set-theoretic platonism, with the attendant mysteries of how these transcendent mathematical entities connect to the rest of reality and, most crucially, of how we can obtain reliable knowledge of them. Recent efforts at making sense of mathematical reality make use of the notion of metaphysical modality (as in the "possible structures" of Hellman (1989)), indicating that the proper order of explanation stars with modality, not with mathematical entities. If we deny that there are any contingent situations, then we must conclude that we live in a world in which all three modalities possibility, actuality, and necessity collapse together. This is tantamount to denying that these modalities can do any interesting work. Such a denial runs athwart the growing body of philosophical work in which modality plays a central role.


A Sketch of the Proof

Lemma 8.1 All the parts of a necessary situation are themselves necessary. Proof: By axiom 8.4 and the K axiom of modal logic. Lemma 8.2 Every contingent situation has a wholly contingent part. Proof: Let a be a contingent situation. If a is wholly contingent, we are through, since a is a part of itself. Otherwise, a has a necessary part. By axiom 8.2, there exists a situation x (x C a & OAx) that consists of the aggregate of all the necessary parts of a. Since a is contingent, a itself is not a part of x (x C a & C\Ax), since if it were, then, by axiom 8.3, a would be identical to x (x C a & OAx), and, by axiom 8.5, a would exist necessarily, being a sum of necessary parts. By axiom 8.1, there is a b that overlaps a but not hence there is a part of a, say c, that is not a part of


Realism Regained

We can show that c is wholly contingent. Suppose that d is a part of c. Then d is part of a but d does not overlap a; (a; C a & OAx). Hence, d is not necessary. Since d was an arbitrary part of c, c is wholly contingent. Definition 8.2 Let C be the aggregate of all wholly contingent situations. By axiom 8.2, it follows that if there are any wholly contingent facts, then any fact overlaps C if and only if that fact overlaps some wholly contingent situation.

Lemma 8.3 // there are any contingent situations, C is a wholly contingent situation. Proof: Suppose that there is at least one contingent situation. Then there is also a wholly contingent part, by the preceding lemma. To show that C is wholly contingent, we must show that every part of C is contingent. Let a be a part of C. Since a is a part of C, a overlaps C, by axioms 8.1 and 8.3. Hence, a overlaps some wholly contingent 6 (by the definition of C). It is a theorem of mereology that two facts that overlap have a common part. Hence, some d is part of both a and of b. Since b is wholly contingent, d is contingent. By lemma 8.1, if a were necessary, d would be necessary. Consequently, a is contingent. Therefore, since a was an arbitrary part of C, C is wholly contingent. Lemma 8.4 // there are any contingent situations, C has a cause. Proof: An immediate consequence of lemma 8.3 and axiom 8.8, the Universality of Causation. Lemma 8.5 Every contingent situation overlaps C. Proof: Let a be a contingent situation. By lemma 8.2, a has a wholly contingent part, say b. By axiom 8.2 and the definition of C, C and b overlap. Theorem 8.1 // there are any contingent situation, then C has a cause that is a necessary fact. Proof: By lemma 8.4, C has a cause. By axiom 8.7 (Separate Existence), this cause does not overlap C. By lemma 8.5, every contingent situation overlaps C. By axiom 8.6 (Veridicality), the cause of C is actual. Hence, the cause of C must be a necessary situation. Since we know that there is at least one contingent situation, we can identify C with the cosmos, and use theorem 8.1 to conclude that the cosmos has a cause that is a necessary fact, a first cause. It is legitimate to call this cause a "first cause" if we assume (as seems plausible) that all effects are contingent.1
'For a discussion of some of the theological implications of this result, see my article "A New Look at the Cosmological Argument" (Koons (1997)).

Universality of Causation



The Well-Foundedness of Causation

Is the causation relation well founded? Are infinite causal regresses impossible? Beginning with Plato (in Book X of The Laws), many philosophers have thought so. However, many others, especially in the twentieth century, have expressed doubts on this point. It is a corollary of my version of the cosmological argument that the causal relation is well founded. Suppose for contradiction that we have an infinite causal regress: ... > sn t> ... > s\ > SQ. Let us call the sum of the regress SOQ. Only wholly contingent tokens can be caused, so each of the members of the series is wholly contingent. Consequently, SQO is wholly contingent. By axiom 8.8, SQO has a cause, SQC+I. However, SQO+I cannot be an immediate cause of any of the members sn of the series, since SQO+I is screened off from sn by sn+i. Suppose, for contradiction, that SQO+I were a cause of sn. Then, s n +i would be preempted from causing s n , since Soo+i is causally prior to sn+\. This contradicts our assumption that s n+ i is a genuine cause of sn. Therefore, Soo+i cannot be the immediate cause of any member of the series. Since SQO+I is not an immediate cause of any of the members of the series, it cannot be a mediate cause of any of them either, since mediate causation is simply the transitive closure of immediate causation. So, SQO+I does not cause any of the members of the series, and therefore, it does not cause the sum of the series, BOO, contrary to our original assumption. In many cases, the impossibility of an infinite regress has been used as a premise in the cosmological argument. I think, however, that it is more illuminating to think of it as a corollary.


Isn't Causation Valid Only for the Phenomenal World?

In the first Critique, Kant argues that causation pertains only to the apparent or "phenomenal" world, not to the real or "noumenal" world. His argument depends on assuming that the fundamental causal principles are known prior to experience, and that nothing substantial or material about the real world can be known by us prior to experience. Kant's objection is relevant only to a priori arguments for God's existence, like those of Scotus or Leibniz. It is not relevant to an argument like mine that rigorously appeals only to empirical, a posteriori arguments. I am not claiming that the axioms of causality I am appealing to are known by us prior to their application to the world of experience. Instead, I appeal to our success in finding causal explanations as empirical evidence for these generalizations.


Realism Regained


What about Quantum Mechanics?

Quantum mechanics is sometimes taken to provide abundant counter-evidence to the universality of causation. Quantum mechanics raises two problems for our understanding of causality: the indeterminism of wave collapse (under the Copenhagen interpretation), and the Bell inequality theorems. The indeterminism of quantum transitions during observation does not contradict axiom 8.8. I have not assumed that causes necessitate their effects: in fact, I strongly suspect that such an assumption is incoherent (if "necessitate" is understood in a strong sense). According to the Copenhagen version of quantum mechanics, every transition of a system has causal antecedents: the preceding quantum wave state, in the case of Schrodinger evolution, or the preceding quantum wave state plus the observation, in the case of wave packet collapse. The Bell inequalities demonstrate that the data described by quantum mechanics forces us to reject one of the following three principles: Causal influences never travel backward in time. Causal influences never travel faster than the velocity of light. Every reliable (projectible) correlation has a causal explanation. In discussions of the Bell inequalities, the third principle is sometimes labeled a law of "causality." It is, however, much stronger than my axiom 8.8. In this chapter, I have not assumed that (as the third principle implies) a cause always "screens off" (in Reichenbach's sense) its effects from non-posterior states, although I will make use of this assumption in appendix B. The Bell inequalities are merely another demonstration of the impossibility of reducing causation to some sort of statistical relationship. They raise no difficulties for a causal realist such as myself. In my opinion, the most reasonable response to the Bell inequalities would be to restrict one or more of the three principles above to macroscopic (large-scale or classical) phenomena and to restate them as defeasible (exception-permitting) rules. I would favor restricting the second principle, applying it only to direct, macro-to-macro interactions, interactions between classical systems. Where causal influences between classical systems are mediated by quantum phenomena (which, on my view, have no intrinsic position or velocity), then exceptions to the second principle can occur. These exceptions do not, however, permit the exchange of information at superluminal velocities.


Doesn't the Argument to a First Cause Assume the Impossibility of an Infinite Regress?

Leibniz was the first to realize that the cosmological argument does not depend on any assumption about the impossibility of infinite regresses. Even if there are infinite regresses of causes within the totality of contingent facts, the totality

Universality of Causation


itself must have a cause that is outside it and, hence, a cause that is necessary. The crucial assumption is axiom 8.2, the assumption that any non-empty set of situations can be aggregated into a single situation. This corresponds to the pre-modern denial of infinite regress, since it in effect denies that any such totality is what Cantor termed an "absolute" or improper totality (like the set of all sets, or the set of ordinal numbers). There is little if any reason to think that there is anything improper about the totality of all wholly contingent situations. We are talking only about ontologically basic situations, not about mathematical or semantical truths that supervene upon them. I am simply aggregating concrete particulars, and I am not running afoul of Russell's vicious circle principle in the process. There is no reason to postulate any facts that somehow involve or presuppose the totality of all situations, or of all contingent situations.


Doesn't the Argument Commit the Fallacy of Composition?

Russell accused Copleston of committing the fallacy of composition, arguing that because each of the parts of the world is caused, the whole must be caused. The cosmological argument includes no such error: it is demonstrated that the cosmos is itself a wholly contingent situation, and for that reason must have a cause.


Isn't Necessary Existence an Impossibility?

A number of twentieth-century philosophers follow Hume in holding that only logical truths can be necessary, that the very notion of a necessary existence is incoherent. Two replies. First, I have not assumed the existence of a necessary situation: this was the conclusion, not a premise, of the argument. Thus, this so-called objection simply fails to engage the argument. The objector is content merely to deny the conclusion without bothering with the premises or the reasoning. Second, the Humean principle being relied upon is self-defeating. Is it supposed to be true by definition that only logical or definitory truths are necessary? Surely in saying this, Hume, Russell, and others intended to be saying something informative. How could such a principle be contingent? What sort of contingent facts about the actual world make it the case that there are no non-logical necessities? What empirical justification have the anti-essentialists provided for their claim? In response, the objector must simply deny that he can make any sense of this notion of modality, except insofar as it is replaced by the clear and well-behaved notion of logical consistency. This sweeping denial of modality is simply obscurantist, undermining fruitful philosophical research into the nature of natural law, epistemology, decision, action and responsibility, and a host of other applications.


Realism Regained


Don't Contingent Facts Typically Have Contingent Causes?

This is probably the most promising line of rebuttal to the cosmological argument. It is an instance of a wider strategy: focus on some unique feature of the first cause, and point out that the cause of the world's having that feature is an exception to some well-established generalization. Indeed, for the most part, contingent situations do have contingent causes. They also have causes with finite attributes and causes that can be located in space and time, unlike the hypothesized first cause. Once we have established that the cosmos is relevantly unusual, we seem to be faced with two equally unattractive options: supposing that the cosmos has only a very unusual kind of cause, or supposing that it has no cause at all. Thus, we end in a stalemate. The defender of the cosmological argument must respond with substantial reasons for thinking that, although the first cause is unique in a number of respects, each of these unique features can be adequately explained by extrapolating from tendencies already observable in ordinary cases of causation. For instance, I would conjecture that, in some precise sense, a cause is always more nearly necessary or less profoundly contingent than its effect. One very simple definition of relative necessity would be the following: a is more nearly necessary than In other words, a situation a is more nearly necessary than situation b just in case a holds in every world in which any part of b holds, but a could exist in the absence of any part of b. That the causal antecedents of a situation-token are more nearly necessary than the token itself follows from the identity conditions of situation-tokens. The causes of a token are essential to its identity: had the very same truth been verified by a situation caused in a different way, we would not have had the same situation as verifier. The corresponding thesis involving effects is not plausible: a situation's identity does not include the eventuality of all its effects. The contingency of the evolution of the world depends on this asymmetry: a situation's holding necessitates the holding of its causes, but not of its effects. This principle (an effect necessitates the existence of its causes) does not imply that the content of an effect necessitates the content of its causes. For example, the situation of Caesar's death could not have existed had not all of its causes, including Brutus's knife thrust, existed. This of course does not mean that Caesar wouldn't have died unless Brutus and the other senators had killed him. The truth 'Caesar died' would have been verified by a different situation in all of those worlds in which Brutus does not help in inflicting the fatal set of wounds. The situation that actually verifies the truth 'Caesar died' would not have existed had any of its causes failed to exist. There are several additional reasons (besides the one involving the identity conditions of situations) for thinking that causes are more necessary than their effects. First, there is the authority of Aristotle and the Aristotelian tradition.

Universality of Causation


Second, it is clear that we need some account of causal priority that explains the transitivity and asymmetry of this relation. An account of causal priority in terms of relative necessity nicely satisfies this desideratum. Third, this account enables us to specify exhaustively the "potential causes" of a given situation: a is a potential cause of b if and only if a is more necessary (less contingent) than b. Such a specification is necessary if we are to account for the statistical properties of causal connections, the so-called Markovian principles developed by Salmon (1984) and Suppes (1984) and studied recently by Pearl and Verma (1991) and by Spirtes et al. (1993). I use these Markovian principles in developing a causal calculus in appendix B. Markov locality entails that the causal antecedents of an event "screen off" the probability of that event from the probability of any non-consequent event-token. If we assume that the probability of every actual event-token is screened off in this way by its actual causes, then we are implicitly assuming that the causal antecedents of any actual token are necessary to its identity, that there are no non-actual or counterfactual causes of actual tokens. Finally, the relative necessity of causally antecedent tokens gives us an explanation of the asymmetry of past and future. In some sense, given the present, the past is fixed in a way that the future is not. This "fixity" of the past can best be understood as the relative necessity of past event-tokens, given the token event corresponding to the present. It is not that the type of the present moment necessitates the types of past moments, since there could certainly have been many different histories leading up to an event qualitatively identical to the-world-at-the-present-moment. Instead, the event-token that is present necessitates the event-tokens making up the past, but it leaves open a number of different sequences of future event-tokens. Since past tokens are causally antecedent to the present, we have another (and I think conclusive) reason for accepting the thesis of the relative necessity of causally antecedent tokens. This thesis is implicit in all "branching-future" models of temporal logic. (See section 10.2 for further evidence on this point.) However relative contingency is defined, it is clear that the cosmos is a situation of absolutely minimal contingency. If situation a contains situation 6 as a part, then 6 is no less contingent (no more necessary) than a, since a could not exist if b did not exist. Since the cosmos contains every wholly contingent situation as a part, no wholly contingent situation can be less contingent than the cosmos. Since the cosmos is a situation of minimal contingency, it is not surprising that it should have no contingent cause, but it would still be very surprising if it had no cause at all. These considerations lead to a new version of the critical axiom 8.8, the axiom of causality. Axiom 8.8" Vx(AxO> 3y(yis more nearly necessary than On the basis of induction, we can confirm that, at every degree of necessity (short of absolute necessity), every token is caused by some token more necessary than it. As we successfully build scientific models that stretch across astronomical and geological time, we confirm that situation-tokens across a wide


Realism Regained

swath of degrees of necessity have causes that are strictly more nearly necessary than themselves. Axiom 8.8" is the defeasible generalization of this pattern. Axiom 8.8" states that we may reasonably infer, about any token at any degree of necessity, that it has a causal antecedent which is more nearly necessary than it. When we try to apply axiom 8.8" to a necessary fact (or any fact that is not wholly contingent), we find that the defeasible conclusion is blocked, since there is no fact more nearly necessary than an absolutely necessary fact. When we apply axiom 8.8" to the cosmos, or to any other minimally contingent fact, we succeed in drawing the defeasible conclusion, and in addition, we have an explanation as to why the cause of the cosmos is necessary. In fact, axiom 8.8" does not depend on the strong assumption that every token necessitates every one of its causal antecedents. It is sufficient to make the much weaker assumption, that every token necessitates at least one of its causal antecedents. The cosmos must have a causal antecedent that it necessitates, and this necessitated cause must be absolutely necessary.


Where Did the First Cause Come From?

If we're right in thinking that causes must be strictly more nearly necessary than their effects, it follows that necessary situations cannot be caused (at least, not in the ordinary sense). Another reason for thinking that necessary situations cannot be effects is this: we know that the totality of all situations cannot be caused (since there is no situation that does not overlap it), and the best explanation of this situation is that this totality contains necessary situations, and necessary situations cannot be caused.


The Ross Objection: Did the First Cause Cause That It Caused the World?

James Ross (Ross, 1969, pp. 295-304), has argued that the principle of sufficient reason can be demonstrated to be false. His objection can be adapted into an objection to my axiom 8.8 (the Universality of Causation) as follows. Consider the situation that the first cause caused the cosmos. Call this situation C*. C* is clearly a contingent situation, since if it were necessary, the cosmos itself would be necessary (by Axiom 8.6, veridical!ty). If C* is also wholly contingent, then it must be a part of the cosmos, and the first cause must cause C*, i.e., the first cause must cause the situation that it causes the cosmos. The same argument can be repeated, showing that the first cause must cause that it causes that it causes the cosmos, ad infinitum. This appears to be a vicious infinite regress. The best answer to this objection is to point out that there is no reason to think that C* is wholly contingent. The situation that the first cause causes the cosmos would appear to be composed of two situations: namely, the first cause on the one hand and the cosmos on the other. The truth that the first caused the second does not represent a third situation in addition to the first

[TniversaL'ty of Causation


two. Instead, such statements about single-case causal connections supervene upon the cause, the effect, and certain non-situational truths about the modal relationship between the cause and the effect. Therefore, the wholly contingent part of C* is simply the cosmos itself, and we are forced only to reaffirm that the first cause does cause the cosmos. This response entails that there are no situations, over and above situations about modality and other non-causal matters, corresponding to single-case causal nexus. That is, we axe assuming that causal truths are supervenient on modal and other non-causal truths (including truths about objective chance or propensity, and about powers and liabilities). Causal connections between situations in a world are to be explained entirely in terms of what has happened in that world, and what might or probably would happen in it and alternative worlds. This sort of modest ontological reduction is quite attractive, since the alternative is to posit causal nexus as brute situations, without any logical relationship to predictability or to statistical regularities. At the same time, this sort of modest reduction does not entail the eliminability of causal discourse, nor does it obviate in any way the necessity of positing situations as an ontological category. Causation is a relation between situations, not any kind of prepositional operator, but any particular causal nexus between situations consists of some aggregation of other modal, stochastic, and historical situations.


William Rowe's Objection

William Rowe (Rowe, 1975, pp. 108-110), has proposed a variant of Ross's objection to the cosmological argument. Rowe asks us to consider the situation a that corresponds to the true proposition: there are contingent (positive) situations. Most defenders of the cosmological argument will accept that a is itself contingent. Therefore, the first cause must cause a. However, the situation that the first cause has caused a is itself a contingent situation, so the first cause would have to cause the situation that it caused a, and so on, ad infinitum. The proper response to this objection is only slightly different from the response to the last objection. The proposition that there are contingent situations does not correspond to a single situation. Situations are not closed under existential generalization, as propositions are. From the existence of a situation that n has F, it does not follow that there is a distinct situation that something has F. Consequently, the situation that makes Rowe's a true is simply the cosmos itself, and no infinite regress can be generated. This is not simply an ad hoc response, since there are independent grounds for denying the existence of a special category of existential situations. Causation is transparent: that is, if the situation that there is an F caused a, then there is some n such that the situation that n is F caused a. Similarly, if the proposition that there is an F has been made true by some situation a, then there is some instance of this generalization that has been made true by a. Thus, in neither case is there any reason to posit a special category of situation corresponding to the existential quantifier.

This page intentionally left blank

A Theory of Information and Misinformation

9.1 Introduction

Attempts to explicate the phenomenon of representation naturalistically, say, in terms of causal connection, often founder on the problem of explaining the possibility of error. Suppose, for example, that we attempt to explain representation along these lines: fact (s,cr) represents fact (S',T) iff (S',T) is a causally necessary condition of (s, cr). Such an account, of course, leaves no room for the possibility of misrepresentation, if 'necessary condition' is interpreted strictly. Whenever an actual (s, a) represents a fact (s', T), the state (s', T) will be actual and (s,a) will not be in any way a misrepresentation of the world. A common strategy for solving this problem is to distinguish between two types of situation: type 1 and type 2.1 We can then identify the content of a representation with that fact which is causally necessitated by the form of the representation in type 1 situations. A representation in a type 2 situation can then misrepresent the world, since its content is determined with reference to a counterfactual situation: what would be causally necessitated by the form of the representation were it to be located in a situation of type 1 instead of type 2? The distinction between type 1 and type 2 situations is typically made in terms of the historical antecedents of the representational form. For example, in Dretske's account of representation (Dretske (1981)), type 1 situations are those situations that occur during the training period in which the meanings of the representational forms are impressed upon the individual subject. Similarly, in Millikan's account (Millikan (1984)), type 1 situations are those situations that actually occurred in the evolutionary history of the representational system in question.
'See (Fodor, 1990, p, 60).



Realism Regained

The strategy of appealing to historical antecedents leads to a number of serious difficulties, as I will argue in section 9.2. Firstly, the historical strategy tends to attribute contents that are far too weak, since misrepresentations do in fact occur in type 1 situations, and these are misdescribed as veridical whenever the historical strategy is followed. Secondly, the historical strategy makes content far too sensitive to irrelevant accidents of history. Finally, this strategy would force us to make facts about the remote past relevant to the best theoretical account of the present. In section 9.3, I develop two accounts of the nature of information and of the possibility of error that avoids the historical strategy, the type I/type 2 distinction, and the concomitant difficulties. The first account relies exclusively on probabilistic connections between states, interprets information in terms of probabilistic necessitation, and leaves room for the possibility of error by failing to make the erroneous inference (made, surprisingly enough, by both Dretske and Fodor) from an event's having probability zero (or infinitely close to zero) to that event's being absolutely impossible. The second account uses the idea of conditional functions to explain error: a representation is erroneous when it has the function (conditional on q) of carrying the information that p, and condition q fails to hold.


The Historical (Retrospective) Strategy

The simplest information-based theory of representation would go something like this: a representation type a represents the actuality of some state of affairs T just in case it is causally impossible that a be actual without r's being actual. This means that either r is a causally necessary condition for a or a is a causally sufficient condition for r. Unfortunately, this simple theory attributes content to representation types that is far too weak, so weak that error is impossible. On this account, if a is actual, and a represents r, then T must also be actual. One way to narrow the content of a representation-type and thereby to explain the possibility of error is to appeal not only to the present causal properties of the representation-type but also to facts about the actual history of the type (even its remote history). These facts may be facts about the previous history of the individual symbol user as in Dretske's theory (Dretske (1981)) or about the history of the representational practice to which the type belongs as in Millikan's account (Millikan (1984)). A simple Dretske-like theory might take the following form: representation-type a represents r for subject A iff for every situation s belonging to the training period (during which A learned the meaning of <r), T was causally necessary in the circumstances (in s) for <r. We could say that a state of affairs T is causally necessary in the circumstances of s for a iff there is a state of affairs v such that both a and v are actual in s and T is causally necessary for the joint occurrence of a and v. Misrepresentation is possible for any representation occurring outside of the training period, because an event-token of a occurring in a situation s outside the training period might represent r even though T is not necessary for a in the circumstances of s.

A Theory of Information


This sort of account has the paradoxical result that the longer and more varied the training period, the weaker the content of the representation-type. If, for example, we extended the scope of the relevant history to include the whole history of the representational practice (in the case of natural representations, this would mean the entire evolutionary history of the species), the resulting content would be so weak as to render error virtually impossible. An alternative but very simple informational account would stipulate that cr represents T just in case the occurrence of a increases the objective probability of T. Let's say that a probabilifies T just in case the objective conditional probability of T on cr is greater than that of T on the negation of a. We could say that a represents T just in case a probabilifies r. This account has a defect that is exactly opposite to the defect we encountered in the simple causalnecessitation model. Instead of making error impossible, this account makes error absolutely ubiquitous. Every representation represents innumerably many possible states of affairs, all but a vanishingly small proportion of which are nonexistent. Millikan starts with this probabilizing model and solves the ubiquity of error problem by adopting a version of the historical strategy. On a simple Millikanlike account, we could stipulate that a represents r iff cr probabilifies T, and the fact that a probabilifies T has in reality contributed causally to the perpetuation of some reproductive family to which a belongs. The longer and more varied is the relevant evolutionary history, the narrower are the contents ascribed to the representation and the more frequent are the errors and misrepresentations. Indeed, as Millikan recognizes, there will be many representational forms that will be erroneous on nearly every occasion (Millikan, 1984, p. 34). For example, suppose that some pattern of auditory stimulation increases the probability of the presence of a predator and that this pattern has triggered a flight response in the past, contributing thereby to the perpetuation of the species. Then, on Millikan's account, the auditory pattern represents "Predator near!" even though on nearly all occasions, the pattern is caused by the wind's rustling of leaves. In fact, Millikan's account cannot provide a basis for ascribing probabilistic content. For example, we could not, on her account, distinguish between signals that mean "There's a slight chance of a predator near" from those that mean "More likely than not there's a predator near" or "Without a doubt a predator is near." All of these signals would simply represent "Predator near" without qualification. There are a number of other difficulties that could be raised concerning the details of Dretske's theory or of Millikan's, but here I would like to concentrate on some problems that are endemic to the historical strategy itself. Firstly, reliance on the historical strategy causes deviant cases in the past to influence the content of representations. For instance, it is quite common for training periods to include some cases in which the representation is wrongly but plausibly applied. I could teach a child the true meaning of 'bird' by means of cleverly constructed mechanical models, even though every attribution of the term in the training period was false. Similarly, in the evolutionary history of any representational system, there will be events in which misrepresentations


Realism Regained

accidentally contributed to the survival of the system. Secondly, the historical strategy makes content too sensitive to accidental features of history. For the sake of illustration, consider the following version of the Twin Earth thought-experiment. Suppose that on Twin Earth, both H^O and XYZ occur in equal abundance, and in close proximity to one another: here an H^O lake, there an XYZ river, and so on. Suppose further that, simply as a matter of pure coincidence, the inhabitants of Twin Earth have encountered only H-iO and have applied to it the term 'water'. Applying the historical strategy means interpreting this symbol as designating only H%O, despite the fact that Twin Earthers are, in the future, just as likely to encounter XYZ as H-2.O and are completely unable to discern any difference between the two. Thirdly, the historical strategy makes facts about the remote past directly relevant to the ascription of content to present-day representations. Content ascription should enable us to understand and explain the behavior of rational agents; information about the remote past of such agents cannot be of any immediate significance for this task, unless we are to believe in something like action at a temporal distance.


Two New Strategies

Fallible Information

Information is somehow tied to objective probabilistic relevance. In explicating this tie, we seem to be faced with a dilemma. If we insist that whenever a fact <r carries the information r, the objective conditional probability of T on a be one, then we make a sufficient for r, thereby eliminating any possibility of error. Alternatively, if we require only that the conditional probability of r on a be very high (though not necessarily equal to one, or that the probability of T on a be greater than that of T on ->a, then we run afoul of a very important principle of information what Dretske calls the Xerox principle (Dretske, 1981, pp. 57-58). Dretske's Xerox principle is simply the requirement that the carriage of information is transitive: if a carries T, and T carries v, then a carries v. Obviously, if we set some finite distance e from one as the threshold on conditional probability for the carriage of information, then this carriage will not be transitive. The dilemma stands only if one assumes as Dretske (Dretske, 1981, p. 245) explicitly does that it is impossible that a state have probability one and fail to be actual. This assumption is false for standard interpretations of the probability calculus, in which events of measure zero are quite possible. I agree with Dretske that this assumption is a useful one. However, one can accept this assumption and still avoid the dilemma, by using a non-standard probability theory, one permitting hyperreal, i.e., infinitesimal, quantities. An abstract or generic informational link involves three entities: a situationtype that characterizes the carrier of the information, a binary relation on tokens that constitutes the "direction" of information flow, and a situation-type that

A Theory of Information


characterizes the target of the information link (that with which the link is concerned). Definition 9.1 (Generic Information Link)

In this definition, I am assuming that R is a relation (like the natural connection relation ~S> or its converse) that is necessarily uniquely valued (functional on its domain): if Rss' and Rss", then necessarily, s' s". If this is not the case, then we must add a clause stating that y is the only token .R-related to x to the consequent of the two conditionals. The definition guarantees that the probability of a situation of type tf> that is .R-related to a given situation of type <f> is infinitely close to 1, and, in addition, that the probability of the existence of a situation of type (j>, given the existence of one of type T/J, is finite. This second clause is needed in order to support the validity of the Xerox principle, in the following form: Xerox Principle

We can also describe the flow of information from one token to another. The definition of information flow involves five parameters: two tokens, two types, and the linking relation. Definition 9.2 (Token-to-Token Information Link)

Misinformation is quite possible, since we know only that the conditional probability of sa's being ifr is infinitely close to 1. The principle of probabilistic locality entails that any information link between mereologically disjoint tokens is causally mediated: either the carrier is part of a cause of the target, or vice versa, or there is a common cause of both. This account has the counterintuitive result that misinformation or natural error can be expected to occur with only an infinitesimal frequency. Two things can be said in response. First, since information is ubiquitous, the fact that the limiting relative frequency of misinformation is infinitesimal does not entail that the absolute frequency of error is low. Moreover, when misinformation is detected, this fact is especially vivid and salient, while the background of accurate information is taken for granted and largely unnoticed. Second, the usefulness of my account does not depend on taking the requirement of an infinitesimal relative frequency of error literally. Presumably, misinformation is exceptional, occurring with a very low relative frequency. At some point, very


Realism Regained

low finite probabilities are treated, for all practical purposes, as though they were infinitesimal. There are fairly obvious computational advantages to working with qualitative differences, represented formally as infinite ratios, instead of working exclusively with quantitative differences. What I am offering is a formal model of how we reason commonsensically about information. If the account faithfully reproduces the crucial features of our commonsense practice, then the question of its literal truth is of little or no importance. In actual practice, we apply descriptions like 'misinformation' or 'error' to cases of which the descriptions are not literally true, as, for example, we apply descriptions like 'flat' to surfaces that are not literally flat but are close enough to flatness for practical purposes. However, if this objection is taken to be decisive against the proposal I have made, or at least against its adequacy as an account of all misrepresentation, I have an alternative account, one in terms of conditional functions. I take that account up in the next subsection.


Conditional Functions

Let us return for the moment to a simple necessitation model of information, like that of Dretske.

Many animals have what are known as "flight mechanisms." These flight mechanisms are perceptual sensitivities to the environment that trigger the reaction of fleeing. They are adaptive because they often enable the animal to escape predatory animals. However, the perceptual sensitivity does not carry the information that a predator is present, either in the strong, Dretskian sense, or in the weaker sense developed by means of hyperfinite probabilities in the last section. The probability that a predator is actually present may be quite low, even less than 1%. Nonetheless, we would like to say that the perceptual state in some sense represents the possible presence of a predator. One solution to this problem is to build probability into the content of the representation. The content of the perception is something like: there is a 1% probability that a predator is near, and located roughly to the right. One difficulty to this solution is that it seems to attribute sophisticated probabilistic concepts to quite primitive animals. However, this is not quite right, since there is no reason to suppose that the content of the representation is articulated in such a way that there is a component corresponding to the 1% probability concept. Nonetheless, it would be preferable to find a model of representation that did not require such enriching of the content. A better solution is to make use of a concept of conditional function. On this model, a representation of the content p is not simply a state with the function of carrying the information that p, but rather a state with the conditional function of carrying the information that p in circumstances C. In the case of the flight mechanism, we could

A Theory of Information


say that the associated perceptual states have the function of carrying information about the approximate location of a predator in circumstances in which a predator is actually present and has actually made some significant noise of the appropriate kind. It is the animal's perceptual state plus the teleologically relevant circumstances that carry the information that a predator has a particular location. When the perceptual state occurs but the relevant circumstances are not actualized, we can say that the state constitutes a misrepresentation. This notion of conditional function is closely related to the idea of conditional constraints developed by Barwise and Perry.2 We can say that a perceptual state 0 of a token s has the conditional function (conditional on circumstances x) of carrying the information (relative to R) that a token of type i/j is actual if and only if the fact that the conjunction <j>&x carries the information (relative to R) that i/j is realized is causally relevant to s's being of type Definition 9.3 (Conditional Representation-Function)

Even if the model of conditional functions is needed to explain the possibility of many forms of error, it is still useful to employ the hyperfinite model of information developed in the preceding section. For one thing, if the world is radically indeterministic, there may be no information that fits the strictnecessitation model of Dretske. Moreover, if we employ the hyperfinite model of information and the conditional-function model of representation, we have two independent accounts of the possibility of error. Where error occurs because the information is fallible, we can label the case one of malrepresentation. Where error occurs because the background condition of the representation is not actualized, we can label the case one of misrepresentation.


Information as the Basis of Knowledge

It is possible to define a notion of robust or knowledge-bearing information. A fact (si : (f>) is robustly linked to a fact (s^ : tl>), relative to relation R, just in case there is an informational link between the two facts, and this link survives the addition of additional actual information to its first term. In other words, if we extend s^ to some larger situation s, and we take into account not only that s is of type </>, but also that it is of the more specific type 0 & X i then there is still an information link (relative to some relation R') between the facts Definition 9.4 (Robust Information Link)

(Barwise and Perry, 1983, pp. 112-114, 270-272), and (Barwise, 1989, pp. 149-151).


Realism Regained

Robust or knowledge-bearing information carriage is transitive, that is, it satisfies the Xerox principle. In fact, this is so whether we define simple information carriage in terms of conditional probabilities infinitely close to 1 or in terms of finite conditional probabilities, despite the fact that, in the second of these definitions, simple information carriage is not itself transitive. An organism can be said to be designed or adapted for the purpose of acquiring robust information and not just information simpliciter if that organism's functions include routines for the detection of error and anomaly, in other words, if the organism is habitually seeking corroboration or correction of its current information state.


A Look Back, and Ahead

10.1 The Causal Relation

In chapter 3, I argued that we have evidence from natural language that the relata of causation are situations, parts of the world. These situations call for a non-classical, three- or four-valued semantics, which I develop in some detail in appendix A. In chapters 4, 5 and 6,1 developed, respectively, a deterministic, indeterministic, and probabilistic model of causation, using the formal language defined in appendix A. I demonstrated that these models satisfied a number of important desiderata for theories of causation, and I tested them against a range of examples. I also demonstrated (in chapter 7) that these models make higher-order causation possible. Chapters 8 and 9 represented applications of the theory to outstanding problems. I argued in chapter 8 that all wholly contingent situations have causes, and that this points to the existence of a necessary first cause, reviving the ancient cosmological argument. In chapter 9, I used my causal language in developing an account of natural information and misinformation that can be used in explaining the existence of representational states. In appendix B, I will turn to the problem of giving an adequate account of a useful theory of defeasible or non-monotonic inference. I will show that a system of defeasible inference that incorporates my account of causation is able to give correct and principled solutions to familiar problem cases, such as the Yale Shooting Problem.


Against Determinism

One theme that has recurred in this volume is that of the unacceptability of determinism. Determinism is the conjunction of two theses: (1) the necessitarian conception of causation and (2) the universality of the causation of temporally bound situation-tokens. I have offered a number of reasons for rejecting the


Realism Regained

necessitarian conception. In fact, I have even argued that causes can never necessitate their effects. I hold this for several reasons: 1. The causal priority relation is one of asymmetric necessitation: causally posterior tokens necessitate the existence of causally prior tokens. If causes necessitated their effects, the asymmetry would be violated. Moreover, the mutual necessitation of causes and effects would make their separate existence problematic. 2. The necessitarian model of causation leads to an inflation of causal and explanatory connections, as I argued in chapters 4 and 5. 3. There seem to be coherent thought-experiments involving indeterministic causation, in which the cause does not necessitate its effect for example, Mackie's indeterministic vending machine M (section 5.6.3). The necessitation model does not fit well with our commonsense view of causation. 4. Determinism undermines the veridicality of all deliberation, since it contradicts the existence of genuinely possible alternative futures (section 16.8). It sets up a false opposition between causation and agency. 5. Coherent indeterministic and probabilistic models of causation are available (chapters 5 and 6). The first reason is based on the principle of asymmetric necessitation of causal antecedents. This principle in turn receives independent support from a number of sources. 1. The thesis corresponds with our commonsense notion that the past is fixed and the future is open. 2. The thesis enables us to avoid introducing causal priority as an undefined primitive, leading to a more economical ontology. 3. The thesis corresponds to natural conditions for the transworld identity of situation-tokens. 4. The thesis simplifies the definition of screening off and seems to accord with our intuitions about what information is needed in justifying causal inferences (appendix B).


Spacetime as Constrained by Causation, Not Vice Versa

My definitions of causation and of causal priority have not included any spatial or temporal relations. This was a conscious decision, since I wanted to be able to use causation in the analysis of space and time. I sketched the beginning of such an account in section 4.10.2.

A Look Back, and Ahead


I will argue in chapter 18 of part II that a causal theory of spacetime sheds new light on the paradoxes of quantum reality. In particular, I will argue that the non-locality of quantum influences should come as no surprise, since spatiotemporal locality is a construction designed to fit (as closely as possible and as simply as possible) the network of macrophysical interactions. In addition, in chapter 18 ofpPart II I will use the causal theory of this volume in an explication of our concept of enduring substances, such as people, organisms and artifacts. This explication depends crucially on the priority of causation over space and time, since it would be problematic to take space and time as given independently of the existence of enduring objects. The main argument of chapter 8 reinforces the conclusion that causation is independent of spatiotemporal relations. In that chapter, I argued that we have good reason to postulate a necessaryfFirst cause of all contingent situations. This first cause is presumably non-spatial and timeless (since spatiotemporal location would seem to introduce an element of contingency), yet it has genuine causal efficacy. Another bonus of giving a non-spatiotemporal account of causation is that it enables me to build causal theories of our knowledge of extra-spatial objects, such as the world of logic, mathematics, and modality. This enterprise will be a major part of my project in part II.

This page intentionally left blank

Part II

Applications to Metaphysics, Epistemology, and Ethics

This page intentionally left blank

An Overview
11.1 Teleology as Higher-Order Causation

The notions of natural teleology and biological function play an increasingly significant role in contemporary philosophy, especially in recent theories of content and of knowledge. The twentieth century has been characterized by an intensifying of efforts at clarifying the logic, semantics, and metaphysics of teleology, rectifying the unfortunate neglect of the topic in modern philosophy since Leibniz. One of the most influential and attractive accounts was that of Charles Taylor, in his 1964 The Explanation of Behavior (Taylor (1964)). Taylor's influence can be seen in most contemporary accounts, including those of Larry Wright, Andrew Woodfield, and Ruth Millikan. According to both Taylor and Wright (1976), a state B occurs for the sake of state G just in case (1) B tends to bring it about that G, and (2) B occurs because it tends to bring it about that G. This is clearly an instance of higherorder causation: the causal connection between B and G figures in the causation of instances of B. The formal theory of causation that I have developed in this volume was designed specifically to explicate this sort of possibility. Ruth Millikan has argued that reliance on this sort of higher-order causation makes sense only if we make explicit reference to the past. She argues that clause (2) must be replaced by one that reads: (2') the present token of B occurs because past instances of B tended to bring it about that G. Wright explicitly rejects this amendment, to which Millikan responds: Wright says that the formulation "because X does Z" does not reduce to "because things like X have done Z in the past." Rather, we are asked to accept that X might be there now because it is true that now X does or X's do result in Z. How the truth of a proposition about the present case can "cause" something else to be the case at present is not explained. (Millikan, 1989b, page 299, note 7)


Realism Regained

Millikan overlooks two facts. First, the fact that X's tend to bring about Z is not a fact about the present case: it is a timeless, eternal fact about the modal and stochastic structure of the world. Second, Millikan fails to take into account the fact that such eternal facts can enter into causal explanations of present conditions, as I argued in detail in chapter 7.

0000 Teleosemantics
One of the central problems of philosophy has been that of accounting for the possibility of the existence of states with content, i.e., the possibility of representational states. I take a representation to be a state with the teleological function of carrying a piece of information. The piece of information is the content of the representation. In chapter 9, I gave two accounts of the possibility of error: misrepresentation and malrepresentation. In the case of misrepresentation, we are dealing with a state that has a conditional function: it has the function of carrying a piece of information in a specific set of circumstances. In cases in which these circumstances are not present, the state still has the same representational content, even though it does not actually carry the information it is supposed to. In the case of malrepresentation, the representational state does actually carry the appropriate information, but the information itself fails to be veridical. This is possible if we use a model of information that employs hyperfinite conditional probabilities: a state <f> carries the information tp just in case the conditional probability of ip on 4> is infinitely close to 1. This model satisfies Dretske's Xerox principle, that is, information carriage is transitive. At the same time, it opens up the possibility of misinformation. In this part, I will develop this model of representation into a novel account of mental states. In chapter 14,1 sketch an account of a variety of mental states, such as belief, desire, intention, and so on. In chapter 16, I use my account of mental representation to explain some of the puzzling features of sensory qualia. In particular, I will attempt to explain why qualia are irreducible to physical properties. I arn deeply committed to the view that thought (and mental representation) is not dependent on public language. Language is impossible apart from the existence of speakers capable of mental representation, but mental representation as such is not dependent on the presence of language. It is certainly true, however, that the existence of language greatly enhances our capacity for complex and subtle representations. Moreover, I do not favor an account of linguistic meaning that reduces public meaning to speakers' meaning, as proposed by Grice and Searle. Linguistic meaning consists in certain proper teleomnctions of phonemes and syntactic structure most basically, on the function of sentences to carry information about described situations when used in assertoric reports. In a sense, the words use us to reproduce themselves successfully by fulfilling certain adaptive functions. Complex Gricean communicative intentions are not essential to the use of language. Beyond this bare sketch, I will

Overview say next to nothing about language in this book.


0000 The Link between Teleosemantics and Epistemology

By combining my definition of teleofunction with my account of information, I can define the semantic content of beliefs and perceptions. A perceptual or doxastic state of type (f> represents that p just in case type <j> has the teleofunction of robustly carrying the information that p. In other words, a belief has content p just in case it is of a type whose proper function is that of robustly carrying the information that p. Since to be a belief whose content information is carried robustly is to be a case of knowledge, we can say that a belief that p is a state whose function is fulfilled by being a state of knowing that p. Knowledge and belief are thus interdefinable: knowing that p is being in a state of believing that p whose function is fulfilled, and believing that p is being in a state whose function is fulfilled by knowing that p. This circularity is not vicious, since each can be non-circularly defined in terms of the more basic notions of function and robust information. One consequence of this account of content is the inseparability of semantics and epistemology. If knowledge of p is impossible, so is belief that p. Conversely, if belief that p is possible, then the normal case of believing that p will be a case of knowing that p. A certain kind of global skepticism is therefore incoherent. One cannot suppose that we can grasp some domain of propositions without supposing that we have the natural capacity for knowledge of that domain.


Causal/Teleological Accounts of Knowledge

Since on my account, timeless and non-spatial realities, like the structure of modality, can enter into causal relations with spatiotemporal processes, I can give a causal account of our knowledge of logic, mathematics, and causal necessity that closely parallels causal theories of our knowledge (via perception) of spatiotemporal objects. In part II, I develop a causal theory of logical and mathematical knowledge in chapter 15, and of scientific knowledge in chapter 17. I also employ a teleological element in my account of knowledge. When a representational state fulfills its function, it constitutes knowledge, not merely truth. A case of true opinion is a case of partial teleofunctional failure. Thus, we should not think of knowledge as truth plus belief plus some third factor. Instead, we should think of true opinion as knowledge minus something (namely, the appropriate kind of reliability). I call the resulting theory one of teleological reliabilism, since it incorporates the advantages of reliabilism, while avoiding the standard objections through the inclusion of a teleofunctional element.


Realism Regained


Mental Causation and Qualia

The most fashionable view of the philosophy of mind today is that of nonreductive materialism. I will defend a view that is doubly unfashionable: a non-materialist reductionism. I agree with reductionists that we must, in the philosophy of rnind, seek an illuminating account of the nature of mental action, intentionality, and qualia, but I also agree with most anti-reductionists in thinking that the resources available to the materialist are inadequate to this task. The solution is to step outside the materialist box. By incorporating a theory of concrete causation that involves eternal facts, such as facts about modality and objective chance, in the causal order, I propose a novel solution to the problem of accounting for mental causation and the gap between physical and phenomenological properties.


Teleological Accounts of Ethics

Teleological realism makes possible a very robust form of ethical and moral realism. It is not necessary to think of the good as some sort of projection of idealized desires or preferences. Instead, the good life for a human being can be defined as one in which all of the primary functions of human life are fulfilled (that is, the functions that are not corrective or ameliorative in nature, like functions for healing or resisting infection). Moral goodness consists, as Aristotle recognized, in fulfilling certain teleofunctions associated with character, that is, with our ability to make appropriate choices and carry them out successfully. Moral virtue both contributes to happiness and is itself an integral component of happiness, since many moral functions are primary functions we possess as human beings. There are two reasons for believing that values and moral norms are objectively real. First, we have a natural tendency to believe ethical propositions, and non-cognitive accounts of the meanings of these propositions cannot account for the fact that we engage in controversy and argumentation over their truth values. Second, objectivity provides a simple explanation of the widespread cross-cultural agreement we observe on questions of what is good and praiseworthy. A standard anti-realist rebuttal to the argument from agreement is to propose that the agreement we observe can be explained by means of natural selection: that cultures following radically different norms are unable to survive. However, this response undercuts ethical realism only if such an appeal to natural selection is itself compatible with the denial of ethical objectivity. I argue that, to the contrary, appeals to natural selection of this kind entail the existence of moral teleofunctions that adequately ground the objectivity of morality. The teleofunctions of any organism, such as a human, are to a very high degree mutually supportive and inter-dependent. The operation of certain functions, for example, those involved in repair and healing, presupposes the failure of other functions. I call these the 'secondary functions'. Primary functions have



no such presupposition. The simultaneous fulfillment of all of an organism's primary functions is the state of eudaemonia. In the case of rational animals, such as humans, human eudaemonia is the ultimate end of all action. Subjective states, such as pleasure, pain, satisfaction, dissatisfaction, and sense of malaise or of well-being, are all representational in character. Pleasure and the sense of well-being have as their natural function the carrying of the robust information that eudaemonia has been at least partially achieved. Pain, dissatisfaction, and malaise all have the function of carrying the information that some function has failed. Our dispositions to feel pleasure and pain are fallible but reliable indicators of the underlying, objective condition. Moral virtue is the disposition to make decisions that promote eudaemonia in the normal way and under normal circumstances. The exercise of virtue is valuable both as a means (as a reliable way of achieving eudaemonia) and as an end in itself (as a natural constituent of eudaemonia). We cannot fulfill all of our proper functions without fulfilling the natural functions of the will, which includes the development and exercise of virtue. Moral truths have the power to provide both reasons and motivation, since the human capacities for reasoning and desiring have the natural disposition to respond to moral truth. It is partly constitutive of being a good reasoner that one accept moral claims as providing good reasons to act. Our ultimate aim is not up to us, nor merely the product of accidental contingencies. It is our being objectively ordered to the ultimate end of human eudaemonia that makes us human and thereby constitutes us as capable of desiring and wanting.


Enduring Substances as Logical Constructions

Enduring substances are logical constructions whose being is constituted by a causal chain of situation-tokens. A chain of situation-tokens constitutes a substance history just in case there is some type (j> realized by each member of the chain, and each succeeding member's being <f> is causally explained by its predecessor's being </). The properties of a substance are always indexed to some member of its history. In normal circumstances, this can be adequately represented by indexing the corresponding proposition to some point in time. However, when time travel is involved, it is necessary to index each substanceproperty attribution to a time and a place. Consequently, it is possible for a substance to have incompatible properties at the same time, so long as it has the properties at different places. Since space and time are themselves constructions, based on the underlying causal relations, it is quite possible for substances to be only intermittently spatiotemporal. For example, it is coherent to suppose (with the orthodox Copenhagen interpretation of quantum mechanics) that quantum systems, such as electrons and other microphysical objects, take on definite position or momentum only under special circumstances. When unobservable, these microparticles


Realism Regained

have no spatiotemporal properties, only potentialities for such properties. Consequently, the principle of the spatio-temporal locality of causation simply does not apply to them.


Teleology as Higher-Order Causation

12.1 Three Definitions of Teleology

In the last forty years, the theory of teleology and biological function has experienced a surprising renaissance in analytic philosophy. Charles Taylor was a pioneer in this field through his 1964 The Explanation of Behavior (Taylor (1964)). Taylor's lead has been followed by Larry Wright, Andrew Woodfield, and Ruth Garrett Millikan. I will use Wright, Woodfield and Millikan as paradigms of three competing accounts of the nature of teleological function. These three accounts are the causal, the normative, and the Darwinian, respectively. The Darwinian account has two versions, one retrospective (Millikan) and the other prospective (Bigelow and Pargetter).


The Taylor/Wright Account

In the theory developed by both Taylor and Wright (1976), a state B occurs for the sake of state G just in case (1) B tends to bring it about that G, and (2) B occurs because it tends to bring it about that G. This is clearly an instance of higher-order causation: the causal connection between B and G figures in the causation of instances of B. As I mentioned in the last chapter, Ruth Millikan has argued that this account makes sense only if we replace reference to higher-order causation by reference to the past, i.e., to the actual history of the state B. She would replace clause (2) by this: (2') the present token of B occurs because past instances of B tended to bring it about that G. Millikan insists on this substitution because she assumes that a cause must precede its effect in time. However, as I have argued in chapter 8, it is quite


Realism Regained

possible for an event in time to be the result, in part, of facts about the modal and stochastic structure of the world, and these latter facts cannot be located in time. To some ears, the notion of causation by timeless facts may sound like an oxymoron. If one takes it as an essential part of our concept of causation that the relation always holds between items with spatiotemporal location, then I will have to make use of some more general notion, such as power or influence. One could think of my account as a power or influence theory of teleology, rather than as a "causal" one. A state has a teleofunctional character when that state is under the power or influence of the appropriate eternal facts, ones involving certain causal necessities. We can distinguish a number of interesting varieties of teleological connection. First of all, we can distinguish between intrinsic and extrinsic purpose. For example, the bird of a wing exists for the sake of flying, and this is a case of intrinsic purpose. In contrast, seeds serve the purpose of feeding the bird, a case of extrinsic purpose. Another distinction we can make is that between productive and informational functions. The Taylor/Wright definition specifies one important class of functions: the productive functions. However, there are also receptive or informational functions. For example, the eye has the function of registering the existence of certain kinds of objects in the environment. This function is a matter not of the eye's effect on the environment, but of the reverse: of the environment's effect on the eye. In chapter 9, I defined a relation of information (or potential information). We can say that a particular pattern of retinal stimulation <f> has the intrinsic function in s (relative to v) of carrying the information that i/' just in case the pattern ^ exists because it carries (in organisms of type v) the information i/J. We might say that when a state occurs that has the function for an organism to carry potential information of a certain kind, then that information has become actual for that organism.


From Woodfield and Bedau to Aristotle

Woodfield (1976) argues that the Taylor/Wright account gives a necessary, but not a sufficient, condition for teleofunctionality. He urges that we must add a normative element requiring that the functional state contribute to the wellbeing of the organism. An example created by Alvin Plantinga (1993) gives some support to Woodfield's contention. We are to imagine a world in which a Nazi-like regime institutes a dysgenics program aimed at a hated minority race. A harmful mutation is introduced into the minority population that renders the bearer nearly blind, and makes attempted seeing painful. The Nazi breeders gradually eliminate all of the members of the minority race without the gene, by testing for signs of faulty and painful vision. In such a case, the defective gene appears to satisfy Wright's criterion, since part of the causal explanation of the presence of the gene in the population is the deleterious effect of the gene on the bearer's vision. Yet, it would seem odd, at the very least, to say that the gene had the function (and not just the effect) of impairing vision.



There are a number of other examples that also suggest that the Wright definition is too broad. Any stable feature of the inanimate world characterized by feedback loops, that is, any genuine case of dynamic equilibrium, will be describable as instantiating teleofunctionality, according to Wright's definition. Suppose, for example, that the presence of ice in a rock crevice causes the crevice to remain open (this example was suggested by Anil Gupta in conversation). In this case, the existence of ice in the crevice is caused by the power of the ice to keep the crevice open. The ice has the Wrightian function of keeping the crevice open. Similarly, if the rapid flow of water in a channel keeps the channel from silting up, we would have to say that the water flow had the function of preventing the deposition of silt, since in the absence of that causal connection, the silt would prevent the water from flowing so rapidly. In these cases, Woodfield would argue, there is no genuine teleofunction, since ice deposits and water flows have no welfare. If we merely add the condition of welfare-enhancement to Wright's definition, however, we would seem to have only a verbal difference, one definition for Wright-functions, and another for Woodfield-functions, with the dispute concerning only the appropriate meaning for the English word 'function'. It is possible, however, to reconstrue Woodfield's position as an alternative metaphysical account. We could take Woodfield as claiming that there is a metaphysically distinguished class of Wright-functions: those that exist because they contribute to the welfare of their bearers. Such an account gives a real causal role to the property of goodness (goodness for some kind of organism), resulting in something very close to Plato's theory of the good. Mark Bedau (1992) has also argued that an evaluative element is essential to teleology. Bedau distinguishes "three grades of evaluative involvement." In the first grade of involvement, we define the proper function of ^ to be V1 by requiring that </> brings about i/>, and il> is good. This adds goodness to a pre-Wrightian, dispositional account of function. In the second grade, we incorporate Wright's definition and add that ifr is good as an additional and separate condition. That is, we require that the thing has <f> because (/> brings about ip, arid, in addition, that -0 is good. Finally, in the third grade, we include the goodness of ijj within the causal explanation of </>: the thing has 0 because both <f> brings about ip and 0 is good.1 Let 7(1*) represent the situation-type in which the welfare of the type of organism whose time-slices are of type v is optimized. We could then define a third-grade or Platonic function (relative to kind v) as one in which the end promoted also promotes 7(1;), and the fact that it does so is also causally relevant to the existence of the functional state. This additional condition, which we can call the 'Platonic condition', requires that there be a causal connection between i/> (the Wright-functional end of <j>) and the welfare of the organism (qua member
1 John Searle is another defender of the thesis that teleological judgments presuppose prior normative judgments (Searle (1995)). He argues, for example, that our judgment that the function of the heart is to pump blood presupposes a prior commitment to the goodness of life itself.


Realism Regained

of the background kind v). This third-grade, Platonic account of teleofunctions could be combined with a eudaemonistic conception of the good: a theory that the welfare of any organism simply consists in the fulfillment of all of its potential Platonic functions. This is not a trivial condition, despite the fact that the definition of Platonic function makes reference to welfare. The definition of Platonic functions leaves open the question of what the good of an organism consists in. Eudaemonism would add to this definition the thesis that the good consists in the fulfillment of some particular subset of the organism's Platonic functions. A Platonic eudaemonism would put Wright functions and the good on a par ontologically: neither could be reduced to the other. Although the Platonist could not give an ontological reduction of the good to the functional, it is still a substantive claim about the good to require that it be identified with the fulfillment of some subset of the Wright-functions of the organism. In addition, the Platonic account is compatible with the claim that, epistemologically speaking, it is possible to learn about the good of an organism by discovering its Wright-functions. It might well be that nearly all Wright-functions are also Platonic functions: that identifying a state as a Wright-function gives us good prima facie grounds for identifying it as a Platonic function as well. Conversely, it may be that in many cases, identifying a state as conducive of the good of the organism gives us good but defeasible grounds for supposing the state to be one of the organism's Wright-functions. If the Platonic condition is satisfied, then transcendent goodness (that is, goodness by a transcendent standard, one that is not reducible to other facts) would be connected to the causal network of the world, not in the sense that something's being good (by this transcendent standard) gives the thing some new causal power, but in the sense that the existence of certain properties of things is to be causally explained in terms of their contribution to the wellbeing of the things possessing them. Thus, goodness or well-being would have an indirect, second-order causal relevance to concrete events. There is an alternative, somewhat more deflationary account of the role of goodness in a Bedavian third-grade definition of teleology. A thing is capable of well-being just in case the sum of its Wright-functions forms a highly coherent, mutually supportive totality. A Wright-function counts as a genuine teleofunction just in case it coheres in this sense with the well-being of its possessor. This sort of an account also has echoes of Platonic themes, in this case the close connection for Plato between well-being and harmony. A thing, like an organism, with a largely harmonious set of Wright-functions is capable of wellbeing; inanimate objects, with largely unrelated, discordant Wright-functions, are not. 2 Plantinga's example of the dysgenic gene can be excluded, since, although the gene does have a Wright-function, this function does not cohere well with the rest of the Wright-functions of its human hosts. There is one more refinement that needs to be made, bringing this deflaThe harmony, or homeostatic clustering, of human goods plays a central role in Richard Boyd's version of moral realism (Boyd (1997)).



tionary account closer to the Platonic one. We need to distinguish between those cases in which the Wright-functions of a thing are harmonious, but the harmony of the functions is merely coincidental, and those cases in which the harmony of the Wright-functions is itself functional, contributing, perhaps, to the adaptive fitness of the organism. According to the deflationary account, harmony is constitutive of the good. Hence, both cases are cases of organisms with a standard of well-being. Alternatively, we might insist that the harmony of Wright-functions must itself be explained by reference to the good. This moderate position we might call an "Aristotelian" theory of the good. According to this account, we can define the good of a thing in the following way: Aristotelian Definition of the Good A thing has a good if and only if it has proper functions. The good of a thing consists in the successful exercise of its primary proper functions. Aristotelian Definition of Proper Function: A state rf> has the proper function TJJ in kind v if and only if: 1. The fact that things in kind v have state <j) is causally explained (at least in part) by the existence of a causal law linking (0&t>) to ?/> as cause to effect (Wright's condition). 2. The system of functions ((t>i,i/)i) meeting condition (1) for v forms a mostly harmonious, mutually supportive whole, and the (4>, i/>) function contributes to this harmony. 3. The existence of things of kind v is causally explained (at least in part) by the harmony mentioned in condition (2). This Aristotelian definition is stronger than the deflationary account, since it requires more than the bare fact of the existence of a harmony among Wrightfunctions. At the same time, it takes on much less ontological burden than the full-blown Platonic account, since it does not have to postulate goodness as a primitive causal factor that explains the existence of Wright-functions. Its combination of sober realism with ontological moderation seems to justify calling it "Aristotelian," at least in inspiration. Bedau argues that biology makes use only of first- and second-grade functions. He denies that third-grade functions have a legitimate place in the modern, scientific picture of the world. However, he reaches this conclusion because he overlooks the possibility of an Aristotelian version of third-grade evaluative involvement. In fact, it is the third grade, understood in this deflationary way, that is needed to distinguish the functionality of organisms and artifacts from self-perpetuating equilibria in the inanimate world. For an organism to have a harmonious set of functions, it is not necessary that it have no dysfunctional features, nor do we need to exclude the existence of a moderate degree of competition and interference between the organism's


Realism Regained

various functions. Let us say that function x harmonizes with system 5" just in case, for many, but not necessarily all, members y of 5, the fulfillment of x increases the probability of the fulfillment of y, and, for most but not necessarily all members y of 5, the fulfillment of x does not significantly decrease the probability of the fulfillment of y. A system of functions S is harmonious if nearly every member x of S harmonizes with 5* {x}. This definition of harmony is not entirely successful, however, because it does not take into account the existence of secondary and tertiary functions. For example, the body may respond functionally to a condition in which it has suffered massive injuries by radically lowering the metabolic rate. This functional response is fulfilled only when many other functions have failed; hence, the fulfillment of this secondary function significantly lowers the probability of the fulfillment of most of the body's functions, since it entails that these functions have in fact failed. It is possible that an organism could exist most of whose functions were secondary ones. In response, let us say that a function x compensates for a set of functions T just in case the successful fulfillment of x entails that none of the members of T are fulfilled and is causally posterior to the failures of the members of T. A function x meta-assists y relative to T just in case x compensates for T and the fulfillment of x increases the probability of the fulfillment of y, conditional on the failure of the members of T. We can then weaken the definition of harmonizing with system by requiring only that the function meta-assist some of the members of the system, relative to some proper subset of the system. A system is harmonious if most of its members harmonize (in the new, weaker sense) with the remainder of the system, and many of its members harmonize (in the first, stronger sense) with it. An organism fighting off an infection, or infested with a parasite, is the locus of two disjoint systems (its own and the parasite's), each internally harmonious, and each in conflict with the other. In cases of symbiosis, we can identify two disjoint systems, even though they are mutually supportive, since the ancillary connections between the two systems are much fewer and weaker than those within each one. Cases such as that of the mitochondria lie on the vague boundary between organic unity and close, long-established symbiosis. Any organism will suffer from a certain degree of dysfunctionality. The standard is one of substantial harmony among functions, not ideal or optimal harmony. The function of x is determined not by working out what x is optimally designed for, but by working out whether the most likely explanation for the origin of x involves a causal connection between x and some effect. For example, there are cases of selfish DNA, genes that take control of the gene replication process, producing multiple copies of themselves on the chromosome, despite the fact that they interfere with the organism's fitness. These selfishly antisocial genes constitute a kind of self-perpetuating genetic illness, a chromosomal parasite. The existence of such imperfections in the chromosomal system does not pose any challenge to the obvious fact that the function of the system includes cell reproduction and protein synthesis. For the purposes undertaken in this book, I will use the Aristotelian definitions of good and of proper function as my working hypotheses. I believe



that the Aristotelian definition is weak enough to include as proper functions everything we would want to attribute as such to organisms and to artifacts, while excluding any property in the inanimate, natural world as functional. 12.1.3 Natural Selection Accounts

Very roughly, Millikan (1984) defines the relation of functionality in terms of actual contribution to the survival and reproduction of the organism's ancestors. The eye has the function of registering information of a certain kind because of the fact that similar organs in the ancestors of the organism in question contributed to the successful reproduction of those ancestors by registering such information. Millikan's account is explicitly retrospective, which invites certain kinds of objections. The first appearance of a new adaptation is always nonfunctional, since it cannot acquire a function until it has actually contributed causally to successful reproduction. This applies even to artifacts: if I design a widget to perform a task, and it does so and in the very way that I envisaged, it still does not have that function until its success at meeting the need for such functionality results (say, through the marketplace) in the reproduction of duplicate widgets. In addition, on Millikan's account, once a function has been acquired, it can never be lost. The sightless eyes of cave fish still have the function of seeing, and words of contemporary English still carry the meanings of their Indo-European roots. These results seem counterintuitive. One solution would be to make Millikan's account prospective instead, as Bigelow and Pargetter (1987) have done. On their account, a state has a particular function if the fact that it tends to produce this result enhances the reproductive fitness, here and now, of the organism in question. It is not clear that this strategy will work, however, since it is unclear what "reproduction" can mean in a purely prospective sense. Millikan has the advantage of being able to make reference to an already existing family of similar, self-perpetuating structures. Since everything is similar to everything else in some way, it is unclear what "the reproduction of x" can mean, in the absence of some already existing class of organisms to which x belongs. In addition, it is unclear how to define the 'present environment' of the organism, and unclear what should provide the baseline for comparison. Bigelow and Pargetter tell us that the adaptation should improve the reproductive fitness of the organism, but what sort of situation provides the benchmark against which we are to measure improvement or deterioration? There is, however, a more fundamental problem with all of these accounts: the fact that they make the truth of Darwinism a matter of ontological necessity. Surely it is possible, in some suitably broad sense, that functional organisms come into existence in the way described in the book of Genesis, even if this is not the way things happened in the actual world. Moreover, it would seem to be possible for there to exist what Richard Sorabji (1964) calls "luxury functions": functions that do not in fact enhance the reproductive fitness of their bearer, and that did not enhance the reproduction of its ancestors. For example, the capacity to appreciate beauty for its own sake, or the ability to track the truth


Realism Regained

in metaphysical domains, may be genuine functions of the human mind that have nothing to do with reproductive fitness. It is at least possible that such functions exist; our fundamental account of the nature of function should not exclude these possibilities. Moreover, all accounts involving natural selection, whether retrospective or prospective, have the drawback that they cannot count artifacts that are the product of an original act of intelligent design as having a function. If an inventor designs a new kind of mousetrap, the mousetrap does not have the function of catching mice (according to these natural-selection accounts) until it has been reproduced in response to demand driven by success in actually catching mice (on the retrospective account), or unless it has the propensity of being reproduced for this reason (on the prospective account). The Wrightbased definitions of function have the advantage of covering both the products of natural selection and those of one-off intelligent design (whether or not they have been or are likely to be reproduced), without the need for any gerrymandered disjunctivity. On Wright's account, the mousetrap has the function of catching mice so long as its propensity to do so is a cause of the inventor's constructing it as he did. An actual history of catching mice, or a likelihood of being reproduced in the future in response to future success in doing so, is not required. It is far more plausible to take natural selection as a mode of explaining how it is that functions exist in the world, not as an account of what it is for something to be a function. Neander (1991) has defended a natural-selection account of teleology as an analysis of the concept of function, as it figures in the thinking of contemporary biologists. According to Neander, in the specialist language of contemporary biologists, the word 'function' just means 'selected for by nature'. If contemporary biologists have made the truth of Darwinism a matter of stipulative definition, so that to deny the neo-Darwinian synthesis, one would have to deny that biological functions exist, then this would constitute an unjustifiable form of dogmatism, setting up a conceptual barrier to any future theory that might prove superior to the contemporary synthesis. This stipulation would make rational dialogue between Darwinists and contemporary or future critics impossible, since supplanting the present theory would require a conceptual and linguistic revolution. Moreover, the notions of 'function' and 'natural purpose' have roles to play far beyond the narrow world of biological specialists. Functionality is an important concept in our commonsense view of the world, and it is needed (I will argue) in an adequate theory of epistemology and ethics. The content of such a widely used concept cannot be settled by the linguistic conventions of a specialized community.




Darwin: Real or Only Apparent Functionality?

Darwin's theory of natural selection has been taken in two quite opposing ways on the question of its bearing on teleology. The American biologist Asa Gray took Darwin's theory as vindicating the reality of biological teleology, and, in a letter to Gray (Gilson, 1984, pp. 80-87), Darwin himself seems to endorse this inference. In contrast, many philosophers and scientists, including, most recently, Richard Dawkins and Daniel Dennett, have taken the upshot of Darwin's theory to be that all biological functionality is merely apparent, with natural selection explaining the existence, not of real teleology, but only of its appearance in nature. These two conclusions are most probably based on two different understandings of the nature of teleology. It would seem that those taking the Dawkins/Dennett line assume that the existence of a function entails the existence of a designer or creator, whose prior intentions, or whose intentions plus their effective realization, constitute the functional character of the product. Alvin Plantinga, in his recent book Warrant and Proper Function (Plantinga (1993)), explicitly affirms the existence of this implication. I have two reasons for demurring. First, it seems that something like the accounts of Wright or Woodfield are adequate characterizations of functionality, with the products of intentional design clearly falling under the definiens, without necessarily exhausting its extension. Second, I hope to give an account of intentionality in terms of teleofunctionality (roughly, a state represents a fact just in case it has the function of carrying the corresponding potential information), so accepting Plantinga's analysis would doom such an analysis to vicious circularity. Consider again the case of the bird's wing's having the function of enabling flight. The causal connection between the presence of wings and flight was itself a higher-order cause of the successful survival of winged ancestors of existing birds. A given stage of a winged-bird organism is caused to be bird-stage, and hence is caused to be winged, by these earlier successes in survival. Thus, there is an indirect causal connection between the causal connection between wings and flight and the presence of wings in the given specimen. Wright's definition is satisfied. Moreover, the bird's Wright-functions form a harmonious system, and this harmony itself contributes to the bird's fitness. The connection via natural selection is indirect and retrospective. If all actual teleology is explainable by natural selection alone, we should deny the existence of real (as opposed to merely apparent) "luxury functions." However, this would be a consequence, not of an ontological theory (as in Millikan's case), but of biological theory. Functions that are explained by natural selection are indirect and retrospective, but so are functions that are explained by the intentions of a designer. The intentions of the designer mediate between, on the one hand, the causal connection between the trait and its effect, and, on the other hand, the existence of the functional trait in the product, just as the evolutionary history of an organism

150 mediates between these two in the case of natural selection.

Realism Regained


Retrospective and Non-Retrospective Accounts

So far, all of the definitions of teleology we have considered have been retrospective in nature, in the sense that the function of a thing depends upon what was involved in causing certain features of that very thing. This would mean that teleofunctions do not supervene on the internal organization of a thing: two internally indistinguishable systems could have different functions due to differences in the causal histories involved. For example, a swamp-bird that forms spontaneously, without evolutionary history, has swamp-wings that, unlike birds' wings, do not have the function of enabling flight, even if the swamp-bird does soar about with apparent facility. Many philosophers, including Dretske and Millikan, are content to bite the bullet of this consequence. My inclinations are to try to dodge it. Moreover, there is another problem with retrospective natural-selection accounts of teleology. Ironically, they propose an essentially neo-Lamarckian conception of function. According to Lamarckian theory, use must always precede function. It is only after a particular structure or behavior has proved its usefulness in practice that it can be incorporated into the set of adaptations of the individual or population. In contrast, neo-Darwinian theory opens the door to the possibility that a function can emerge spontaneously, by fortuitous mutation. Natural selection explains not the origin or nature of the function, but its successful perpetuation. This issue is particularly acute when one attempts to understand systems of interdependent functions. Consider, for example, the mutually presupposing functions of sexual reproduction. The function of the sperm is to fertilize the ovum; and the function of the ovum is to receive the sperm. Neither can operate before the other is functional. Hence, it is incoherent to insist that the gametes cannot be functional until past instances of each have successfully been used in reproduction. Indeed, in this case, the difficulty for the natural-selection account of functionality is especially acute, since there can be no such thing as gametes before the functional system of sexual reproduction has been established. Hence, the functionality of gametes cannot be explained in terms of the previous history of gametes, since there could not, by the very nature of the case, be such a thing. It is possible to define the function of a thing without building in any conditions about the actual causal history of that very thing. Let us say that the Aristotelian definition of function given above is the definition of 'etiological function'. Then, we can say that some feature A of some thing x of kind K has function F just in case the objective probability is greater than one-half that something with the internal organization specified by K would have been caused in such a way as to make F the etiological function of A. For example, the swamp-bird belongs, by virtue of its internal organization, to a class of



things B of such a kind that the objective probability is greater than one-half that an arbitrary member of B came into existence through the kind of natural selection responsible for the existence of ordinary birds. Therefore, even though the swamp-bird came about in a very unusual way, a way in which the causal powers of its wing-like appendages had no role, we can still say that the function of these appendages is to enable the swamp-bird to fly. In contrast, if natural processes accidentally produce something internally indistinguishable from a very crude arrowhead, we do not have to say that its function is to act as the point of an arrow, since the objective probability of the accidental production of such a system is non-negligible. The difference between the swamp-bird and the arrowheadlike stone lies in the astronomical difference in the objective probabilities of the spontaneous generation of each. In the case of systems of interdependent functions, such as those of the sperm and the ovum, each individual gamete, even the very originals, are such that it is very likely that something so organized resulted from a process that included successful reproduction (i.e., favorable natural selection). Even though the original gametes had no such selective history, it is far more likely (in terms of objective chance) that something so organized is one of the many successful descendants of the original mutants that it is a product of favorable mutation. Thus, the original gametes were fully functional, despite the fact that their actual history included nothing that satisfies Wright's higher-order condition.


Extrinsic Functions and the Extended Phenotype

It is harmonious systems of Wright-functions that give rise to teleology. These systems are not entirely internal to the body of a given organism: instead, they embrace the pattern of interaction between the organism and its environment. Features that explain their own existence via their contribution to survival and reproduction I call "intrinsic functions." Features that explain the perpetuation of the organism/environment system, but that would still exist in the absence of that system, I call "extrinsic functions." For example, the structure of the human heart has the intrinsic function (relative to the human system) of pumping the blood. The presence of oxygen in the atmosphere (relative to that same system) has the function of providing oxygen to the bloodstream through the lungs. The human phenotype extends into the entire ecological niche belonging to the human system. Teleological functions supervene on the present state of the extended phenotype, and not on the internal state of the organism itself. For example, the intrinsic function of the coloration of the viceroy butterfly is to mimic the appearance of the poisonous monarch, while the coloration of the monarch has no such function, and would not have it even if the chemical basis for color were identical in the two species. The same structure can have two different functions, by belonging simultane-


Realism Regained

ously or successively to two different extended phenotypes. The fat of the mouse serves the intrinsic function of storing energy for the mouse and the extrinsic function of providing nutrition to the cat. The shell of the mollusk first serves the intrinsic function of protecting the mollusk and then, after being abandoned and adopted by the hermit crab, it serves the extrinsic function of housing the crab.


Our Knowledge of Teleology

Our knowledge of teleological connections can be either direct or indirect. If a particular instance of functionality is mediated by natural selection or by a designer's intentions, then we can discover the functionality by discovering the mediation. If I can demonstrate that a particular organism would not long survive if a molecule did not have a particular effect, then I can reasonably infer that the molecule has that effect as one of its functions. Similarly, if I learn that a competent designer intended the artifact to crush olives, and it does in fact crush them, and in the way envisaged, then I have learned that crushing olives is one of its functions. It is possible to have direct knowledge of a teleological connection, regardless of whether the connection is itself direct or indirect. This direct knowledge is simply a matter of inferring a simple causal generalization from a large and variegated sample of instances. Suppose that we find, in the members of a population v, a large number of factors, </>i,<f>2,--,<t>n, each of which, independently of the others, promotes some effect TJJ. In such a case, it is reasonable to infer that the presence of each factor fa is caused by the causal connection between fa and i/j, that is, that each of the fa's has the function of ^-ing. This conclusion can be further confirmed by finding analogous cases: similar populations v{,..., v'm, in each of which there is a family of properties having common effect ipj (similar to ifr). It can also be confirmed by fitting the functional connection between (j> and V into a network of coherent, mutually supportive functions. William A. Dembski's recent monograph (Dembski (1998)) delineates precise criteria for excluding chance as the source of such teleological patterns. It is important to recognize that, despite what Daniel Dennett has said about the "intentional stance," it is not necessary to discover states that optimally realize some end. Imperfect functions can be discovered with as much objectivity and certainty as can optimally designed functions. It is not required that the various factors that promote some end if> do so optimally: all is required is that there exist a number of separate factors whose existence can be economically explained by reference to their common effect. Once it has been established that such a teleological connection exists, the focus of inquiry can then be shifted to discovering whether the connection is unmediated or mediated by natural selection or explicit intentionality. The conclusion that an unmediated connection exists can be supported not only by the failure to find a mediating mechanism, but also by the discovery of simple laws of teleology from which a wide range of actual connections can be deduced.




Teleological Natural Kinds

A natural kind, especially in biology, is best characterized in terms of the functions of its members. The core of a natural kind is a kind of fixed point. A class A is such a core just in case the following relationship holds. Let f ( A ) be the set of functional connections instantiated by every member of A. Then, A is a core of a natural kind if and only if A is identical to the set of individuals instantiating every member of f ( A ) . The natural kind of which A is the core consists of all those things instantiating nearly all of the functional connections in f ( A ) (an admittedly fuzzy set). In the case of sexual and other social animals, the situation is somewhat more complicated. Certain teleofunctions, such as the male and the female sexual function, are exercised not at the level of individual organisms, but rather at the level of groups of individuals. To get a comprehensive picture of the teleofunctions involved, we must move from the organismic level to the population level, as Elliott Sober (1984) has argued. However, Sober erred in setting population thinking against essentialism, as though the two were inherently incompatible. In fact, Aristotle himself recognized, although perhaps with some degree of confusion and indistinctness, that for political animals such as human beings, a full account of their natures could be given only by studying the structures of their societies. Hence, the Politics is the indispensable companion to Aristotle's Ethics (and even in the Ethics, the group phenomenon of friendship plays a crucial role). Even granting the importance of sexual, social, and other population-level functions, it remains the case that there must be a fairly high degree of overlap between the proper functions of the various individual organisms belonging to a single natural kind. If we do not require such overlap or similarity, we will erroneously count members of two symbiotic species as belonging to the same natural kind.

This page intentionally left blank


Causal Theories of Mental Content

In the early 1980s a number of accounts of mental content involving teleology and proper function were developed. I will concentrate here on two representative theories, those of Millikan and Dretske, and on an influential critique of the teleological turn by Fodor.

13.1 Millikan
In Language, Thought, and Other Biological Categories (Millikan (1984)), Millikan proposes both a causal theory of teleological or proper function and a teleological account of intrinsic meaning or intentionality. Very roughly, Millikan proposes that a feature F has proper function P for an organism x (or a member of some other sort of reproductive kind) just in case F's causing P in the ancestral history of x contributed to the survival or reproductive success of x's ancestors. F's causing P is something for which F was selected by nature (in Elliott Sober's terminology). I argued in chapter 14 that Millikan's theory confuses the definition of teleology with its explanation, and that a much simpler and more comprehensive definition is possible along the lines of work by Charles Taylor and Larry Wright. Millikan proposes that the content of a sign consists of those conditions in the world to which the sign is supposed to "map." For instance, . . . the sense of an indicative sentence is the mapping functions (informally, the "rules") in accordance with which it would have to map onto the world in order to perform its proper function or functions in accordance with a Normal explanation (italics hers). (Millikan, 1984, p. 11) It is never very clear (at least, not to me) exactly what a "mapping function" is, or what it means for a sign to "map'' onto the world. We know that


Realism Regained

this is what a sign is supposed to do, and that when it does it, the sign is true (if a complete sentence) or at least significant (bearing a "real value"), but the notion of mapping is seriously underspecified in Millikan's work. In chapter 16, I identify the content of a natural representation with the information that the representation is supposed to carry robustly. The notion of robust carriage of information, in turn, was made precise in chapter 11. My account of information could be taken as one way of filling in the details in Millikan's notion of "mapping."



In Explaining Behavior (Dretske (1988)), Fred Dretske argues that representation is possible only as a result of a learning process by which an organism becomes attuned or calibrated to its environment . Dretske rejects the idea that a teleological explanation based on natural selection and evolutionary history can be explanatory of behavior, and, consequently, he rejects such functionality as the basis for an account of mental content. Dretske bases this rejection on the possibility of inherited intentionality on Sober's distinction in The Nature of Selection (Sober (1984)) between developmental and selectional explanations. Sober argues that natural selection cannot explain why a given organism behaves as it does. Instead, it explains why there exist only organisms who behave in certain ways. Natural selection does not make existing organisms behave the way they do: it eliminates organisms who act differently. I can explain the fact that all my friends drink martinis in two different ways: developmentally, by explaining the origin and evolution of each of my friend's taste in alcoholic drinks, or selectionally, by demonstrating that any would-be friend who does not drink martinis is eliminated from contention on that basis. Only the former sort of explanation really explains the friends' behavior. However, I think this argument moves a little too quickly. We should distinguish between relational and absolute selection. Relational selection explains why nearly every organism of some kind in a certain location, or bearing some other relation to certain reference markers (e.g., by being a friend of mine), possess a certain property. Absolute selection explains why nearly every organism of some kind in existence possesses a certain property. Absolute selection does explain the existence of the behavior that has been selected for, and so should count as a legitimate form of explanation of that behavior. It is odd, to say the least, to insist, as Dretske does, that the calibration that occurs in the lifetime of an organism confers content on its states, but that the calibration that occurs over the span of evolutionary history cannot. Dretske's account of the basis of the mental content of beliefs is very similar to my own. Dretske proposes that a belief-type M represents that P is the case just in case M has the function of indicating that P is the case, i.e., has the function of carrying the information that P is the case. The account that I develop in chapter 16 differs from Dretske's account in two ways: (1) it defines content in terms of having the function of robustly carrying the information

Theories of Mental Content


that P, and (2) it introduces a relation-parameter, fixing the relation between the mental event-token and the event-token that is the object of representation. The first difference means that when a belief fulfills its functions, it becomes thereby not just a true belief, but a case of knowledge. The second difference is critical in order to avoid Liar-like paradoxes, besides the fact that it enables the account to cohere well with recent work in philosophical linguistics (including situation semantics and discourse representation theory). Dretske's account of desires differs quite fundamentally from mine. I take a desire to have representational content: a desire for X is a state that represents the fact that it would be good for me to have X in the near future. Beliefs and desires interact with the faculty of the will, whose function is to generate intentions and volitions that successfully guide one toward the fulfillment of one's functions (eudaemonia). A volition to <f> is a state whose proper function is to cause the performance of an act of <^-ing. Thus, I follow Bratman (1987) in rejecting a simple belief/desire model of intentional action. Dretske (1990) defines a desire for X as a state that reinforces behavior that results in X, or, more precisely, a state D is a desire for X just in case past tokens of D reinforced certain forms of behavior exactly because that behavior resulted in tokens of X. I would contend that Dretske's account is an account of tropism, not desire. Dretske's belief/desire model of behavior is really a perception/tropism model. A desire is something that can be weighed and evaluated, acted upon or resisted. A desire is one way in which a future state can be represented as good, but it is not the only way. A human can act against all of her present desires, when she believes (as a result of inference or intuition) that the greatest good demands doing so. Dretske's model leaves out the role of the will entirely, oversimplifying the phenomena. Dretske argues that he can explain the causal efficacy of mental content, since the fact that mental event-type M indicates P can be used in causal explanations of the fact that M results in behavior of type B. The argument goes something like this: The fact that M is a reliable indicator of P is part of the explanation of how M became connected to behavior B as a result of learning (operant conditioning). However, as Lynn Baker and others have pointed out, there are at least two difficulties with this argument. First, at best it shows that what M indicates (the information M carries) is causally relevant to behavior. It does not establish that M's having the content it does, its having the function of indicating P, is causally relevant to behavior. Second, it does not show that the fact that a given token of M indicates P is causally relevant to that token's causing a token of B. What Dretske's argument shows is that the fact that past tokens of M indicated P is relevant to a present token's causing B. Dretske's response to these objections is to argue that the present token of M's having the content it does consists in the facts about the token's history that explains the connection between M and B in terms of M's indicating P. Consequently, to say that a certain fact about past tokens (namely, the fact that these past tokens of M became connected to B in the organism because they indicated P) is causally relevant to the present production of B by a token of M is just to say that the fact that the present token of M has the content


Realism Regained

P is causally relevant to the production of B: the two facts are one and the same (Dretske (1991)). However, Dretske is confusing two things: (1) the past history of the present token of M that establishes the connection between M, P, and B and (2) the present token's having this history. The first of these is causally relevant to the production of B, but the second is not. According to Dretske's definition, having the content of P is a kind of "Cambridge property": it is not an intrinsic property of a present token of M. It is like the property of being an x such that Clinton is President: a property that really characterizes Clinton, or the world, not x. In chapter 16, I argue that the causal efficacy of mental content depends on the existence of higher-order functions. For example, our faculties of inference involve such higher-order functions: these faculties have the function of responding to other functional states (beliefs) according to their contents. The contents of beliefs thereby become causally efficacious by virtue of triggering the operation of these higher-order functions. Lower animals such as worms, without such higher-order functions, have states with mental content (perceptions, reflexes, and so on), but these contents are not causally efficacious. They help us to understand the worm's behavior functionally, but they contribute nothing to our understanding of the etiology of the behavior. When dealing with cognitively sophisticated organisms, by contrast, knowledge of mental contents can be used in constructing perfectly accurate stories about the causal links in the production of behavior, since some higher-order functions are associated with dispositions to respond to mental content as such.


Fodor's Critique of Teleological Semantics

In A Theory of Content and Other Essays (Fodor (1990)), Jerry Fodor lodges two principal objections against the teleological account of mental content. First, he argues that this theory is riot able to account for the possibility of error or misinformation, since natural selection cannot make distinctions of sufficiently fine grain. Second, he argues that these accounts cannot explain the mental content of thoughts in modalities other than belief and desire. A particular kind of stimulation of the retina of a frog causes a response involving the tongue. Teleologically speaking, the purpose of this response is to catch and eat a fly (part of the frog's normal diet), and the purpose of the retinal stimulation is to carry the information 'Fly at point X'. Natural selection has selected this stimulus-response pathway for the purpose of catching flies, because the typical cause (in the frog's ancestral environment) of this stimulation has been the presence of a suitable fly in the appropriate position. Apparently, we can explain the intrinsic content of the retinal stimulation by reference to this teleological function. Fodor, however, objects that this account attributes too fine an eye for discrimination to Mother Nature. Suppose that the frog's vision cannot distinguish

Theories of Mental Content


between flies and BBs. Suppose further, as seems plausible, that hovering BBs have not been common inhabitants of the frog's ancestral niche. In this case, the retinal pattern carries the information, not only that a fly is probably present, but also that either a fly or a BB is probably present. Since things that are flies-or-BBs in the frog's ancestral niche are almost all edible (since they are nearly all flies), detecting and trying to eat flies-or-BBs is something natural selection can select for. Consequently, we have no reason to deny that the retinal pattern means 'Fly or BB at location X'. Since any error will be the result of something present that is indistinguishable from a fly, any erroneous representation will have a perfectly veridical content, formed by the disjunction of 'fly' with whatever actually caused the stimulation. Fodor's objection assumes that causation and teleology cannot distinguish between two properties that are very reliably co-extensive. This is simply wrong. As I argued in part I, causal (and therefore also functional) contexts are extremely sensitive to subtle differences in properties. I showed how not even logically equivalent properties are intersubstitutable salve veritate in causal contexts. From the fact that the presence of flies caused the survival of the stimulus-response pathway, plus the fact that flies-or-BBs are co-extensive with flies (even reliably, counterfactually so), it does not follow that the presence of flies-or-BBs has been a causal factor in the survival of the pathway. Mother Nature, in her guise as guardian of causal relations, has very sharp eyes indeed. Fodor also point out that states with mental contents, such as the idea of a cow, can occur in modalities quite different from belief. For example, I could, in a particularly bucolic mood, daydream about a cow. This daydream carries no information about the real existence of a certain kind of cow anywhere or anytime, nor is it supposed to. Daydreams fulfill some radically different purpose. I must concede that Fodor has a good point here, and I do not know of any existing teleological account, including mine, that adequately addresses the basis of mental content in modalities other than beliefs, desires, intentions, and volitions. Perhaps something like Fodor's language of thought hypothesis is correct, and the very same syntactic item occurs in both belief contexts and in daydreams. Or perhaps there is some other sort of systematic connection between beliefs whose content has a certain form and daydreams with isomorphic content. Obviously, a great deal of additional work remains to be done, but the apparent success of a teleological account in explaining the nature of the content of states closely connected with action gives some reason to hope that this success can be repeated with accounts of the content of more remote states.

This page intentionally left blank


Teleosemantics of Mental Representations

14.1 An Overview of Representational States

There are a variety of mental states in humans and animals that clearly have representational content: perceptions, actions and attempted actions, memories, thoughts, opinions, doubts, wishes, etc. In this chapter, I will sketch, in a very programmatic way, how my accounts of teleofunctionality and information can be used to account for mental representation. Mental representations fall into a number of significant categories. A distinction of fundamental importance is that between cognitive and pre-cognitive states. I take the defining characteristic of cognitive states to be the presence of sub-sentential or sub-propositional structure: the existence of identifiable components corresponding to subjects and to predicates, to designating terms and to predicating expressions. I do not intend to prejudge the question of the presence of language of thought in the brain, realized in a particular kind of symbolic architecture. Even if the brain's architecture is thoroughly connectionist, the phenomena of the articulation of thought into individual and general concepts must be accounted for. It is the presence of sub-sentential structure in cognition that enables us to formulate and use rules, and to engage in induction and abstraction, as well as to reason discursively. A limited amount of reasoning is possible precognitively, to the extent that something analogous to negation and disjunction is present. However, it is the kind of reasoning modeled in predicate logic that gives cognition is characteristic freedom from the particular. A four-way distinction among mental states is also useful. First, there are states that are immediately involved in action and behavior: motor volitions and preparations. It is these states that occur when one tries to perform some basic action and fails, and, of course, they also occur when one succeeds. Second, there are states that represent some sort of pro-attitude: appetites, desires, goals


Realism Regained

and intentions. Third, there are states whose function is to register information: perceptions and perceptual memory, opinions and intellectual knowledge. Finally, there are the second-order cognitions; conjectures, doubts, wishes, fancies, fictions, hypotheses, and the like. Combining these two sets of distinctions, we can generate the following map of mental representations. At the bottom are motor preparations, neural states that result in reflexes. These are normally considered sub-mental. They are certainly unconscious (lying entirely in the spinal cord and below), but they are representational in my sense. At the next level (still pre-cognitive) we find perceptions (and perceptual memories) and appetites. These two interact to produce motor directions, resulting in behavior. Above these are the four cognitive categories: opinions, intentions, volitions (both motor and mental), and second-order cognitions. There is some connection between perceptions and opinions, and between appetites and intentions, and the output of the cognitive level is volition, resulting in intentional action. However, the interaction at the cognitive level is probably quite complex, involving a good deal of feedback.

Figure 14.1: The Network of the Mind The account of representationality in each of these categories will be somewhat different. I will focus here primarily on perceptions and opinions, and secondarily on reflexes, appetites, and intentions. I will have little to say about the other categories, except to say that I expect the account of representationality to be parasitic in those cases upon the five that I discuss. Most probably, something like Harmon's theory of conceptual-role or functional-role semantics



is appropriate for the elements of higher-order cognitive states. However, the fact that the roles of these states include connections to first-order states, which in turn have information-theoretic and not conceptual-role semantics, suggests a way of breaking into the hermeneutic circle of a system of interrelated roles. There does seem to be some connection between the representational content of higher-order cognitive states and the teleofunction of carrying information about certain possibilities. When I hope for some state 0, my mental state has the function of carrying information about what would be involved in the possible realization of <f>. This seems to be true of fears, doubts, hypotheses, and many other cognitive states.


Pre-cognitive Representations

In the case of reflexes, a neural state occurs whose function is to produce a particular movement of the body. The motor preparation (as I call such a neural state) carries some minimal amount of information, such as that something dangerous or harmful, or nutritious (as in the case of nursing reflex), is located at some relative position to the body. The motor preparation exists because, in part, states of this kind often carry this information robustly. A state carries information / robustly just in case it carries that information, and every super-situation also carries the same information. Robust information is always veridical: ordinary information is reliably, but not infallibly, veridical. A perception is a state that also carries information (and, possibly, misinformation), and whose function it is to carry this information robustly. Unlike motor preparations, a perception does not have a characteristic movement as its effect. Instead, it interacts with other perceptions and with the appetites to produce motor directions, resulting in goal-directed behavior. Such behavior may reflect operant conditioning, but not any more sophisticated form of learning. Definition 14.1 (Representational Content of Perceptions) A perceptual type <j) has the representational content (R, if)) just in case there is a type x such that <f> has the function of robustly carrying the information under the condition x that there exists a situation-token of type tl> in relation R to the point of origin. Typically, the relation R will give some agent-centered location in space and time, e.g., 'left', 'right', 'up', 'down', 'now', or 'very recently'. My account differs in three subtle but important ways from that of Dretske. For Dretske, a type <f> represents that p if it is the function of 4> to carry the information that p. Thus, for Dretske a representational function is a function of carrying information, not of carrying information robustly. According to my definitions of information and function, if a state (j> had the function of carrying the information p, then that state would always carry the information p, and so malfunctioning in representation would be impossible. The reason for this is that on my account, type <j) carries the information p just in case the probability


Realism Regained

of p on some state's being <j> is infinitely close to 1. Moreover, a type that has a given function fulfills that function in almost all cases (with a probability infinitely close to 1). Thus, a type <j> whose function is to carry information p, will carry that information with probability 1, which means that the probability ofp conditional on </>'s being instantiated is always 1. Consequently, this function will always be fulfilled. In contrast, if a state has the function of carrying information robustly, then it will fail to fulfill that function in any case in which it carries the information non-veridically, or even when it carries veridical information but does so in a context in which it would be overridden by misleading non-veridical information (these are cases corresponding to Gettier-like instances of true belief without knowledge). A perception can be 'true' or veridical, even if it fails in its function to carry its information robustly. Gettier cases of veridical misperception illustrate this possibility. If I perceive that the stick is bent, and it really is bent, but the stick would, from other perspectives appear straight, then my perception is veridical, but not robust. When a representation fails to fulfill its function because the information it carries is non-veridical, I call this failure a case of "malrepresentation." Mairepresentations are relatively infrequent, since the objective relative probability of such a failure is always infinitesimal. The distinction between carrying information and carrying that information robustly is a very fine one. The conditional probability of one on the other will always be infinitely close to one. Nonetheless, the causation is sensitive to such extremely fine-grained distinctions, so it is possible for a state to have one but not the other as its function. The function of a perceptual state is not only to carry certain information, but to do so robustly, since it is the secure possession of truth that contributes to the organism's fitness. A second point of difference between my account and Dretske's lies in my use of the conditional functions. For (f> to have the representational content of i/>, it is not necessary that </> carry the information ijj absolutely. It is sufficient if <f> has the teleofunction of carrying the information ip on the condition that some condition x is also realized. For example, suppose that a rabbit always runs in the opposite direction of a certain kind of sound. We should say that this sound represents the presence of a predator in the relevant direction, even if predatory animals are present only occasionally when the sound is heard. The sound represents the location of a predator, since it carries information about the location of the predator on the condition that a predator is in fact nearby. In chapter 9, I developed a theory of such conditional information. On this model, a representation of the content V> is not simply a state with the function of carrying the information that ip, but rather a state with the conditional function of carrying the information that i(> in circumstances X- In the case of the flight mechanism, we could say that the associated perceptual states have the function of carrying information about the approximate location of a predator in circumstances in which a predator is actually present and has actually made some significant noise of the appropriate kind. It is the animal's perceptual state plus the teleologically relevant circumstances that carry the



information that a predator has a particular location. When the perceptual state occurs but the relevant circumstances are not actualized, we can say that the state constitutes a misrepresentation. This notion of conditional function is closely related to the idea of conditional constraints developed by Barwise and Perry,1 We can say that a perceptual state (f> of a token s has the conditional function (conditional on circumstances x) f 00000000000000000000000000000000000000000000000000000000000000000 and only if the fact that the conjunction </>& x carries the information (relative to R) that V7 is realized is causally relevant to s's being of type <p. Definition 14.2 (Conditional Representation) A state 4> of situations has the function of representing the type ip relative to R and on condition that x if and only if: (i) the fact that (f> and x are co-instantiated carries the information that ip is realized in relation R to the site of information, (ii) the informational constraint expressed by (i) is itself causally relevant to s's being of type (j>. When type <j> has the teleofunction of robustly carrying the information that i/> is realized in relation R on condition that x 1S also realized, I call the type X a "normality" condition for representation <j>. When a representation fails to be accurate because its normality condition is not realized, I call the failure a case of "misrepresentation," as opposed to "rnalrepresentation", which occurs when the normality condition is present but the information carried fails to be veridical. When the rabbit is startled by a harmless animal, the perception is a case of mis-, rather than mal-, representation. Misrepresentations can be quite common, since there is no requirement that the associated normality condition of a representational type be objectively probable. The third way in which my account differs from Dretske's is in its explicit use of a relation between the token bearing the information and a token which the information is about. This relation corresponds to the demonstrative component of meaning as described by J. L. Austin and by Barwise and Perry. This relational component undergirds the possibility of de re attitudes and plays a crucial role in averting the Liar paradox. In the case of an appetite, the information that is carried concerns the needs of, or opportunities for some advantage of, the organism. Appetites and aversions can be thought of as a special category of perception: perceptions that are like reflexes, in that they have the additional function of making a particular sort of behavior more likely. Unlike motor preparations or motor directions, however, appetites do not actually carry the information that the characteristic sort of behavior will occur, since the frustration and deferral of satisfaction of appetites is a regular, and not an exceptional, occurrence. In addition, the behavior whose probability is enhanced by an appetite must be described quite abstractly: behavior that is likely to meet the represented need or achieve the

(Barwise and Perry, 1983, pp. 112-114, 270-272) and (Barwise, 1989, pp. 149-151).


Realism Regained

represented advantage, given the perceptions and past experience of the organism. Motor directions, like motor preparations, carry information about the future state of the organism's body. Misinformation occurs when something interferes with the execution of the directions. Once again, it is possible for the motor direction to succeed in producing its characteristic behavior, and yet fail to fulfill its function, if it fails to carry the information about the future behavior robustly, if, for example, its success depends on some accidental but fortunate factor.


Cognitive States: Opinions and Intentions

Intentions carry a kind of conditional information about the future state of the organism. An intention to bring about a situation of type $ that is jR-related to the intention is a state whose function it is to carry the information that such a situation will be brought about, unless the agent's opinions are mistaken or a new, countervailing intention interposes. Goals can be thought of as secondorder intentions: intentions to form intentions of a certain kind, if conditions for realizing the goal are sufficiently favorable. An opinion is a cognitive state with the function of carrying robust information. Knowledge is the normal case: mere opinion is a case of failed knowledge. Hence, I do not define knowledge as a special kind of true opinion; rather, it is true opinion that must be defined in terms of knowledge. An opinion is a case of knowledge just in case it has the function of robustly carrying the information (R,ijj), and it succeeds in carrying that information robustly. If it carries veridical information, but does not do so robustly, it is a case of true opinion. If it carries non-veridical information, it is a case of false opinion. Since opinion is a cognitive state, each opinion-token realizes a plurality of representation-types. In the case of opinions whose content is atomic, at least one of these types must be a token-designating type, something analogous to a singular term or to a discourse referent in Kamp/Heim discourse representation theory (Kamp and Rele (1993), Heirn (1990)). Another of the types must be a type-designating type, analogous to a predicate or DRT condition. The function of a token-designating representational type must make reference to the existence of type-designating types, and vice versa. The two sets of functions are interdependent. Each type-designating type has the function of combining with one or more token-designating types, where each of the latter must belong to a different category, such as agent-designating, patient-designating, means-designating, etc. In this way we can distinguish between opinions whose content is 'Antony loves Cleopatra' from those whose content is 'Cleopatra loves Antony'. Hence, every token-designating type must come in several distinct sub-types, corresponding to these different sub-sentential roles.



A token-designating type <j> will typically bear as its content some relation jR0. The relation R,p picks out those tokens that are potential referents for tokens of type </>: if s is of type <j>, and Rss', then s' is a potential referent for s. If 0 is a true proper name, R will always pick out a unique object. In some cases, a type-designating type will bear as its content not a single, unitary type, but a more or less loosely denned family of types. In these cases, certain members of the family play the role of paradigms of the family, and membership in the family of types is determined by phenomenal or cognitive "distance" between each type and each paradigmatic member of the family. In cases in which the space of types is dense (in the mathematical sense), for example, when the space of types constitutes a continuum, this process of defining families gives rise to the phenomenon of vagueness. In a 1994 article in Mind (Koons (1994c)), I argued that vagueness can best be modeled by means of a four-valued logic, and that accepting the existence of first-order vagueness does not force us to postulate an unending regress of higher-order vaguenesses.


Mental Representation and Language

Although I will have little to say in this work about linguistic meaning, I will go so far as to endorse (for the most part) the account of linguistic meaning given by Millikan in Language, Thought and Other Biological Categories (Millikan (1984)). Unlike Grice, I would not attempt to explain the content of the elements of public language by reference to a complicated set of beliefs and intentions on the part of the speakers of that language. In many cases, we use language more or less thoughtlessly. In fact, the metaphor of the word's using us to perpetuate themselves (like Dawkin's "selfish genes") is an apt one. Words and syntactic structures have the function of carrying certain kinds of information because their doing so in the past is part of the causal explanation of our present use of them. At the same time, I am equally adamant in rejecting the view of those (like Sellars, Davidson, or Brandom) who would make mental representation dependent on the social practices constituting a public language. Normativity is generated by teleology, not by sociology. Indeed, if there were no normativity built into the teleofunctional structure of human life, the positive norms of social custom would themselves be impossible. The social practices of assertion and other forms of communication are only fruitful because we have something, namely information, that is worth communicating. No doubt participation in a public language increases the flexibility and subtlety of our repertoire of mental representations. The linguistic division of labor emphasized by Tyler Burge and Hilary Putnam has much to do with explaining this fact. Nonetheless, were there no pre-linguistic content, this multiplier effect would be nugatory.


Realism Regained


The Narrowness of Mental Content

Is mental content broad or narrow? That is, does mental content supervene on the intrinsic condition of the human being? On my account, this turns on whether or not teleofunctional states supervene on this same intrinsic condition. I am inclined to say that it does - that the function of a state is determined by the most likely causal history of the organism in which the state occurs, given the intrinsic state of that organism. (See chapter 14.) This would mean that all mental content is narrow. However, some of our mental contents are linguistically realized. The function of a word or other linguistic unit does not supervene on the intrinsic condition of the word or symbol, nor does it depend only on the intrinsic condition of the individual language-user. We must instead look at the linguistic system as a whole, as a communal institution. I would argue that the representational contents of words supervene on the present condition of the entire language community. However, this means that the contents of linguistically-mediated mental states are broad rather than narrow.


Teleosemantics and the Liar Paradox

Since I have given a kind of definition of 'true opinion', there is the danger that I will run afoul of Tarski's theorem to the effect that truth is indefinable. In particular, it would seem to be possible to construct a Liar opinion-token, true just in case it is false. However, it is essential to my account that every opinion-token realize two kinds of types: type-designating and token-designating. Even opinions whose content seems to be abstract and eternal, like '2 + 2 = 4', are always opinions about some concrete token or other. In the case of mathematical opinions, the token in question is an atemporal one, supporting certain kinds of modal constraints. In the case of semantic opinions, the situation-token will be a complex one, supporting certain facts about some opinion-token, and also certain modal and causal facts of a general, atemporal nature. To say that some opinion-token p is true is always to say, of some situationtoken s, that it is of the rnaking-p-true type. It is the addition of this token parameter that enables me to escape the Liar paradox, by simply borrowing the work done on such parametric theories of the Liar as those of Barwise and Etchemendy (1987) and myself (Koons (1992), Koons (1994a)). If we try to formulate a Liar token, an opinion p that carries the content that some situation s is of the not-making-p-true type, we reach the conclusion that p is actually true, since s is not of that type. However, there is some larger situation s' that does make p true. When we reflect upon p, we are talking about this larger situation s'. Hence, there is no contradiction: s is of the not-making-p-true type, and s' is of the contrary type. (If we formulate the Liar in terms of falsity, instead of non-truth, we reach the conclusion that the Liar is made false in the larger, but not in the smaller, situation.)


A Causal Theory of Logical and Mathematical Cognition

15.1 The Need for a Causal Theory

Gettier examples in epistemology indicate that a causal element is needed to distinguish knowledge from true opinion. The distinction between knowledge and true opinion pertains to the domain of logic and mathematics, as well as to domains of contingent and temporal truths. Hence, we need to be able to appeal to the existence of causal connections of the appropriate kind between mathematical facts and our beliefs about them. Paul Benacerraf (1983a) has pressed this point as a basis for a critique of realist conception of mathematical truth: if mathematical facts are causally inert, we cannot know them. Additionally, a causal theory is needed to provide an account of how we are able to refer to particular mathematical objects. There are infinitely many mathematical structures that provide models of our theories of arithmetic: how, apart from a causal connection, are we to explain the fact that we refer to exactly one of these structures in arithmetic? This point, like the last one, has roots in the work of Benacerraf, in particular his "What Numbers Could Not Be" (Benacerraf (1983b)). As soon as we try to do so, however, we face a dilemma. If we try to identify logical and mathematical facts with contingent, spatiotemporal facts, we distort the nature of mathematics and lose that which distinguishes it from other sciences, as is illustrated by John Stuart Mill's version of mathematical empiricism. Alternatively, if we locate mathematical fact in a timeless, necessary Platonic heaven, we face the daunting task of finding a ladder to make possible commerce between the Platonic heaven and cognitive states on earth. Merely talking about "a priori knowledge," or vague allusions to a special capacity


Realism Regained

of sight or touch seeing that 2 + 2 = 4, or grasping a mathematical truth fails to give us the kind of substantive theory capable of sustaining the knowledge/opinion distinction. I will attack one horn of the dilemma, the horn that has rarely been challenged, to my knowledge (the sole exception I know of is Penelope Maddy's work, about which I will have some comments in section 15.7 below). I will argue that numbers and other mathematical objects are real and are causally effective. Information about mathematical objects is conveyed to us causally, not by some mysterious faculty of "mathematical intuition," but through our interactions (both sensory and active) with ordinary, everyday situations. My view might be described as a kind of naturalistic Platonism, as opposed to the mystical Platonism of those who postulate a mysterious, non-natural channel to the mathematical as the unique possession of the human mind.


Logico- Modal Facts as Causes

I suggest that it is modal facts that provide Jacob's ladder between temporal events and Platonic truths. In chapters 4 through 7, I developed an account of causation according to which modal facts can act as causes of contingent, temporal events. Logical and mathematical facts have causal efficacy in their modalized forms. For example, consider the law of excluded middle. If a token is of type (f> V -i^, then it must be either of type <f> or of type -i<f>, since the relation between tokens and types is governed by the extension of the strong Kleene truth tables to four- valued logic (the Dunn-Belnap tables). The purely disjunctive type never figures as such in any causal chain, as I explained in section 4.8.1. An instance of the law of excluded middle is a paradigm case of an unnatural or gerrymandered type. However, a situation token can be of the type D(</> V-i</>), without being of type </> or -KJ>. This modalized disjunction can figure significantly in causal chains. Suppose that we have causal laws of the following forms:

Let us suppose that situation s supports both of these laws, plus the types x and p. Is s then a total cause of a succeeding state s' of type i/>? Not necessarily. In addition to the two causal laws above, s must also support the modal type O(cj) V -i</>). Without this modal fact, the existence of a situation of type 1/1 does not follow (see section 8.3). For example, there could be an impossible situation-token s' that, from the limited perspective of s, is counted as possible. This token s' might support both <f> and -itj>. Consequently, it falsifies both of these tokens. The two causal laws above could still hold at s, even if s' is not followed by a token of type ip, since s' falsifies the antecedents of both laws. In contrast, if s did support the modalized instance of the law of excluded middle,

Logical and Mathematical Cognition


this would be impossible, and every s-possible token would be followed by a token of type i[>. If we assume that causal connections between tokens are always associated with causal connections between the corresponding types (what I called "Hume's hypothesis" in chapter 5), this would mean that token s cannot cause a token s' of type if; unless it supports the modal/logical type D(0 V -><f>). Hence, we can correctly say that the modalized version of this instance of the Law of Excluded Middle was causally relevant to the production of the concrete, temporally located situation s' of type ip. The existence of this vertical causal connection does not exclude the simultaneous existence of an ordinary, horizontal connection. To return to the example above, there will be in the world a situation s' that is either of type <f> or of type -i0. If this situation s' also supports the relevant causal law, then it will constitute a total cause of the existence of a situation of type if). This will not, however, be a case of overdetermination, since the two causal connections occur at different levels. The vertical causal connection involving the modal property n(0 V -10) presupposes the existence of some horizontal connection involving either 0 or -10. My account of logical knowledge depends on two things: postulating the existence of logically complex situation-types (negative types, disjunctive types, etc.), and postulating that the support relation between tokens and complex types is governed by a four-valued interpretation scheme, namely, the Dunn tables (and their extensions to the first-order case).1 In some actual situations, the facts are partial: if (f> represents the presence of hydrogen, then neither 0 nor -i0, its negation, may be supported in certain parts of the world (parts representing features other than chemical ones, for instance). There are no actual, nor even any possible, situation-tokens supporting both 0 and -i0; however, this impossibility is itself a fact that may be supported in some situations and not in others. In order to represent such modal partiality, it is convenient to use impossible, overdetermined situation-tokens in our models. I am a realist about modality, but (unlike David Lewis) not about possible-but-not-actual situations. What is really possible is the realization of a certain type: it is convenient to model this fact through the use of fictional objects such as possible-but-not-actual situation-tokens. Similarly, in order to model modal partiality, it is convenient to make use of the fiction of impossible situation-tokens. One major advantage to the Dunn tables (as well as to their three-valued counterpart, the strong Kleene tables), for my purposes, is that no formula is true in every interpretation, or, in my case, no type is supported by every token in every model. There are non-trivial logical consequences iff partial logic; for example, 0&V entails tj}&c,(j>. However, there are no logical validities in this logic, no conclusions that can be validly drawn from an empty set of premises. This means that every classical validity (every type that is supported by every totally defined token) corresponds to a piece of modal information that

See appendix A.


Realism Regained

may or may not be supported by a given token. If we slap a D in front of a classically valid type <j>, we produce a piece of logical information that can enter into causal explanations of concrete events, including our own perceptions and beliefs.


Knowing How to Infer Correctly

It is important, for my purposes, to distinguish between logical knowing-how (knowing when it is proper to draw a particular inference) and logical knowingthat (knowing the necessity of a given classical validity). Knowing-how is to be defined in terms of a reliable disposition to draw only the correct (strongKleene or Dunn valid) inferences, where this disposition has logical reliability as its proper function (in the sense given in chapter 12). Knowing-that involves knowledge of particular modal facts, which entails the existence of an appropriate causal link between the particular fact known and the knowing of it. A pattern of inference is knowledge-conferring for a given person only if it has, as realized in the dispositions of that person's mind, the teleofunction of preserving robust (or, at least, veridical) information. A necessary condition of such teleofunctionality is that the cognitive disposition be of such a kind as is typically caused by a corresponding constraint in the world. We can see the need for such a causal connection by considering Gettier-like examples of failed inferential knowledge. Suppose that Max has proved a mathematical theorem by means of the inference rule of modus ponens. However, Max used modus ponens only because the rule was recommended to him by his astrologer, Morris, and Morris recommended that Max use modus ponens only because Max is a Pisces. Had Max been born under any other sign of the zodiac, Morris would have recommended, and Max would have used, other, unsound rules of inference. Under such circumstances, Max's use of modus ponens is not knowledge-conferring, and so Max's would-be proof provides him with no knowledge of the truth of the theorem. For inferential knowing-how to be possible, we must have two constraints, one a modal constraint involving logical or mathematical form, and the other a causal constraint linking different beliefs in the mind of the mathematician on the basis of their content.2 Take, for instance, the inference pattern corresponding to the T axiom of modal logic: inferring <j> from D<. This inference pattern is realized in the mind of the mathematician as a causal constraint between belief-types: Bel(D0)|~ Bel(^). This inference pattern is an instance of knowing how to reason correctly only if it has the teleofunction of mirroring a corresponding constraint in the world, namely, D(D^ > (f>). This modal constraint is supported by any situation-token sufficiently rich in its support of modal facts. For the causal constraint in the mathematician's mind to have the appropriate teleofunction, it must be such a kind as to be apt to be caused
See Barwise and Seligman's recent book (Barwise and Seligman (1997)) on information flow for a powerful mathematical theory of such constraint-mirroring.

Logical and Mathematical Cognition


by the corresponding modal constraint. Fortunately, the account of vertical or higher-order causation developed in part I is adequate to the task of postulating such higher-order causal constraints between modal constraints on the one hand and cognitive causal constraints on the other. In this case, there are general facts about natural selection, or general facts about the human capacity for trial-and-error learning, that can provide the basis for such a higher-order causal constraint. Inference patterns that are reliably truth-preserving are the sort of thing nature selects for, and they are also the sort of thing that humans are apt to discover on the basis of experience. In some cases, our disposition to reason correctly in a certain way is innate, to be explained by natural selection, and in other cases, it is learned from the individual human being's experience. Which inference patterns are which is a question to be settled by empirical cognitive psychology. In logic, mathematics and other formal disciplines, knowing-that can be seen as merely a special case of knowing-how. To know a logical truth or mathematical axiom, all one needs is to know how to infer that truth from the empty set of assumptions. Where the set of assumptions is empty, the modal constraint collapses into the simple necessity of the corresponding logical or mathematical type, and the cognitive causal constraint is simply the disposition to believe the axiom without proof.


Is Logic Factual?

I am claiming that the subject matter of logic is a domain of fact, specifically, of modal fact. There is a long tradition in philosophy, beginning at least with Hume, that divides truths into two categories: matters of fact and matters of the relation of ideas. Logic is the paradigmatic example of the second category. Hume's distinction depends on the assumption that there can be no necessity in the world other than that which is projected on the world by some sort of psychological necessity. This in turn was based on Hume's sensationalist theory of concepts: since we have no sensation of necessity (outside of introspection), we can have no real concept of it (as applying to external realities). It is hard to see how Hume's distinction can be defended, since if there really is no necessity in the world, then there is no psychological necessity either, and hence no necessary relation of ideas. Conversely, once we acknowledge that some mental representations are possible and others are impossible, what principled reason do we have for extending this distinction to extra-mental event-types? Another objection to a factual theory of logical truth proceeds in this way: whatever is factual is contingent, logical truths are not contingent, and therefore logic is not factual. I deny the first step: many modal facts (perhaps all of them, if the S5 axioms are sound) are necessary.


Realism Regained


The Form/Content Distinction

A Kantian objection to the factuality of logic would be based upon the form/content distinction. Logical truth has to do only with the form, and not with the content, of the relevant propositions. Factual truth, in contrast, depends on the content as well as the form. Some logical forms, like that of a self-contradiction, are incoherent, and so do not represent any possible state of affairs. Negations of such incoherent forms are true by virtue of form alone, and hence convey no information about the actual world. I have no problem with the hypothesis that there is such a thing as logical form. My claim is simply that the impossibility of the actualization of an inconsistent form is itself a matter of fact, by which I mean, the sort of thing that can figure in causal explanations. Forms are themselves parts of the world, even if (as the Kantian might suppose) only as parts of the mind. The impossibility of assembling a representation with a certain logical form, or the impossibility of a representation with such a form applying veridically to the world, is itself a modal fact about the world (as a whole). Hence, the Kantian has not given an account of logical necessity that is innocent of commitment to modal facts. The Kantian makes claims about the necessary coherence or incoherence of certain forms, and these claims themselves are necessarily true by virtue of their content: they make claims about logical form, but do note themselves have the form of logical tautologies.


A Critique of Conventionalism

Are the principles of logic true by convention, or by virtue of the meanings of the words involved, and not by any kind of correspondence to the world? I find Quine's objections to these theses in his classic, "Truth by Convention" (Quine (1949)), to be decisive. If we say that the principles of logic are merely rules that we adopt, we face two embarrassing questions: (1) for what purpose do we adopt these rules? and (2) what determines what follows from the rules we adopt? On the first question, surely we use the rules of logic we do because we believe that they are reliably truth preserving, in both absolute and hypothetical contexts. Modal facts about truth and validity, then, stand prior to and apart from our choices of logical systems. On the second question, it is surely a matter of logic to decide what does and does not follow from any set of rules. Hence, logic itself cannot be merely a set of rules. If we suppose that the logical conventions we adopt are not themselves logically complex propositions (e.g., 'every sentence of the form (p > p) is true'), but instead consist merely in a pattern of linguistic behavior, we face the problem that our past behavior is finitary, but the set of logical truths and logical consequences of our language is infinite. It is impossible for a finite set of decisions ('this shall be true', 'that shall be false') to determine the extension of logical truth, unless negation, conjunction, quantification, and the other logical primitives have a pre-linguistic existence. If (and only if) the latter is so, our

Logical and Mathematical Cognition


conventions can link particular English words and symbols to these pre-existing logical operations. But, in this case, the laws of logic are not themselves the product of our conventions but have an independent existence. If we suppose that the truths of logic are given by a conventional acceptance of the classical truth tables, we face the problem that the truth tables presuppose the principle of bivalence: the general fact that every proposition is either true or false, but not both. This general fact of bivalence is not something we can make true by collective fiat. If we say that bivalence is entailed by what we mean by "proposition," then we have no basis for assuming that the class of propositions (so specified) is really closed under the operations of negation, conjunction, and so on. Every time we encountered a novel sentence, we would have to first verify somehow that it really expressed a proposition (according to our bivalent conception) before we could confidently apply the principles of logic to it. Transcendent-Basis and Immanent-Basis Conventionalisms It is clear that in some sense all of the truths of logic involve conventions, as do all the true sentences of any actual language. However, there are two ways to understand the role of convention. On the first, the transcendent-basis conception (hereafter, 'TB conventionalism'), conventions link the elements of our language to pre-existing forms or structures. Only finitely many links are needed, each established via the combination of teleology and information (as I described in chapter 14). There are infinitely many truths of logic expressible in English because there are infinitely many logical facts comprising the finitely many logical elements associated with the logical constants of English. The truth-makers of the theorems of logic exist independently of our language and its conventions: conventions serve only to link sentences to appropriate truthmakers. On the opposing view, the immanent-basis conception (hereafter, 'IB conventionalism'), there are no such convention-independent facts or truth-makers. Our conventions in and of themselves ground the truths of all "analytic" sentences, including all of the theorems of logic. These truths are somehow immanent in our practices, taken as contingent facts of social practice, unrelated to a realm of Platonic objects and their relations. As I have argued above, it is impossible for finitary social practices, by themselves, to ground an infinity of logical and mathematical truths. To make this clearer, I will consider in more detail two versions of the immanent constructionist conception, one drawing on the later work of Wittgenstein, and the other on Carnap's philosophy. Quasi-Wittgensteinian Conventionalism It is unclear whether Wittgenstein, in the Philosophical Investigations (Wittgenstein (1953)) or Remarks on the Foundations of Mathematics (Wittgenstein (1978)), embraces the IB conventionalist view. His primary target in these


Realism Regained

works seems to be the Cartesian epistemology of Bertrand Russell, especially, his emphasis on "knowledge by acquaintance." In fact, I will argue that IB conventionalism is inconsistent with certain key Wittgensteinian tenets. Nonetheless, it is possible to use some of Wittgenstein's ideas to fill out the immanent constructionist view, and so I will consider such a quasi-Wittgensteinian view in this section. A quasi-Wittgensteinian could try to turn aside my objections to immanent constructionism by denying that there are in fact an infinity of logical truths. He could argue that the infinity is merely potential and so deny that there is any infinitary fact in need of explanation. I will argue in this section that the quasi-Wittgensteinian conventionalist is implicitly committed to the existence of an actual infinity of rules. For the quasi-Wittgensteinian, logical and mathematical truths are normative, rather than descriptive. They do not say anything; instead, they merely show or express the rules or norms inherent in our language-game. Thus, the quasi-Wittgensteinian is committed to the thesis that actual human practices embody certain norms: that particular practices follow certain rules. What sort of thing is a rule, as an element of Wittgenstein's philosophy? We can take a first step toward answering this question by focusing on what I shall call the infinitary upshot of a rule. With each rule we can associate an upshot, a function that assigns certain values (such as 'correct', 'incorrect', 'borderline', and so on) to an infinite set of actions in context (past, present, and future, actual and hypothetical). The IB conventionalist must simply identify a rule with its corresponding upshot. If we were to acknowledge that a rule is something over and above this upshot, a thing that somehow by itself determines its upshot, we would have moved from the IB to the TB version of conventionalism, since the rule would then be a transcendent object to which we become connected by engaging in a certain social practice. The fact that a particular rule (i.e., a particular infinitary upshot) is in practice in a particular community (concretely specified) is a fact of a kind that I shall call a practice fact. If the quasi-Wittgensteinian is to have any advantage over the TB conventionalist, it is essential that these practice facts be epistemically accessible: they must be the sort of thing that we can come to know. Wittgenstein assumes that practice facts, like psychological facts, are not something we perceive directly. Instead, there are empirical criteria associated with each such possible fact. When we observe positive criteria for the practice fact, we have good (albeit defeasible) grounds for accepting the corresponding proposition (namely, the proposition that rule R is in practice in community C), and we do not need to ground this appeal to these criteria on anything more fundamental (such as the observation of a positive correlation between the satisfaction of the criteria and the truth of the associated proposition). A criterial system is a pair of two sets, a set of positive criteria, and a set of negative criteria. Each criterion in turn is a finite set of observable, effectively decidable conditions. A criterion cannot itself be infinite, or otherwise inacces-

Logical and Mathematical Cognition


sible to us, since the whole point of postulating criteria is to give us something we can use to arrive at reasonable opinions concerning the corresponding fact. Each possible practice fact must somehow be linked to a criterial system. The crucial question is: how are we to account for these linkages? What do they consist in? The quasi-Wittgenteinian must insist that the linkage of criterial systems to practice facts is itself a normative, rather than a descriptive, matter. It is the rules of our language-game, and not any language-independent fact of the matter, that constitute these linkages. This means that the linkage or association of criterial systems with practice facts is grounded in a set of actions that we collectively take, guided by some shared rule. These actions include such things as affirming a practice proposition on the basis of a positive criterion, rejecting such a proposition on the basis of a negative criterion, appealing to criteria in settling questions about practice facts, etc. An action of associating a criterial system with a practice fact concerning some domain of action is a kind of meta-action. More formally: Definition 15.1 (The Meta Relation) If action a belongs to the domain of the upshot U of rule R, and action b links a practice fact concerning R with a criterial system C, then b stands in the meta relation to a. This quasi-Wittgensteinian version of IB conventionalism is incoherent, because the following six propositions are inconsistent: 1. All practice facts (for example, the fact that rule R is practiced in community a) are empirically accessible. 2. A practice fact can be empirically accessible only if there exists a linkage of the fact to some criterial system. 3. Such linkages are entirely the product of rule-governed human actions. 4. The transitive closure of the meta relation between actions is a partial order (transitive and irreflexive). 5. There are at most finitely many rule-governed actions. 6. There exist rule-governed actions (i.e., some practice fact is actual). Propositions 1 through 3 guarantee that, if there are any rule-governed practices at all, there must exist rule-governed meta-actions that link the practice fact to a criterial system. The existence of these rule-governed meta-actions constitutes a further practice fact, the fact that the rules governing these metaactions are in fact in practice in the community in question. This meta-level practice fact, must in turn, be linked by rule-governed human actions to a meta-level criterial system. According to proposition 4, the meta relation constitutes a partial order, and so there must be an infinite hierarchy of distinct practice facts, with an actual


Realism Regained

infinity of linking actions. Were the hierarchy to terminate after finitely many stages, the highest-level meta-practice fact would be actual but epistemically inaccessible, contrary to proposition 1. Moreover, proposition 5 states that actual human practices cannot encompass such an infinity of actual actions. Consequently, the Wittgensteinian must give up proposition 3, and with it, the IB conventionalist conception. If the Wittgensteinian tried instead to give up either proposition 1 or 2, he would be faced with a dilemma. Either he would have to postulate some mysterious form of non-empirical intuition by which we come to have grounds for accepting practice propositions, or he would have to admit that there are truths, expressible in our language, that are in principle inaccessible to us. This would mean that the Wittgensteinian was no better off than the most dogmatic Platonist. We might as well accept that we have a mysterious faculty for intuiting logical and mathematical truths, or that we have a mysterious ability to think about things that are utterly inaccessible to us. Proposition 4 is perhaps the most doubtful of the five. Couldn't there be a fixed point of the meta relation, an action a that linked a criterial system C to some practice fact concerning a rule R, where a itself belongs to the domain of the upshot of R? Let's call an action embodying such a fixed point a self-determining action. I will contend in the following section that a selfdetermining action is clearly impossible, given the IB conventionalist hypothesis. Reference and Self-Reference without Transcendent Objects At this point, we must return again to the question of the ontological status of rules. If rules are practice-independent objects that autonomously fix their own infinitary upshots, then a self-deter mining action a might be possible, since a could make reference to a rule R, and R in turn might determine an upshot whose domain included a. However, admitting that rules exist and determine their upshots independently of our social practices is to abandon the immanent-constructionist hypothesis. Rules so conceived are transcendent, Platonic objects which are associated, by convention, with certain kinds of human action. Alternatively, if rules are not transcendent objects that determine their upshots independently of human action, then we must simply identify a rule with its upshot. On this conception, rules do not determine but simply are their upshots. However, if rules and their upshots are identical, then self-determining actions are clearly impossible. A meta action a that associates a practice fact involving rule R with a criterial system C is an action that operates upon a certain rule (R), taken as a completed entity. If R is simply identical to its upshot U, and if the meta action a is included in the domain of U, then a cannot have reference to the rule R without vicious circularity. In such a case, action o cannot associate a practice fact involving R with a criterial system C, since the status of a is itself an integral part of R. In order to make this point clearer, I must back up a little and discuss how the IB conventionalist accounts for the relation of reference for general terms.

Logical and Mathematical Cognition


Reference is a relation between a physical state (a vocalization or a graphical production or a neurological state) and a set of objects (the extension of th 00000000000000000000000000000000000000000000000000000000000000000000 it with the meaning-encumbered act (about which more is to follow). For the sake of argument, I am willing at this point to grant to the IB conventionalist the right to treat this reference relation as a brute, irreducible fact, brought about in some way or other by our shared social practices. In contrast, the TB conventionalist or Platonist will understand the reference relation to involve an intermediary object, the transcendent type or universal, which in turn determines an extension independently of our practices. On the TB account, our practices establish a connection (in my view, a causal connection) between the bare proto-signifier and this transcendent object. The bare proto-signifier by itself has no meaning. It is only the combination of the proto-signifier with the reference-induced extension that can be thought of as meaningful. Thus, for the IB conventionalist, the meaning-encumbered act can be identified with the sum of the bare proto-signifier with set of objects in its extension. The bare proto-signifier itself can be a member of this extension there is no reason to insist that the reference relation be irreflexive. However, the meaning-encumbered act cannot belong to its own extension (on pain of vicious circularity), since it is in part constituted by that very extension. The meta relation as I have defined it above is a relation between meaningencumbered acts. Only a meaning-encumbered act can associate a criterial system with a practice fact. Hence, for the IB conventionalist, the meta relation must be a strict partial ordering. Perhaps a legal analogy will be helpful here. Suppose we had a law that stated, 'This act shall be legally binding when it is passed by the legislatures of three-fourths of the states'. This law, call it L, attempts to incorporate within itself a rule of recognition (in H. L. A. Hart's sense) that is to apply to itself.3 A .document is not legally binding (it is not a law-encumbered document) unless it meets the conditions of some binding rule of recognition. Hence, no document can provide a legally binding rule of recognition that provides the basis for its own legality. There must be a document or an unwritten rule with a prior basis of legality to supply the conditions of recognition for L. The law cannot be socially constructed or positive rules all the way down. At some point, we must arrive at rules of natural law, which provide a practice-independent basis for recognizing certain positive rules as binding. The legal positivist and the IB conventionalist face precisely analogous infinite regresses.4
The Quasi-Wittgensteinian and Infinity

The quasi-Wittgensteinian is therefore faced with an actual, and not merely a potential, infinity of rules and practices. For a practice fact to be actual, there must be a further actual practice (at a higher level in the meta hierarchy)
3 4

Compare section VII of the Constitution of the United States. See chapter 22 for further discussion of the incoherency of legal positivism.


Realism Regained

assigning criteria to the original practice fact, or else it would be in principle unknowable. It is arguable that Wittgenstein himself would have rejected proposition 3, and with it, the IB conventionalist programme. In the Investigations, Wittgenstein postulates that it is our shared form of life as human beings that provides the linkages between practice facts and empirical criteria. This notion of a form of life is a notoriously obscure one, and one could take the TB conventionalist account developed in this chapter as a fleshing out of Wittgenstein's proposal. The TB conventionalist avoids the infinite regress that plagues the IB conventionalist, since practice facts are, from the TB perspective, ordinary, natural facts. There is no need for human action to make these facts epistemically accessible through the conventional association to them of empirical criteria. Instead, the connection between actual practice and logical facts is sustained by causal connections (including the teleology of mental mechanisms), which can be investigated by the normal methods of natural science. Kripke's Quasi-Wittgensteinian Solution In his 1982 work on Wittgenstein and rule-following, Saul Kripke (1982) interprets Wittgenstein as offering a "sceptical solution" to the problem of practice facts. Kripke's Wittgenstein hypothesizes that there are no such facts, that practice propositions have no truth-conditions. Instead, they have only assertibility-conditions. In a recent book, Robert Brandom (1994) suggests a similar strategy. Our social practices generate rules for attributing practice facts to portions of reality, given sets of empirical data. However, the Kripke-Brandom solution only pushes the problem back a step. Kripke and Brandom concede that there is no IB conventionalist solution to the problem of constructing the binary relation between bare proto-signifiers (observable behavior) and infinitary upshots. They recommend instead that we suppose that our social practices construct a ternary relation between empirical data, bare behavior, and infinitary upshots. (The ternary relation tells us which upshot is rationally attributable to which behaviors, given a body of empirical data.) However, this ternary relation is no less infinite or open-ended than is the original, binary relation. If finite social practices cannot provide a basis for assigning truth-conditions to practice propositions, then, for the very same reason, they are inadequate to the task of constructing infinitary assertibilityconditions for those propositions. Just as only meaning-encumbered acts can associate bare behaviors with infinitary upshots, only meaning-encumbered acts can associate empirical data, bare behaviors, and infinitary upshots. To avoid the infinite regress, we need some meaning-encumbered acts that do not derive their meaningfulness from meaning-encumbered social practices.

Logical and Mathematical Cognition Carnapian Conventionalism


A similar problem can be uncovered by examining Carnap's theory of logical and mathematical truths in, for example, The Logical Syntax of Language (Carnap (1937)). For Carnap, logical and mathematical truths are analytic, given somehow by the rules of a language system. The truth of a logical or mathematical sentence is an "internal question," to be answered with reference to the associated conventions and norms. We cannot raise the question of the truth of these sentences ("But are there really such numbers?") as an external question. Instead, our external questions must concern only the usefulness in practice of one language system or another. A crucial question for the Carnapian is the status of what I called practice propositions above. When I ask, "Is language L being used in community C1?" am I raising an internal or an external question? It seems that this must be an external question, since it is inseparable from the problem of judging the relative usefulness of one language over another. If I say that language L is more useful than language L', my evidence must consist of actual or hypothetical populations putting one language or the other into use. If we allowed practice propositions to be treated as internal questions, we would produce some very anomalous results. For example, suppose language L contains the meaning postulate that any successful population is a population using language L. This postulate would seem to trump the crucial external question, making language L maximally useful by self-proclamation. Consequently, it seems we must treat practice propositions about L as external to L. However, a question that is external to L must be internal to something. There must be a second, meta-language L' in which the external questions about L can be formed. This is in effect the point pressed by Quine in his attack on Carnap's analytic/synthetic distinction. Quine insisted that the metalanguage must, by Carnap's own principles, be a behavioristic one, and Quine demonstrated that practice propositions (propositions concerning which meaning postulates are actually being used in a given population) cannot be decided behavioristically. There is an independent, and less ad hominem, objection to the Carnapian that can be pressed at this point. By Carnap's principles, there must be questions that are external to the meta-language L', questions concerning practice propositions or usefulness assessments about L'. These external questions must be internal to a meta-meta-language L", and so on to infinity. Whether or not we are behaviorists, it is surely implausible to think that human social practice is adequate to support an actual infinity of meaning postulates, belonging to an infinite hierarchy of meta-languages. Why must each of these meta-languages be actually used by us? Why couldn't Carnap willingly concede (as Tarski did) that there exist, as abstract objects, an infinite series of linguistic systems, each external to its predecessor? The inadequacy of this response is evident upon simple reflection. For any given linguistic system L , there are an infinite number of external metalanguages, assigning different truth values to the sentence 'L is useful'. In


Realism Regained

assessing the actual usefulness of L, we must make use of the external metalanguage that we ourselves speak. Hence, the infinite regress must be a regress of languages in actual use. The only way to stop the regress would be to postulate a language for which there are no external questions. However, this would be to abandon a fundamental feature of Carnap's philosophy. Moreover, such a thing is impossible, if we accept the IB conventionalist view. The only language about which external questions cannot be raised is one so impenetrably obscure that no one can understand it, like the language of Hegel's philosophy.
Baseball and other Platonic Objects

Conventionalists often appeal to the phenomenon of rule-governed games as an analogy to the rules of language. Supposedly, no one would be tempted to postulate an eternal, Platonic entity corresponding to something as mundane and contingent as the game of baseball. If we can understand a rule-governed activity like baseball as grounded exclusively in immanent social practices, though, the need for a transcendent basis for language is undercut. However, I would argue that games like baseball are a perfect example of the need for transcendent objects. Indeed, one of the social purposes of having baseball and similar sports is to teach the young the reality of practice-transcending rules. Children seem to begin life as social constructivists (or IB conventionalists). When learning to play a game, children are skeptical about the existence of disinterested appeals to rules: they suspect that what makes something an out depends solely on some local, social consensus. Hence, they often try to manipulate the game for their own advantage, or accuse others (even adults who are making fair and disinterested applications of the rules) of doing so. At some point, a light turns on, and the young player grasps the fact that baseball has an integrity that transcends our fallible attempts to realize it. At this point, they become zealous for the strict and disinterested applications of those rules, even when doing so means a personal loss. They see that the value of playing baseball depends on playing by the rules. Now, of course, there is an element of conventionality to the rules of baseball. We could have played a game with five bases instead of four, or one requiring five strikes instead of three for a strikeout. However, the existence of the rules of baseball depends on the real existence of logically complex types, of negations and conditionals, as well as natural (non-gerrymandered) types involving motion and location and time. Without a transcendent basis consisting of these Platonic facts, and without our causal access to them, a genuine game like baseball would be impossible.


Logic as the Precondition of Thought

There is another distinction between logical and factual truths that could be contended for: that logical falsehoods are unintelligible, while merely factual falsehoods are intelligible. I agree that logical impossibilities are unintelligible,

Logical and Mathematical Cognition


but I do not accept the further inference that this makes logic non-factual. There are certain facts, namely the necessary ones, that it is unintelligible to deny. I would not limit this to logical necessities: it is unintelligible to deny any necessity, whether this is physical or causal or some other sort. It is unintelligible to deny that water is water, and it is also unintelligible to deny that water is H-zO. There is a difference between the two: I learn that the one wouldbe conception is unintelligible by learning logic, and I learn that the other is unintelligible by learning chemistry. A mental representation can represent a real possibility (and so represent intelligibly) only if there is, in the realm of modal reality, a possible situation for the representation to be about. Which representations represent intelligibly is itself a factual matter, a matter to do with the structure of the world. When we say that a logical falsehood is "unintelligible" or "incoherent," there are three things we might mean: 1. It literally cannot be thought. 2. Believing it would make one vulnerable to Dutch book strategies (in which it is possible to lose but impossible to win). 3. It is logically false. The third sense of "incoherent" of course trivially distinguishes logical impossibilities from other impossibilities. I am not denying that the class of logical necessities forms a natural and interesting class; I am merely denying that the account of logical truth is radically different from the rest of semantics. I would deny that logical falsehoods are incoherent in the first sense above. People do in fact believe logical falsehoods, and this is an important and causally relevant fact about them. I agree that believing logical falsehoods makes one vulnerable to Dutch books, but so does believing any impossibility. The incoherency comes from believing the impossible, not the illogical. In any case, even if logic is in some special sense a precondition of all thought, this fact is irrelevant to the project of explaining the possibility of thinking about and knowing logical truths. If logic is a precondition of all thought, this may give rne a reason to think logically, but it does not (by itself) explain how it is that I know logical truths, or what it is that I am talking about when I am doing logic.


The A-Priority and Unrevisability of Logic

Although I am defending a causal theory of logical and mathematical knowledge, it does not follow that I am committed to an empiricist account, a la John Stuart Mill. It is quite possible, and I think probable, that elementary logic and mathematics are knowable a priori, and, moreover, that they are in fact unrevisable, hard-wired into our minds. I am also not claiming that we know the truths of logic by abduction, by inference to the best explanation. Sophisticated scientific inferences like inference to the best explanation already presuppose a


Realism Regained

substantial body of logical knowledge, since otherwise we would not be able to judge what is implied or or contrary to a given hypothesis. Logic is largely pre-scientific. My point is that the content and the epistemic status of such a priori convictions stands in need of some kind of causal explanation. The faculty of imagination plays a critical role in the acquisition of new logical and mathematical knowledge. I can discover that the sum of five and seven is twelve, even though I have never encountered twelve identifiable things in one setting. I can imagine two disjoint sets, one of five and the other of seven, and discover that their union must consist of twelve individuals. No manipulation of physical objects is needed. Nonetheless, we can ask: how is it that such features of our faculty of imagination are knowledge conferring? It must have something to do with the origin of the human mind, whether Darwinistically or otherwise. The formation of our faculty of imagination must somehow have been influenced by the relevant logical and mathematical facts, perhaps as these facts were causally efficacious in various episodes in our evolutionary history.


Logical and Physical Necessity

Heretofore I have emphasized the similarities between logical and physical necessity. Both are knowable via their causal influence on sequences of concrete events. Nonetheless, there are clearly different forms of modality: logical and merely physical, to take two examples. It is physically impossible, but logically possible, that I should travel faster than the speed of light. Can I give an account of the difference between the two? It is important in this context to be very clear about what sort of thing is it to which we are attributing possibility or impossibility. For example, is it a situation-token, a situation-type, or a proposition (the combination of a token or tokens with a type)? As an actualist, I believe that the only real tokens are actual ones. So, I view merely possible tokens as some sort of logical construction, built up from actual tokens and various situation-types. Such a construction represents a real possibility just in case the types involved have the modal property of being possibly instantiated (or possibly instantiated by or in a certain relation to certain actual tokens). Similarly, a proposition is possibly true just in case its type is possibly instantiated by its token. Thus, modality is primarily a category of property of situation-types. A situation-type represents a logical possibility just in case some type logically isomorphic to it is possibly instantiated. (By logically isomorphic, I mean that one can be transformed into the other through the substitution of nonlogical elements.) Dually, a situation-type represents a logical impossibility just in case no type logically isomorphic to it is possibly instantiated. Analogously, a type constitutes a physical possibility just in case some type physically isomorphic to it is possibly instantiated. (Physical isomorphism means that one can transform one into the other by substitution of non-physicaltype elements.) We normally include logical structure in our definition of phys-

Logical and Mathematical Cognition


ical structure, resulting in the inclusion of all physical possibilities within the class of logical possibilities. However, we need not do so: we could countenance certain types as physically possible but logically impossible. For example, it is physically possible for an electron to have spin +^, and it is physically possible for it not to have spin +^. We could count the type according to which the electron both has and does not have spin +| as physically possible, but logically impossible. My point is that possibility and necessity tout court are the basic realities. Logical modality and physical modality are two kinds of structure we find within the modal reality of the world. They are distinct, but not fundamentally different in kind. It may be that there is a further difference between logical and physical necessity. It may be that physical laws are only contingently necessary, while the truths of logic are necessarily necessary. This could happen if we find that the laws of physics are themselves the resultant of some more fundamental fact (such as the will of God), while the laws of logic (or some of them) are absolutely under ived. A standard distinction between logical and non-logical necessity relies on Tarski's reduction of logical necessity to 'truth in every model'. The inadequacy of such a model-theoretic approach to logical necessity can be seen by considering propositional logic and truth tables. Suppose we try to identify logical truth in propositional logic with true in every interpretation, with the interpretations of the logical connectives simply stipulated by displaying the corresponding truth functions. This theory of logical truth can work only by asserting (if only implicitly) that the rows of the truth tables are necessarily exclusive and exhaustive of all possibilities. This is something that cannot be simply stipulated to be the case. For example, consider just negation. If by 'false', we mean 'not true', then the fact that the two rows of the standard truth table for negation are exclusive and exhaustive is itself a prior logical necessity, and not simply the product of our stipulating a meaning for 'not'. Alternatively, if 'false' does not simply mean 'not true', then the standard truth table presupposes a substantive thesis of bivalence. In this case also, the mutually exclusive and exhaustive nature of the rows is not merely a product of convention. The semantic fact of bivalence is now something with which we must have some kind of epistemic contact, and this fact of bivalence is itself modal in character: we need to know, not only that every sentence in a certain class is in fact either true or false and not both, but that this holds of necessity. Once again, we encounter a modal fact to which we must have epistemic access.


From Logic to Arithmetic

When compared to our knowledge of logic, our knowledge of arithmetic poses a new challenge. Arithmetic involves the existence and properties of things, the numbers, that seem to exist in a realm causally isolated from our own.


Realism Regained

However, this appearance may be deceiving. A number is simply a natural kind of quantifier complex.5 Numbers and their properties are thereby contained in modalized logical facts. Whenever a situation supports a modalized logical fact involving quantifiers and identity, that situation also supports an arithmetical fact involving one or more numbers. For example, the logical type:

corresponds to the arithmetical type 1 + 1 > 1. The number n is simply a kind o quantifier complex occurring in modalized logical facts, a complex consisting of n quantifiers whose variables are declared to be pairwise distinct. For instance, the following type is of type 3:

The existence of logically complex types of this kind is not the result of any human doing. Our capacity to speak a recursive language and to think thoughts of arbitrary logical complexity all depends on the prior existence, in reality, of corresponding logical complexity. The commitment to an infinity of numbers and the commitment to the recursive nature of language are essentially the same thing, as Godel and Poincare long ago realized. If we believe in the existence of a recursively defined language containing quantifiers and identity, we have already accepted the existence of the number series, since each number is simply a kind of quantifier complex producible in such a language. In the ancient world, the Pythagoreans and the Eleatic philosophers argued over which was more fundamental: numbers or logic. As T. K. Seung (Seung, 1996, pp. 194-195) has argued, the later Plato reached the conclusion (expressed in his "Parmenides" ) that the two are inseparable and interdependent. As soon as we admit into our logic formulas of arbitrary complexity, i.e., as soon as we recognize that we are working with a syntax and semantics that can be defined only recursively, we are already committed to the real existence of the natural numbers. Thus, numbers do have causal influence on the world: they do so by figuring in modalized logical facts that constrain what can happen. To posit that every number has a successor is to hypothesize that there exist real modal constraints of this kind of arbitrary logical complexity. Thus, contrary to Hartry Field, arithmetic is not a conservative extension of physical theory. Rather, the axioms of arithmetic are an especially bold conjecture, a set of infinitary generalizations based on our knowledge of their instances. These arithmetical conjectures are confirmed every time we encounter novel situations of great complexity and are able to navigate through these situations successfully, with arithmetic as one of our guides.
Alternatively, it may be that each number is the causal ground of the existence of a family of equinumerous quantifier complexes.

Logical and Mathematical Cognition



The Infinity of the Universe

Peano's axioms assert that every number has a successor. We find this law confirmed in our experience, but our experience is (perhaps) limited to rather small numbers. If the universe were finite, all of the logical types containing quantifier complexes of greater cardinality than that of the universe would be uninstantiated. It might be argued that we have little ground for believing in the existence of such uninstantiated types. If these types do not exist, then neither do the corresponding numbers, resulting in a counterexample to Peano's successor axiom. Moreover, the existence of this infinity of numbers is not a contingent fact about the world. We need an argument for the necessary existence of an infinity of numbers. First of all, it is quite plausible to say that science gives us good reason to believe in the existence of infinite pluralities, and so in transfinite cardinal numbers. Consider for example, the usefulness of the real continuum in physics. This argument, however, does not provide grounds for a belief in the necessary existence of the numbers. Secondly, we can turn to the ancient trick, deployed by Plato in the dialogue "Parmenides" and used famously by Frege in the Grundlagen, of using the numbers to number the numbers themselves. If we have two things that are not numbers, then we can infer that the number two exists. This means that there axe at least three things: the two non-numbers, and the number two. This type is grounded in the number three, which is provably distinct from the number two. We now have four things, etc. Thus, if at least two things exist necessarily, then an infinity of things do. (In fact, if one thing exists necessarily, and it is not a numerical type and so distinct from each of the numbers, including the property of oneness, then we can get Plato's cascade as a necessary existent.) How do we know that there exist things that are not numbers? How, to use Frege's example, do we know that Julius Caesar is not a number? The causal/modal theory of mathematics gives us a good answer to this admittedly odd question. Numbers are necessary existents that impinge upon our experience through their incorporation in modal facts. Julius Caesar, and other spatiotemporally located substances, are contingent, and are themselves caused to be by temporally prior events. These facts give us at least a strong presumption in favor of treating the two classes as disjoint.


Kripke and Wittgenstein on Rule-Following

000000000000000000000000000000000000000000000000000000000000000000 (1953)) a novel puzzle: how is it that a finite number of acts can fix the content of the rule being followed in a given practice? In the case of arithmetic, the set of arithmetical calculations that ever have or ever will be performed is finite. There are infinitely many different extensions of these data points to the entire three-place Cartesian product of numerals. For example, the "quus" function differs from addition only on pairs of numbers so large that no one will ever use them. What makes it the case that we are following the rule of addition instead


Realism Regained

of its counterpart quaddition? Kripke's puzzle seems to put the order of explanation the wrong way around. It is because we mean addition by "plus" that we are (or should be) following the addition rule, not vice versa. The fact that the linguistic and cognitive operations in question represent addition is determined by systematic causal connections between them and facts in the realm of logical necessity. Our arithmetical calculations are (teleologically speaking) supposed to connect in a particular way with the set of first-order logical necessities. Natural numbers can be systematically translated into strings of quantifiers, qualified with suitable identity or non-identity statements, as in Frege's logicist programme. Arithmetical calculation is supposed to facilitate efficient computation of logical necessities via these translations. If these cognitive operations represented quaddition instead, these systematic causal connections would be quite different (and a good deal more complicated). I cannot think of any way of making sense of direct causal connections between bare mathematical facts (situations including only certain numbers and some mathematical relations between them, such as '7 + 5 = 12' or '3 < 5') and temporally-located events and processes. Instead, the connection is more indirect and holistic. Particular logical facts impinge directly on concrete events and processes. Implicit in these logical facts are numbers (types of quantifier complexes) and their mathematical relations (such as succession and inclusion). Representations of numbers and their relations in the mind (which we might call "cognitive arithmetic") are confirmed by their reliability and fruitfulness in generating information about first-order logical necessities (via the translation of numbers into quantifiers restricted by identity and distinctness conditions). Thus, there are two systems, real arithmetic and cognitive arithmetic, whose agreement is caused and sustained by a finite number of causal interactions between first-order logical facts, facts about concrete necessities and possibilities, and cognitive facts constituting our knowledge of these modal facts. Arithmetical facts are knowable by virtue of a systematic translation between atomic facts about numbers (facts about the value of particular sums and products) and modal facts of first-order logical necessity. Each atomic fact about numbers can be mapped to a corresponding set of theorems of firstorder logic (as in standard logicist treatments of arithmetic). However, isn't this systematic translation between arithmetic and logic itself a rule that can be quusified? The translation is an infinitary rule, but our actual mathematical practice concerns only finitely many instances of this translation scheme. Doesn't the Kripke/Wittgenstein problem arise at this point? The answer to this deviant translation problem is to posit that the numbers really exist, and really participate in those modal situations to which the translation scheme links them. That is, the numbers 2, 3, and 5 are real constituents of the modal situation-token that supports the logical necessity of the translation of '2 + 3 = 5' into first-order logic. Moreover, since these modal situation-tokens enter into causal relations with ordinary events and processes, the individual numbers are also implicated in these causal relations. However, we still must confront the fact that we humans have interacted

Logical and Mathematical Cognition


Figure 15.1: Flow of Arithmetical Information in this way with only finitely many natural numbers. What determines the extension of the translation scheme to numbers with which we have had no such interaction? The answer would have to take something like the following form. The successor relation is a real, quasi-causal relation between numbers. The successor relation enters into the very being of all numbers other than 0. Thus, in interacting with numbers larger than 0, we are interacting with the successor relation itself. Once we reach the point of forming universal generalizations about numbers, such as the generalization that every number has a successor, our representational state is informed by a causal connection to this successor relation (as a constituent of the relevant modal facts). In other words, succession is a natural kind, like gold or aardvark. This is not to deny that quusified successor-like relations exist. It is merely to deny that these quusified relations enter into causal connections with our epistemic states when we form generalizations about numbers and their successors. Moreover, there are important causal/explanatory asymmetries between succession and quuccession. The fact that every number has a successor explains why every number has a quuccessor, but not vice versa. This asymmetry is crucial, because we can then appeal to Occam's razor (see appendix B) to explain why it is objectively far more likely that our thoughts about succession are caused


Realism Regained

by succession and not quuccession. Occam's razor tells us to minimize the factors that we take to be causally relevant to the phenomena to be explained. If we hypothesize that quuccession is causally relevant to our thoughts about succession, then it would follow that succession is also relevant, since succession is needed to explain the structure of quuccession. In contrast, if we hypothesize that it is succession that is directly relevant to the causal explanation of our thoughts of succession, we need not suppose that quuccession is also relevant. Hence, Occam's razor dictates that the direct connection to succession is the best explanation of our basic arithmetical beliefs.


Set Theory and Other Branches of Mathematics

Not all of the branches of mathematics will succumb to the same treatment as does arithmetic. In the case of geometry and real analysis, for instance, a structuralist theory along the lines of Dedekind (1888) seems called for: Euclidean geometry is about any structure that satisfies its axioms, and similarly for non-Euclidean geometries and real analysis. The sort of "elimination" of real numbers proposed by Hartry Field (1980) fits into this structuralist pattern: physics postulates the existence of various systems of physical quantities (distance, duration, mass, field intensity) that validate the axioms of real analysis (under suitable interpretation). Thus, physics, and other empirical sciences, verify the existence of structures of certain kinds, and the various branches of structuralist mathematics investigate the logical consequences of being structures of the postulated kind. Consequently, there is no such thing as the real numbers, or Euclidean space, as subsistent objects. In real analysis, we do not study the properties and relations of a collection of objects (the real numbers); instead, we study the properties of any of a class of structures, each of which validates the axioms of analysis. In contrast, arithmetic is primarily the study of the natural numbers, which are definite objects in their own right, although, secondarily, its results can be applied to any omega-sequence (that is, any sequence sharing the structure of the natural numbers). The difference between arithmetic and analysis lies in the tightness of the connection between arithmetic and logic, via the definability of finite cardinality in first-order predicate logic. Set theory constitutes a more difficult case. I would lean toward classifying it with arithmetic (as having an absolute domain, the sets), rather than with the structuralist branches, such as analysis and geometry. Just as numbers can be thought of as the grounds of natural families of logical types (types involving quantifier complexes of the corresponding cardinality), so too can sets be thought of as grounds of natural families of disjunctive logical types. For example, suppose the set A is {oo,i,... ,a a }- Set A is the cause of the existence of a family of co-extensive types, of which the following is an instance:

Logical and Mathematical Cognition


Let Av be the family of types co-extensive with this one. We can define membership as follows:

Sets, even infinite sets, are essentially things that can be given by exhaustive enumeration. For this reason, something like Zermelo-Frankel set theory must be correct, since an enumeration can include only things that exist prior to the enumeration, a restriction reflected in the iterative hierarchy of Zermelo-Frankel theory. How is it that we are influenced causally by sets and their mathematical properties? Quantum mechanics may play a key role in elucidating this connection. As David Bohm and Basil Hiley have argued, quantum mechanics concerns the process by which physical wholes are formed and dissolved (Bohm and Hiley (1993)). Thus, in quantum mechanics, it is sets of physical objects, and not just objects taken individually, that are causally efficacious. The possibility of the formation of physical wholes presupposes a prior metaphysical composition of individuals into sets. When a set of physical systems meets certain conditions, it constitutes a quantum whole, whose properties are not decomposable into the properties of the parts. Our experience of this process of physical composition gives evidence of an underlying metaphysical process by which arbitrary collections of things constitute a metaphysical whole, i.e., a set. Even a nominalist like Field (Field, 1990, page 214), for instance, confidently makes use of an axiom asserting the existence of arbitrary sums of spacetime regions. Presumably, it this feature of quantum mechanics that is ultimately responsible for the Gestalt phenomenon in perceptual psychology. The formation of a quantum whole, comprising a system of perceptual objects and the perceiver's sensory system, is a precondition of the perception of a holistic Gestalt. Gestalt perception is thus literally the perception of sets of objects, and not just of the objects individually. Penelope Maddy (1990) has argued for this latter point: the perceptibility of sets of physical objects.6 However, this perceptibility is not sufficient to provide an account of our knowledge of set theory. We need to have some explanation for our knowledge of general facts about sets, of the kind represented by the axioms of set theory. In addition, our perception of sets of physical objects does not seem to have any direct bearing on our knowledge of the higher ranks of set theory, or of the existence of unit sets or the null set.
6 Maddy (Maddy, 1990, page 87) argues that set theory is metaphysically prior to arithmetic, since she postulates that numbers are properties of sets. However, the truths of arithmetic are already implicit in first-order logic with identity, even without any apparent ontological commitment to the existence of sets. See Field's "Reply to Maddy" (Field, 1990, pp. 208-209).


Realism Regained

I think the solution to these problems can be found by thinking of our knowledge of sets of higher ranks as analogous to our knowledge of the future. We know the future, not by virtue of any effect of the future upon our minds, but through the influence on our minds and on the future by certain common causes. Similarly, our experience of the metaphysical combination of physical object into sets, without limitation, provides us with access to a general tendency in things toward agglomeration. This same process of agglomeration is also responsible for the formation of arbitrary sets of higher ranks. Our knowledge of the axioms of powersets, separation, choice, and replacement all reflect our knowledge of the universality of this agglomeration process. The iterative hierarchy of ZermeloFrankel set theory reflects a quasi-causal process by which new sets, of higher and higher rank, are generated through the successive agglomeration of sets of lower rank. (This generation of successive ranks does not take place in time, but in an order orthogonal to our time line.) Thus, we know that sets of sets exist, even though we have no direct contact with such things, since the very same tendency of individuals to form sets is at work, both in the physical world and in the realm of sets of higher ranks. This view of mathematical ontology and epistemology has no revisionist implications for the practice of mathematics, unlike many philosophical positions, such as intuitionism, finitism, or modal-structuralism. Mathematics is a healthy, progressive science, exploring the structure of modal reality in much the same way as physics explores the structure of physical forces and interactions. Mathematicians need no help from philosophers to do their job. However, this version of set-theoretic realism does not by itself settle the question of bivalence. A proposition in the language of set theory is determined to be true if it is true in the universe limited to a particular rank in the iterative hierarchy and its truth is necessarily preserved by the process of agglomeration that leads to still higher ranks. A proposition is determined to be false when its negation becomes determinately false. There could, in principle, be propositions whose truth value never stabilizes in this way but instead fluctuates from truth to falsity and back again as the ranks accumulate.


Alternatives to Mathematical Realism


Recent so-called nominalists, such as Hartry Field, have argued that the usefulness of arithmetic and the rest of mathematics in calculating logical consequences does not depend on the postulates of arithmetic's actually being true: it is enough that they are conservative extensions of our non-mathematical theories. A mathematical theory can be a conservative extension only if it is consistent. Hence, we must be able to discover, presumably by scientific induction, that the postulates of arithmetic are consistent. Consistency and conservativeness are themselves mathematical (metalogical) properties. Field is not a fictionalist about these properties, nor about

Logical and Mathematical Cognition


the infinitely large collections (physical theories) that bear these properties. This seems an arbitrary and unmotivated distinction. Why is the fact that an unsurveyable theory is consistent any less problematic ontologically or episternologically than any fact in number theory? As we know from Godel's work, metamathematical facts such as a theory's consistency can in fact be represented by theorems in number theory. Moreover, can we really find reason for believing the postulates of a mathematical theory to be consistent without simultaneously finding reason for believing them to be true? We find that, time and time again, postulating the axioms of arithmetic and using arithmetic in our computations of logical consequence is reliable. This sort of reliability is exactly the phenomenon that leads us to accept (provisionally) the truth of scientific hypotheses outside of mathematics. The only reason Field has for treating mathematics differently is his concern to preserve the causal theory of knowledge. Consequently, to the extent that I can account for the possibility of causal influence on the part of numbers, I have removed all motivation for mathematical anti-realism. In any case, Field does not attempt to provide a fictionalist account of our logical knowledge. Field considers logic (at least, first-order logic) to be epistemologically and ontologically unproblematic. He takes for granted, for example, that the language of our physical theories is recursive, comprising an infinity of sentence types. Only finitely many of these types have concrete instantiations: how does the nominalist explain our cognitive and epistemic access to this actual infinity of logical types? Finally, Field argues that we have good reason to believe that the axioms of standard mathematics are logically consistent. This reason is empirical: the mathematical community over time has succeeded in weeding out many inconsistencies. If our surviving mathematical theories were inconsistent, someone would have discovered this inconsistency by now. Field's argument depends on an appeal to the causal efficacy of consistency and inconsistency. His inference to the consistency of mathematics is an inference to the best explanation. Such inferences presuppose that the consistency of mathematics can genuinely cause our repeated failures to find an inconsistency. This means that there must be causal connections between logical features (consistency, inconsistency) of certain abstract objects (mathematical theories) and our minds and behaviors qua mathematicians. If our mental states can be causally connected in this way to theories, why not to numbers and sets?



According to structuralism, mathematics is really the study of certain kinds of structures, namely those structures that satisfy the axioms of the theory. As I indicated earlier, I find a structuralist account quite plausible as an account of many branches of mathematics, like geometry or analysis. There are only three parts of mathematics toward which I am inclined to take a straightforwardly realist attitude: logic, arithmetic, and set theory. As far as I know, no one has offered a structuralist account of logic. It is possible to think of arithmetic


Realism Regained

as the theory of omega-sequences, but this seems to miss something important: namely, the use of numbers in counting things. As Frege and Russell recognized, there is a tight connection between numbers and quantifiers, so tight that I have identified numbers with natural kinds of quantifier complexes. There is a different problem with taking a structuralist approach to set theory: structuralism seems to involve treating special mathematical theories as sub-theories of a more general theory of structures. However, set theory is itself a theory of possible structures. There is no more general theory in which set theory can be embedded.



The modalists, such as Putnam and Hellman (1989), take mathematics to be the study of possible structures. This enables them to avoid the commitment to the actual existence of infinitary structures. However, they are still vulnerable to the other objections mentioned in the last section. In addition, the modalists owe us an explanation of the possibility of our modal thought and knowledge. Merely possible structures can have no effect on our minds; how, then, is reference to or knowledge of them possible?


Why the Human Mind Is Not a Turing Machine

If we learn about mathematical fact by interacting with modal facts, then there is no in-principle upper bound to the set of learnable mathematical truths. There are no mathematical truths that are in principle unknowable. In light of Godel's incompleteness theorems, this means that the set of mathematically learnable truths is not recursively enumerable, which in turn means that the human mind, qua unbounded learner of mathematics, cannot be modeled as a Turing machine. This rejection of the Turing machine model as adequate for the representation of the mathematical mind does not necessitate speculation about exotic forms of quantum causation, as Roger Penrose has suggested. It is consistent with the sort of causal Platonism that I am defending to hold that the human brain can be modeled as a Turing machine, or even as a finite automaton, without remainder. The human person, characterized ideologically, cannot be extricated from the human environment. Unlike premathematical animals, the human environment is, thanks to its logical/modal component, infinitary in character. It is essential to the Turing machine model that only finitely many squares on the memory tape are actually written on. This means that a Turing machine is always being represented as interacting with a finite environment. The memory tape is infinitely long, but only a finite segment of the tape is used on any actual run of the machine. Since the human person cannot be extricated from the human environment, and the human environment is infinitely rich in information, any Turing niodel

Logical and Mathematical Cognition


of the human mind leaves out something essential. The transition from the Turing machine model to the Platonic model is analogous to the transition that occurred from the finite automaton model of Skinnerian behaviorism to the Turing machine model of computational psychology. In both cases, a certain kind of idealization takes place, but the idealization is essential to the illuminating of essential features of the mind. Presumably, there is some finite bound to potential human memory, given the finitude of the cosmos and the ever-encroaching fate of thermodynamic heat death. Thinking of the human mind as a Turing machine involves ignoring this actual bound on potential memory, since this bound is inessential to what the mind is doing when it performs computations. Similarly, there may well be a bound, fixed by the physics of the cosmos, on the set of mathematical truths that human beings can learn. The Platonic model involves ignoring this accidental bound, since doing so is essential to a correct characterization of the nature of mathematical thought and knowledge.

This page intentionally left blank


A Teleological Theory of the Mind

In this chapter, I want to look at a number of problems in the philosophy of mind, and discuss briefly the relevance of a teleological theory of mental representation to these problems, including downward causation, qualia, and free will.


The Irony of Non-Reductive Materialism

It's inevitable that a discussion of the mind/body problem begin with Descartes. It was Descartes who crystallized the mind/body problem as it exists for the modern mind. Descartes's view, of course, is radically dualistic: the mind and body are two separate substances, with radically different essences (the one, thought, and the other, extension). From Descartes's point of view, this dualism represented an essential first step out of the confusion of the medieval synthesis, in which matter was endowed with mind-like attributes (teleological properties) and many functions of the mind (such as sensation) were believed to be essentially dependent on the cooperation of matter. Of course, this dualism extracts a price, a price that the inheritors of modernity have, in general, been unwilling to pay. The price to be paid is the loss of any intelligible and plausible story about the causal links between the mind and the body. Descartes resorted to his infamous speculations about the pineal gland, while Malebranche took the extreme expedient of denying secondary causation altogether. It was Hobbes who prefigured the consensus of the twentieth century by identifying the activities of the mind with certain motions of matter. Twentiethcentury materialism has taken a variety of forms: behaviorism, brain state identity theory, functionalism, and non-reductive or supervenient materialism. In each of these cases, the materialist has accepted Descartes's anti-Aristotelian conception of matter, disputing only his account of the mind. However, since a



Realism Regained

simple identification of the mind with matter has proved impossible, the dualistic challenge to mental causation remains unmet. Mental descriptions cannot be reduced to or translated into physical descriptions because mental descriptions involve a different level of abstraction. The very same mental states and operations can, in principle, be realized in an infinite number of different physical media. This multiple realizability holds even between mental facts and facts about algorithms or computational processes. If mental descriptions involve simply a re-description of the physical facts at a very high degree of abstraction, they would seem to be causally redundant. All the real work of making things happen takes place at the level of the fully determinate physical facts. Mental facts merely supervene upon these purely physical goings-on and, therefore, are merely ephiphenomal, a causally inert shadow cast by physical reality on a linguistic-conceptual screen that abstracts from that reality only some general features. The intentional stance (as Dennett puts it) is no doubt a useful stance to take, but it gives us no access to the causally relevant features of the situation. Non-reductive materialism, then, despite its origins in dissatisfaction with the failure of Cartesian dualism to explain mind/body interaction, is saddled in the end with an equally insoluble version of the very same problem. The solution lies in reconsidering both parts of Descartes's legacy to the modern world: not only his account of the mind, but also his account of the physical world. Only by taking seriously the realm of teleology in nature can the paradox of mind/body dualism be overcome.


Supervenience and Type and Token Identity

Supervenience is a relation between classes of types. One class of types is said to supervene on a second when, necessarily, which type from the first class is realized in a given token is always determined by which type from the second class is realized by the same token (or possibly, which types are realized by the same and other actual tokens). The key to acquiring a clear conception of supervenience is to clarify the meaning of 'determined by'. We get two quite different pictures, depending on how we understand this relation of determination. The simplest model of Supervenience, given the existence of tokens, types, and a three- or four-valued interpretation linking them, is to say that one class of types is determined by a second just in case, whenever a token has a member 4> of the first class, it also has some set of types A from the second class such that every token having all the types in A necessarily realizes type <p. Let's call this relation strict Supervenience. Strict Supervenience Class A of types strictly supervenes on class B iff A and B are disjoint, and for every possible state-token s arid every type 4> in A, if s \= 4>, then: there exists a subset C of B such that s supports

Teleological Theory of the Mind C and for every possible token s', if s' supports C, then s' supports t-1


Not every situation-token has a spatial or temporal location. Some tokens, for example, realize eternal, non-local facts, such as modal or causal facts. Let's say that two tokens coincide just in case they share the same spatiotemporally located parts. Coincidence is thus a weaker relation than identity. We can use this weaker relation to define a form of loose supervenience. A class A loosely supervenes on class B if and only if, whenever s is a possible token realizing some (/) in A, and s is part of some world w, there must be a token s' that coincides with s and is also part of w, and some set C of types in B realized by s', such that in every world in which some token realizes every member of C, there is some coincident token of type (j>. Loose Supervenience Class A of types loosely supervenes on class B iff A and B are disjoint, and for every possible token s, and every type 4> in A, and every world w, if s is part of w and s supports (f>, then: there exists a subset C of B and a token s' that is part of w, coincides with s, and supports C such that for every world w' and every token si that is part of w' and that supports C there exists a token s% that is also a part of w', that coincides with s ^ , and that supports <j6. Where strict supervenience holds, it warrants an ontological conclusion: the members of the supervening class of types are each identical to some (possibly infinitary) construction from the members of the second class of types. This entailment is supported by the following definition of type identity: Type </> is identical to type ty iff every possible token supporting </> also supports if;, and vice versa. Loose supervenience, in contrast, warrants no such ontological conclusion. I will argue that the relation between the mental and the physical is one of at most loose, not strict, supervenience. For example, in the case of the perception of color, there is good reason to expect a very reliable connection between the quality of the experience and certain neural event-types, since the whole point of color perception is to bring us into attunement with certain physical regularities in our environment. If there were significant breakdown in the determination (loosely speaking) of the mental by the physical, perception could not perform its proper function. Whenever a mental event of a certain perceptual type occurs, there will be a coincident physical event belonging to one of a definite class of neurological types, and whenever one of these neurological types occurs, a coincident mental state-token of the corresponding perceptual type will also occur.
1 Strict supervenience corresponds to Kim's modal operator definition of strong supervenience (Kim, 1997a, p. 188).


Realism Regained

Functional types are higher-order types, involving quantification over firstorder types. A functional type has the logical form:

In this formula, <j> is a metalinguistic variable, placing some condition on the type-variable v. A token s belongs to this higher-order type just in case it belongs to some type v meeting the condition <j>. A crucial question concerning type-identity is this: supposing that the set A consists of all the types meeting the condition (/>, and V A is the disjunction of the members of A, is the higherorder type identical to the disjunctive type V A? For example, suppose that A consists of an infinite set of physical types. Is the type V A also a physical type, and is the higher-order functional type Ax : Bv(<f>(i/) &z (x\ = v)) identical to \l Al In order to decide this question, we must ask whether there is in fact a set containing all the physical types that could possibly satisfy condition A. If we are moderate actualists, we should hold that a type exists in actuality only if it has a token,2 but we should also concede that there could have existed types that do not in fact exist. If there are no non-actual types, neither are there any sets, such as A, containing non-actual types. Consequently, neither is there such a type as the disjunctive type V A. Thus, the higher-order property Xx : 3z/"(^>(i/) & (x = i/)) exists in actuality, but there is no disjunctive type V A. I will, therefore, deny that higher-order types, including teleofunctional types, are identical to any physical type. Another argument for the same conclusion is independent of actualism. Even if all types that exist in any possible world exist in this one, there may not be a set of possible physical types. Consequently, there might exist no disjunctive physical type equivalent to a given higher-order type. Now I will turn to the thesis of token identity. What does it mean to say that mental states are token-identical to physical states? One interpretation of this claim is that every token realizing a mental type also realizes at least one physical type. However, this interpretation is too weak. Suppose every token realizing a mental type had two parts, one physical and one super-physical. In this case, every mental token would realize some physical type, by virtue of its physical part, but it would be misleading to say that mental tokens are just physical tokens. A stronger version of the thesis of token-identity would go something like this. Class A of types is token-identical to class B of types just in case, for every possible token s and every type <f> in A, there is a type ijj in B such that s supports type ^;, but no proper part of s does so. In the remainder of this chapter, I will lay out my reasons for denying both the strict supervenience of the mental on the physical and the token-identity
Alternatively, we might hold that a type exists only if there is some token of some type that is a determination of the same determinable. All that is needed to make this argument work is to assume that physical types exist contingently.

Teleological Theory of the Mind


thesis. At the same time, I will avoid any form of substance dualism or interactionism. The secret to this solution is the introduction of teleological types, which are a kind of higher-order causal property. Mental types are identified neither with physical types (not even infinitely complex disjunctions of physical types) nor with some super-physical, first-order property (as in classical Cartesianism), but with types involving higher-order causation.


Downward Causation versus Epiphenomenalism

The problem of downward causation is a major source of worry about causal, functional, and teleological theories of the mind. My own account of mental representation makes the representational character of a neural state a higherorder property of the state, loosely supervenient on the first-order properties and the causal and modal facts of the world. This at least suggests that the mentality or representationality of the neural states is itself causally inert, riding epiphenomenally on top of the "real" causal story, which occurs exclusively at the level of physics. This is a caricature of the account I have given, however. Mental states are causally efficacious, even efficacious at the physical level, and mental properties can figure as such in perfectly good causal explanations of physical events. The account is quite far from being epiphenomenal. There are two separate issues that need to be examined: (1) do mental situations cause physical situations, and (2) do mental situation-types figure in genuine causal explanations of the instantiation of physical types? What are mental-state tokens? They include, but are not identical to, physical-state tokens. My position is an inclusion thesis, not an identity thesis. In addition to some physical or neural state, a mental state includes information about the somatic and environmental context of this state and about the causal/modal structure of the world, with sufficient information to make it objectively likely that the causal antecedents of the state fulfill Wright's definition of teleofunction. Thus, a mental situation-token can be decomposed into three parts, one entirely atemporal, one temporal and remote, and one temporal and local. The atemporal part is the situation supporting the modal and causal facts that link the physical-state type with its characteristic effect. The second part gives the relevant physical context of the immediate physical state, supporting the fact that there is an objective likelihood that the first part (the causal connections) was indeed involved in causing that immediate state (the third part). The local, temporal part is the physical state in the brain that has the representational function. The second part of the mental token is the somatic context of the third part: those features of the human body that, taken as a whole together with the relevant causal constraints (the first part) make it objectively likely that the


Realism Regained

third part satisfies the Wrightian definition for proper function.3 The first and third parts of the mental-state token are always directly involved in producing the succeeding mental states and behavior of the agent, and the second part is at least indirectly involved.4 Hence, the mental states are causally relevant to physical events. Epiphenomenalism, in the classic sense, is avoided. In fact, the main difficulty is explaining not how mental tokens can be causes, but how they can be effects. Only one component of mental events can be caused by changes both within and without, namely, the localized physical component. The other components are unchanged and unaffected through these vicissitudes. This account of mental token-causation could be labeled the 'catalytic theory', since the eternal and holistic (non-local) components of the mental events participate in the causal process without being affected themselves, as catalysts participate in chemical processes without undergoing changes. Mental state Ml causes the local/physical component of mental state M2. All of the components of Ml are involved in the causal connection, but only one component of M2 is an effect. However, the other two components of M2 are exactly the same as the corresponding components of Ml. These two components are the constant element in the process. The following diagram illustrates this catalytic theory:

Figure 16.1: Mental Causation The diagram illustrates a typical mental-state/mental-state causal interaction. A mental-state token Ml consists of three components, an eternal, modal component El, a holistic, physical component HI, and a localized, physical
See my non-retrospective definition of teleological function in section 12.3. In light of the measurement problem of quantum mechanics, it may be that the wider physical context, including this second part of the mental token, is always directly involved in producing the behavior. In other words, QM is inconsistent with the kind of causal locality that underlies an atomistic account of physical causation. See also section 18.5.
4 3

Teleological Theory of the Mind


component Pi. The three jointly produce a localized effect, P2, which in turn forms part of a new mental state, M2, when combined with the two unchanging, "catalytic" elements El and HI. M2 can, in turn, act as a cause of further state-tokens. In the case of causal explanation (involving types as well as tokens), there is in principle no reason why an instantiation of a mental property could not be part of a genuinely causal explanation of the instantiation of some physical property. In section 5.6.1, I gave an account of heterogeneous causal explanations, in which the system of classification used in the explanans differs from that used in the explanandum. Mental-to-physical explanation is an example of such heterogeneous explanation. Nonetheless, there is a legitimate basis for worry, since it seems that it is the first-order causal properties of a mental-state token that are needed in explaining behavior, and not the second-order and holistic properties that are involved in determining the token's teleological and representational character. A neural state of a certain physical kind will have the same effect on behavior, whether or not it is likely to have been caused in such a way as to have the function of representing some information. The representational aspect of the token seems to be exclusively hypothetical or backward-looking and to have no bearing on the effects of the mental state. This worry is quite warranted in the case of organisms with very primitive functional organization. For example, an organism that exhibits only reflexes has representational states, but the representational character of these states is irrelevant to explaining its behavior. The explanation of behavior is possible only by making reference to the causal properties of its neural states, not the teleological or representational properties. However, when an organism has a more interesting mental life by virtue of a recursively structured functional organization, functional properties can be used in explaining its behavior. For example, if the organism has both perceptions and appetites/aversions, the function of the appetites and aversions is, to some extent, definable only in terms of the presence or absence of perceptions of certain kinds. In other words, we begin to encounter higher-order functions (although not yet higher-order representations). An appetite for some state </> is a state that tends to lead to behaviors that are represented internally as likely to lead to <j>: hence, the effect of the appetite depends on the functional states of the organism's perceptual faculties. This means that, in the presence of an appropriate appetite, the functional/representational character of a mental state can be used to explain the organism's behavior. (For a more detailed discussion of this sort of example, see section 8.5. See also Tyler Surge's classic paper on the subject, "Individuation and Causation in Psychology" (Burge (1989)).) By way of illustration, consider the higher-order property of fragility. In explaining why a particular object breaks, there is no need to posit the existence of a distinct, dispositional property, over and above the particular physical and chemical features of the object that explain its fragility. However, suppose there is a warehouse to which a variety of kinds of glass objects are sent. As each shipment arrives, samples of the objects are subjected to a variety of tests, whose


Realism Regained

function is to identify the property of fragility. Shipments determined to be fragile are stored in the north side of the warehouse, and non-fragile shipments are stored in the south side. In this case, thanks to the participation of the objects in a higher-order functional system, the property of fragility is causally explanatory of the location of shipments within the warehouse, and hence we have good reason to posit the existence of such a dispositional property. It is true that whenever the organism's behavior can be explained causally in terms of the functional and representational properties of its internal state, the behavior can also be explained solely in terms of the physical and first-order causal properties of that state. There is also some sense in which the firstorder explanation is "more fundamental" than the higher-order explanation. However, this fact does not render the explanation in functional terms noncausal, or merely heuristic. Nor does it entail the occurrence of some odd sort of overdetermination. The two explanations do not compete with each other, as two independent physical explanations would do. Genuine overdetermination (along with the possibility of competition) exists only when one of two conditions is met: (1) the situation-tokens involved in the two explanations are mereologically disjoint and causally independent, or (2) the situation-types involved in the two explanations are logically unrelated, in particular, neither is an instance (at a lower level) of the other.5 In the case of a functional and an underlying physical explanation, neither of these two conditions is met. First, the local physical token is part of the functional token. In addition to the local physical token, the functional token includes components that support the modal and contextual facts needed to give the physical type its functional characteristic. Second, the physical type is an instantiation of the functional type. The functional type includes quantification over physical types, and the corresponding physical explanans involves a physical type that is the relevant instance of the generalization contained in the functional type. Hence, the two explanations are too intimately connected to compete with one another. Mental states can be used in genuine causal explanations insofar as they participate in a higher-order functional system. For example, the inferential system has as one of its function the detection of beliefs with related logical forms that match one of its inference schemata. The contents of logically related beliefs can thus figure in genuine causal explanations of the production of new beliefs through logical inference. Similarly, beliefs, desires, and intentions can enter into causal explanations of the production and revision of intentions. Jaegwon Kim (1997b) has argued that higher-order types can never figure in genuine type-level causal explanations, because they are gerrymandered or unnatural or unprojectible, in the same way as genuinely disjunctive types. I have already given an account of genuinely disjunctive types in section 4.8.2. According to that account, what makes a type genuinely disjunctive is the separability of any causal constraint involving that type into two or more constraints, each involving only one of the disjuncts. If higher-order types, such as functions

See also section 4.8.2.

Teleological Theory of the Mind


or representations, figure in causal constraints that are not separable in this way, then we have a principled basis for distinguishing them from the sort of genuinely disjunctive types that cannot be causally efficacious. The existence of higher-order functions, functions that take as inputs other functions, including representations, would support such non-separable causal constraints. Hence, since we have good reason, from biology and cognitive psychology, to believe in such higher-order functions, we have good reason to believe that at least some higher-order types are causally relevant.


Two Further Problems of Mental Causation

Jaegwon Kim has very helpfully identified three problems of mental causation: the anomalism problem, the problem of syntacticalism, and the explanatory exclusion problem (Kim (1991)). In the previous section, I explained how my account handles the explanatory exclusion problem. In this section, I will take up Kim's other two problems. The anomalism problem arises from Davidson's thesis that mental properties do not obey strict, exceptionless laws. I have argued that exceptionless laws are not a necessary condition of causal connection. Indeed, I argued in part I that there is a much better fit between causation and non-strict, defeasible laws than there is between causation and strict laws (see chapters 4 and 5). The problem of syntacticalism is perhaps best thought of as a problem of causal locality. Syntactic properties of brain states are intrinsic properties, while semantic properties often have a historical and relational component. Only intrinsic states can contribute directly to the causation of behavior. Hence, mental-content states play no immediate causal role in behavior. On nay account, mental-content tokens, like other teleofunctional tokens, include components that are intrinsic states of the human body and of the relevant part of the brain. Hence, they do participate, via these components, in the production of behavior. Moreover, when the behavior is characterized functionally, all of the components of mental-state tokens are involved in the causal process, since without the modal components, the mental-state tokens would not carry the appropriate content and so would not be appropriate inputs to the higherorder functional system that includes the functionally characterized behavior. For example, if the behavior in question is a token of the action of expressing disapproval, then the cause of the behavior will typically include a mental state of disapproval (or the intention to mimic disapproval). Without a mental state with such a content, the higher-order function of expressing disapproval would not be triggered into action. Hence, the semantic content of the mental state is causally efficacious.


Realism Regained



In experiencing phenomenal qualia, we experience our own mental states as representational. Does this mean that we have some independent access to the causal facts and historical connections that make these perceptual states representational? This is not necessary. In the simplest cases, our perceptual states simply are states of awareness of some feature of the environment. Some materialists, such as Smart and Armstrong, have compared apperception to proprioception, the perception of the internal states of our own body. Apperception is sometimes likened to the result of some kind of "brain scanning" faculty of the brain. This seems to be a mistake. My access to my own qualia is more privileged and less fallible than my access to my own bodily states, including my own brain states. Apperception should not be thought of as perception of perception. Instead, I would suggest that we borrow an idea of Donald Davidson's, viz., his 'paratactic' theory of indirect quotation (Davidson (1967)). Davidson argues that in uttering a sentence like 'Jones says that it is raining', we are using, not merely mentioning, the sentence 'it is raining', and that we then point to that very speech-act by means of the pronoun 'that' in asserting 'Jones says that'. Whether or not this theory will work for quotation, something analogous is quite promising as an account of apperception. The act of apperception actually incorporates the perception as one of its parts, thereby incorporating the representational content of the perception as part of its more complex content. An act of apperception is something like a thought with the form 'I am experiencing that', where the pronoun 'that' points to a perceptual state. A paratactic theory of apperception can explain the difference between apperception and the perception of a perception, and so can explain the privileged access each person has to his own mental states. When I perceive one of my own bodily states, that state has itself no representational content. My perception of the bodily state must include an original act of representation, with the attendant possibility of misrepresentation or error. Similarly, when I perceive that you perceive something, my perception cannot include your perception as a part, and so my perception must include a separate representation of the content of your perception, again opening up the possibility of error. However, when I apperceive one of my own perceptions, the act of perception can become incorporated into the apperception, and the content of the apperception can be determined directly by the content of the perception. Error is impossible, at least at this stage. When I conceptualize my experience, categorizing it as an experience of 'red', for example, it is possible that I will make an error, just as I might misapply 'red' to an external object. This account of qualia depends on the assumption that all qualia correspond to mental representations. This must include the secondary qualia of color, taste, odor, and so on. In order for these states to be representational, they must represent real qualities of the perceived objects. What property of a perceived surface is the quality of color? I would suggest that colors are extrinsic teleological properties. Very roughly, a surface has the quality of red just in case it

Teleological Theory of the Mind


has some physical property with the extrinsic function (qua part of the ecological niche of humankind) of stimulating the human visual cortex in a particular way. This is a perfectly objective, although admittedly anthropocentric, property of the surface. I would reject attempts to treat secondary qualities as unreal projections of human sensibility. For example, it is a mistake to suppose that something is red if and only if it appears red to normal observers in standardized circumstances. In addition to the normality of the observer and of the environment, it is also necessary to add that the way in which the appearance of the object is caused must accord with the proper function of the faculty of color perception. For example, suppose that it turned out that a certain iron ore appeared green in standard conditions to normal observers, not by virtue of reflecting the right sort of light to the observers' retinas, but by virtue of generating an unusual sort of magnetic field that directly stimulates the visual cortex of observers in such a way as to make everything look greenish. Such an ore would not really be green, since it does not have the proper (extrinsic) function of causing green sensations in humans. The causal pathway is functionally deviant.

16.6 16.6.1

Problem Cases The Inverted Spectrum

Functionalist accounts of consciousness cannot account for what seems to be a genuine possibility: individuals whose behavior is indistinguishable from that of normal human beings, but who systematically experience things as differently colored from the way they are. They use the same words to describe the colors of things as normal speakers do, but they experience red things as blue, blue things as orange, and so on. No amount of observation of behavior could reveal such facts, but they seem to be possible nonetheless. On a teleological account of consciousness, such inverted spectrum cases are quite possible, and are even discoverable in principle, although not through the observation of behavior alone. If a person is in a neurological state with the teleofunction of representing things as blue whenever she observes a red object in normal circumstances, then there is indeed a mismatch between experience and reality, even though this mismatch is entirely covert. Ned Block (1990) has produced an interesting variation on the inverted spectrum problem, one intended to demonstrate the falsity of accounts like mine that attempt to explain qualia in terms of intentional contents. A normal human undergoes a spectrum inversion operation and is simultaneously transported to Inverted Earth, a planet on which things similar to things on earth really have colors that are inverted relative to their counterparts on earth. On Inverted Earth, the sky is yellow, grass is red, etc. To the transportee, everything appears normal on Inverted Earth, thanks to his color-inversion operation. He notices no change in his internal qualia. However, Block argues, as time progresses, the intentional contents of the transportee's sensory states shift, so that


Realism Regained

blue experiences carry information about the yellowness of their objects, green experiences carry information about the redness of their objects, etc. If experiential qualia are determined by intentional contents, then we would have to say that the subject's qualia gradually shift their external correlations, without the subject's being able to detect any difference. This seems quite improbable. Since I have taken the position in chapter 14 that teleofunctional states are narrow states, supervening on the intrinsic character of the human body (together with certain facts about modality and objective chance), I deny that transportation to Inverted Earth has or could have any effect on the intentional contents of the perceptual states. The interesting question for me is this: when does a spectrum-inversion operation affect intentional contents and thereby experiential qualia? From my perspective, everything in this case depends on the exact nature of the spectrum-inversion operation. If the operation consists in adding a special lens to the eye that transforms the incoming light, then I would guess that neither the qualia nor the intentional contents change. In this case, the most likely scenario for the coming-to-be of the altered body would be one in which ordinary vision (without the added lens) is the normal state, favored by natural selection, and the lens, which introduces only unnecessary complication, is some sort of adventitious addition. Consequently, it would be ordinary sense perception sans lens that would fix the intentional contents of the perceptual states. In contrast, suppose that the inversion operation involved a complex process of rewiring the subject's rods and cones. If the result is indistinguishable from a vision system that might well have arisen directly in nature, then I would argue that the operation changes both the intentional contents of and the qualia associated with the resulting brain states. This is impossible if we suppose that qualia must supervene on the local state of the central nervous system. However, I see no reason for making this assumption. The subject (assuming now that he is not transported to Inverted Earth) will report an inversion of qualia associated with seeing familiar objects. There are, however, two possible explanations of these reports: (1) the associated qualia have really been inverted by the operation, and (2) the subject's memories of the qualia associated with seeing familiar objects in the past have been systematically perverted by the operation. In the case of the rewiring of the retina, I would claim that (2) is the correct explanation. After the operation, the subject no longer has a human visual system. His system is now that of a distinct species. The qualia that he experiences when observing familiar objects are now utterly incommensurable with human color qualia. He now experiences qualia that are ineffably different from any we have experienced, or that he experienced in the past. His memories of his own qualia, experienced when he had a human visual system, are systematically in error, based on a faulty identification of those old qualia with the new qualia he experiences after the operation.

Teleological Theory of the Mind



Mary the Color Scientist

Prank Jackson (1996) asks us to imagine Mary, a color scientist who knows everything there is to know about the physiology and physics of color and color perception, but who, sadly, is congenitally color blind. Jackson argues that there is something we know about the color red that Mary does not know: namely, how red things look (to a human with normal color vision). This fact is somehow extra-physical and extra-causal, since, by hypothesis, Mary knows all relevant physical and functional facts. To make this case especially relevant in the present context, we can imagine that Mary knows all causal and teleofunctional facts as well. How then can it be that redness is simply a teleofunctional property? I think Jackson's problem is one of the most difficult, but also one of the most important, problems for the philosophy of mind. I do not have a fully satisfactory solution, but I will sketch out briefly the best account I can find. I am inclined to say that although Mary knows all the physical facts of the situation, there are certain causal and teleofunctional facts that she, as a colorblind person, cannot have access to.6 I am inclined to believe that the experience of red (or of a specific shade of red) carries information about the world that is not representable in the language of physics, not even the language of a hypothetical, ideal physics. Mary is ignorant, not only of certain psychological facts (how red things look), but also of certain facts about colored surfaces (that this chair is red). She can learn that the chair is a color that is commonly called 'red' (or, more precisely, that the chair has a property commonly called 'the color red'), but she cannot learn that the chair is red. Even if it is true, as I am willing to grant, that color properties loosely supervene on physical properties, so the chair could not be a different color without differing in some relevant physical properties, it does not follow that exactly the same information carried by the state of perceiving the redness of the chair can be carried by some sentence of physics (even of the ideal physics). The boundaries of the region in physical state-space upon which the shade of red supervenes is very probably fractal (infinitely complex). In addition, the indeterminacies and aspects of vagueness associated with the color red almost certainly have no exact isomorph among the concepts of physics. One thing that makes the Mary case so intractable is the Cartesian error of locating the secondary qualities entirely in the mind. If Mary were omniscient about the extra-mental objects of perception, there is no way to limit her knowledge of the contents of mental states. However, if we can look at physics, not as a potentially exhaustive treasury of truths about the mental world, but rather as the product of a ruthless degree of abstraction and idealization, resulting in a very narrow but extremely useful mode of description, then we can understand why the mental cannot be reduced to the physical. What bars Mary from access to the properties of red and of being appeared to redly is the existence of a certain kind of epistemic circle (analogous to the

This is exactly the position that Gilbert Harman (1990) takes.


Realism Regained

problem of the "hermeneutic circle"). Red can be identified with the teleofunctional state whose extrinsic proper function is to produce (via the normal operation of the visual system) the state of being-appeared-to-redly in humans. The state of being-appeared-to-redly can be identified as the teleofunctional state in humans whose intrinsic proper function is to convey information about the contemporary presence of a red object in a certain place. The two functions are constitutionally intertwined: each exists for the sake of the other.7 There is no real mystery about how this came to be: color perception enables us to classify and track physical objects more effectively, and shared color perception enables us to use colored objects and color terms to enhance communication (a red sign means Stop!]. Nonetheless, the tightness of the circularity bars those lacking normal human color perception any direct cognitive access to the two properties, and thus, to any facts including either one. In the case of colors, the microphysical structures that underlie the sensible qualities are of no interest to us, except insofar as they stably and reliably support the same colors as context and perceiver are varied. Color perceptions convey but do not represent information about the reflectances of different wavelengths of light, since it is not part of the function of color perception to convey this information. It is not part of the function of color perception to carry information about wavelengths, but only to carry information about how surfaces normally appear to human observers. Color perception and the perception of auditory qualities are perhaps distinctive in this respect: our perception of primary qualities, and also our perception of many smells and tastes, do often have the function of carrying information about the underlying physical and chemical properties of the perceived object. For example, the function of the taste of saltiness is to indicate the presence of NaCl and similar compounds. We can distinguish two components of the representational content of sensory qualia: (1) the intersubjective component, and (2) the purely physical component. The content of every form of qualia contains an intersubjective component. Some forms of qualia have a content with a purely physical component, while others, such as color, do not. In any case, it is knowledge of the intersubjective component that depends (for us, at least) on actual experience of the sensations in question. No quantity of information about the physical component of the content, or the physical properties upon which the quality supervenes, can provide knowledge of this intersubjective component. The circularity of secondary qualities and their corresponding phenomenal appearances poses a problem at the semantic or ontological level. If redness and the appearance of redness are complementary, interdependent functions, there would seem to be a problem about fixing the extension of words like 'red' in natural language. If 'red' is defined as referring to the extrinsic function of producing reddish appearances, and 'reddish appearance' is defined as the intrinsic function of a mental state to carry information about the location of a
To put the matter formally, the property of redness can be identified with the extrinsic teleological type 3Y(Y fc (rext(K, human, appeared-to-redly)), and the property of beingappeared-to-redly with the intrinsic teleological type 3V(V fe (Tjnt;(Y, human, (Y ~>.R red)).

Teleological Theory of the Mind


red surface, then we are faced with the puzzle of how to understand the resulting circularity. In a recent book on the Liar paradox, Gupta and Belnap (1993) develop a theory that vindicates the legitimacy and usefulness of circular definitions of exactly this kind. The circular definitions can be thought of as revision rules, rules for revising initial guesses about the correct extensions of the defined predicates. A defensible interpretation of the predicates consists of a fixed point in this revision process. In the case of sensory qualities, the definitions of all of the quality/ phenomena pairs take exactly the same form. Different qualities correspond to different fixed points in the Gupta/Belnap revision procedure. Thus, the circularity in conception is not semantically vicious. Materialist responses to the Mary problem have all involved introducing some sort of distinction between real, coarse-grained facts and merely conceptual, fine-grained facts (Tye (1995)), or between facts and "phenomenal information" (Lycan (1996)). Mary is supposed to know all the real facts about redness and red experience, but lack some sort of phenomenal or conceptual information. Lycan uses the analogy of de se knowledge: I can know that Rob Koons is overpaid without knowing that I am overpaid, if I have forgotten my identity. I possess all the facts, but I am still missing some potentially useful information, namely, that I am Rob Koons. It is hard to see how this sort of materialist response can possibly succeed. If Mary is lacking in conceptual or phenomenal information, then there is some fact of which she is ignorant, namely, the fact that a certain phenomenal concept (namely, red) applies to certain actual experiences and is associated with a certain English word. She knows that these experiences under their neurophysiological descriptions, and she knows that there is a phenomenal concept corresponding to the word 'red', but she cannot know de re of the phenomenal concept red, that it applies to these experiences, since she can have no de re attitudes involving the phenomenal concept at all. The central issue here concerns the nature of facts. I would propose that we think of facts as the combination of a situation-token and a situation-type, where the token is actual and supports the type. To know a fact, one must be able to represent it. To represent a fact, one must be able to represent both the token (via, perhaps, some relation between the token and one's representationtoken) and the type. Mary cannot have mental states that directly represent the type red. She can represent this type indirectly, by means of a representation equivalent in content to the definite description the surface-property designated in English by the word 'red' or to similar descriptions. But this does not give her access to the fact that the chair is red only to the (distinct) fact that the chair has the surface-property designated in English by the word 'red'.


Killer Yellow and Magnetic Green

A killer yellow object is one which is in fact yellow in color, but which cannot be perceived by human beings because it emits such powerful and lethal radiation


Realism Regained

that any human would be vaporized long before he or she got close enough to the object to observe its color. The possibility of killer yellow objects, whose postulation is attributed to Saul Kripke, raises serious problems for accounts that analyze colors as dispositions to cause perceptions of certain kinds in normal observers under normal circumstances. Killer yellow objects are yellow even though they have no disposition to cause yellow sensations in normal humans under normal circumstances they have instead the disposition to vaporize such normal observers under such circumstances. Killer yellow objects are yellow because they have some physical property that has (when realized by other, less lethal objects) the teleofunction of interacting with the human visual system in a certain way to produce yellow sensations. Yellowness is inherited by an object because of its possession of a higher-order type, not because the object itself would or even could fulfill the corresponding function. A magnetic green object is a colorless object that causes green sensations in normal observers under normal circumstances, but does so by means of a powerful magnetic field that induces visual hallucinations. To make the case especially devious, suppose that the magnetic field always produces a greenish hallucination in exactly the part of the visual field of the subject that would be occupied by the magnetic green object. Magnetic green objects also pose a challenge to dispositional accounts of color, since they are not green, despite their possessing the right disposition. Magnetic green objects are not green because of the deviancy of the causal chain leading to the green sensation. They subvert and do not fulfill the human capacity for color sensation.



Is it possible that an organism could be physically identical to a human being and yet be a zombie, experiencing no qualia whatsoever? The answer to this question depends on what is involved in being physically identical. Suppose that there is a possible world w in which all of the physical properties of the actual world exist, but in which the modal and stochastic facts, and the associated causal laws, are radically different. In world w, there could exist a physical duplicate of me whose physical states are entirely lacking the higher-order, teleofunctional properties that my actual physical states possess. It might be that some such physical duplicate would be entirely lacking in teleofunctional properties. If so, the duplicate would lack all mental properties, including those of experiencing qualia. Of course, the behavioral dispositions of the duplicate would be radically different, as well. It is this sort of conceivability of physical-duplicate zombies that explains why there is an explanatory gap between the properties of physics and those of consciousness. It is not so clear, however, that there is any such gap between teleofunctional properties and the properties of consciousness.

Teleological Theory of the Mind



The Correlation of Qualia and Physiology

Qualia, such as being appeared to greenly, are regularly associated with certain physiological states. What accounts for this reliable correlation? If some property-identity thesis were true, an answer would be readily available: the qualia properties are identical to the neurophysiological properties in question. However, the example of Mary the color scientist demonstrates that no such identity thesis is correct. This means we have a substantial problem of explaining the regular association of pairs of quite different properties. For dualists, this correlation must be accepted either as a brute fact, or as the product of divine fiat (see Adams (1987), Swinburne (1979)). For materialists, only the brute fact option is available. However, on the ideological account of qualia, a non-trivial explanation (without resort to divine intervention) is readily available. The state of beingappeared-to-greenly is supposed to carry robustly the information that a green object is present and visible. An investigation of human neurophysiology can explain how it is that the presence of green objects, in normal circumstances and with respect to normal human observers, causes the particular neurophysiological state regularly associated with the appearance of greenness. The availability of this explanation depends on two facts: (1) that the very essence of the property of being-appeared-to-greenly essentially includes a certain content, namely, the visible presence of a green object, and (2) that the corresponding external property (greenness) really exists. If we acknowledge these two facts, then there is no explanatory gap between the neurophysiological property and the qualia, despite their distinctness.


Free Will

The will is a faculty whose function is to make apt choices, choices that further the agent's good. A "free" will is a will that is not disabled at the point of action, a will that really does select one from a list of many options. An perfectly unfree will is a will that, at the point of action, is disabled. In cases of unfreedom, no choice is made: the agent was capable of considering and acting out only one course of action. Freedom of will is limited when the the will is unable to adopt courses of action that it should (ideally) be able to adopt. Being unable to think the unthinkable is not a case of unfreedom. If I am unable to consider the possibility of abandoning my child, this is not a case of unfreedom, since this is not the sort of option that my will is supposed to be able to consider (i.e., that it must consider if all of my primary teleofunctions are to be fulfilled). One who is able to consider unthinkable options has a will that is not thereby freer, but merely more licentious. As Aristotle noted, freedom or voluntariness is neither sufficient nor necessary for responsibility. One can be responsible for an unfree act if one was responsible for causing the unfreedom. One can fail to be responsible for a free


Realism Regained

act if one was, for no fault of one's own, ignorant of the nature or consequences of the act freely taken. In part I, I gave several reasons for rejecting determinism in the strict sense, according to which every state has a strictly necessitating cause. In the end, determinism in this sense is incompatible, not only with free will, but with the central features of causation itself. However, there is a more modest version of determinism that seems quite coherent: one that combines an indeterministic conception of causation with the theses of the soundness and completeness of causal explanation. Even though no cause strictly necessitates its effects, it might still be the case that for every wholly contingent state there is an actual causal explanation of that state that is adequate and undefeated. This is the hypothesis that I called the Completeness of Explanation in section 5.4. If we combine the necessary completeness of explanation with the indeterministic model of causation, we end up with a mitigated form of determinism. Causal explanation was defined in part I (section 5.4.2) as a prior state that is both defeasibly sufficient for the explanandum and actually undefeated. The explanans need not necessitate the explanandum, since there could have existed defeaters of the explanans. However, if we add to the explanans the negative information that no such defeaters exist, the resulting situation would, given the necessity of explanatory completeness and the existence of some effect of the explanans, necessitate the existence of the explanandum. Thus, there would be only two difference between strict and mitigated determinism: Strict determinism implies that the continuation of the course of the world is itself necessary, whereas mitigated determinism implies only that if the course of the world continues, it must continue in a unique way. Strict determinism implies that effects are necessitated by their causes, whereas mitigated determinism implies that the effects are necessitated by the sum of their causes plus a background situation rich enough to exclude the existence of any possible defeaters of the cause. These differences do not seem to be great enough to secure the kind of openness of the future, the real possibility of alternative courses of action, that genuine free will seems to require. One possible alternative would be to suppose that the hypothesis of the completeness of explanation is only contingently true, but then it is hard to see how we could have any basis for confidence in its truth in the actual world. However, there is a third alternative. It would be quite reasonable to take the completeness of explanation in any given case to be a truth with a very high objective probability, perhaps infinitely close to 1. If we embraced such a modest determinism, which we might call default determinism, we could still confidently expect to find causal explanations of every fact, and still be able to affirm that many things could have gone otherwise than they did. Default determinism is thus fully compatible with the real possibility that things could have gone otherwise, even without any change in causally antecedent facts. This real possibility of alternative courses is important, not

Teleological Theory of the Mind


only because without out it we feel a reluctance to assign moral responsibility, but also for a more fundamental reason. The veridicality of our deliberations entails that the various options we represent to ourselves are really possible in our actual circumstances. For example, in causal decision theory (Gibbard and Harper (1981), Lewis (1981)), we evaluate a decision situation in three steps: first, discover which actions are possible in the present circumstances; second, discover the probable consequences of each possible action; and finally, assign a utility value to each of the possible outcomes, evaluating a possible action by means of the probability-weighted average of the utility of its possible results. If necessitarian determinism were true, then all representations of the possibility of options other than the one actually taken would be illusory. This would mean that all deliberation was shot through with error, making the evaluation of decisions as optimal or sub-optimal impossible. Since all deliberation aims at action that is objectively optimal in the circumstances, necessitarian determinism would entail that all deliberation is aiming at a will-o'-the-wisp. In addition, without real possibilities that are alternative to the actual future, it would be impossible for our representations of alternatives to have the function of carrying robust information about the existence of such alternative futures. This would mean that we could not give a realist semantics for our apparent beliefs about what is still possible. Determinists might object that all we need is the doxastic or epistemic possibility of alternative futures: all that is needed is that the deliberator be ignorant of which of several alternative futures is already determined to be realized. However, this seriously misrepresents the nature of deliberation. In deliberation, we are not interested in finding out only what alternatives are possible, for all we presently know. We are interested in discovering which alternatives are really possible. We actively seek information that would improve and correct our current beliefs about the range of available options. The determinist cannot explain our interest in gaining new modal information. In addition, ignorance about the future course of things is not a necessary condition for deliberation. Suppose that I have already made up my mind to take the train tomorrow, confident that this is the optimal choice. You challenge my reasoning, arguing that I have made a mistake, and that the truly optimal choice for me is to take an airplane instead. I can refute you by engaging in a process of re-deliberation, and I can do so without in the slightest degree reducing my confidence that I will in fact take the train. What is important is not that there is any practical doubt in my mind about whether I will take the train, but that it is a genuine ontic possibility that I do otherwise. This I can grant, while admitting only an infinitesimal degree of doubt about the actual course of action I shall take. Suppose the determinist concedes that deliberation requires the acknowledgment of the genuine possibility of several alternative futures but continues to insist that the laws of nature and the history of the world so far pro-determine a unique course. He could do so by postulating that it is a contingent matter whether the laws of nature will continue to hold. This defense of determinism would have the bizarre consequence that I would have to take into account,


Realism Regained

in my deliberations, possible futures in which the laws of nature are violated. Surely, once I know something is a law of nature, it would be irrational for me to consider its violation a real possibility in the future.


Teleological Reliabilism
17.1 Reliabilism: The Reference Class Problem

Gettier's seminal article on justified true belief (Gettier (1963)) forced a rethinking of the conditions of knowledge.1 So-called external factors and analyses came to the fore. The distinction between knowledge and mere true opinion turns on the way in which the belief was formed and how it is sustained. If the formation and maintenance of the belief reliably tracks the truth, then the belief constitutes knowledge. Knowledge is reliably formed and maintained belief (Goldman (1979)). Reliability is a matter of probability. A way of forming beliefs is reliable if the objective probability of a belief's being true, given that it is a product of that way, is very high. This means that everything turns on how we specify the various 'ways' of forming beliefs. Without any principled answer to this question, issues of reliability can be settled any way we please by simple jury-rigging the definition of the 'way' in which the belief was formed. Suppose I believe that it will rain tomorrow because my friend said so. Which of the following is the 'way' in which this belief was formed? By testimony
1 Gettier demonstrated that justified true belief is not sufficient for knowledge. He gives the example of Smith, who is justified in believing Jones owns a Ford. Smith uses the laws of logic to derive the belief Jones owns a Ford or Brown is in Barcelona, having no idea where Brown in fact is. As it happens, Smith's belief about Jones is false, since Jones recently sold his Ford but lied to Smith about it, but Smith's belief in the disjunction is true, since, by coincidence, Brown happens to be in Barcelona. Smith's belief in the disjunction is true (by virtue of the truth of the first disjunct) and justified (by virtue of his justified belief in the first disjunct), but this justified true belief does not constitute knowledge. What is missing is the right sort of connection between the truth-maker of the disjunction (Brown's location) and Smith's belief-state.


218 By testimony from a friend By testimony from my friend

Realism Regained

By testimony from my friend or the Encyclopedia Britannica By testimony that happens in this case to be true One could get very different answers to the question of probability, from nearly 0 to 1, depending on one's choice of the reference class. Is there any principled way to choose a (or the) correct class? Teleology to the rescue! As Alvin Plantinga (1993) has convincingly argued, what makes for knowledge is not merely reliability, but the fulfillment of the function of reliability. Every belief is formed by a combination of a number of neural states with the intrinsic function of carrying reliable information, and a number of environmental factors with the extrinsic function of conveying reliable information to us. When all of these functions are fulfilled, the resulting belief is a state of knowledge. When malfunction occurs, the belief is a mere opinion, true if it happens to coincide in content with what would have been believed had there been no malfunction, false otherwise. To return to the case of testimony, malfunction could occur at a number of points. There might already have been malfunction in my friend's belief formation. My friend might be lying, which would be a malfunction in the extrinsic function of my friend's speech as part of my environment. I might have misunderstood what my friend said. Or, there might be some other malfunction in my processing of my friend's statement. For example, I might wrongly believe both that he's lying and that he's always wrong about the weather. A great many things can be involved in the 'way' in which I form my belief, but not just anything. A factor can be introduced as relevant to the classification of my belief as knowledge only if there is some teleological connection between that factor and my belief.


Grue, Bleen, and the New Riddle of Induction

As Plantinga has argued, a teleological epistemology has a fairly simple answer to the Humean puzzle about induction. Induction is reasonable because induction accords with the teleofunction of certain belief-forming processes in the human mind. Reason is not to be identified with deductive logic alone: reason is itself an inescapably teleological concept. We think reasonably when we think in accordance with the proper, intrinsic functions of our mind. No further justification of reason is needed or possible. Nonetheless, we are still left with the substantive task of characterizing the inductive functions of the mind. Nelson Goodman's new riddle of induction shows us that the definition of induction is no trivial task. We cannot simply say that induction consists in inferring that unobserved tokens will resemble

Teleological Reliabilism


observed ones. We must say something substantive about what sorts of resemblances are, and what sorts are not, reasonable to project in induction. For example, if we define the property of "grue" as follows:

where "Obs(x)" means that object x will have been observed at least once before the year 2001, then we will find that all heretofore-observed green things have also been grue. In particular, all emeralds so far observed have been grue. If we assume the future will be like the past, should we predict that emeralds first discovered after the year 2001 will be green or grue? For emeralds discovered after 2000, these two predictions are incompatible, since an emerald that is grue and unobserved until 2001 must, according to the definition, be blue. There is a simple answer that teleological epistemology can give to this question as well (in fact, Plantinga (Plantinga, 1993, pp. 133-136) offers exactly this answer). We can say that for a mind to use a property like grue in induction is to malfunction. However, in this case, I am not content with stopping with this simple answer, since we would like to know what exactly constitutes the malfunction in the case of properties like grue. The correct diagnosis depends on the fact that the property of grue is disjunctive in a way that the property green is not. If we grant this assumption for a moment, the problem with the grue inference is that we are using a disjunctive property in our projection, despite the fact that all of the observed instances fall under only one of the two disjuncts. In the absence of any knowledge linking the two disjuncts, this is an unreliable procedure, and hence one that we can imagine that natural selection has disfavored. What if we drop the troublesome disjunct? Why is it unreasonable to infer that all the emeralds in the world will have been observed before the year 2001, given that all observed emeralds have been so observed? In this case, it is the conjunctive nature of the property that is the source of the problem, since one of the conjuncts is illegitimately egocentric and time-bound. Once again, it is easy to see how allowing such properties to occur in inductive procedures would lead to unreliable results. However, is it so clear that grue is a disjunctive property, in fact, a disjunction of two conjunctive properties? Doesn't this involve a kind of category mistake? Isn't it predicates or phrases, and not properties, that can be disjunctive or conjunctive? If we concede that this is so, we are immediately in trouble, since we can imagine a language in which 'grue' is the primitive and 'green' is defined in terms of 'grue' and 'bleen'. There is, however, a transcendental argument that demonstrates that the kind of nominalism, like Goodman's, that denies the distinction between disjunctive and simple properties is incoherent. Every proposed solution to the grue puzzle makes a covert appeal to a Platonic distinction between simple and complex properties. If we distinguish between green and grue on phenomenological or epistemological or historical-cultural grounds, we ignore the fact that the


Realism Regained

categories of phenomenology, epistemology, and history can also be subjected to grue-like transformations. For example, suppose we try to distinguish between green and grue on the grounds that there is a phenomenological quality that corresponds to the first and not the second. This ignores the fact that there is a phenomenological quality which corresponds to grue, the property of 'being appeared to gruely'. One is appeared to gruely if and only if one is either being appeared to greenly not as a result of observing an emerald first observed (if at all) after 2000, or one is being appeared to bluely as a result of observing an emerald first observed after 2000. If one objects that being appeared to gruely is not a genuine phenomenological property, because, perhaps, it is not introspectible, I can simply respond by asking, Upon what basis do you say that it is being appeared to greenly, and not being appeared to gruely, which we are able to introspect? Every introspective act in response to an event in which the one property was instantiated was also an act in response to an event in which the other was instantiated. Why does the epistemologist identify the content of these introspection with the quality of green instead of grue? It must have to do with the fact that being appeared to greenly is an intrinsically simpler property than being appeared to gruely. Even Goodman's solution in terms of entrenchment is vulnerable to this attack. On what basis can we say that it is green rather than grue which has been entrenched by our past practice? Since the two properties have been so far coextensive, it would be equally charitable to interpret the established English word 'green' as signifying grue or as signifying green. In order to abort this infinite regress, one must appeal at some point to a difference in intrinsic simplicity between the two properties. This solution does not depend on what we might call "strong" Platonism or Aristotelian essentialism: the view that world contains ready-made, pre-cut categories to which we must simply attach labels. Instead, what is necessary is that there be a mind-independent abstraction-space, in relation to which we can distinguish those possible categories that are convex and those that are not. The category 'green' corresponds to a single convex region in this abstraction space (a single "regularity" to which agents might be "attuned"), whereas 'grue' corresponds to two disparate regions.2


Curve-Fitting: The Problem of Mathematical Simplicity

It has been known from antiquity that the simplest explanation is the best. This is often encapsulated as Occam's razor: do not multiply entities needlessly. In appendix B, I will argue that the crucial thing that we must minimize in our explanations is the extension of the causal priority/relevance relation. It is not
2 See also section 5.8.1 for a further discussion of the causal irrelevance of merely disjunctive properties.

Teleological Reliabilism


that we should suppose that only those particulars exist that are needed in our causal explanations, but rather that only those particulars are relevant to a particular explanandum that are needed in explaining it. This minimization of causal relevance extends to the mathematical domain as well. We should suppose that a mathematical structure is causally relevant to a particular phenomenon only if it is needed in constructing an adequate explanation of the phenomenon. This means that we should not use integers if the natural numbers will do, or real numbers if the integers will do. We should not use multiplication if addition alone is needed, or exponentiation if addition and multiplication are adequate. It is not that we are supposing that the reals do not exist if they are not needed in formulating a particular explanation. Rather, it is that we should suppose that the reals are causally irrelevant to the phenomenon unless they are needed. Since in my view, mathematical objects can be causally connected to physical events through the medium of modality, the very same Occamist considerations that lead us to prefer the physically simplest theory should lead us to prefer the mathematically simplest theory. If a linear relationship is adequate to explain the data points, then we should infer that the causal structure of the world in this domain is linear in character, despite the fact that there are infinitely many curves passing through those same data points.


The Reliability of Simplicity as a Criterion of Truth

Inference to the best explanation involves an application of Occam's razor, since the best explanation is the simplest explanation, the one that assumes the smallest set of causally relevant factors. Presumably, Occam's razor is a principle governing the proper functioning of the human mind. However, even if this is true, we still face the difficulty of meeting Hume's challenge: how do our super-empirical concepts acquire the power to designate what they do? I must meet this challenge in essentially the same way as that in which I explained the representative character or perceptual ideas. A non-perceptual or theoretical mental state \ carries the representational content that a token is of type (j> ju in case it is the teleofunction of state \ t carry the information that a token i of type (/>. In order for the data to carry the information that the simple explanation is veridical, it must be the case that the objective probability of the simple explanation, conditional on the data, is very high (in fact, infinitely close to one). We can ask, under what conditions is this the case? Two things must be true: 1. There must be a significant (finite) probability that any phenomenon we encounter is in fact caused by a relatively simple ensemble of relevant factors.


Realism Regained

2. The probability that a set of data that is in fact caused by an extremely large and complex causal mechanism should be amenable to a simple would-be explanation must be extremely low (infinitesimal). In any world that is only finitely complex, there will be a quite large set of phenomena that are caused by a small number of factors. However, what is crucial is that we should encounter a significant number of such phenomena at the normal scale of human/environment interaction. Since human beings are themselves composed of a large number of complex parts, relatively little of what we encounter is actually simple in the absolute sense. There is a qualified sense of simplicity, however, that does characterize many commonly encountered phenomena. There are many systems that, although they are composed of a very large number of parts, can be analyzed as composed of a few intermediate-scale aggregates, each of which can be treated as causal units (at least, with approximate success and over a fairly wide range of conditions). Thus, the possibility of theoretical representations (and, therefore, the possibility of scientific knowledge) depends on the contingent fact that the world we inhabit is uniformly inhabited by systems characterized by such intermediate-scale simplicity. This contingent uniformity must itself have an ultimate causal explanation, and it is this causal foundation that undergirds and informs human cognition. Condition (2) consists in the requirement that the probability of pseudosimplicity is infinitesimal. A phenomenon instantiates pseudo-simplicity when it is caused by a complex set of factors, there is no intermediate scale at which these factors can, even with approximate success, be aggregated into a small number of units, and yet the phenomenon is amenable to a simple putative explanation. If the phenomenon consists of an infinite collection of data, then the probability of pseudo-simplicity drops to an infinitesimal level, unless many causal factors involved are themselves somehow coordinated so as to mimic the simpler mechanism. The absence of such a mimicry mechanism (e.g., Descartes's evil genius) is an important precondition, not only of our theoretical knowledge, but also of theoretical cognition of any kind.


The Incompatibility of Materialism and Scientific Realism

Whenever philosophers bother to offer a defense for philosophical materialism, they typically appeal to the authority of natural science. Science is supposed to provide us with a picture of the world so much more reliable and well supported than that provided by any non-scientific source of information that we are entitled, perhaps even obliged, to withhold belief in anything that is not an intrinsic part of our our best scientific picture of the world. This scientism is taken to support materialism, since, at present, our best scientific picture of the world is an essentially materialistic one, with no reference to causal agencies other than those that can be located within space and time.

Teleological Reliabilism


This defense of materialism or "naturalism" presupposes a version of scientific realism: unless science provides us with objective truth about reality, it has no authority to dictate to us the form that our philosophical ontology and metaphysics must take. Science construed as a mere instrument for manipulating experience, or merely as an autonomous construction of our society, without reference to our reality, tells us nothing about what kinds of things really exist and act. In this section, I will argue, somewhat paradoxically, that scientific realism can provide no support to philosophical materialism. In fact, the situation is precisely the reverse: materialism and scientific realism are incompatible. Specifically, I will argue that (in the presence of certain well-established facts about scientific practice) the following three theses are mutually inconsistent: 1. Scientific realism 2. Materialism (ontological naturalism, the thesis that the world of space and time is causally closed) 3. Representational naturalism (the thesis that there exists a correct naturalistic account of knowledge and intentionality) By scientific realism, I intend a thesis that includes both a semantic and an epistemological component. Roughly speaking, scientific realism is the conjunction of the following two claims: 1. Our scientific theories and models are theories and models of the real world, including its laws, as they exist objectively, independent of our preferences and practices. 2. Scientific methods tend, in the long run, to increase our stock of real knowledge. Ontological naturalism is the thesis that nothing can have any influence on events and conditions in space and time except other events and conditions in space and time. According to the ontological naturalist, there are no causal influences from things "outside" space: either there are no such things, or they have nothing to do with us and our world. Representational naturalism is the proposition that human knowledge and intentionality are parts of nature, to be explained entirely in terms of scientifically understandable causal connections between brain states and the world. Intentionality is that feature of our thoughts and words that makes them about things, that gives them the capability of being true or false of the world. I take philosophical naturalism to be the conjunction of ontological and representational naturalism. The two theses are logically independent: it is possible to be an ontological naturalist without being a representational naturalist, and vice versa. For example, eliminativists like the Churchlands, Stich, and (possibly) Dennett are ontological naturalists who avoid being representational naturalists by failing to accept the reality of knowledge and intentionality. Conversely, a Platonist might accept that knowledge and intentionality are to be understood entirely in terms of causal relations, including, perhaps, causal connections to the Forms, without being an ontological naturalist. I will argue


Realism Regained

that it is only the conjunction of the two naturalistic theses that is incompatible with scientific realism. Many philosophers believe that scientific realism gives us good reason to believe both ontological naturalism and representational naturalism. I will argue, paradoxically, that scientific realism entails that either ontological naturalism or representational (or both) is false. I will argue that nature is comprehensible scientifically only if nature is not a causally closed system only if nature is shaped by supernatural forces (forces beyond the scope of physical space and time). My argument requires two critical assumptions: PS: A preference for simplicity (elegance, symmetries, invariances) is a pervasive feature of scientific practice. ER: Reliability is an essential component of knowledge and intentionality, on any naturalistic account of these.


The Pervasiveness of Simplicity

Philosophers and historians of science have long recognized that quasi-aesthetic considerations, such as simplicity, symmetry, and elegance, have played a pervasive and indispensable role in theory choice. For instance, Copernicus's heliocentric model replaced the Ptolemaic system long before it had achieved a better fit with the data because of its far greater simplicity. Similarly, Newton's and Einstein's theories of gravitation won early acceptance due to their extraordinary degree of symmetry and elegance. In his recent book Dreams of a Final Theory, the physicist Steven Weinberg included a chapter entitled "Beautiful Theories," in which he detailed the indispensable role of simplicity in the recent history of physics. According to Weinberg, physicists use aesthetic qualities both as a way of suggesting theories and, even more importantly, as a sine qua non of viable theories. Weinberg argues that this developing sense of the aesthetics of nature has proved to be a reliable indicator of theoretical truth. The physicist's sense of beauty is ... supposed to serve a purpose it is supposed to help the physicist select ideas that help us explain nature. (Weinberg, 1993, p. 133) ... we demand a simplicity and rigidity in our principles before we are willing to to take them seriously. (Weinberg, 1993, pp. 148-149) For example, Weinberg points out that general relativity is attractive not just for its symmetry, but for the fact that the symmetry between different frames of reference requires the existence of gravitation. The symmetry built into Einstein's theory is so powerful and exacting that concrete physical consequences, such as the inverse square law of gravity, follow inexorably. Similarly, Weinberg explains that the electroweak theory is grounded in an internal symmetry between the roles of electrons and neutrinos.

Teleological Reliabilism


The simplicity that physicists discover in nature plays a critical heuristic role in the discovery of new laws. As Weinberg explains, Weirdly, although the beauty of physical theories is embodied in rigid, mathematical structures based on simple underlying principles, the structures that have this sort of beauty tend to survive even when the underlying principles are found to be wrong.... We are led to beautiful structures by physical principles, but the beauty sometimes survives when the principles themselves do not. (Weinberg, 1993, pp. 151-152) Weinberg notes that the simplicity that plays this central role in theoretical physics is "not the mechanical sort that can be measured by counting equations or symbols" (Weinberg, 1993, p. 134). The recognition of this form of beauty requires an act of quasi-aesthetic judgment. As Weinberg observes, There is no logical formula that establishes a sharp dividing line between a beautiful explanatory theory and a mere list of data, but we know the difference when we see it. In claiming that an aesthetic form of simplicity plays a pervasive and indispensable role in scientific theory choice, I am not claiming that the aesthetic sense involved is innate or a priori. I am inclined to agree with Weinberg in thinking that "the universe acts as a random, inefficient, and in the long-run effective teaching machine" (Weinberg, 1993, p. 158). We have become attuned to the aesthetic deep structure of the universe by a long process of trial and error, a kind of natural selection of aesthetic judgments. As Weinberg puts it, Through countless false starts, we have gotten it beaten into us that nature is a certain way, and we have grown to look at that way that nature is as beautiful ... Evidently we have been changed by the universe acting as a teaching machine and imposing on us a sense of beauty with which our species was not born. Even mathematicians live in the real universe, and respond to its lessons. (Weinberg, 1993, pp. 158-159) Nonetheless, even though we have no reason to think that the origin of our aesthetic attunement to the structure of the universe is mysteriously prior to experience, there remains the fact that experience has attuned us to something, and this something runs throughout the most fundamental laws of nature. Behind the blurrin' and buzzin' confusion of data, we have discovered a consistent aesthetic behind the various fundamental laws. As Weinberg concludes, It is when we study truly fundamental problems that we expect to find beautiful answers. We believe that, if we ask why the world is the way it is and then ask why that answer is the way it is, at the end of this chain of explanations we shall find a few simple principles


Realism Regained of compelling beauty. We think this in part because our historical experience teaches us that as we look beneath the surface of things, we find more and more beauty. Plato and the neo-Platonists taught that the beauty we see in nature is a reflection of the beauty of the ultimate, the nous. For us, too, the beauty of present theories is an anticipation, a premonition, of the beauty of the final theory. And, in any case, we would not accept any theory as final unless it were beautiful. (Weinberg, 1993, p. 165)

This capacity for "premonition" of the final theory is possible only because the fundamental principles of physics share a common bias toward a specific, learnable form of simplicity.


The Centrality of Reliability to Representational Naturalism

The representational naturalist holds that knowledge and intentionality are entirely natural phenomena, explicable in terms of causal relations between brain states and the represented conditions. In the case of knowledge, representational naturalism must make use of some form of reliability. The distinction between true belief and knowledge turns on epistemic norms of some kind. Unlike some Platonists, representational naturalists cannot locate the basis of such norms in any transcendent realm. Consequently, the sort of Tightness that qualifies a belief as knowledge must consist in some relation between the actual processes by which the belief is formed and the state of the represented conditions. Since knowledge is a form of success, this relation must involve a form of reliability, an objective tendency for beliefs formed in similar ways to represent the world accurately. Thus, if representational naturalism is combined with epistemic realism about scientific theories, the conjunction of the two theses entails that our processes of scientific research and theory choice must reliably converge upon the truth. A naturalistic account of intentionality must also employ some notion of reliability. The association between belief-states and their truth-conditions must, for the representational naturalist, be a matter of some sort of natural, causal relation between the two. This association must consist in some sort of regular correlation between the belief-state and its truth-condition under certain conditions (the 'normal' circumstances for the belief-state). This reliability may be only a conditional reliability: reliability under teleological normal circumstances. This condition provides the basis for a distinction between knowledge and true belief: an act of knowledge that p is formed by processes that reliably track the fact that p in the actual circumstances, whereas a belief that p is is formed by processes that would reliably track p in normal circumstances. It is possible for our reliability to be lost. Conditions can change in such a way that teleologically normal circumstances are no longer possible. In such

Teleological Reliabilism


cases, our beliefs about certain subjects may become totally unreliable. As Papineau observed, It is the past predominance of true belief over false that is required ... [This] leaves it open that the statistical norm from now on might be falsity rather than truth. One obvious way in which this might come about is through a change in the environment. (Papineau, 1993, p. 558) In addition, there may be specifiable conditions that occur with some regularity in which our belief-forming processes are unreliable. This link is easily disrupted. Most obviously, there is the point that our natural inclinations to form beliefs will have been fostered by a limited range of environments, with the result that, if we move to new environments, those inclinations may tend systematically to give us false beliefs. To take a simple example, humans are notoriously inefficient of judging sizes underwater. (Papineau, 1993, p. 100) Finally, the reliability involved may not involve a high degree of probability. The correlation of belief-type and represented condition does not have to be close to 1. As Millikan has observed, "it is conceivable that the devices that fix human beliefs fix true ones not on average, but just often enough" (Millikan, 1989a, p. 289). For example, skittish animals may form the belief that a predator is near on the basis of very slight evidence. This belief will be true only rarely, but it must have a better-than-chance probability of truth under normal circumstances, if it is to have a representational function at all. Thus, despite these qualifications, it remains the case that a circumscribed form of reliable association is essential to the naturalistic account of intentionality. The reliability is conditional, holding only under normal circumstances, and it may be minimal, involving a barely greater-than-chance correlation. Nonetheless, the representational naturalist is committed to the existence of a real, objective association of the belief-state with its corresponding condition.


Proof of the Incompatibility

I claim that the triad of scientific realism (SR), representational naturalism (RN), and ontological naturalism (ON) is inconsistent, given the theses of the pervasiveness of the simplicity criterion in our scientific practices (PS) and the essentiality of reliability as a component of naturalistic accounts of knowledge and intentionality. The argument for the inconsistency proceeds as follows. 1. SR, RN, and ER entail that scientific methods are reliable sources of truth about the world. As I have argued, a representational naturalist must attribute some form of reliability to our knowledge- and belief-forming practices. A scientific realist


Realism Regained

holds that scientific theories have objective truth-conditions, and that our scientific practices generate knowledge. Hence, the combination of scientific realism and representational naturalism entails the reliability of our scientific practices. 2. Prom PS, it follows that simplicity is a reliable indicator of the truth about natural laws. Since the criterion of simplicity as a sine qua non of viable theories is a pervasive feature of our scientific practices, thesis 1 entails that simplicity is a reliable indicator of the truth (at the very least, a better-than-chance indicator of the truth in normal circumstances). 3. Mere correlation between simplicity and the laws of nature is not good enough: reliability requires that there be some causal mechanism connecting simplicity and the actual laws of nature. Reliability means that the association between simplicity and truth cannot be coincidental. A regular, objective association must be grounded in some form of causal connection. Something must be causally responsible for the bias toward simplicity exhibited by the theoretically illuminated structure of nature. 4. Since the laws of nature pervade space and time, any such causal mechanism must exist outside spacetime. By definition, the laws and fundamental structure of nature pervade nature. Anything that causes these laws to be simple, anything that imposes a consistent aesthetic upon them, must be supernatural. 5. Consequently, ON is false. The existence of a supernatural cause of the simplicity of the laws of nature is obviously inconsistent with ontological naturalism. Hence, one cannot consistently embrace naturalism and scientific realism.


Papineau and Millikan on Scientific Realism

David Papineau and Ruth Garrett Millikan are two thoroughgoing naturalists who have explicitly embraced scientific realism. If the preceding argument is correct, this inconsistency should show itself somehow in their analyses of science. This expectation is indeed fulfilled. For example, Papineau recognizes the importance of simplicity in guiding the choice of fundamental scientific theories. He also recognizes that his account of intentionality entails that a scientific realist must affirm the reliability of simplicity as a sign of the truth. Nonetheless, he fails to see the incompatibility of this conclusion with his ontological naturalism. Here is the relevant passage: It is plausible that at this level the inductive strategy used by physicists is to ignore any theories that lack a certain kind of physical simplicity. If this is right, then this inductive strategy, when applied to the question of the general constitution of the universe, will

Teleological Reliabilism inevitably lead to the conclusion that the universe is composed of constituents which display the relevant kind of physical simplicity. And then, once we have reached this conclusion, we can use it to explain why this inductive strategy is reliable. For if the constituents of the world are indeed characterized by the relevant kind of physical simplicity, then a methodology which uses observations to decide between alternatives with this kind of simplicity will for that reason be a reliable route to the truth. (Papineau, 1993, p. 166)


In other words, so long as we are convinced that the laws of nature just happen to be simple in the appropriate way, we are entitled to conclude that our simplicity-preferring methods were reliable guides to the truth. However, it seems clear that such a retrospective analysis would instead reveal that we succeeded by sheer dumb luck. By way of analogy, suppose that I falsely believed that a certain coin was two-headed. I therefore guess that all of the first six flips of the coin will turn out to be heads. In fact, the coin is a fair one, and, by coincidence, five of the first six flips did land heads. Would we say in this case that my assumption was a reliable guide to the truth about these coin flips? Should we say that its reliability was |? To the contrary, we should say that my assumption led to very unreliable predictions, and the degree of success that I achieved was due to good luck, and nothing more. Analogously, if it is a mere coincidence that the laws of nature share a certain form of aesthetic beauty, then our reliance upon aesthetic criteria in theory choice is not in any sense reliable, not even minimally reliable, not even reliable in ideal circumstances. When we use the fact that we have discovered a form of "physical simplicity" in law A as a reason for preferring theories of law B, which have the same kind of simplicity, then our method is reliable only if there is some causal explanation of the repetition of this form of simplicity in nature. And this repetition necessitates a supernatural cause. Papineau recognizes that we do rely on such an assumption of the repetition of simplicity. The account depends on the existence of certain general features which characterize the true answers to questions of fundamental physical theory. Far from being knowable a priori, these features may well be counterintuitive to the scientifically untrained. (Papineau, 1993, p. 166) Through scientific experience, we are "trained" to recognize the simplicity shared by the fundamental laws, and we use this knowledge to anticipate the form of unknown laws. This projection of experience from one law to the next is reliable only if there is some common cause of the observed simplicity. Similarly, Millikan believes that nature has trained into us (by trial-and error-learning) certain "principles of generalization and discrimination" (Millikan, 1989a, p. 292) that provided us with a solution to the problem of theoretical


Realism Regained

knowledge that was "elegant, supremely general, and powerful, indeed, I believe it was a solution that cut to the very bone of the ontological structure of the world" (Millikan, 1989a, p. 294). However, Millikan seems unaware of just how deep this incision must go. A powerful and supremely general solution to the problem of theory choice must reach a ground of the common form of the laws of nature, and this ground must lie outside the bounds of nature. Papineau and Millikan might try to salvage the reliability of a simplicity bias on the grounds that the laws of nature are, although uncaused, brute facts, necessarily what they are. If they share, coincidentally, a form of simplicity and do so non-contingently, then a scientific method biased toward the appropriate form of simplicity will be, under the circumstances, a reliable guide to the truth. There are two compelling responses to this line of defense. First, there is no reason to suppose that the laws of nature are necessary. Cosmologists often explore the consequences of models of the universe in which the counterfactual laws hold. Second, an unexplained coincidence, even if that coincidence is a brute-fact necessity, cannot ground the reliability of a method of inquiry. A method is reliable only when there is a causal mechanism that explains its reliability. By way of illustration, suppose that we grant the necessity of the past: given the present moment, all the actual events of the past are necessary. Next, suppose that a particular astrological method generates by chance the exact birthday of the first President of the United States. Since that date is now necessary, there is no possibility of the astrological method's failing to give the correct answer. However, if there is no causal mechanism explaining the connection between the method's working and the particular facts involved in Washington's birth, then it would be Pickwickian to count the astrological method as reliable in investigating this particular event. Analogously, if the various laws of nature just happen, as a matter of brute, inexplicable fact, to share a form of simplicity, then, even if this sharing is a matter of necessity, using simplicity as a guide in theory choice should not count as reliable. In my chapter, "The Incompatibility of Naturalism and Scientific Realism," in the forthcoming anthology Naturalism: A Critical Appraisal (Craig and Moreland (2000)), edited by William Lane Craig and J. P. Moreland, I give a fuller version of this argument. I also deal in that chapter with alternative accounts of the role of simplicity, such as that of Forster and Sober (194), Reichenbach (1956), and Turney (1990). I show that, in each case, the rationales given for the use of simplicity as a criterion are inadequate to salvage a genuine scientific realism.


The Ramsey-Lewis Account of Laws

Frank Ramsey Ramsey (1990) and David Lewis Lewis (1994) have proposed an account of the nature of natural law that would dispose of any need to explain the reliability of simplicity as an indicator of genuine lawhood. Their account simply identifies the laws of nature with the axioms of the best theory

Teleological Reliabilism


of the world, where best is cashed out in terms of such virtues as simplicity, strength, and fit with the empirical data. Hence, it becomes an analytic truth that simplicity is a criterion of the lawfulness of a confirmed generalization. However, the Ramsey-Lewis account fails to satisfy my definition of scientific realism, since on their account, what the actual laws of nature are is not a fully objective matter. Which generalizations are laws depends in part on our preferences and practices, in particular, on our preferences for certain kinds of simplicity. Lewis suggests (Lewis, 1994, p. 479) that if nature is "kind", the subjectivity of his account can be somewhat mitigated: it may be that there is a single system of laws that is robustly best, best under a variety of conceptions of simplicity. This is only a somewhat mitigated form of subjectivism, however, since if the simplicity criterion is to have any real bite, our preference for simplicity must play an ineliminable role in determining which generalizations are in fact laws of nature. In addition, Lewis cannot take seriously Weinberg's suggestion that we are learning the correct aesthetic in response to our more and more extensive interactions with nature. Weinberg's view takes for granted that it is a very specific and constrained conception of simplicity that guides science at each point, that this conception of simplicity changes substantially over time, and that as we learn more about the aesthetic properties the true laws of nature share, we become better at identifying new laws of nature. The Ramsey-Lewis account assumes that the relevant conception of simplicity is generic and fixed, and it provides no way of making sense of a learning process by which our aesthetic sense becomes better attuned to that of the universe. In case any doubt remains about the lack of objectivity in our knowledge of natural law on the Ramsey-Lewis account, consider this fact: there is no possibility on the Ramsey-Lewis account of a causal connection between the facts about natural law and our opinions about those facts. For Ramsey and Lewis, Humean supervenience is a given: the modal and stochastic facts are wholly determined by the distribution of occurrent properties in the actual world. Which statements are in fact laws of nature depends on the whole course of the actual world, past, present, and future. A genuine law of nature must fit the occurrent facts of the future as well as the past. Thus, the fact that some L is a law of nature supervenes on occurrent facts spread throughout time. This latter fact cannot be a cause of our current opinions, since much of it lies in the future, causally posterior to our opinions. The need for a causal connection between the laws of nature and our scientific beliefs (the kind of connection that the Ramsey-Lewis account precludes) can be seen by considering, once again, Gettier-like examples of failed knowledge. Consider the following counterfactual world. Newton bases his theory of the inverse square law of gravitation almost entirely on observations of the movements of the planets, which exactly match the observed movements of the planets in the actual world. However, in this hypothetical world, the planets move they way they do because they are firmly attached to a system of elliptical rail lines in space, constructed millions of years ago by visitors from the Andromeda galaxy. These Andromedans built the rail lines in conformity to


Realism Regained

certain complex religious beliefs they held, and the fact that the lines force the planets to move in exactly the orbits they do has in fact nothing whatsoever to do with the force of gravity. In such a case, Newton's beliefs about gravity would be as true and as justified as they are in the actual world, but they clearly would not constitute knowledge of the nature of gravity, because of the lack of the right sort of causal connection between Newton's theory and the inverse square law itself. This lack of causal connection between the laws of nature and our scientific beliefs means that Ramsey and Lewis cannot give a teleological account of either the content of our beliefs about natural law or of our knowledge of natural law. If we identify what is rational with what is required by the "design plan" (to use Plantinga's phrase) of our mind, that is, with the fulfillment of the proper functions of our ratiocinative faculties, then our reasoning about natural law falls outside the scope of reason. To treat natural laws as Ramsey and Lewis suggest we do must be to follow some purely positive social norm, to conform to a practice that is a non-adaptive product of historical accident. Our beliefs about natural law are, therefore, not genuinely about the world at all, but merely embodiments of this non-functional practice. In contrast, a modal realist can give a perfectly good account of our lawinducing practices in terms of natural selection. Since natural law (and its corollary, objective chance) causally affects the future course of events, it is plainly a matter of adaptive fitness to be well attuned to the actual laws of nature and the actual objective chances. These same laws and chances have been active in shaping the past, so it is possible for nature (despite her myopia about the future) to have succeeded in selecting for this attunernent. Observed patterns and frequencies in the past are, thanks to the reality of law and chance, fallible but reliable indicators of future patterns and frequencies. Since reason is simply the fulfillment of the mind's proper functions, it is paradigmatically rational for the mind to practice inference to the best theory (including the progressive improvement in the standards of goodness in theories that Weinberg describes).


When Does Bayesian Learning Constitute Knowledge?

Bayesian learning consists in updating one's subjective probabilities in light of new evidence by conditionalizing by making the posterior probability P'(A) be equal to the prior conditional probability P(A/E), where E represents the new information learned with certainty. There a number of Bayesian convergence results, demonstrating that, in the infinite long run, the probability of empirical hypotheses (hypotheses for which there is no underdetermination of theory by data) will converge (with a subjective probability of 1) to a single value, washing out the effects of differences in the original priors. However, we are interested in more than just convergence to agreement we

Teleological Reliabilism


are also interested in convergence to knowledge. When does Bayesian learning converge in the long run to a state of knowledge? Let us suppose that the Bayesian is estimating the probability of a given outcome (such as "heads") in each of a series of 'exchangeable' trials (such as flips of the same coin in the same conditions).3 In the long run, all Bayesian learners will converge to the same value as the true, 'objective' probability of this outcome in these trials. From the viewpoint of teleological reliabilism, a convergence to a range r of objective probabilities constitutes knowledge when the following six conditions are met: 1. The trials in the series are objectively exchangeable, in the the sense that they all have approximately the same objective probability, and this objective probability of heads is in fact in the range r. 2. The subjective exchangeability of the trials (the symmetry of the probabilities of permutations of outcomes in the subjective prior) carries robustly the information that these trials are objectively exchangeable. 3. The subjective exchangeability has the proper function of carrying this information robustly. 4. The actual outcomes of the trials were causally irrelevant to the determination of which of the trials were observed (i.e., there was no causally grounded bias in the selection of the observed cases). 5. The number of observed trials was great enough to make the objective probability of convergence to r very high (this is the condition to which various convergence results, including the law of large numbers, are relevant). 6. The fact that the hypothesis that the objective chance of "heads" had a finite (non-zero) prior probability was itself an instance of partial knowledge. The last condition introduces the notion of partial knowledge. Partial knowledge of p consists in a state in which p is true, and p belongs to some set -rr of mutually exclusive propositions, where each member of TT is given a finite, nonzero probability, and where this assignment of prior probabilities to the members of TT has the proper function of robustly carrying the information that the disjunction of the members of TT is true. In other words, p itself is a cause of the assignment of positive probability to p, and this causal chain accords with the proper functioning of the believer's subjective-probability state. The believer must know (in the teleological-reliabilist sense) that the disjunction of TT is true.
3 In de Finetti's convergence result (de Finetti (1980)), a series of trials is exchangeable just in case the prior probability of any two series of outcomes is equal whenever the series are permutations of each other, that is, whenever the number of the various outcomes is the same in each series. Exchangeability represents a kind of symmetry in the assignment of prior probabilities.


Realism Regained

This sixfold condition is needed to exclude Gettier-like examples of Bayesian convergence to the truth that fails to constitute knowledge. Consider the following Gettier cases: 1. The Bayesian converges to the correct objective probability, but does so because some all-powerful genie made visible a series of outcomes selected because of their conforming to the genie's favorite pattern, which just happened to coincide statistically with the objective chance. 2. The Bayesian's normal prior probability function would have assigned zero to the probability of the hypothesis that the objective chance was r, but, fortunately, a blow to the head caused the Bayesian to assign a non-zero probability to this hypothesis. 3. The trials in the series were both subjectively and objectively exchangeable, but they were subjectively exchangeable only because the Bayesian learner wrongly thought that each of the trials occurred on a weekday. Had the Bayesian learner discovered that many of the trials actually occurred on weekends, the trials would no longer have been subjectively exchangeable, and no convergence to the truth would have occurred. In each of these cases, the Bayesian would have converged to the truth as a result of perfectly correct applications of conditionalization, but the resulting rational true belief with certainty would not have constituted knowledge of the objective chance.


Objective Chance and Empiricism

The notions of metaphysical necessity and objective chance play fundamental roles in my account of causation and, consequently, in my accounts of knowledge and the mind. A number of epistemological challenges to modality and objective chance have been lodged in recent years by empiricists such as John Barman, Bas van Praassen, and David Lewis. These include the non-supervenience of chance on occurrent fact and the problem of rinding a rational basis for a connection between subjective and objective probability. Earman (1984) argues that objective chance cannot be acceptable to an empiricist unless it supervenes on occurrent facts. He calls this the "acid test" of empiricism. I am dubious about the viability of a distinction between 'occurrent' and 'dispositional' or 'modal' facts or properties. It may be that all the properties with which we are familiar are at least partly dispositional or modal in character. However, for the sake of argument, I am willing to concede that we can make some sort of sense of a occurrent/dispositional distinction. On any reasonable view of objective chance, objective chance does not supervene on such occurrent facts. Barman's insistence of supervenience assumes that the only properties that can be observed are occurrent properties. This is, of course, the essence of

Teleological Reliabilism


Humean philosophy. It may be, as Hume thought, that no dispositional property is perceived qua dispositional property in its guise as the particular dispositional property it is in single, isolated cases of sensory perception. Even if we grant this, it does not follow that dispositional facts and properties (including facts about objective chance) are not perceived at all. There remain at least two possibilities: (1) we perceive the dispositional property in a single case, but we do not perceive its internal structure (we do not perceive it as dispositional, and we do not perceive which disposition it is), or (2) our perception of the dispositional property qua disposition emerges in the context of perceiving a series of relevant cases. It is the second possibility that I want to pursue here. Humeans may object that I am not entitled to use the word 'perception' in describing knowledge of objective chance that arises from a long series of separate observations. I would ask the Humean to consider the following illustration. I cannot perceive the Louvre in a single experience my perception of the Louvre only emerges through a long series of separate experiences, experiences of various aspects of its exterior and of the contents of its various salons. Nonetheless, it would seem odd to insist that the Louvre is imperceptible. Similarly, the objective chance of a tossed coin's landing heads cannot be perceived in a single observation, but our perception of it emerges from a series of separate observations of coin tosses. What makes the objective chance perceptible is the existence of a causal chain of the right kind between the objective chance and corresponding mental states. This is also what makes various occurrent properties perceptible. Earman may be assuming that only occurrent properties can enter into such causal connections. One of the principal tasks that I undertook in part I was to expose the groundlessness of this assumption. Earman could still insist that our beliefs about objective chance are better described as formed by inference rather than by perception. Once again, this is a distinction of dubious value. For the point of view of teleological reliabilism, what matters is whether a belief is formed in the proper, reliable manner, informed by the appropriate factual situation. Whether this process is best described as one of 'perception' or 'inference' is a secondary matter. However, once again, let's set these caveats aside for the sake of argument and suppose that beliefs about objective chance are based on inferences from observations. Why does this necessitate a principle of modal and stochastic supervenience? Earman seems to be assuming that the only form of inference that can ground inferential knowledge is deductively valid inference, inference in which there is an absolute guarantee that truth is preserved. This is of course, another typically Humean dogma. Unless Earman assumes this, there is no reason why I cannot say both that dispositional and other modal beliefs are inferred from observations of occurrent fact, and that modal and dispositional facts do not supervene on the occurrent facts. Supervenience is a very strong condition: it means that it is impossible for the dispositional facts to vary once the occurrent facts are fixed. Supervenience could fail, and it could still be true that one can reliably infer the dispositional facts from the occurrent ones.


Realism Regained

As a reliabilist, I would respond by insisting that the Hunaean sets the standard for the reliability of inference far too high. I can know, on the basis of a large number of observations of occurrent fact, that the objective probability of some outcome lies in range r, even though it is possible for the very same occurrent facts to be actual in a world in which the objective probability falls outside of r. The mere metaphysical possibility of error is not sufficient grounds for denying a claim to knowledge, unless we insist on repeating Descartes's fundamental error. In characterizing the reliability of statistical inference, it is important to bear in mind two facts. First, all information is relational, in the sense that an information connection always takes this form: a fact (s, <f>) carries the information that some token of type ijj exists in relation R to s. Second, information is often abstract, involving quantification over types. Thus, I do not acquire the information from a series of observations that the objective probability of some type <f> lies in a range r: instead, I acquire the information that there is some type x in relation R to the token-observations such that the objective probability of <j> conditional on x is in range r. This means that the reliability of an information channel can be evaluated without bringing in higher-order objective probabilities. That is, I do not want to say that there is some objective chance that the objective chance of ijj lies in r, conditional on the observation series s, since this presupposes that it makes sense to talk about the objective chance of the objective chance of I4>. Instead, I want to say that there is a conditional objective chance that, given series s of observations realizing type </>, there exists a type x that is realized in relation R to each member of the series s such that the real objective chance of ip conditional on x lies in the range r. The reliability of a statistical method is measured, not by hypothetically varying the world's objective chance function, but by varying the values of the parameters of the trials which determine the objective chances of the outcome in situ. To evaluate the reliability of some belief-forming process, we need to discover whether it causes belief to track the truth. This means that we must consider the beliefs that would be formed across a range of hypothetical situations. In evaluating statistical inference, where the beliefs that are formed are judgments of objective probability, we must decide how to vary the objective probabilities being estimated. It would be problematic to make the objective chance function itself vary, since this would require us to make judgments of higher-order chance, the chance that objective chance might vary in certain ways. The alternative that I am suggesting involves varying some parameter shared by the trials for which the objective chance of the outcome is being estimated. This means leaving the objective chance function itself unchanged but instead changing the condition on which objective chance is being evaluated to a different condition on which the objective chance function determines a different probability-value for the relevant outcome. By way of illustration, suppose that I am trying to estimate the objective chance of "heads" in a series of identical tosses of an unchanging coin. Suppose that I observe thirty "heads" outcomes in a row. I estimate that the objective chance of "heads" is at or very near 1. To test the reliability of my method of

Teleological Reliabilism


forming such estimates, we would consider what estimates I would have formed had certain relevant parameters of the coin or the tosses been different. We can estimate the objective chance of these alternative parameter-values, by considering the processes by which the coins or the tosses were produced (the distribution of weight in the coin, how hard the coin is tossed, etc.). We can then measure the reliability of my inference by measuring how likely it is that my estimate would have varied significantly from the objective chance that would have resulted from these alternative parameter-values, weighted by the objective chance of the alternatives. Van Fraassen's problems with objective chance (van Fraassen, 1987, pp. 3839,80-86) lie in a different quarter. Van Fraassen poses a dilemma for the objectivist: he must solve either the identification problem or the inference problem. The identification problem is the problem of identifying what sort of facts in the world make claims about objective chance true. The demand for a solution of the identification problem is essentially a demand for the reduction of stochastic facts to occurrent facts. I take modal and stochastic facts as primitive, irreducible constituents of the world, so (unlike Armstrong and Tooley) I decline to attempt a solution to the identification problem. Since I decline the identification problem, I must face van Fraassen's inference problem. In the case of objective chance, the inference problem is the problem of explaining the basis for a rational constraint on the relationship between subjective and objective probability. I go along with most objectivists in accepting Miller's principle (which David Lewis calls the "Principal Principle"):

According to Miller's principle, my subjective probability for </>, conditional on the supposition that the objective probability of cf> lies in the interval r, must itself lie in the interval r. The problem that van Fraassen raises is this: what is the basis for this "must" ? Appeals to pragmatic coherency, including appeals to the rational necessity of immunity to Dutch books, are of no avail in supporting Miller's principle in this form, since all such appeals are concerned solely with the relation of subjective probabilities to other subjective probabilities. None of these arguments can ground a constraint on the relationship between subjective and objective probabilities. If a solution to van Fraassen's inference problem must take the form of such an appeal, then no solution is possible. However, I am a thoroughgoing primitivist. I claim that Miller's principle is a primitive demand of reason, for which no further justification is necessary or possible. I thus decline van Fraassen's second problem as well. Why does van Fraassen think that the rationality of Miller's principle needs to be grounded in an argument from consistency? Like Earman, and like all Humeans, van Fraassen thinks that the only form of inference for which no justification is needed is deductive inference. This means that the only form of rational coherency that van Fraassen can recognize is some form of deductive consistency, including the probabilistic generalization of deductive consistency, namely, absolute immunity to Dutch books.


Realism Regained

In contrast, I see deductive and non-deductive inference as very much on a par. In both cases, the inferences are rational because they are required by the fulfillment of the proper functions of the human mind. Since the point of human inference is the extension of knowledge, we can infer that any rational form of inference will be a reliable form of inference. In deductive inference, this reliability reaches its apex: the metaphysical impossibility of error. In many cases, the reliability of rational inference does not reach so high. Analogously, the price of deductive inconsistency, including probabilistic inconsistency, is very high: one is in a state that is necessarily less than optimal from the point of view of truth. Other forms of rational incoherency, such as the violation of Miller's principle, come with a lower, but still very substantial, price. There is no proof that I will necessarily go wrong if I violate Miller's principle, but the objective probability that I will go wrong is high, higher the greater is my deviation from the principle. One who conforms to Miller's principle is more likely (objectively speaking) to succeed in the long run. This, together with the fact that the proper function of subjective probability consists in aiming at the maximization of the objective expectation of utility, is enough to ground the rationality of Miller's principle. Van Fraassen insists on an internal characterization of rationality ultimately, one that can be cashed out in terms of internal consistency. It is not surprising that from such a perspective van Fraassen cannot make sense of any primitive rational constraints linking subjective and objective probability. In contrast, the teleological reliabilist's approach to the characterization of rationality is thoroughly externalist. Rationality is about getting to the truth with a high degree of objective rationality. Violations of Miller's principle violate this external end: hence, they are irrational. If Earman, van Fraassen, and Lewis are right in rejecting objective chance, then there is no possibility of a teleological account of our inductive practices. We might call this the Darwinian problem of induction: how can we explain induction as something for which nature selects? If there is no objective chance, then there is no causal factor lying behind both past, observed frequencies and future, to-be-encountered frequencies. Nature only cares about the latter, but she has access only to the former. Nature selects for things that contribute to our fitness, which is forward looking. The fact that our subjective beliefs are well attuned to past frequencies has nothing to do with our present reproductive fitness, which has to do with the chances of our survival and reproduction in the future. However, it is impossible for natural selection to bring about directly our attunement to future frequencies, since the events constituting these frequencies are causally posterior to our current beliefs and inferential faculties. Induction contributes to our fitness only if there is a causal explanation that links attunement to past frequencies with attunement to future frequencies. This causal explanation must make reference to objective chance as the tertium quid. Why then is it reasonable for me to conform to Miller's principle? The function of subjective degrees of belief is to be the best possible estimate of objective chance. The more closely our subjective degrees of belief approximate the objective chances, the greater is the objective chance that the act that

Teleological Reliabilism


maximizes our subjective expectation of success will also maximize our objective expectation of success. To fail to follow Miller's principle is to guarantee a discrepancy between subjective belief and objective chance. To do this is to frustrate the mind's proper function, a paradigmatic case of irrationality.

This page intentionally left blank


Enduring Substances and Their Identities

18.1 Substances as Logical Constructions

The ontologically basic components of the world are situation-tokens and situationtypes. Situation-tokens encompass such things as events, states, processes, histories, and (in one sense of the word) facts. Situation-types are predicables: multiply-instantiated properties and relations of situation-tokens. The most prominent inhabitants of the world are enduring substances, spatially extended things, typically composed of matter, that experience change, that are sometimes created and destroyed, and that have histories. Enduring substances include such things as living organisms, artifacts, discrete, homogeneous masses, and social institutions. Enduring substances, along with facts about their identities and varying properties and relations, are logical constructions from situation-tokens and types. In the final analysis, reality consists merely of situation-tokens. Substances are really structures of situation-tokens considered or treated in a special way. The first step in this construction is the definition of a substance history. A substance history is a stable, causally connected series of situation-tokens. Formally, a substance history is a pair (C,(j>), where C is a finite sequence (GI, . . . , Cn) of situation-tokens, and 0 is a situation-type, where: 1. each Ci is of type 4>, 2. for each i < n, Cj's being of type <f> causally explains c,+i's being of type 0000000000000 3. there is no sequence C' meeting conditions (1) and (2) such that C is a proper sub-sequence of C'.



Realism Regained

A substance history is a self-perpetuating stability. The situation-type (/> is the sortal whose persistence unifies the history. For example, suppose we have an isolated blob of mercury (causally isolated from any other mercury) with one gram of mass. There are a series of situationtokens of a particular type, namely, being homogeneously mercurial and having one gram of mass, and for each member of the series, the fact that it is of this type explains, via the law of conservation of mass and various principles of inertia, the fact that its successor is also of this type. If the blob were not causally isolated from a second gram of mercury, then it would not be the case that there is a substance history associated with the first gram of mercury alone. Unless we can causally isolate the two grams of mercury (say, at the atomic level), we cannot use laws of conservation to provide separate explanations of the persistence of each of the two grams of mercury. Instead, the persistence of the two grams of mercury would constitute a single, indivisible substance history. For a second example, we could consider Aristotle's bronze statue. In the actual world, the history of the mass of bronze overlaps with the history of the statue per se. The mass of bronze could have existed, as a discrete, homogeneous quantity of alloy, before the statue came to exist, and it could go on existing after the statue ceased to exist. Conversely, the statue can continue to exist even after some part of the mass of bronze has corroded away. Qua mass of bronze, the sortal unifying one history is that of being a homogeneous mass of bronze of a certain quantity. Qua statue, the unifying sortal is a higher-order, functional property: that of serving some public or aesthetic purpose. The histories of living organisms provide very clear examples of substance histories. The type that is sustained throughout the substance history of an organism is a conjunction of higher-order, teleofunctional types. This accounts for the fact that we can continue to have a single history, even though the stages vary widely in many first-order physical properties (from weighing only a few grams to weighing far too many kilograms, for instance). Living systems are self-perpetuating at the functional level: the fulfillment of biological functions at one stage causally explains their fulfillment at the succeeding stage. To each substance history C^, there is a corresponding substance C^. Substances must not be identified with substance histories simpliciter. substance histories are series of situation-tokens, and substances are not. Substance histories are spread out across time, while substances endure through time. Substances are a kind of logical construction out of substance histories. The properties of substances must be explained in terms of corresponding properties of histories. My aim is to reduce substance-talk to situation-talk. If I am successful, I will have saved the appearances, in the sense that our ordinary uses of substance language will come out as largely true under my analysis. To each situation-type if}, there is a corresponding substance-type ijj#. These two types are not identical: they characterize different categories of beings and have quite different relations to locations in space and time. If C^ is a substance history, s is a constituent token of C$, and i/> is a situation-type, then:

Enduring Substances


Informally, substance C* has property il># at situation-token s just in case s belongs to the corresponding substance history C$ and s is of the corresponding situation-type i/>. A similar account of relations between substances can be given in terms of relations between token-constituents of the two substance histories. Change with respect to ^;* between situation s and s' can be accounted for as follows: s and s' are both constituents of substance history C0, with s causally prior to s', s is of type i/j and s' is of type -iif). In such a case, we say that substance Cf has changed from V1* to -r0*. It is important to bear in mind that a substance-property such as ip# is not a relational property: if s \= (<7*|= V^)i we are not to think of ifrft as a relation between C1* and s. Rather, the attribution of ^># to Cf is true at s. The whole construction (C*|= 0#) is essentially a complex situation-type. Substance C* is identical to substance D^ if and only if <f> and ip are necessarily co-extensive, and there is some initial segment C<i of sequence C and some initial segment >< of D such that C<t = ><;. This stipulation captures the Kripkean view that it is the sortal and the origin, and only these, of a substance that are essential to its identity. If C and D share some initial segment, and both are substance histories relative to some sortal 4>, then they represent two possible life-stories for the substance C1^, one, for example, in which the substance becomes a philosopher, and another in which it becomes a stockbroker. It is the shared origin, together with the shared sortal <f>^, that makes these two possible stories of the same substance. Tensed properties, such as 'having been ^>#', or 'going to be (/>#', can also be constructed in a similar fashion. In the case of future-tensed properties, we must make reference to a world (selecting one of the possible futures of the situation-token), as well as to the situation-token itself. Let 'P' be an operator representing the simple past, and 'F' an operator representing the simple future. Then we can introduce truth definitions as follows:

Informally, a past-tense attribution of a property to an enduring substance is true at a given situation-token if there is some token in its past at which the untensed attribution is true. A future-tense attribution of a property to an enduring substance is true at a given situation token and a given world if there is some token contained by that world and posterior to that token at which the untensed attribution is true. Reference to a world is ineliminable in the case of a future-tense attribution, since there are many alternative futures for any given token, but only one past.


Realism Regained

A tensed attribution of a property to a substance can be true at a situationtoken, even if that token is not part of the corresponding substance history. Consequently, a past-tensed attribution of a property to a substance can be true long after the substance has ceased to exist, and a future-tensed attribution can be true before the substance has come into being.


Change and the Johnston Paradox

Mark Johnston (1984) and David Lewis (1986a) have argued that the commonsense theory of enduring substances is incoherent. They argue that one and the same substance cannot have (as a whole) two contradictory properties at different times, since there is no one to make coherent sense of this tensed predication of different properties to the same entity. Their solution is to identify substances with substance-histories and to argue that it is temporal parts of the substance/history that have contradictory properties, not the whole. Peter Simons (Simons, 1991, pp. 134-135) has produced a convenient typology for classifying different approaches to this problem of tensed predication. We can parse a proposition of the form A is F at t in six different ways: 1. A is-F-at t 2. A is F-at-t 3. A is-at-t F
4. (A is F) at t 5. A ((is F) at t)

6. A-at-t is F Johnston and Lewis advocate option 6, in which A-at-t is taken to refer to the temporal part of A located at time t. They take option 1 as misrepresenting the intrinsic property F (like having two legs or weighing ten stone) as a relational property is-F-at, a relation between a thing and a time. Options 2, 3, and 5 also seem to deny the possibility of change, since each substance timelessly and eternally possesses the property of being F-at-t, or the property of being-at-t F. My own account is clearly one that takes option 4: (A is F) at t. Tensed predication of an intrinsic property to a substance is merely one special case of a universal phenomenon: the supporting of situation-types by situation-tokens. The time index t is merely a stand-in for some particular situation token located at t. The predication (A is F) is itself a situation-type, verified by some situation-tokens, falsified by others, neither verified nor falsified by still others.

Enduring Substances



Zeno's Paradox and the Instant of Change

The ancient Eleatic philosopher Zeno raised a number of difficult paradoxes concerning the logic of change. One of these concerned the classification of the substance at the instant of change. For example, at the moment of death, is the human alive or dead? If we say alive, then it would appear that death has not yet occurred at the moment of death. If we say dead, then it would appear that death has already occurred at the moment of death. Each of these conclusions is contradictory. Zeno's paradox is an artifact of our imposing the continuum of metrical time upon what is in reality a set of discrete phenomena. In reality, a state of life is immediately followed by a state of death. In locating these states within a system of time measured by the real numbers, we introduce a new, virtual state, the instant of death, that has no real existence. Hence, the question of whether the human is alive or dead at that instant has no principled answer. Since the instant is the artifact of our own theorizing, we are free to stipulate any answer we please. What happens when a situation-token s includes temporal parts both before and after the change? Should we say that it supports contradictory types, both the human is alive and the human is not alivel No, we must recognize that these types are not mereologically persistent: each is supported by a part of s, but not by the whole. We can find persistent types for such a case: the human is alive at some time and the human is not alive at some time. Both of these non-contradictory types are supported by s.


Hard Cases for Substance Identity Autocatalysis and Biological Reproduction

There are several examples that suggest that my definition of substance history (and, consequently, of substance identity) is too liberal. First, consider processes of autocatalytic reactions. The presence of a particular molecule in solution causes new instances of that molecule to come into existence. We do not want to say that the original molecule is identical to each of the products. A similar example would be the replication of a particular crystalline form during the solidification of some liquid. I would argue that in each of these cases, the linkage between one situationtoken and the next is too weak. The existence of the molecule is not, all by itself, a causal explanation of the subsequent existence of the duplicate. What is needed is the existence of a favorable environment for autocatalysis. We could identify the entire system of molecules and solution as an enduring substance, but this seems all right. Similarly, in the case of the replication of a crystalline arrangement, the seed crystal is not by itself a sufficient causal explanation for the subsequent crystalline layers. But now, have I made the condition too stringent? Is the existence of a


Realism Regained

living organism at one moment a sufficient causal explanation of its existence at the next? Doesn't the environment contribute something to the survival of the organism, just as it does in the case of autocatalysis or crystalline replication? This objection assumes that there is no information about the environment in the substance history of a living organism. What persists from one moment to the next is a system of functional organization, including both intrinsic functions (within the body of the organism) and extrinsic functions (in its environment). There is a single, indivisible system of organization for each living organism, but these systems overlap considerably in their shared environment. Asexual biological reproduction provides another difficult test case. The existence of the living parent at one stage is causally sufficient for the existence of each of the offspring at a later stage. Now, clearly there is something that endures through reproduction, namely, the species itself as a substance. However, the individual organism does not survive meiosis, since the normal organization of the parent does not directly cause the corresponding organization of the children. There is an intermediate stage during which the normal functioning of the parent is disrupted. During that stage, the parent ceases to exist, and after the division is complete, two or more new individuals come into existence. This analysis agrees with that of Peter van Inwagen's (van Inwagen (1990)), who also locates the persistence of living things in the continuity of the process of life. To take another example, suppose a woman dies at the moment of giving birth (or at whatever moment we take as constituting the beginning-to-be of the woman's child). Does the woman-child history constitute the basis for the existence of a single human organism? No, because we do not have direct causal connections between successive stages of the various processes that make up human life: respiration, locomotion, digestion, perception, and so on. The woman's last state of respiration is not directly connected to any of the initial states of biological activity of the child, and the same thing holds for the other organic processes. The causal connection is between a process of reproduction on the woman's side and life on the child's, not between one stage of life (as a whole) and the next.


Fission and Fusion of Individuals

The philosophical literature on personal identity is full of science-fiction scenarios involving the transplantation of all or part of a human brain into another human body. Since humans can apparently survive with somewhat less than half of their cerebral cortex intact, these scenarios raise the possibility of duplicating one person, or fusing several people together. These kinds of radical surgery introduce ruptures in the causal connectedness of the organism's history. In fact, any medical intervention poses at least some threat to organic continuity, since the organism's future functioning is no longer explained entirely in terms of its earlier functioning. We can salvage personal identity only by subtly shifting the relevant sortal, incorporating the practice of medicine as a normal part of our ecological niche. However, organ transplantation poses a serious challenge to this strategy, since we begin to lose

Enduring Substances


our grip on the definition of the system whose survival is being preserved by the medical intervention. It is not medicine per se that poses a challenge to personal identity, but medicine that is transformational, rather than merely restorative or ameliorative. It is difficult to say at exactly what point the challenge becomes insuperable, but I would argue that we are well over the line when we reach the transplantation of significant parts of the cerebrum. At that point, I think we must say that there are no survivors, and that the products of the operation are no longer human. They should be thought of as new, artificial persons.


Star Trek Transporter Accidents

In the TV/movie series Star Trek, there is a transporter device that apparently takes one apart, atom by atom, sends the structural information through space, and re-assembles a duplicate at the other end. Prom time to time, the transporter malfunctions, producing two or more duplicates of the original, with the consequent confusions. Prima facie, the teleporter is fatal to its customers. The functional continuity of the human organism is completely broken, with the teleporter intruding as a tertium quid. As in the case of medical intervention, it may be that a shift in the sortal, from human to human*, where the continuity of the history of human*s includes the normal operation of the teleporter, will provide substances whose identities can endure the teleportation. What then happens in the case of teleporter duplication? At this point, I think we have to bite the bullet and say that Spock (or whoever) is literally at two different places, with incompatible properties, at the same time. I will explore some of the implications of this possibility in the next subsection.


Time Loops and Other Anomalies

Suppose Captain Kirk travels through a wormhole, ends up in the past, and meets an earlier version of himself. How many humans are standing on the bridge? Do we count Captain Kirk twice? In my account of the attribution of properties to substances, I made these attributions true or false, not at a moment of time, but at a particular situationtoken. Normally, one organism will not be at two places at one time, since this can only happen in the presence of a temporal anomaly: normally, contemporaneous, spatially separated tokens are causally isolated from one another. However, time travel (through wormholes or whatever) can make very abnormal things possible. We must say not only that Kirk was thin then but heavy-set now, but also that Kirk is now bewildered over there, but not now bewildered over here. Incompatible attributions are logically consistent, even when they are attributed to the same substance at the same time, so long as they are attributed to different tokens at different places, and so long as these tokens are, through some anomaly, causally related.


Realism Regained

When counting the number of people on the bridge, we should count Kirk only once. Since Kirk now acts in many ways as though he were (per impossibile) two persons, it may be pragmatically convenient to miscount for certain purposes. In addition, Kirk can have a number of relations to himself that are normally possible only between distinct people. He can, for example, clap his right hand against itself and generate noise, should a Zen-like mood pass over him. Although Kirk has only two hundred pounds in mass, he can stand on a scale and make it register four hundred pounds in weight (so long as he cooperates with himself in climbing on the scale). If temporal anomalies were common, we would have to be a lot more careful how we formulated various physical laws involving substances. What happens in the case of situations, like that of t, the present time, that include both versions of Kirk? Must we say that this token supports the contradictory type Kirk is angry and not angry? No, once again we must recognize that the types Kirk is angry and Kirk is not angry are not mereologically persistent. If we want to use only persistent types, we would have to say that Kirk is angry somewhere and not angry somewhere, which is not self-contradictory.


Substrate Theory

I have to confess to having qualms about the reduction of substances to situation histories that I have proposed in this chapter. Especially in the case of personal identity, I share the widespread sense of uneasiness with the idea that personal identity consists in nothing above and beyond causal connections of the right kind between distinct situations. I can feel the pull toward some postulation of a single entity that lies somehow at the bottom of personal identity. Let us call such a unifying entity the substrate of the person or organism. If such substrates exist, they might be situation-tokens of a special kind. These substrate-tokens would be timeless and non-spatial, but they would be causally efficacious. The substrate of person X might be a token that is causally responsible for the continuity and perpetuation of the personal substance history of X. In other words, the substrate would pull some real weight in the causal structure of the world, explaining the coherency over time of certain mental or even physiological processes. If living organisms in general have substrates, they could be thought of as an individual elan vital associated with each organism, a causally necessary condition of the sustaining of the organism's biological functions. Although I can see some attraction to such a substrate theory, it would seem to be highly speculative. A good case for such a theory would have to make good on at least one of the following claims: The existence of substrates is a deliverance of common sense. We have direct awareness of at least one substrate (perhaps one's self). The existence of substrates can be supported as an inference to the best explanation of some phenomenon.

Enduring Substances


However, these claims seem at best doubtful. Probably the strongest case for substrate theory would make use of resilient intuitions we have about personal identity.



An enduring substance need not be composed of anything. Electron holes in semiconductor media and real holes (containing nothing but a vacuum) in materials are two examples. We can trace the history of an electron hole (the absence of an electron from a positively ionized atom) as it moves through the semiconductor. The hole moves as electrons jump from its next position to its current position. The chain of electron-absences is a causally connected history of the appropriate kind; so, the hole counts as an enduring substance. Similarly, a hole in a material substance endures from one moment to the next by the same principles of inertia that explain the endurance of the isolated quantity of material stuff.


Quantum Reality and the Foundations of Materialism

At a bare minimum, materialism entails that the only things that can be causally efficacious are things that have spatial and temporal location. (See for example, David Charles's discussion of physicalism (Charles, 1992, p. 280).) What is so special about spatiotemporal location? Why should that be thought to be essential to causal efficacy? There are two sorts of answers the materialist can give to these questions. On the one hand, the materialist could argue that we have no positive evidence that atemporal causation exists, that situations without spatiotemporal location are ever causally efficacious. On the other hand, the materialist could argue that the success in the history of science of a particular explanatory strategy, one that presupposes the spatiotemporal character of causation, provides a powerful argument in favor of minimal materialism. It is one of the main burdens of this book to argue that the first materialist response is mistaken that we have in fact ample evidence of the existence of atemporal causal connections.1 First of all, the existence of teleological connections, attested to throughout the biological and human sciences, points to the causal efficacy of certain causal and modal facts (section 7.4 and chapter 12). Second, the existence of logical and mathematical cognition and knowledge is best explained by positing the causal efficacy of facts about logical necessity (section 7.3 and Chapter 15). Third, the possibility of theoretical cognition and knowledge in science requires the existence of an atemporal causal explanation of the relative simplicity of causal mechanisms discovered across a wide variety of disciplines (section 17.5). Fourth, I argued in chapter 8 that a reasonable extrapolation of our success in discovering causes leads to the inferring of the
'For a summary of these arguments, see section 21.3.


Realism Regained

existence of an uncaused first cause, without spatiotemporal location. The existence of these independent evidences for extra- spatio-temporal causes means that the materialist cannot rely upon an appeal to ignorance. Positive reasons for limiting causation to spacetime must be given. The second materialist response does involve giving such a positive reason. The materialist can point to the progressive success of a particular model of scientific explanation, stretching continuously from Democritus to Einstein. I will call this the DTE model, for "Democritus to Einstein." The model consists of a potentially fruitful strategy for explaining all natural phenomena whatsoever. This DTE model depends on the truth of four theses: 1. The finite complexity of nature: Every phenomenon consists of a finite number of simple parts. 2. Spatial compositionality: Facts about wholes supervene on intrinsic facts about their parts, together with the spatial relations between these parts. 3. Every projectible correlation or other regularity has a causal explanation (Reichenbach's rule). 4. Causal locality: No action at a spatial or temporal distance. If these four theses are true, then we can hope to find a complete causal explanation of any observable phenomenon by following three steps: (1) analyze the phenomenon into its simple parts, (2) find complete causal explanations of each of the parts of this region in terms of contiguous facts, and (3) explain each feature of the whole region in terms of the features and spatial relations of the parts. Materialism entails that a Laplacean intelligence, if supplied by an oracle with all facts about the physical characteristics of the ultimate simples (unlimited measurement) and with all mathematical truths (unlimited computation), would be able to explain all macroscopic properties and all projectible patterns and correlations. Both Democritean atomism and Einstein's theory of general relativity, as well as many physical theories entertained between these two, fit this general materialistic strategy.2 For example, general relativity tries to explain all properties of complex physical objects in terms of the intensities of fields at the constituent spatiotemporal points and provides a deterministic theory of the evolution of these field strengths. If any of the four theses is denied, then the resulting view cannot be characterized as one of strict materialism. If, for example, we reject thesis (1), then we open the door to supra-physical influences. Lord Kelvin, as Crosbie Smith discusses in his biography of Kelvin (Smith (1989)), recognized that an infinitely complex nature opened the door to the influence of a supra-physical free will
Although Newtonian mechanics violates principle 4 (locality), due to the instantaneous action of gravity, the inverse square law guarantees that any violations of locality due to changes in remote gravitational forces will be negligible. The gravitational attraction of massive distant objects is an essentially uniform influence on all the particles of an isolated system.

Enduring Substances


on natural processes, without necessitating any violations of physical law. The operation of an infinitely intricate mechanism may be unpredictable in principle, even if the basic physical laws are (for any finite system) as deterministic as classical Newtonian physics. Similarly, if we reject thesis (2), we admit the possibility of strongly emergent properties, properties whose existence and causal powers are unpredictable, even in principle, from the properties and spatiotemporal arrangements of the parts of its possessors. Such strongly emergent properties could be radically nonphysical in nature and include mental or spiritual qualities. If we reject thesis (3), we are admitting that there is more in heaven and earth than is dreamt of in our best causal theories. Reliable patterns and correlations exist that cannot be explained by the operation of any physical mechanism. This again could open the door to irreducibly immaterial facts and explanations, explanations involving interpretation and verstehen, for example. Finally, if we reject thesis (4), then we must agree that there is no connection between causality and spatiotemporal contiguity. Consequently, there is no way to exclude the possibility of the causal influence of entities with no spatiotemporal location whatsoever. If causal locality is not even approximately true if differential causal influence does not go to zero as distance increases then there is no conflict with physical theory in supposing that there is influence from states at an infinite distance, i.e., from states outside of spacetime altogether. Moreover, rejecting thesis (4) moves us in the direction of an ontological monism, since the existence of non-local influences challenges the basis of the individuation of material bodies. Without the principle of locality, the universe would be a single, evolving unity, of which individual material bodies are merely partial manifestations. In this section, I want to raise some doubts about the second, third, and fourth of the assumptions of the materialistic strategy, namely, the conjunction of causal locality, spatial compositionality, and Reichenbach's rule. These principles face a serious challenge from quantum mechanics, in particular, from Bell's theorem and the empirical confirmation of the violation of Bell's inequalities. Bell's results (see Mermin (1981) and van Fraassen (1982)) conclusively refute the conjunction of locality and compositionality, since they entail that either (1) each thing in the universe is causally non-localizable, or (2) that macroscopic objects (classical systems) are localizable, but their mereological parts are not. If we take option (1), then the very notion of spatiotemporal location is undermined. (As I argued in section 5.10.2, action at a distance is impossible, because distance is by definition that at which there is no action.) If instantaneous action at a distance were possible, then spatial compositionality would be of no value: one could reduce the present features of the whole to the present features of its parts, but this would not constrain the future states of the whole in any meaningful way. Option (2) corresponds to Heisenberg's interpretation of quantum mechanics (Heisenberg (1958)). This interpretation has not been popular with philosophers, largely, I think, because of its blatant inconsistency with mereological compositionality. According to Heisenberg, properties such as definite


Realism Regained

position, velocity, and momentum are strongly emergent properties of classical, macroscopic systems, not fixed by the properties of the quantum-level, microscopic parts of those systems (which are characterized by probabilistic quantum wave functions, not by the classical properties of definite position and velocity). The emergentism of the Copenhagen interpretation leads to the so-called measurement problem: at exactly what point in a micro/macro interaction does the transition from quantum properties to classical properties take place? If we assume that what is fundamentally real is the causal and mereological structure of situation-tokens and their intrinsic types, then the superimposition of a metric spacetime upon the world is a matter of finding the most favorable balance between simplicity and empirical adequacy. It would be unreasonable to expect that empirical adequacy should always trump the issue of the simplicity of the spacetime geometry. The very simple geometries of a single time line and Euclidean space (in pre-relativistic physics) and of Minkowski spacetime (in general relativity) are viable only through ignoring the occasional misfit between these geometries and the actual structure of events. Consequently, the attribution of classical spatiotemporal properties to macroscopic objects always involves a certain amount of oversimplification. As the scale shrinks, the mismatch between classical geometries and the causal structure becomes greater and greater, until it breaks down entirely at the level of subatomic interactions (as revealed by the failures of the Bell inequalities).3 The boundary between the classical world and the quantum world is consequently a vague one, and the measurement problem admits of no precise solution. As Bohm and Hiley (Bohm and Hiley, 1993, p. 94) have argued, quantum mechanics is essentially about "the process of forming and dissolving wholes," wholes for which mereological compositionality fails. Such wholes play no role in classical or relativistic mechanics. However, if we try to avoid the rejection of spatial compositionality that is explicit in the Heisenberg interpretation by trying to force definite spatiotemporal properties and relations on the objects at the quantum level, the result is no better for materialism. The Bell inequality results force, by a kind of recoil, the non-locality of the quantum systems to rise to the level of the macroscopic. Without causal locality, the very notion of the location of a macroscopic substance becomes problematic. We can no longer analyze spatiotemporal location in terms of causal relations, since there is no longer any correlation between causal relatedness and spatial distance. Location would have to be treated as a purely subjective phenomenon, simply a matter of how things appear to us. This would mean, moreover, the abandonment of even minimal materialism, since it presupposes the objectivity of location. Moreover, the limitation of causation to the spatiotemporal is plausible only if causal order agrees in every case with temporal order. If temporally reversed
In The Undivided Universe, David Bohm and Basil Hiley (Bohm and Hiley, 1993, p. 378) suggest just such an emergence theory of Cartesian space. They suggest that the difficulty of interpreting quantum mechanics is the result of "trying to force quantum laws into a framework of a Cartesian order that is really only suitable for classical mechanics."

Enduring Substances


causation is acknowledged as in Cramer's interpretation of QM (Cramer (1986)) it seems arbitrary to exclude atemporal causation. If the materialist tries to preserve compositionality by giving up causal locality and admitting superluminal influences, then the theory of relativity entails that he has also acknowledged temporally reversed causation, since, if no light signal can reach B from A, then in some reference frames A precedes B, and in other frames B precedes A. If A causally influences B, then there is a cause that is (in some frames of reference) later than its effect. On the Heisenberg interpretation, we can still interpret location as an objective property of macroscopic systems. Moreover, we can continue to hold the principle of causal locality: two systems with spatial location (i.e., two macroscopic systems) can influence one another directly only if they are spatially contiguous. Indirect influences via microscopic parts escape the constraints of locality, since the parts have no location, strictly speaking. However, under most ordinary conditions, such quantum-level effects are negligible. There are in fact three levels at which we encounter the primary qualities of position, shape, velocity, volume, and so on: 1. The mereotopological level, the "commonsense," qualitative geometry attributable to the network of causal relations. 2. The metrical geometry of classical physical theory (including Newtonian and relativistic mechanics). 3. The extrinsic functional properties of observable objects in the natural human environment that correspond to our perceptions of location, shape, and volume. At each level, we can talk about such things as relative distance. At the level of mereotopology, this talk reflects the causal relations actually holding between event-tokens. Spatiotemporal relations at this level accurately reflect the underlying causal realities (at both the macroscopic and the microscopic levels), but the relations have few formal properties and cannot sustain a simple metrical geometry. At the level of physical geometry, we sacrifice some accuracy and comprehensiveness for the sake of finding a simple geometry with strong formal properties. Classical mechanics, by ignoring the microscopic level, can successfully impose a very elegant geometry on the network of causal relations at the macroscopic level. Finally, just as there are functional properties of observables corresponding to the secondary qualities of sensation, so are there functional properties corresponding to the visual and tactile qualities of shape and location. Objects of certain types have the extrinsic function (as parts of the natural environment of humans) of stimulating certain kinds of sensory representations in the human mind, representations corresponding to the primary qualities. At this level, Berkeley was quite right to insist that there is no essential difference between primary and secondary qualities. The difference between the two kinds of


Realism Regained

qualities consists in the fact that there are systems of properties roughly corresponding to the sensible primary qualities at the level of causal mereotopology and of physical theory, while there are no such properties corresponding to the secondary qualities.


The Many Worlds Interpretation

There is one interpretation of quantum mechanics that would seem to provide some hope for materialism: the many-worlds interpretation of Everett and DeWitt. According to this interpretation, there is no collapse of the wave function. Instead, the superposition of eigenstates represented by the quantum wave function corresponds to the existence of multiple states of the universe. Individual particles can have many different positions, momenta, and other characteristics at the same time. Since the evolution of the wave function is fully local, the many-worlds interpretation seems to preserve both locality and mereological compositionality. On the Everett interpretation, measurement brings about a splitting of the world into many worlds or many 'relative states' of the world. One fundamental question to be faced is this: is world-splitting a causal process or not? If it is not a causal process, then Reichenbach's rule is violated, since splitting into coordinated states is used to explain observed correlations. However, if worldsplitting is a causal process, then it is a very peculiar one. For example, the non-occurrence of an interaction can cause a world-splitting, such as our failure to observe an electron at one of the slits in the two-slit experiment (Bohm and Hiley, 1993, pp. 123-124). In effect, what is going on in other worlds or remote parts of this world can produce world-splitting in this world, a very radical sort of violation of locality. The greatest difficulty with the many-worlds interpretation, noted by Bell, Bohm and Hiley, van Fraassen (van Fraassen, 1987, p. 85), and many others, concerns the interpretation of quantum probabilities. Suppose that quantum theory predicts that a particular measurement has two possible results, one having a probability of 75%, the other 25%. According to the many-worlds interpretation, both observations will actually be made, each in a different world. What then does it mean to say that the first is three times more likely than the second? There is no non-circular answer to this question that can be given in terms of the standard many-worlds interpretation. Another problem for the standard many-worlds interpretation is that of specifying a privileged basis for decomposing the quantum function into a plurality of precisely defined worlds or "relative states" (in Everett's terminology). The wave function does not by itself determine which operators represent real properties, properties that are fully and determinately realized in each of the many worlds. Thus, many-worlds theorists must supplement quantum mechanics with some sort of metaphysical principle for determining this basis. Albert and Loewer (1989) have devised a variant, the many-minds interpretation, that provides an intelligible meaning for quantum probabilities and defines a privileged basis for the decomposition of the wave function. Suppose

Enduring Substances


that each brain state of a certain kind is inhabited by infinitely many minds, all in exactly the same mental state, which state is determined by the underlying brain state. In the example given above, each mind makes a transition to one of the two observation states. Each mind has an objective chance of 75% of ending up in the first state and a 25% chance of ending up in the second state. The Albert/Loewer theory is an unusual mixture of physicalist and dualist elements. The mental state of each mind is wholly determined by the underlying physical state. However, the continuity through time of each mind is wholly unrelated to any physical substratum. Albert and Loewer have solved the probability-interpretation problem only to create a still more serious problem: the problem of accounting for the diachronic identity of individual minds. A causal explanation of mental identity through time is excluded, since such causal links would have to be independent of physics, and yet the synchronic state of each mind is wholly determined by the corresponding physical state. A spatiotemporal or corporeal basis for diachronic identity is obviously not available, since each of the successor minds have exactly the same spatiotemporal relations as the infinitely many duplicate minds occupying the same physical state.

This page intentionally left blank


Eudaemonism and the Objectivity of Value

The objective reality of value is a given of human experience. The task of philosophy is not to explain away the objectivity of value, but to reconcile the objective existence of value with four apparent problems. These four problems include two forms of "queerness" about objective values noted by J. L. Mackie, plus the problems of semantics and epistemology. Mackie argued that objective values would be queer, incapable of being integrated into a rationally coherent ontology, because, unlike all other facts, they (1) provide categorical reasons for acting, and (2) are intrinsically motivating. Many have noticed the philosophical problems inherent in postulating objective values, since they seem to be imperceptible and without definite location. Finally, the most serious problem of all is the semantic one: given that there are objective values, how do our thoughts and words become linked to one rather than another? By bringing on board the theory of teleofunctions, it is possible to revive the eudaemonistic theory of objective value developed by Plato and Aristotle. This eudaemonistic theory is able to solve the four problems in a systematic and principled way.


Objectified Subjectivity: A Dead End

Is the distinction between a good life and a bad one an objective distinction, or does "thinking make it so"? This is the fundamental question of meta-ethics. I want to defend a robustly objectivist thesis, much more so, I think, than many others who are often associated with objective or naturalistic ethics. I do not want to identify the objective good with an idealized version of subjective good. That is, I do not identify the good with what a suitably idealized person would want or value. This sort of approach can achieve only pseudo-objective value, not the real thing. Proposals of objectified subjective value fall into two categories. First, there


Realism Regained

are the intersubjective theories, such as that of Hume's impartial spectator or Mill's competent judges (in chapter 2 of Utilitarianism). Second, there are the individualized versions, such as those of Brandt or Railton. The intersubjective theories fall prey to two objections. First of all, the intersubjective approach can give us no good reason to believe that all idealized subjects will want the same thing. Even if it turned out that they did, this would be merely a fortunate accident and would still not secure the objectivity of value. Convergence of ideal inquirers on some opinion is evidence that the opinion is true, because the truth of the opinion is part of the best explanation of the convergence. The objective fact with which the opinion is concerned comes first, causally speaking: the convergence is causally dependent upon it. This is why we cannot simply identify the objective good with that value upon which idealized agents would converge, since the convergence cannot explain itself. Second, as my colleague T. K. Seung (1993) has persuasively argued, the definition of "ideal" as used in every version of the intersubjective approach always incorporates the theorist's prior opinions about what is in fact good. The only sort of ideal agent that could be relevant to the project of grounding a theory of objective value would be an objectively ideal agent, and this means that we must have at least one objective value whose existence is not explained by intersubjective convergence. If at least this one, why not many more? An ideal agent is one whose cognitive faculties are fulfilling their proper functions. Why not identify an objectively good life with one in which some class of proper functions is fulfilled? The idealized self of Brandt (1979) and Railton (1986a) also fails as a basis for genuine objectivity. Brandt and Railton identify what is objectively good for me with what my ideal self would want, or what my ideal self would want me to want. As in the last case, there are at least potentially some problems with giving a value-neutral justification for one form of idealization. Railton argues that, since he is trying to give an ontological account and not a conceptual analysis of value, the circularity of his theory is not vicious. In effect, Railton gives a recursive definition of value: my good is whatever my ideal self would want, where the characteristics of my 'ideal' self are also determined by what my 'ideal' self (in the same sense) would want. This is a coherent theory of idealized subjectivity. An objectivist would, with some plausibility, argue that the principle that value is coextensive with the preference of an ideal self is plausible only when the standards defining the ideal self are themselves objectively valid. However, the more serious problem for Railton's account concerns the possibility of the sort of semantic reference to the good that Railton postulates. The ideal-self theorist faces a dilemma. She must either identify the property of being good with the property of being wanted by her ideal self, or she must identify that property with what Railton calls the "reduction basis" of that property, that is, with the conjunction of physical and psychological facts that make it the case that her idealized, fully informed self would want what it does. In either case, the ideal-self theorist cannot give a viable account of both the semantics and the epistemology of ethics. In particular, she cannot explain how the property of being good can be causally relevant to our experience. Railton



(Railton, 1986b, p. 142) explicitly takes such causal relevance of goodness to be a necessary condition of ethical objectivity: "it is such and we are such that we are able to interact with it, and this interaction exerts the relevant sort of shaping influence or control upon our perceptions, thought, and action." In the first case, I think, the problems are fairly clear. It is hard to see how the property of being wanted by an idealized self could be causally efficacious in the real world, since it makes reference to the activity of a purely hypothetical being. If goodness is not causally efficacious, then facts about goodness cannot be involved in my forming opinions about my good, and we run afoul of the Gettier examples, being unable to distinguish knowledge from right opinion. In this case, ethical inquiry is not the investigation of a condition ontologically distinct from the result (under ideal conditions) of the inquiry itself. However, the second case is even more problematic.1 The reduction basis of the good varies wildly from individual to individual, and from time to time, in the case of a given individual. The property of goodness must then be identified with an infinite disjunction of conjunctions of possible reduction bases with the corresponding fulfillment of the ideal desires (or equivalently, with an infinite conjunction of material conditionals of the form, if in physical state Ri, then in state Gj). Such an infinite disjunction would be a paradigm of a genuinely disjunctive or gerrymandered type, which, as I argued in section 4.8.1, can never be causally relevant or efficacious, contrary to Railton's intentions. Moreover, this means that our idea of the good must carry the corresponding infinitary information, and must have the robust carrying of this information as its proper function. However, it is surely impossible for us, given our finite capacities and finite evolutionary history, to have any proper function that is infinitely complex in nature. How could the fact that a state carries some infinitely complex information possibly contribute causally to the existence of humankind? The idea of the good has been involved in only finitely many events in the evolutionary history of mankind. It is impossible that every disjunct in the infinitary disjunction played a causal role in contributing to the successful propagation of the species. Hence, the representational content of the idea must be unitary. In addition, there is an even more fundamental objection to Railton's account. There is no reason to believe that there actually exists a disjunctive type that is equivalent to the higher-order type being desired by one's ideal self, as I have already argued in section 16.1. There could have existed many physical types that do not in fact exist. Only physical types that exist in actuality can be components of actually existing disjunctive types. Hence, no disjunctive type, not even an infinitary one, is equivalent in intension to the higher-order type. In some worlds, the physical type realizing the reduction basis for the ideal-self preferences will be a type that does not even exist in our world and that therefore cannot be included in any actual disjunctive type. Consequently, the objectification of subjective preference or choice is inadequate as a basis for objective value. The theories of Hume, Mill, Brandt, and

It is this horn of the dilemma that Railton grasps (Railton, 1986a, p. 25).


Realism Regained

Railton have value as accounts of the epistemology of ethics. Where they fall short is in accounting for its metaphysics: what is it that constitutes the reality of objective value?



The identification of good with the fulfillment of natural functions originates with Plato and is carried on in the eudaemonistic tradition of Aristotle, Aquinas, and Butler, to name a few. Eudaemonia is the state of living a perfectly good life. In order to define a good life in terms of teleofunctions, we must first distinguish between primary and secondary functions. A secondary function is one whose proper operation presupposes the failure of some function. For example, the function of wound healing is secondary, since its fulfillment presupposes that the body has suffered some damage, preventing it from fulfilling one or more of its natural functions. The operation of antibodies is a secondary function, presupposing the presence of disease. Anger is a secondary function, presupposing that some function has been thwarted by the unjustified action of others. A primary function is any function that is not secondary. Clearly it would be a mistake to identify eudaemonia with the fulfillment of all of one's functions. This is an impossible state, since the fulfillment of any secondary function presupposes the failure of some other function. Hence, I will define eudaemonia as the state in which all of an organism's primary functions have been fulfilled. In chapter 12,1 developed a moderate, Aristotelian account of teleofunctionality. On the basis of that account, we would expect eudaemonia to consist in the fulfillment of a largely harmonious system of functions. Whether or not all of these functions can be explained in terms of their contribution to the reproductive fitness of humans (that is, in terms of natural selection) is an empirical, and not a conceptual question.


The Connection between Eudaemonia and Motivation

In Ethics: Inventing Right and Wrong (Mackie (1974)), J. L. Mackie argued against the existence of objective good by charging that such a thing would be "queer" in two ways: first, it would, by virtue of simply existing and being recognized as such, have an essential power of engaging anyone's motivations. Second, it would, in the same way, have an essential power of providing anyone with reason for doing something, independent of his desires or tastes. With a eudaemonistic theory of objective good, one can provide an explanation of these two special powers that dispels Mackie's charge of queerness. What is it that people really want? Is it that all of their present desires, aspirations, and intentions be fulfilled? No, because it is possible that after having



fulfilled these, people can find themselves dissatisfied, even miserable, coming to realize that their desires were faulty, erroneous, and in need of revision. Is the goal of all action then the feeling of satisfaction and contentment, or attendant feelings of pleasure and ease? Again, this is clearly wrong. If one were offered a pill that would guarantee a lifetime of warm and fuzzy feelings, but would also guarantee that one would remain idle, ignorant, and friendless, it would be foolish to accept such a pill. The feelings of satisfaction and pleasure are reliable but fallible indicators of what it is we really want. What, from an anthropological point of view, is the function of our will, of our ability to want and to desire? It must be to coordinate our actions so that we have the greatest chance of fulfilling all of our functions, including digestion, metabolism, and reproduction. The point of having a will is to aim at eudaemonia. Desires and feelings of pain, pleasure, contentment, and dissatisfaction are all mechanisms whose proper function it is to move us in the right direction. There is a non-vicious circularity built into the object of our will. We aim at the fulfillment of all our functions, including the function of the will (which is to aim at the fulfillment of all our functions). In other words, the content of eudaemonia for reflective organisms like humans must be specified recursively. Kant was wrong to contend that the only good thing is a good will, since in that case there would be no base case with which to start the induction. However, he was right to include a good will as a part of the good. There is an ineluctability to our wanting eudaemonia. We do not choose to aim for it, nor could we choose not to. Eudaemonia is the end for which sake the faculty of choosing exists. When the will, the faculty for planning and choosing, is functioning properly, it is aiming at eudaemonia. Humans can vary in the degree to which they understand what eudaemonia consists in and how best to achieve it, but all humans want eudaemonia as their sole ultimate end. Although eudaemonia is an ineluctable end for normal human beings, the connection between eudaemonia and human motivation is not an unbreakable one. It is possible to be in the grip of defective desires: desires for conditions that are destructive of eudaemonia, even desires that are known to be destructive of one's fulfillment. For example, an addict may find himself with an overwhelming desire for cocaine, despite the fact that the addict is well aware of the fact that cocaine is not good for him. However, it is too much to expect a theory of objective good to support an unbreakable connection between objective good and human desire. It is enough if it supports a natural and essential connection between the two, and that eudaemonism can supply. It is also important to realize that eudaemonia is an inclusive, and not a superordinate, end. I am not claiming that everything we want we want as a means to the end of eudaemonia. Rather, I am claiming that all of our natural desires are desires either for a constituent of eudaemonia or for something that tends to contribute to eudaemonia. Hume famously argued that reason is the "slave of the passions," that reason uncovers only facts, and that facts cannot motivate without the cooperation of desire. A eudaemonist does not entirely disagree with Hume. The goodness of


Realism Regained

eudaemonia is not entirely independent of the fact that we all (qua humans) desire it. The fact that eudaemonia is good for us depends in part on the structure of our faculty of desire. Indeed, Hume recognized two functions of reason with respect to the good: the verification of the real existence of the proper object of a passion, and the selection of adequate means to the attainment of that object (Hume, 1969, p. 463, book II, part III, section 3). Thus, reason's function is not wholly instrumental: it must also apprehend the proper object of the subject's passion. The crucial issue that divides the Humean from the eudaemonist is this: are the proper objects of our passions invariably specifiable in ethically neutral language? If so, there is no room for a normative science of ethics. Reason must simply introspect and discover the ethically neutral objects of our passions and then apply itself to selecting the most effective means for securing those objects. However, suppose that some of our most important passions or desires have, as their proper objects, conditions that can only be specified in ethical terms, such as wisdom, virtue, or true happiness? Then reason's non-instrumental function in guiding action is not merely one of introspection; it also includes the science of ethics, an investigation into what it is we truly want. For Hume, there is such a thing as the science of ethics, consisting in the discovery of those things that human beings invariably desire and approve. However, this science guides our actions only instrumentally, by informing us about how we and other humans are likely to act. For the Aristotelian eudaemonist, ethics is action-guiding in a more direct way: it clarifies for us the nature of our end, which is not (contra Hume) transparent to introspection. Mackie's second charge of queerness concerned the power of objective good to provide reasons for action. Mackie followed Hume in adopting a purely instrumental conception of reason: the faculty of knowledge has as its purpose the selection of means, not of ends. However, on a teleological conception of human nature, we have access to a substantive conception of reason. Knowledge includes knowing what to want, as well as how to get what one wants. The qualification for participating in rational dialogue is the possession of properly functioning cognitive faculties. These cognitive faculties include our wantings, valuings, and preferrings. One whose mind is diseased in its conative faculties is as much disqualified from full membership in the institution of reason-giving and reason-taking as is one whose capacities for deductive logic are damaged.2 In response to eudaemonism, Mackie argues that the theory depends on a confusion of two possible senses of the phrase, the good for man. This could mean either (1) "what men in fact pursue or will find ultimately satisfying, " or (2) "man's proper end, ... what he ought to be striving after, whether he is or not" (Mackie, 1977, pp. 46-47). An account of eudaemonia in the first sense is a descriptive statement, with no implications for normative ethics. An account
At the moment, I'm assuming that all deviations from reason are by way of a defect. I'm setting aside the possibility of the teleological suspension, for the individual, of the ethical and the rational, a possibility I will take up in section 20.8.



of eudaemonia in the second sense already presupposes some positive theory of the good, and so cannot inform ethics. Clearly, an Aristotelian must insist that he uses the word eudaemonia, or the phrase the good for man, in the first, descriptive sense. It makes no sense to say that we ought to pursue eudaemonia, since we must do so, and oughts only apply to things we both can do and can fail to do. However, Mackie simply begs the question by claiming that an investigation of eudaemonia in the first sense has no implications for normative ethics. Mackie assumes here the very descriptive/normative dichotomy that is in dispute.


Nature and Nurture

The nature of an organism (in the normative sense of the word) is constituted by the totality of the functional organization of the organism as situated in its environment. Nature in this sense cannot be identified with the organism's genetic endowment, nor can it be assumed to be utterly disjoint from the effects of the organism's social and cultural environment. It is natural for human beings to have parents, to learn a language, to acquire a history and a network of social relationships, no less natural than to eat and drink, or to have a heart that circulates the blood. Socialization and acculturation are themselves natural processes, processes with natural teleofunctionality. A human organism's capacity to exercise its functions can be damaged by a bad upbringing, as much as by defective genes. Ethics, as a part of dialectic, is addressed to humans whose rational faculties are in good working order. Hence, a reasonably good upbringing is a prerequisite for full participation in the practice of ethics as a science.


The Unity and Universality of Good

In the Summa Theologicae, Aquinas asks two critically important questions: Does every human have a single ultimate good? Do all human beings have the same ultimate good? Eudaemonism is committed to answering "Yes, for the most part" to both questions. The unity of the human organism depends on the harmony, the mutual compatibility and even interdependence, of that organism's primary teleofunctions. To the extent that these functions are compatible, eudaemonia constitutes a coherent system of ends. If the system of ends of a human being becomes radically discordant, falling into two or more mutually exclusive sets, then the unity of personality has disintegrated, resulting in incoherent and mutually adversive actions. The science of ethics presupposes that its participants are rational, cognitively healthy individuals. Hence, ethics presupposes that each has a single, approximately coherent end.


Realism Regained

There is a special branch of ethics, which we might call "corrective ethics," that deals with the situation of persons who fall far from the ideal of rational coherency. This branch is concerned with the identification and study of secondary functions, whose operation presupposes the existence of some degree of irrationality or personal disintegration. The study of such phenomena as guilt, shame, remorse, and regret would fall within this branch. Do all human beings share the same ultimate end? To the extent that all teleofunctions are explained by means of Darwinian selection, then this must be so, at least approximately. All humans share a common evolutionary background, and nearly all of the functional aspects of the human personality were fixed in these prehistoric times. Many of these natural teleofunctions are schematic or abstract, in the sense that their concrete realization depends heavily on historical and cultural contingencies. For example, every human has the natural function of speaking a language, but there is no particular language which it is natural for all (or even some) humans to speak. It is natural for every human to have a conception of his family's history, but there is no particular historical conception that it is natural to have. It is this schematic character of human nature that makes possible the wide variety of culture across human societies and periods of history. These underlying anthropological functions provide a standard for evaluating various cultures. We can ask how well some aspect of a human culture fulfills the abstract teleofunction it serves. We can evaluate, for example, how well a given language fulfills the teleofunctions of language: how well it avoids ambiguity and confusion, how efficiently it conveys important information, how aesthetically pleasing are its sounds and cadences. The ethical norms and counsels of a culture can be subjected to evaluation on similar grounds. Eudaemonia is impossible apart from the resources that culture provides, but some cultures do a better job of this than others. Can members of certain nations acquire entirely new functions in the course of cultural evolution? Certain patterns of behavior could become fixed on a widespread scale in a society because that behavior promoted some effect valued only in that society, for special, historical reasons. Do these cultural functions become incorporated into a specialized, historically conditioned form of eudaemonia? The answer depends on the extent to which the fulfillment of these functions becomes incorporated into the function of the will itself. My hunch is that the human mind is so constructed as to resist the incorporation of novel elements into the constitution of eudaemonia. The coherency of human action, or, equivalently, the unity of the human personality, is biologically imperative. A human who is constantly working at cross-purposes enjoys less success in survival and reproduction than one whose personality is directed to a single, coherent end. The extent to which the fundamental orientation of the will is subject to change in history is a matter for empirical investigation: it may be that, as long as the new elements can be readily harmonized with the old, the will is subject to a limited degree of historical malleability. As we saw in chapter 12 (section 6), natural kinds must be identified at the level of populations and not individuals. There can be natural sub-types within



a given natural kind, varying in some degree in the proper functions that individuals in each sub-type realize. The most obvious example of this is the division of the two sexes. There may be other natural sub-types within humankind: a natural pattern of distribution of talents that promotes a socially advantageous division of labor. To the extent that humans do fall into such distinct natural sub-types, their natural ends will also vary. Ultimately, eudaemonia is something that can be fully realized only at the level of complete societies (as Aristotle himself at least partially realized). Nonetheless, there must be a limit to this variation among sub-types, if humankind is to constitute a single natural kind, and not merely a cluster of symbiotic kinds. There is good reason to doubt that the variation is so wide as to include humans who are slaves by nature (as in Aristotle's account in the Politics3) or naturally decadent (as in Nietzsche's view).


Indeterminacy and Objectivity

In his work on anti-realism, Michael Dummett suggested that the defining characteristic of a realist is her commitment to bivalence: to the thesis that every proposition in the relevant domain is either true or false, and not both. However, as a number of critics have pointed out, there is no necessary connection between this criterion and the requirement of ontological commitment. A realist is one who believes that there is some reality, potentially exceeding in its richness our capacities to investigate it, that determines which of our propositions in the relevant domain are true and which are false. If realism is true, then when we agree about some proposition, on the basis of shared knowledge, the fact about which we are agreeing plays an indispensable role in the causal explanation of our agreement. Realism in this sense (the indispensability of reference to corresponding facts in fashioning complete causal explanations of our cognitive practices) is fully compatible with a failure of bivalence. A proposition of a realistic domain can fail to be either true or false, if the corresponding set of facts is itself incomplete. Indeed, it is possible to be a realist while accepting that certain propositions in the domain are both true and false, so long as there exist facts in a suitable state of overdetermination. T. K. Seung and I (Seung and Koons (1997)) have argued for just such a realistic overdetermination of fact in the domain of ethics. I would argue that the law of non-contradiction is a highly reliable, and yet fallible, rule. There is another dimension of indeterminacy in ethics that should be acknowledged. The existence of an objective ideal of eudaemonia entails only the existence of a partial preference ordering of states, based on the set of primary functions that are fulfilled in each state. One state is clearly to be preferred to a second if the set of functions fulfilled in the first is a superset of the set of those fulfilled in the second. Many states will be mutually non-comparable, as Isaiah Berlin argued.

See the discussion of slavery and natural rights in (Arnhart (1998)).


Realism Regained

When we find ourselves in a state in which it is impossible to fulfill all of our primary functions, certain secondary functions are activated. These secondary functions guide decision making in sub-optimal conditions. The proper operation of these secondary functions, when risk is involved, provides a normative standard from which we can construct a cardinal utility or welfare function. Presumably, this welfare function will be highly underdetermined by the set of secondary decision-making functions. We could model such an underdetermined function by a set of acceptable functions, or by a single, interval-valued function.


The Semantics and Epistemology of Ethics

Since practical reason has guiding us toward eudaemonia as its principal proper function, the exercise of practical reason is a reliable means of achieving eudaemonia. Insofar as our practical judgments are the product of healthy, wellfunctioning cognitive faculties, they will tend to be accurate. All of our thoughts are causally connected to the property of eudaemonia, since eudaemonia is the natural end to which our faculties are ordered. This means that our concept of goodness will have an exceptional semantic basis. We can distinguish the concept of the good from other concepts by virtue of its cognitive-functional role. Characterizing an act or state as good is to qualify it as a possible goal for action. Characterizing an act as best, all things considered, is tantamount to adopting the act as an intention. Among the sources of knowledge of the good, the first and most important is personal ethical development. One grows in one's knowledge of the good as one's cognitive faculties develop and mature. Since this intellectual maturity is itself a great good, growth in the knowledge of the good depends on success in achieving the good. As one's capacities for the exercise of one's functions improves, this brings with it a fuller expression of one's cognitive and conative faculties. This ethical development depends in large part on a good upbringing. The example of mature character in one's elders and instruction through story and precept both play an indispensable role. A second important source of ethical knowledge is the testimony of the wise, of persons who have already achieved a high level of developmental maturity and experience. There is a certain degree of unavoidable circularity in appealing to the wise, since the ability to recognize wisdom is itself a product of ethical knowledge. Nonetheless, this circularity can lead to a beneficial recursion, rather than vicious stagnation. Thirdly, the experiences of pleasure and displeasure, contentment and dissatisfaction, and other subjective impressions of one's degree of welfare are natural indicators of eudaemonia, reliable although fallible. Fourthly, there is the possibility of scientific knowledge of eudaemonia, based on the empirical investigation of the teleological generalizations and connections associated with human nature. Medicine and physiology can clearly give us



insight into the nature of health, a key component of eudaemonia. Sociobiology and evolutionary psychology can reveal which aspects of human behavior and social organization have been adaptive.


Eudaemonism versus Evolutionary Ethics

It is important to distinguish teleological eudaemonism from the various versions of evolutionary ethics that have been proposed since the time of Darwin. I will group evolutionary ethics into three categories: (1) the Whig interpretation of evolution, (2) nature as a moral paradigm, and (3) survival value as the ultimate value. There are elements of the Whig interpretation of evolution in Herbert Spencer, but I think it has been more influential in popular culture than among scientists or philosophers. The WIE is implicit in the first series of Star Trek, where the characters are constantly discussing whether or not some alien species is 'more evolved' than they are. What I mean by WIE is the view that evolution has a determinate direction, an absolute Up and Down, with human beings on the top (at least so far). Earlier forms of life are early stages of a process with a predetermined terminus, namely, civilized Homo sapiens (the Victorian Englishman), or, perhaps, some form of Uebermensch just over the horizon. The WIE is not incompatible with teleological eudaemonism, but the two are logically independent. The kind of cosmic teleology implied by the WIE is not part of the immanent teleology of human functioning required by eudaemonism. The Whig interpretation of evolution is in conflict with eudaemonism if it asserts that higher forms of life are better, and that we have some sort of obligation to further the process of evolution. Eudaemonism is essentially conservative, backward looking: what determines eudaemonia for us is what functions have been realized so far in human nature, not what new functions may arise in new forms of life in the future. A number of ethicists have proposed that nature, as described by Darwinism or evolutionary biology, be taken as an ethical paradigm to be imitated. Some have urged that since natural selection exists, it must be right, so it is right for us to be tough-minded and allow inferior humans to perish. Dewey, believing that evolution is a Good Thing, adopts flexibility and mutability as the ultimate values of human life. All of these forms of evolutionary ethics are guilty of a fairly crude version of the "naturalistic fallacy": taking whatever exists (on a large and permanent enough scale) to be ethically paradigmatic. Teleological eudaemonism is committed to no such inference. The measure for human happiness is human nature, not some cosmic phenomenon. We are to be fulfillers of human nature, not imitators of nature. Some evolutionary ethicists, such as B. F. Skinner, have taken survival, or the survival or reproduction of human genes, to be the ultimate value, in terms of which all other projects and intentions are to be evaluated. According to


Realism Regained

this view, survival value is the only value, objectively speaking. Knowledge, friendship, music, and creativity are all valuable, if at all, only as means to survival. Survivalism of this sort is not compatible with eudaemonism. According to eudaemonism, one's survival and reproduction are real values, since most if not all of our functions have these as natural ends, but these are not the only values. All of the components of eudaemonia, including survival, friendship, virtue, productive work, and knowledge, are equally ultimate. We could imagine a creature so constructed by nature that each of its choices was made in light of their reproductive expected value, but human beings are clearly not such creatures. Nor is it clear that such creatures would be better at reproducing themselves than we are: it is often the case that one's chances of achieving X are increased if one forgets about X and seeks Y instead. I might add here that, contrary to the views of genetic imperialists such as Richard Dawkins, there is no value to the reproduction of one's genes per se, nor do genes have self-reproduction per se as their natural function. For example, suppose a wealthy industrialist decides against having children, but instead builds several large factories producing tons of DNA that duplicates his own genes. He has these frozen, loaded into canisters, and launched into deep space. Have the industrialist's genes fulfilled their natural functions? Far from it. The natural end of reproduction is the reproduction of a form of life, not of a quantity of chemical compound. Survivalism is based on a confusion between survival's being the ultimate function of some feature and survival's being the ultimate value for human choice. If Darwinism is true, then all functions can be explained in terms of natural selection. This entails that all functional features have, as their ultimate function, the function of contributing to the reproduction of some biological kind. The heart has the function of pumping blood, and it also has the ultimate function of enabling us to survive and reproduce. However, the existence of this ultimate function does not annihilate the reality of the proximate function. It would be wrong to say that the function of the heart is to ensure reproduction, and not to pump the blood. Similarly, our capacity to love our families and friends presumably enhances our reproductive fitness, but these functions are fulfilled only when we really do love our friends and families, and not when we cynically use them to maximize our chances for reproduction. To enjoy eudaemonia, we must fulfill our proximate functions as well as our ultimate ones, and the fulfillment of proximate functions can be, from the perspective of choice, every bit as much an ultimate value as is reproduction itself.


Moore and the Indefinability of Good

In Principia Ethica (Moore (1922)), G. E. Moore argues that our idea or notion of good is indefinable. He applies a very high standard for a correct definition of a concept: the definiens must be inter-substitutable salve veritate in any cognitive context. If the good were definable, then the definition must have exactly the



same cognitive significance as the concept of good itself. Moore suggests that the idea of a horse is definable: that there is some ensemble of descriptions of parts and their properties and relations that would be cognitively equivalent to the concept horse. This seems doubtful. Given any such complex description, it seems one could always meaningfully ask, yes, but is every such thing a horse? It would seem that Moore's 'Open Question' criterion sets an impossibly high standard. Moore's account of the naturalistic fallacy depended on a particular theory of intentionality with respect to properties or qualities. Moore believed that, whenever we think of a quality, one of two conditions must be met: (1) we have direct acquaintance with that quality, or (2) we have a definition of the quality in terms of qualities with which we have direct acquaintance. Moore assumed that knowledge by acquaintance consists in having the quality itself (in toto) immediately present to the mind, perhaps, via the presence of a mental datum instantiating the quality, or perhaps via the bare presence of the uninstantiated universal. In either case, Moore's argument depends on assuming that the presence-of-x-in-the-mind context is a fully extensional context. In other words, if x = y, and x is fully present to the mind, then so is y. Let M(Q) represent the mental state-type of having the quality Q present to the mind. Moore's assumption (which we may call the extensionalist fallacy) is that if P = Q, then M(P) = M(Q). This then provides us with a clear criterion for the distinctness or non-identity of qualities. To prove that P ^ Q, all we need to do is prove that M(P) ^ M(Q). How do we establish the non-identity of mental state-types such as M(Q) and M(P)1 The observation of a difference between the two state-types via introspection would be sufficient. Thus, if we can distinguish M(P] from M(Q) in introspection, this is sufficient (for Moore) to establish that P ^ Q. It is relatively easy for Moore to then establish the indefinability of good, since the simple idea of good is easily distinguished, in introspection, from any proposed definiens. Moore's theory of intentionality is not an especially attractive one. The notion of the bare presence of a quality to the mind is a fairly mysterious one. The account of intentionality developed in chapter 14 avoids such mysterious postulations. In any event, why should we suppose that the presence-of-o: context is an extensional one? Couldn't a quality be present to the mind, but always in some guise or under some mode of presentation or other? Couldn't there be two distinct intentional state-types, M\ and M%, each of which had, as its immediate object, the same quality Ql If so, demonstrating the distinctness of MI and M-2 (where, say MI involves a simple notion while M% involves a complex one) would have no bearing on whether they denoted the same quality. Hence, introspection alone provides no criterion for discovering the ontological simplicity or complexity of real qualities. If the Aristotelian account of teleofunctions developed in chapter 12 is correct, then goodness is definable (complex). The primary use of good would be in specifying what is good for this or that organism, and something would be good for an organism just in case it is or leads to the fulfillment of the organism's


Realism Regained

teleofunctions. On the Wrightian account, the teleofunctions of an organism can be specified purely in causal terms, without reference to what is good for the organism. Hence, in this case, a non-circular account of the nature of good could be given. The definiens would not be cognitively equivalent to the concept of goodness, but it would tell us all there is to know about what goodness is, as a feature of the world.


Moral Theory as the Teleology of Character

20.1 Virtue as Both Means and End

Moral virtue is the disposition to fulfill those of one's teleofunctions concerned with choice and intentional action. As such, the exercise of moral virtue is an important component of eudaemonia. The primary teleofunctions of the will are included in the set of primary teleofunctions whose fulfillment constitutes eudaemonia. At the same time, the exercise of moral virtue is a reliable means for the attainment of the whole of eudaemonia. The system of primary teleofunctions of a human being forms an approximately coherent whole: fulfilling any subset of the primary teleofunctions assists in (or at least, does not detract from) the fulfillment of the others. Moral virtue is no exception. Although acting virtuously may involve some short-term cost in terms of the fulfillment of other functions, and although in exceptional cases this cost may be permanent and large, still, for the most part, acting virtuously leads in the long term to something close to complete eudaemonia.


Eudaemonism versus Egoism

Although each organism has, as its ultimate end, the achievement of its own state of eudaemonia, Aristotelian eudaemonism1 is not a form of egoism (either ethical or psychological), since one human being's eudaemonia always includes the fulfillment of the eudaemonia of certain others as one of its most important constituents. A parent's primary functions can be fulfilled only if her children live and flourish. The primary functions of a human being include the capacity for true friendship (in the sense of Aristotle's Nicomachean Ethics, Book X).

In contrast to hedonistic forms of eudaemonism, such as that of Epicurus.



Realism Regained

The parent does not use the child's eudaemonia as a means to the end of her own eudaemonia: instead, the child's eudaemonia is an ultimate end for the parent, as part of the parent's eudaemonia. Similarly, true friends do not use each other as means. Instead, the constitution of eudaemonia for each friend expands to include the eudaemonia of the other.


Is and Ought

As Kant proposed, morality contains categorical imperatives. On a teleological account, these imperatives are rooted in our human nature. Every human being, as such, is ordered to the fulfillment of the human form of eudaemonia as the ultimate end, and it is the ineluctability of eudaemonia as our end, together with the inclusion of moral action within eudaemonia, that gives moral requirements their categorical nature. We can, therefore, derive 'ought' from 'is', so long as the 'is' includes the specification of the teleological structure of the agent. Oughts are, in fact, a part of what is. For example, consider the following argument: 1. Human beings necessarily pursue eudaemonia (this pursuit is constitutive of their being human, and so of their being simpliciter). 2. Mary is a human being. 3. Therefore, Mary necessarily pursues eudaemonia. 4. Moral wisdom is a necessary condition of attaining eudaemonia. 5. Therefore, Mary ought to attain moral wisdom. The premises are all factual. The conclusion involves an all-in, categorical sense of 'ought', since none of the premises concerns contingent facts about Mary's peculiar goals or interests.2 Instead, they are concerned only with the concern that is constitutive of Mary's very being, namely, the pursuit of eudaemonia. Am I not assuming, without basis, that we ought to fulfill our human nature, or that we ought to aim at so doing? I am asserting that we ought to fulfill our human nature, but not that we ought to aim at so doing. That we ought to fulfill our natures is tautologous. 'Oughts' apply only to things with teleofunctions, and something ought to be the case for such an organism just in case it involves the fulfillment of as many of the primary functions of the organism as possible. It makes sense to say that one ought to do something only if it is possible not to do so. We cannot say that we ought to travel slower than the speed of life, if it is impossible for us to do otherwise. Similarly, since it is impossible for us not to aim at eudaemonia, it makes no sense to say that we ought to
Perhaps this is not all that Kant intended in his use of the word categorical. I am not claiming that the practical necessities of achieving eudaemonia are Kantian categorical imperatives, only that they possess all the categoricality that is really needed in moral 'oughts'.

Moral Theory as Teleology


aim at it. Since it is possible for humans to be confused or ignorant about what constitutes eudaemonia, it does make sense to say that we ought to have eudaemonia (or more precisely, an accurate representation of eudaemonia) in mind. We ought to make our decisions in light of an explicit and accurate representation of eudaemonia as a goal. In other words, we ought to act with practical wisdom.


Sociobiology, Game Theory, and Species Relativity

The science of human sociobiology can shed considerable light on the matter of morality by revealing which features of human behavior and social interaction are adaptive, that is, have in fact contributed to the reproductive fitness of their carriers. As I have argued earlier, it is not the gene (in the sense of a DNA molecule) whose "selfishness" provides the basis for the definition of adaptation, but rather the "selfishness" of a form of life, including certain patterns of behavior. These "selfish" forms of behavior, that is, forms of behavior that successfully cause their own replication, include purely altruistic action, such as the love of own's kin or kin-surrogates, and the genuine reciprocity of friendship. Some moral virtues are concerned with the management of conflict and the generation of cooperation. These are virtues of fairness. The mathematical theory of games provides a very illuminating way of analyzing social interaction in terms of strategic equilibria. A strategic equilibrium is a distribution of behavioral patterns throughout a population that is self-sustaining. Under favorable conditions, equal distributions of benefits constitutes a uniquely stable equilibrium (see, for example Brian Skyrms's "Sex and Justice" (Skyrms (1994)) or my "Gauthier and the Rationality of Justice" (Koons (1994b))). Thus, game theory can explain how the exercise of fairness can be adaptive and, hence, functional. Once fairness becomes functional, it becomes part of the end, and not merely a means. Once human life acquired fairness as one of its functions, fairness became part of that which is reproduced, and not merely a factor contributing to the reproduction of something wholly distinct from it. Therefore, it is perfectly rational to be fair, even when fairness does not carry any collateral advantages, since the exercise of fairness is itself an advantage. Since human morality makes reference to those teleofunctions of character that are actually realized in human life, the principles of morality pertain only to our own species. Different standards of morality apply to other animals who engage in decision making, if there are any. Does the specific nature of morality somehow undermine its objectivity, or suggest that commonsense morality is (as Michael Ruse and E. O. Wilson have described it) an "illusion foisted on us by evolution"? Certainly not. A competent Martian anthropologist should evaluate the morality of human actions and characters exactly as we do, and we should evaluate the morality of Martian characters as they do, in each case applying the appropriate set of standards.


Realism Regained

Game-theoretical considerations suggest that many moral principles will be, at some level of abstraction, universal, or nearly so. It is difficult to imagine a stable form of social life lacking any notion of fairness or loyalty, or in which cruelty is an end in itself.


Elements of a Teleo-Ethological Morality

An Aristotelian morality of the kind sketched above will include many virtues that are self-regarding and that have little or no relation to fairness and rights. For example, many of the virtues discussed by Aristotle and by Hume will clearly fall within the scope of a teleo-ethological morality: not just fairness, reciprocity, and compassion, but also courage, temperance, and resilience, as well as humor, gaiety, industriousness, and fidelity. Love of one's children, and fidelity to one's spouse, will play a central role in moral theory, and will not have to be relegated to a footnote or somehow deduced from a wholly disinterested love of humanity in general. A proper degree of self-love and of persistence in one's own commitments and projects will be recognized as more than merely permissible. The study of punishment and guilt falls within the science of secondary functions, functions whose operation presupposes the failure of some primary function. Punishment plays an important role in sustaining the practice of fairness, respect for others, and peacefulness, since in the absence of punishment, the relative cost of virtue can climb to excessive heights. A willingness to bear a fair share of the burden of punishing wrongdoers is itself an important component of social responsibility. Guilt can be thought of as a form of preemptive self-punishment, reducing, through its obvious presence, the need for costly punishment transactions. The artificial virtues of justice and of good manners are rooted in the social teleofunctions of human nature. Humans have a need to be rooted in a culture and a history and to internalize the specific norms and standards of that culture. Moreover, as social animals, humans require customs that regulate cooperation and conflict with a considerable degree of precision and clarity. So long as these specific norms are consonant with the general requirements of human nature, the pursuit of artificial virtue is an indispensable part of the pursuit of natural virtue.


Politics and the Natural Law

Since human beings are, by nature, political animals, the operation of the state fulfills certain teleofunctions. A just, well-ordered state is both a means and an end in itself. A just state maximizes the chances for the fulfillment of other, nonpolitical teleofunctions, and participation in a just state is itself a component of eudaemonia. Consequently, a purely instrumental view of the state, as in the political theories of Hobbes or of Bentham, is fundamentally mistaken. The state

Moral Theory as Teleology


does not exist merely to secure peace or maximize pleasure: instead, the proper functioning of the state is in itself an indispensable part of human welfare.3 The state has certain natural functions, such as the regulation of conflict and the punishment of wrongdoers. The requirements of these natural functions constitute the basis for natural law. Far from being "nonsense on stilts" (as Bentham labeled it), natural law and natural rights are the consequence of the teleological structure of human life, given our natural sociability. An application of this theory solves an outstanding problem for rule consequentialism (of the sort defended by Mill in Of Liberty, for example), namely, how to justify the rationality of following the rule in instances where doing so causes a net loss in the ultimate value for which the rule exists. If the status of the rule or institution is merely instrumental, then the problem is insoluble: it is irrational to pursue something as the means to an end when one knows that doing so frustrates the end in question. However, from a teleological point of view, we can see that certain rules or institutions (establishing certain rights, for example) exist because they further certain ends, where the 'because' here is causal, and at the same time we can recognize that following these rules has value as an end in itself, as partly constitutive of our eudaemonia as citizens. Once we recognize that such rule-following has value in itself, it is rational to forgo other values (even those values explaining the existence of the rule) in order to obtain the value of political justice through conforming to right rules.


The Incoherency of Legal Positivism

As I noted in chapter 15, there is an analogy between legal positivism (as represented by John Austin, H. L. A. Hart, and Hans Kelsen) and a certain kind of conventionalism about logic and mathematics (which I called immanent-basis conventionalism). In both cases, there is an attempt to ground normativity (in the form of rules with definite content) exclusively in social practices, without reference to anything transcendent (logical and mathematical facts in the one case, principles of natural law in the other). In both cases, a vicious regress threatens the coherency of the project. The problem can be seen quite clearly by examining the legal positivism of Kelsen (1967). Kelsen recognizes that not every pattern of behavior, and not even every pattern that is coercively enforced, counts as a legally valid rule. There must be meta-rules, norms of legal validity or recognition, that bestow legal validity upon the law. So, for example, in Britain the principal norm of recognition consists of the principle that a law is whatever has been passed by Parliament. In the United States, a statute is recognized as federal law when it has been passed by both houses of Congress and signed by the President, or passed by a supermajority in both houses overriding a Presidential veto, so long as the statute has not been declared unconstitutional by the federal courts. These norms are, Kelsen recognizes, themselves rules of law. Their validity must be grounded in some yet deeper norm. Ultimately, we reach what Kelsen called

See Arnhart's recent book Arnhart (1998) for a fuller development of this theme.


Realism Regained

the Grundnorm of the legal system, a rule whose validity is somehow given independently of other norms. In the analysis of the Grundnorm, Kelsen faced a dilemma. Is the Grundnorm itself a valid rule of law, or a raw exercise of power? Qua raw exercise of power, the Grundnorm has no legal validity, and so cannot convey any such validity to any other rule. The very distinction between the validity of rules in the system and the invalidity of patterns outside it comes crashing down. However, qua legal rule, the Grundnorm must derive its legal validity from some outside source. By hypothesis, the social practices in play provide no more fundamental basis than that provided by the Grundnorm. Hence, there must be some principle of natural law that bestows upon the Grundnorm whatever legal validity it has.


Justice toward Future Generations

A sharp difference between a teleo-ethological theory of justice and various social-contract and libertarian theories, such as those of Rawls, Nozick, or Gauthier, emerges quite clearly when examining the problem of justice toward future generations. Suppose that we are contemplating two policies, A and B. The two policies will have profound effects on the welfare of people living one hundred years from now. If we follow A, the welfare of the present generation is maximized, but the people alive 200 years from now will live lives that are just barely worthwhile. If we follow B, the welfare of the present generation would be slightly lower, but future generations, including those 200 years in the future, will live lives of comparable value. Let G(A) be the set of people who would be alive 200 years from now if we now adopt policy A, and let G(B) be the set of people who would be alive then if we adopt policy B. The differences in the course of history depending on which policy is followed will be profound: who marries whom, and when and under what circumstances they conceive and bear children, will depend on which policy we adopt. Let us suppose, as seems plausible, that the sets G(A) and G(B) are disjoint: the populations that would overlap under the two different policies have no common members. In this case, it is very hard for a social-contract or a rights-based theory to say that there is anything wrong with policy A, since no one is injured or wronged by pursuing it. No actual present or future person would be better off if policy B had been adopted instead: actual present people would be worse off, and actual future people would not exist at all. If the lives of the actual future population are worthwhile, we might say that they would have been harmed had policy A been adopted instead. Moreover, social-contract theories are typically stymied by considerations of our asymmetrical relations to our future progeny, since there is no possibility of quid pro quo. In social-contract theories from Hobbes to Gauthier, the existence of mutual vulnerability and the possibility of mutual advantage lie at the heart of the concept of justice. This mutuality is notoriously absent in dealing with

Moral Theory as Teleology


the stakes of future generations. The situation looks entirely different from a teleo-ethological point of view. Natural selection would certainly prefer populations whose characters include a profound concern for the prosperity of their posterity. The love of a parent for her children and grandchildren, and the collective concern of communities for their long-term futures, would lie at the very heart of moral theory, rather than being something attached as an afterthought in the third appendix, as is typical in modern ethical theory.


Kierkegaard and the Teleological Suspension of the Ethical

If all teleology has a Darwinian foundation (as per the thin theory of teleology), then all human beings share the same teleofunctions, since our divergence from a common ancestral population is too recent for significant differences at the level of function to have arisen. Even if that divergence were much earlier, an exclusively Darwinian account of teleology entails that, for each individual, there is some population that shares all of its teleofunctions. This is so because Darwinism operates exclusively at the level of populations. Darwinism can explain nothing about an individual except by referring it to some population. However, if there are fundamental, extra-Darwinian instances of teleological connection, there is the possibility of radically individualized teloi. It would be possible for an individual human being to have a teleofunction that cannot be derived from any teleofunctions shared by other individuals. This could involve the existence of a higher-order causal law that makes reference to the individual in question, a personalized causal law, in other words. Alternatively, it could be that each individual is, qua that very individual, the product of a one-off intelligent design by a creator, in which the intentions of that creator would have a unique kind of authority over that individual (as Kierkegaard believed). If we identify ethics with the study of the teleofunctions that are common to us as humans, or even with the study of those teleofunctions that are common to us as members of a given culture, a Sittlichkeit, then the possibility of individualized teloi would open up the possibility of the teleological suspension of the ethical. That is, the fulfillment of an individual's eudaemonia might require what would be in other humans the violation of some teleofunction. Kierkegaard (1985) imagines just such a possibility in Fear and Trembling in describing Abraham's willingness to sacrifice his son Isaac. Once again, if we identify reason with the fulfillment of those cognitive functions that are common to the most specific relevant population of humans, then the possibility of individualized eudaemonia would entail the possibility of nonrational, even anti-rational, knowledge, which we could call "faith."4
4 For a discussion of the rationality of relying upon such non-rational sources of knowledge, see my article "Faith, Probability, and Infinite Passion" (Koons (1993)).


Realism Regained

A concrete application of Kierkegaard's ideas can be seen in Dietrich Bonhoeffer's decision to participate in the plot to assassinate Hitler. In his Ethics (Bonhoeffer (1955)), Bonhoeffer develops a Kierkegaardian moral theology in some detail, and in the decision to take part in the plot, Bonhoeffer put this theology into practice. According to Bonhoeffer, the decision to attempt to take Hitler's life cannot be justified by recourse to any set of rules derivable from general features of human life. Like Abraham in his decision to prepare Isaac for sacrifice, the plotters had to be prepared to act as "responsible" men, responding to an evident calling specific to them and their concrete circumstances, without justification that appeals to timeless or general principles. The radical self-sacrifice of heroes and saints, such as Gandhi, Martin Luther King, Jr., or Mother Theresa, provide additional (and non-lethal) examples of the same kind of individualized calling or vocation that cannot be reduced to a pursuit of eudaemonia, generically considered. Surprisingly enough, the possibility of such extra-rational and extra-ethical functions is amenable to scientific investigation, and Kierkegaard himself points the way. It is quite possible that a number of generally human teleofunctions presuppose the existence, in each case, of functional elements that are radically individualized. That is, it may be that we have, as humans, a general need to transcend the general. In The Sickness unto Death (Kierkegaard (1989)), Kierkegaard describes this interplay between the general and the individualized as the need for both freedom and necessity, or both possibility and necessity. We need a structure of general teleofunctions to provide unity to our lives, especially diachronic unity (our need for "necessity"). At the same time, we cannot bear to sacrifice our individuality utterly in the pursuit of a rigid and unvarying ideal of human eudaemonia (hence, our need for "freedom" or "possibility"). According the Kierkegaard, radical evil can be described as the perverse fulfillment of our need for freedom. Evil involves the wholesale rejection of anything independent of my will as the objective and ineluctable end of choice. Evil seeks freedom from the constraint of generalized eudaemonia, not by apprehending an individualized eudaemonia, but in rebelling against eudaemonia itself. This rebellion is impossible, since even in rebelling against the pursuit of eudaemonia, we are acting in pursuit of one aspect of it, namely, the need for freedom. Carried to its ultimate conclusion, evil leads to the annihilation of personhood, since the will cannot bind its own future decisions, and so the unity of the person through time that is provided by the pursuit of a single end is lost. Kierkegaard's conception of evil or "the demonic" prefigures Nietzsche's ideal of the creation of new values and Sartre's analysis of humans as "condemned to be free" (Sartre (1956)).


A Coherent Realism Is a Comprehensive Realism

21.1 The Four Waves of Anti-Realism

A comprehensive form of realism, as exemplified by Plato, Aristotle, and Boethius, was the dominant school in Western philosophy from the time of Augustine until that of Scotus. Today, a comprehensive form of anti-realism, as exemplified by Rorty, Foucault, or Derrida, is at or near dominance in the academy. The transition from the first state to the last took place in four great waves: Occam, Bacon, Hume and the post-modernists. These waves correspond to the dismantling, one by one, of Aristotle's four causes: formal, final, efficient, and material.1 Nominalists such as Occam rejected the real existence of properties, types, and other universals. All that exists is individual: all predicates and other general terms refer distributively to their many satisfiers, not to a single universal entity. Thus, nominalists denied the reality of Aristotle's formal cause: form as such does not exist. Although it took several hundred years for this conclusion to be explicitly drawn, it follows from the rejection of form that there can exist no real final causes. Final causation implies a real relationship between an individual and a form that is only partially or imperfectly realized in the present state of that individual. If forms are unreal, so are such relationships. Descartes, Bacon, and Galileo urged that final causation be banished from natural philosophy. This was to some extent justified by the over-reliance of Aristotelians on final causation, especially in physics. Moreover, the concentration of scientific research on matters of efficient causation undoubtedly contributed to the rapid growth of physical and chemical sciences in the early
1 Students of Richard M. Weaver will recognize the influence of his analysis of modern history in Ideas Have Consequences (Weaver (1948)).



Realism Regained

modern period. However, the banishment of final causation to the realms of a priori psychology and revealed theology was unjustified and has done great harm to both philosophy and science. Bacon and Descartes did not deny the existence of final causation absolutely, but they denied its existence within natu're. All final causation was made dependent on the intentions of conscious agents, whether human or divine. Anything that is not a human artifact could have a proper function only by reference to the design intentions of God. This identification of final causation with divine intention led to the subsequent confusion by many of teleological explanation with the attribution of perfection or optimization. Once final causation was relegated to revealed theology, it was inevitable that a Hume would appear, who would attempt a thoroughly non-teleological account of the human mind. Epistemology thus became the study of the operations of the human mind, without reference to the proper functions of the human faculties. As Hume so clearly saw, the operationalist empiricism that results undermines the rationality of induction and renders causal connection inaccessible. Consequently, the third of Aristotle's causes, the efficient cause, went under. Kant attempted to minimize the damage of this loss by making causation an unavoidable projection of the finite understanding, rather than the accidental result of associations in this or that individual human being. With Hume and Kant representing the two alternative poles, one of individualistic subjectivism, and the other of universal, inter-subjective anti-realism, modern philosophy has sought out many devices for reconstructing epistemology and ethics without the use of either final or efficient causation, without notable success. Post-modernism has been the natural response to the evident failures of modern philosophy. Without final or efficient causation to tie human ideas to objective reality, the materialistic story of modern scientific philosophy becomes merely one story among many equally legitimate alternatives. Since truth is impossible, reason becomes optional. Post-modernism will turn out, I believe, to be a transitional episode, and not a permanent condition. The absolute indifference to intellectual discipline that post-modernism fosters will inevitably provoke a reaction in the opposite direction. Indeed, the reaction has already begun, as evidenced by the Australian materialism of David Armstrong, Prank Jackson, and others, and the teleological naturalism of Millikan, Dretske, and Papineau. A coherent and viable alternative to the failures of modern philosophy and the vacuity of postmodernism must, and I think will, be built on the restoration of all four of Aristotle's causes. By recognizing that our cognitive faculties are objectively ordered to the end of truth, and by recognizing that universal types are every bit as real as particular instances, we can successfully depend on the possibility of both truth and knowledge. Moreover, since our volitional faculties are also objectively ordered to a systematic end human eudaemonia we can close the infamous fact/value gap and restore ethics to its rightful place among the sciences.

Coherent Realism



A Prolegomenon to Any Future Critique of Metaphysics

Since the work of Hume and Kant, the work of metaphysics has taken place under a cloud of suspicion. Empiricists and positivists have held metaphysics to be unscientific because it postulates entities, causal connections, substances, universals, numbers, etc., that are not directly verifiable by the senses. Metaphysicians have not been alone in this predicament. Scientists who insist on interpreting the theoretical entities of science realistically fall under the same suspicion. Locke was skeptical not only about scholastic metaphysics, but also about Newton's mechanics, and van Fraassen rejects not only universals and causal connections, but also electrons and magnetic fields. The central dogma underlying the positivist critique of metaphysics is the privileged status of sense perception. Whatever can be justified can be justified (according to the positivist) in terms of sense perception, or sense perception plus deductive logic. The positivists owe the rest of us an explanation of why we should grant this exclusive privilege to one or two modes of knowing, at the expense of all others. The basis for the privileging of sense perception seems to be the matter of reliability. There are two reasons for thinking that our knowledge of our own sensory surface stimulations (to use Quine's phrase) is more reliable than our knowledge of other facts: causal distance and inferential distance. The process conveying information to me from a rock or an electron is much longer than the process conveying information to me from-the immediate environment of my sense organs. Similarly, the process carrying information to my own sense organs is much shorter than the process by which natural selection conveys information to the innate structures of my natural kind. A longer process is more susceptible to malfunction, ceteris paribus. Hence, the shorter process is more reliable. Similarly, any knowledge gained by inference from sensory knowledge involves additional steps, during which additional errors can occur. However, ceteris is not always paribus. As Fred Dretske has pointed out, our knowledge of distal facts is often much more reliable than our knowledge of proximal stimulations. I am much better at learning the pattern of the distribution of furniture in my office than I am at learning the pattern of stimulation of my retina. My innate knowledge of arithmetic is more reliable still, and much of our inferential knowledge, for instance, our knowledge of the power of gravity, is more reliably formed than our knowledge of the results of any single experiment. Where positivists and empiricists are right is in insisting that there be the possibility of some kind of causal connection, direct or indirect, between us and any postulated entity. In the absence of such a causal connection, there can be no reliability, and where there is no reliability, there can be no knowledge. Where they are wrong is in limiting this causal connection to the five senses. A philosopher who is empirical in spirit rejects a priori certitudes in philosophy as bad methodology. This must include rejecting the a priori certitudes of empiricism.


Realism Regained


Causalism, Yes! Materialism, No!

On the questions of the philosophy of mind, analytic philosophers have tended to divide into two camps: the naturalists and the mysterians. The naturalists hold to some form of the mind/brain identity thesis, insisting that all the facts there are can be accommodated within materialism. The mysterians insist, to the contrary, that there is a subjective, introspectible aspect to consciousness, and, perhaps, that there is a phenomenon of basic and underived intentionality that cannot be accounted for by materialistic theories. I am by and large sympathetic to the strategy undertaken by contemporary naturalists: (1) to explain the phenomenal character of conscious experience in terms of intentionality, (2) to explain intentionality by means of information and proper function, and (3) to give a causal account of both information and proper function. This strategy, however, is available to non-materialists as well. The resulting account of the mind is better termed the "causalist theory of the mind," rather than the "materialist theory of the mind." A non-materialist causalism has one major advantage over the materialist account of the mind: explaining the causal efficacy of mental states. On the causalist account, a token mental state consists of two parts: one supporting certain physical and physiological types relating to the central nervous system, and one supporting higher-order causal connections between those states and their intrinsic purposes (the carrying of information or the execution of behavior). Since a materialist is committed to the dogma that only spatiotemporally located tokens can be causally efficacious, he must hold that only the physical component of a mental state causes behavior; the teleofunctional component is causally otiose. On the causalist account, in contrast, higher order tokens can interact with tokens whose functions are themselves higher-order in nature, i.e., second-order tokens can be causally efficacious in interactions with third-order tokens. In any event, there are several good reasons for rejecting materialism that are entirely independent of the issues in the philosophy of mind. The causal efficacy of modal facts, and, through them, of logical and mathematical facts (chapter 15). The constructibility of metrical spacetime as a simple approximation to the qualitative relations determined by the causal network (section 4.10.2). The existence of an extra-spatial first cause of the cosmos (chapter 8). The existence of a cause of the simplicity of the causal structures underlying many observable phenomena, as required for a realistic interpretation of scientific theory (section 17.5). The refutation of the principles of materialistic compositionality and causal locality by the failure of Bell's inequalities for quantum phenomena (section 18.5).



In part I, I demonstrated that modal and causal facts that are themselves not spatially or temporally located can act as causes of concrete events. In chapter 12,1 used higher-order causation to give an account of the teleological connections we find throughout the biological and human worlds. In chapter 15, I a,rgued that such an account of higher-order causation is needed if we are to have an account of the possibility of logical and mathematical thought, in particular, in order to solve Benacerraf 's problem of the indeterminacy of reference in mathematics. These non-spatiotemporal facts are counter-examples to materialism, which is committed to the view that only entities located in space and time can be causally efficacious. In section 4.10.2 of part I, I indicated that a mereotopological approach to qualitative, commonsense spatial and temporal relations could be based on the theory of causation. I further suggested that the metrical spacetime of physical theory is based on a much simplified picture of these qualitative spatiotemporal relations. Consequently, it is unreasonable to assume that spacetime is a universal receptacle into which all situation-tokens must be fit. The most we can ask of the spacetime of physics is that it provide a very simple, useful framework into which many tokens and their relations can be fit with at least approximate success. In chapter 8,1 argued that considerations based on the apparent universality of causation should lead us to the conclusion that there is a necessary fact that is the uncaused first cause of all wholly contingent states. This necessary fact most probably involves no entities that are material or spatiotemporal in character, since any fact involving such entities would be at least partially contingent. In section 17.5 and in a forthcoming article (Koons (2000)), I argued that a realistic interpretation of scientific theory depends on the objective reliability of our inductive methods, including our preference for simplicity. This objective reliability in turn depends on the existence of a cause that explains why many observable phenomena are the product of relatively simple causal structures. This cause of the uniform simplicity of observable phenomena must itself be non-physical in nature. Finally, I have argued in chapter 18 that the failure of Bell's inequalities strengthens the case for rejecting the ontology of materialism. The attractiveness and plausibility of materialism depends on the principle of materialistic compositionality: the thesis that any fact about any composite entity can be explained (without remainder) in terms of intrinsic facts about its parts and their spatiotemporal relations. Materialistic compositionality is analogous to compositionality in linguistics. The meaning of complex expressions in a compositional language can be deduced from the intrinsic meaning of the parts of the expression, together with their spatiotemporal relations. This means that the meanings of complex expressions are never strongly emergent, so we do not have to resort to some non-recursive faculty of interpretation to explain our understanding of novel sentences. Similarly, materialistic compositionality means that we can explain any novel physical phenomenon on the basis of a classification of its parts by their intrinsic characters, an account of their spatiotemporal relations, and the


Realism Regained

use of a finite number of functions. The connection between materialistic compositionality and materialism lies in the assumption that the only relations that matter are spatiotemporal relations. If we permit other relations between parts to figure in our canonical explanations, we open the door to ghostly and mysterious relations, like being neurons in the relation that carries the intrinsic meaning red. Both Democritean atomism and Einsteinian field theory satisfy the principle of materialistic compositionality. In the case of field theory, the number of parts of a material object is non-denumerable, since each point in spacetime constitutes a part of the field. However, given the field strengths at each point and the spatiotemporal relations between the points, we have all we need to explain every physical phenomena, according to general relativity. However, materialistic compositionality is demonstrably false, thanks to the falsification of the Bell inequalities by quantum results. The failure of the Bell inequalities leaves us with only four options: 1. Reject the thesis that quantum objects have spatiotemporal relations (the Copenhagen interpretation). 2. Allow for superluminal influences (pilot waves or Everett-style worldsplitting). 3. Allow for backward causation (the influence of present experimental settings on past events). 4. Recognize the existence of irreducibly non-spatiotemporal relations among distant physical parts (holism). Of these, only (2) and (3) are compatible with materialistic compositionality. Both (1) and (4) explicitly contradict compositionality: (1) because it denies that all of the parts of a physical system have spatiotemporal relations to one another, and (4) because it requires that relations other than spatiotemporal ones play an irreducibly real role in physical causation. Options (2) and (3), however, still undermine materialism, since they entail that no two physical systems can be causally isolated from each other. This means that the resolution of complex phenomenon into simple parts gets us no closer to a complete explanation of the phenomenon, since each of those parts can interact with an unlimited number of remote factors. The Bell inequalities force us to recognize strongly emergent properties in each classical system, that is, properties that do not supervene on the intrinsic characters (stable states) plus the spatiotemporal relations (if any) of their quantum-level parts. If we adopt, as seems most reasonable, some variant of the Copenhagen interpretation (such as Heisenberg's), we are left with the conclusion that position arid velocity themselves are such strongly emergent properties. Our quantum-level parts have no position, although we may attribute something like position to them when they interact with a classical measurement system. This attribution of intermittent position to quantum particles should be thought

Coherent Realism


of as only analogous to the attribution of position to the measurement device itself. Quantum-level phenomena are unimaginably strange. The principle of no-action-at-a-distance simply does not apply to them, since they do not really stand at a distance from one another or from us. My own actual position in space is not determined by any feature of my quantum-level parts, since compounding quantum systems yields only more complex probabilistic wave functions, never definite spatiotemporal properties. Consequently, there is no reason to assume that my psychological state is determined by the physical features of my body's parts. Since the physical world is itself divided, home to strongly emergent properties, respect for physics provides no reason to rule out the possibility of properties that are strongly emergent relative to both the quantum-level and classical-level physical attributes of the body. I have discussed strict materialism, rather than "physicalism," because physicalism is too vague and indeterminate a doctrine to serve as the topic of a coherent philosophical discussion. If "physicalism" means that science will eventually be unified, with a single set of laws and concepts, then no one today defends such a doctrine (to my knowledge). If physicalism means that everything that exists and every actual causal explanation has a "complete" description in the "ideal physics of the future," then the doctrine is so vague as to be essentially meaningless. Does this doctrine entail the spatio-temporal contiguity of causes and effects, or mereological compositionality? Does it exclude the existence of an uncaused first cause? Who knows? Who can tell? The history of physics over the last one hundred years gives us reason to believe that the physics of the future will be unimaginably different from today's theories. If "physicalism" means that a complete and sound description of all causal connections can be given in terms of today's best physical theories, the doctrine is, besides being wildly implausible, still seriously underdefined, since today's physics includes quantum mechanics, which is subject to a wide variety of metaphysical interpretations. Metaphysical theory is radically underdetermined by an austere mathematical formalism such as quantum mechanics. For these reasons, instead of criticizing physicalism, I have chosen to criticize a precise metaphysical position, that of strict materialism.


Anti-Realist Obscurantism

Arguments for anti-realism typically take the following form: 1. It is difficult to account for our epistemic access to facts about X. 2. Therefore, we have no epistemic access to facts about X. 3. Therefore, either (1) there are no facts about X, or (2) the facts about X are merely projections of our own judgments about X, when made under ideal circumstances.


Realism Regained

For many domains (ethics, mathematics, theory of universals, causation), premise (1) is clearly true, and the inference from (2) to (3) seems correct. However, the inference from (1) to (2) is clearly a weak point. Anti-realism is simply a strategy for dodging the hard problems of epistemology, for taking by theft what ought to be earned by hard toil (in Russell's phrase). The difficulties referred to in premise (1) are due to inadequacies in our models of causation and of knowledge itself. In this book, I have begun to develop a conception of causation that is flexible enough to accommodate real causal connections between concrete events and timeless conditions, such as modal constraints. I hope that I have at least provided some basis for hope that the difficulties the anti-realist points to are not insurmountable.


Is the Theory Naturalistic?

Naturalism is all the rage these days, so it is natural to wonder whether the theory I have sketched in this book qualifies as naturalistic. There seem to be three characteristics shared by most who consider themselves naturalists: 1. The rejection of a scientifically inaccessible realm of subjectivity; causal relevance as the criterion for knowability. 2. The continuity of philosophical method with the methods of natural science. 3. A physicalist, or at least materialist, ontology. By these standards, my theory is two-thirds naturalistic, since I fall in line with the first two characteristics. I am very resistant to acknowledging the existence of subjective facts that are accessible only from a first-person perspective. Reality (insofar as we can know it) consists of a causally connected network. The very notion of reliability has no application to an irreducibly first-person mode of knowing, as Wittgenstein argued in Philosophical Investigations. Since knowledge entails reliability, this means that such a concept of first-person knowledge is incoherent. I also follow the same method in philosophy, namely, inference to the best explanation, that characterizes good methodology in science. There may be some difference between my use of this method and that of many philosophical naturalists, since I take the data of philosophy to include more than sensory observation. Non-inferential knowledge of logic, mathematics, and ethics also counts as legitimate data for philosophical theorizing. As I have made already made clear, I reject physicalism and materialism. Causal efficacy is not limited to space and time. Modal facts can be causally effective, despite that fact that they are timeless and placeless. Moreover, there are genuine instances of higher-order causation, in which timeless causal facts impinge upon the concrete events of spacetime. Metrical space and time are constructs that merely approximate the richness and complexity of the world's

Coherent Realism


causal structure. The failure of Bell's inequalities in the case of quantum phenomena provides decisive evidence against the principle of metaphysical compositionality that forms the core of the materialist's research program.2

2 For further arguments against the objectionable sort of naturalism, see the forthcoming volume, Naturalism: A Critical Appraisal (Craig and Moreland (2000)), edited by William Lane Craig and J. P. Moreland.

This page intentionally left blank

Appendix A

Partiality, Modality, and Conditionals

A.I Partial Prepositional Logics

It is conventional in philosophical logic to refer to both three- and four-valued logics as "partial logics." A three-valued logic recognizes the values true, false, and neither (undetermined). Four-valued logic adds the possible value of both true and false. Three-valued logics are useful in representing partially undefined or ontologically incomplete situations. Four-valued logics enable us to deal both with incomplete and with logically impossible situations. For most purposes, three-valued situation theory will suffice, since all actual and possible situation-tokens bear one of three relations to each situation-type (verify, falsify, or neither). However, there are two reasons for being interested in four-valued logic. First, there are concerns of symmetry and elegance. The four values form a kind of lattice, and in many cases, the logic and semantics of four-valued systems are simpler than those of the corresponding three-valued systems. Second, I am interested in situations that are partial with respect to information about logical necessity. Such situations do not recognize the impossibility of certain logically impossible situations. In order to model such logical impossibility, it is convenient to make use of the fiction of logically impossible (four-valued) tokens. Logically partial situation-tokens will enable us to recognize the causal efficacy of logical necessities, which in turn will make possible a causal theory of logical reference and knowledge (chapter 15). Here again are the three-valued (strong Kleene) truth tables for negation, disjunction, and conjunction.



Realism Regained

The corresponding tables for four-valued logic (the Dunn (1976) tables) are as follows:

The unifying idea behind both the strong Kleene and the Dunn tables is simply this: one computes truth-values as one one would in classical semantics, except that one separates the determination of truth and of falsehood: 1. -i0 is (at least) true if 0 is (at least) false. 2. -i0 is (at least) false if 0 is (at least) true. 3. (j>&ip is at least true if both 0 and i/> are. 4. 0 & V> is at least false if either 0 or ^ are. 5. 0 V ip is at least true if either 0 or ?/> are. 6. 0 V V> is at least false if both 0 and V7 are.

Partiality, Modality, and Conditionals


By 'at least true', I mean either T or B (either true only or both true and false), and by 'at least false' I mean either F or B (either false only or both true and false). A proposition that receives no truth-value from these principle retains the classification U. The use, within semantic models, of impossible situation-tokens is no more troubling, ontologically speaking, than the use of possible-but-not-actual tokens. The only real tokens are actual tokens. Non-actual tokens, whether possible or not, are merely useful fictions. Modality (possibility, contingency, necessity) pertains primarily to actual situation-types. A paradigmatic modal fact is something like: type </> is possibly instantiated. Since the work of Kripke and Kanger, it is widely recognized that it is useful, in representing the semantics and logic of modal facts, to construct models containing indices that stand for merely possible worlds. Each possible world represents the compossibility (from the point of view of certain worlds) of a set of types. Similarly, in fourvalued modal logic, I will use impossible situation-tokens to represent the lack of the impossibility (from the point of view of logically partial situations) of the co-exemplification of certain types. The classical connectives (negation, conjunction, etc.) are functionally complete with respect to classical two-valued interpretations. Every classical truthfunction is definable by means of negation and conjunction alone (and also by means of negation and disjunction alone, as well as several other combinations of classical connectives). The classical connectives are not, however, functionally complete with respect to three-valued or four-valued interpretations. To achieve functional completeness, we would have to add several new, non-classical connectives. However, as Thijsse (1992) has proved, the classical connectives are functionally complete with respect to an important class of three-valued functions, namely, those that have the properties of truth-functional persistence, classical closure, and freedom. A truth-function is persistent just in case: whenever the truth-values of the inputs are enriched, the truth-value of the output is also enriched. By enriching a truth-value, I mean moving from undefined to one of the other truth values, or moving from one of the classical truth values to the value both true and false. Classical closure requires that whenever the inputs are limited to the classical truth-values, the output is also classical. Finally, the property of freedom entails that whenever the truth-values of the inputs are all undefined, the truth-value of the output must also be undefined. I have not found any need for logically complex types whose relation to their constituent types does not satisfy all three of these requirements. In particular, the property of truth-functional persistence is closely related to the interpretation of the part-whole relation on situation-tokens. Each coherent situation is part of a possible world: as we move from less to more inclusive situations, the associated truth function should converge to the classical (bivalent) case. Hence, we want formulas to have the property of mereological persistence. A class of formulas is mereologically persistent if whenever a situation s verifies a formula <j> and s C s', then s' also verifies cf>. If we stipulate that any interpretation of the language makes the atomic formulas mereologically persistent, and we use only truth-functionally persistent connectives, then it follows that all formulas


Realism Regained

of the language are mereologically persistent. Classical closure is important, since we want any use of the overdetermined value (both truth and false) to be forced by an overdetermination of the value of some atomic formula. The property of freedom will have little bearing on the project, since I will assume that every situation verifies some formula. Consequently, I will deal only with the classical connectives throughout this project. In the case of four-valued interpretations, this situation is even simpler. If we define the dual relation as holding between true and false, and also between undefined and overdefined (both true and false), then Thijsse (1992) has proved that the classical connectives are functionally complete for the class of persistent, duality-preserving truth functions. In partial logic, there are no logically true propositions: no proposition is true in every three- or four-valued interpretation. We do, however, have nontrivial logical implication. In fact, there are a variety of relations that are species of logical implication or consequence in partial logic. In three-valued logic, there are three notions of logical consequence that seem most natural: verification validity, falsification validity, and double-barreled validity. A set F verifiably entails a set A just in case every interpretation that verifies every member of F verifies some member of A. A set F falsifiably entails a set A just in case every interpretation that falsifies every member of F falsifies some member of A. The relation of double-barreled implication (first suggested by Blarney (1986)) holds between a set F and A just in case F both verifiably and falsifiably implies A. Whenever I talk about implication in three-valued logic, I will mean double-barreled implication, since this comes closest in many ways to the classical case. In four-valued logic, the situation is much simpler, in that verification implication, falsification implication and double-barreled implication all coincide (Muskens, 1995, p. 77). Muskens (1995) has proved that the following system of rules (rL+*) is complete for double-barreled implication in three-valued propositional logic:


(R3) (R4) (R5)

(R6) If (R7) If (R8) (R9) If and then and and and and then then

Partiality, Modality, and Conditionals


(RIO) F h A if and only if there are nonempty sets {aj, . . . , am} C F and {ft, . . . , /?} C A such that For four-valued logic, Muskens demonstrates that the system rL is sound and complete, where rL is rL + * (R8*).

A. 2

Partial Modal Logics

I take it as obvious that there is some important connection between causation and modality. As Hume famously observed, causation involves some kind of "necessary connection." Consequently, I need to make use of a partial version of modal logic. Fortunately, the groundwork for partial modal logic has already been laid by Thijsse (1992) and Muskens (1995). In this work, I will follow Muskens very closely. A model in partial model logic shall consist of a quintuple (Sit, &,&,!, C). Sit is the set of situation-tokens, which are essentially partial (incomplete or overdetermined) worlds. The part- whole relation C is a partial ordering on the class Sit. The interpretation function assigns truth-values (true, false, neither, or both) to atomic symbols, representing persistent situation-types. Whenever s C s', I will require that /(</>, s') be an enrichment of I((f>, s) for every atomic symbol <f>. (Remember: every other value is an enrichment of the value undefined, and the value both is an enrichment of the two classical values.) The relations R^ and R^ are binary relations on Sit. These are the outer and inner accessibility relations. If we have sR^ s', then we fail to falsify the accessibility of s' from s: the model treats the accessibility of s' from s as either definitely established or undefined. Dually, if we have sR^- s', then we definitely verify the accessibility of s' from s. This gives us four possible values for the accessibility of s' from s: The relation is undefined: we have sR^s', but not sR^s'. The relation is verified only: we have both sR^ s' and sR^-s'. The relation is falsified only: we have neither sR^ s' nor sR^s'. The relation is both falsified and verified: we have sR^-s', but not sR^s'. Whenever a situation-token s is logically possible, we have both that I(<j))(s) is true, false, or undefined for every <j> (never both true and false), and the image R^ [{s}} is a subset of the image R^ [{s}] (no situation is both definitely accessible to s and definitely not accessible to s), I will require that modal facts be persistent. So, if s C s', then /?^[{s'}] C jRT [{}], and /2^[{s}] C fl^[{s'}]. In other words, as we move from a smaller to a larger situation-token, the set of definitely inaccessible tokens monotonically increases, as does the set of definitely accessible tokens. The truth and falsity definitions for the modal operators are quite simple:


Realism Regained

The formula D<j> (the necessity of <f>) is true in a model at a situation s just in case </> is never falsified by any token in the outer accessibility set of s (the set of tokens that are not definitely inaccessible). This formula is false in a model at s if it is falsified by some situation-token in the inner accessibility set of s (the tokens that are definitely accessible). The possibility operator O is defined as the dual of D. In these definitions, I deviate from the pattern of both Thijsse and Muskens, since for my purposes, it is essential that I make all modal facts persistent (with respect to moving up the part-whole ordering C), which the Thijsse-Muskens truth definitions fail to do. Nonetheless, it is easy to demonstrate that the logic of double-barreled consequence in the three-valued case is characterized by the Thijsse system MK, and the four-valued logic is characterized by the system M , one of Thijsse's systems (Thijsse, 1992, p. 104). The system MK consists of rL+* plus the following rules:

(KNec) If (j> is a theorem of the classical (two-valued) system K, then


The system MK agrees with the classical modal logic K with respect to all theorems that are preceded by a box, since it is impossible to find coherent models that falsify any classical validity, and the truth definition of O(f> guarantees its truth whenever <j> cannot be falsified. The system M , which characterizes four-valued modal logic, consists of rL plus rules (Rll) through (R14), plus two additional rules, (R16) and (R17):

The completeness of these two systems for their respective sets of models can be demonstrated quite easily by means of the standard construction of canonical models. In the case of three-valued (or coherent) modal logic, the set Sit in the

Partiality, Modality, and Conditionals


canonical model consists of the set of consistent, saturated theories of the logic MK. A theory of a modal system is a set of sentences that is closed under the rules of the system. A theory is consistent if it does not contain both </> and -10, for any formula 0. A theory is saturated if it contains either 0 or i/j whenever it contains the disjunction 0 V V>- The interpretation function / for the canonical model is defined thus: for each atomic formula 0, if <j> 6 F, then /(0)(F) = T, if -10 T, then /(0)(F) = F, and otherwise /(0)(F) = U. Two theories T and A are in the part-whole relation C just in case F is a subset of A. Finally, we need to define the two partial accessibility relations R^ and R^ for the canonical model:

In the case of four-valued (general) partial modal logic, the canonical model is constructed in exactly the same way, except that the class Sit is the set of all saturated theories of M~~, whether or not they are consistent. For these completeness proofs, we need a partial version of Lindenbaum's Lemma: Lemma A.I (Lindenbaum's Lemma) //F \f A, then there is a saturated theory F' such that F C T' and T' n A = 0. This partial Lindenbaum's lemma can be proved by a slight modification of the usual proof (Muskens (1995)). We then prove by induction a canonical model theorem, showing that for every situation s in the canonical model (that is, for every saturated theory), and every formula <f> of the language, 0 s if and only if M.can,s |= 0, and -i0 s if and only if Mc<m> s |= ~~1<A- This proof follows the usual one in the atomic cases and in the cases corresponding to the propositional connectives. In the case of the modal connectives, we must make use of the partial-logic Lindenbaum's lemma. For example, in the case of the necessity operator n, we must show that D0 g s if and only if n0 is true at s. If O0 g s, then the fact that O0 is true at s follows immediately from our definition of R^ for the canonical model, and from the truth definition for D. If D0 ^ s, then we must use Lindenbaum's lemma to construct a saturated theory F containing -i0 and disjoint from the set A(s) = {T/> : O-itjj s}. We know that we cannot derive any member of A(s) from -i0 because, if we could, then, since derivation satisfies contraposition, there would be a formula -0 such that Di/; G s and i/> I- 0. By rule (R17), this would mean that O<f> was in s, contrary to our assumption. There is an alternative way of seeing that the partial modal logics must be axiomatizable. We can make use of a translation from partial modal logic into classical modal logic, using a technique developed by Gilmore (1974) and Feferman (1984). For each atomic formula 0 in the language of partial modal


Realism Regained

logic, we introduce two distinct formulas into the classical language: <f>+ and <j>~ . We also introduce into the classical logic two independent modal operators, D^ and d^, together with their duals. It is convenient to define two complementary translations, + and . Each atomic formula is translated via + by means of the corresponding positive version in the classical language and via by means of the corresponding negative version of the formula.

In the classical language, we assign independent truth values to the positive and negative versions of each atomic formula, and we use two independent accessibility relations, one for the | modalities and one for the j modaliti set F entails A in partial modal logic if and only if two conditions are met: (1) the translations of F entail the translations of A in classical modal logic, and (2) the translations of the negations of A entail the translations of the negations of F in classical modal logic. In the case of four-valued modal logic, these two conditions coincide. This correspondence between partial modal logic and classical logic enables us to transfer to partial modal logic many of the familiar results of classical modal logic, such as decidability and the finite model property (Muskens (1995)).

A. 2.1

Reflexive Models

If we require, as seems natural, that the accessibility relations be reflexive, then we can strengthen the two systems MK and M . In the case of three- valued modal logic, the class of reflexive models can be characterized by adding two new rules: (Refll) (Refl2) In the four-valued case, I do not believe that we can characterize the class of models in which R^ is reflexive. One solution to this latter problem is to introduce a hybrid logic. In models of this logic, there is a single designated situation g (the actual situation, intuitively). We can require that g be coherent, in the sense that I(g) assigns only

Partiality, Modality, and Conditionals


T, F, or U to all atomic formulas, and /^[{p}] C /?T[{<7}], an(j m addition, we could require that every situation in R^- [{g}} (the situations that are definitely accessible to g) be similarly coherent. Other situations in the model, however, could be logically incoherent, requiring the use of the four-valued truth functions. We could define logical consequence for this system by reference to these distinguished worlds g: T entails A if and only if: 1 . every model M. such that M. , g verifies every member of F is such that M,QM verifies some member of A, and 2. every model M such that M,g falsifies every member of A is such that M,gM falsifies some member of F. The logic for this system, MH , would consist of rules (Rl) through (R17), but would lack rules (K) and (KNec). In addition, we could characterize the class of reflexive models by using rules (Refll) and (Refl2). If we restrict our attention to possible worlds, situation-tokens that are coherent and complete, then we would return to the classical two-valued modal logics, such as T, S4, or S5. My intention is not to argue that partial modal logic should replace classical modal logic. We still need classical modal logic to characterize a certain kind of validity. For instance, the inference from n<f> to (/> is not locally valid, since there are many situation-tokens that verify the first but not the second. However, this same inference is globally valid, since any token that verifies the first is embedded in a possible world that somewhere verifies the second. The corresponding T axiom, (D</> > <^), is not verified at every situation-token, but only because many tokens contain only partial information about modality. As the modal information supported by a token is enriched, we approach classical modal logic at the limit. Partial modal logic is important in representing facts about causal connections between partial tokens, as we saw in chapter 7.

A. 3 Partial Conditional Logics

In chapters 4 and 5, 1 argued for the possibility that all of the causal laws of our world are "oaken" rather than "iron" (to use D. M. Armstrong's distinction), that is, that all of the actual causal laws admit of exceptions. In addition, I argued that causes do not strictly necessitate their effects. Instead, I developed in chapter 7 an indeterministic model of causation, in which causes make their effects extremely probable but not absolutely certain. In constructing such a model, I made use of what Michael Morreau (1997) has called "fainthearted conditionals." These conditionals, which I have symbolized by means of D, have a logic and a semantics that is very similar to that of the counterfactual or subjunctive conditionals investigated by Robert Stalnaker and David Lewis. Ernest Adams (1975) was the first to investigate the properties of such probabilistic, fainthearted conditionals and to note their logical similarities to the Stamaker/Lewis conditionals.


Realism Regained

David Lewis (1973) developed a "system of spheres" semantics for the subjunctive conditional. The spheres of these models were intended by Lewis to represent varying degrees of similarity of the contained worlds to a designated world. This semantics can, however, also be given a different, probabilistic interpretation. In the case of a finite model, each sphere can be thought of as representing the intersection of all those sets whose probability is greater than or equal to 1 e, where e belongs to some fixed order of infinitesimals. In this way, Lewis's system-of-spheres semantics can be given an interpretation in terms of qualitative probabilities: ^D> i/> represents the condition that the probability of (f> & i/} is infinitely greater than the probability of </> & ->^>, as was demonstrated by Lehmann and Magidor (1992) in appendix B of their 1992 essay, "What does a conditional knowledge base entail?" (Adams (1975) is the locus classicus of this approach; see also Vann McGee's very insightful paper, McGee (1994).) Lehmann and Magidor proved that the conditional logic VW~ is both sound and complete for an interpretation of the conditional in terms of non-standard probability theory. In such an interpretation, we assign a non-standard probability space to each world in the model. Non-standard probability theory is an application of the work by Abraham Robinson (1966) on non-standard analysis. (For more details see Keisler (1976) and Cutland (1983).) We know from model theory that there are non-standard models of number theory: models in which there are non-standard natural numbers, numbers that are larger than any finite natural number. These model theoretic results extend, as Robinson showed, to real analysis. For example, the number jj, where h is a non-standard natural number, is a non-standard, infinitesimal rational number. Suppose that we fix on a particular non-standard model of the real numbers, 72.* = (/?*,+*, x*,<*,0, 1). This model consists of an ordered field, together with a map * that takes members of R into members of R* , relations on R into relations on R*, etc. More precisely, * is a function from the superstructure of R into the superstructure of R*, where the superstructure Voo(X) is defined recursively as follows:

where P(Z) is the power set of Z. ine map * is such that, tor every x K, x = x, and tor every bounded formula 0,

This latter correspondence is known as "the Leibniz principle." The Leibniz principle guarantees that the non-standard counterpart of any standard notion shares all of its important (measure-theoretic) properties. The set TV* is the counterpart of the natural numbers. The non-standard natural numbers are those that belong to N* but not to N. An object x in the superstructure of 72* is called internal if x G y* for some y G Voo(R). The set of internal objects of 72* is designated V^. An internal

Partiality, Modality, and Conditionals


object A is hyperfinite just in case there exists a function / 6 V, and a number h 6 N* such that / is a one-one mapping of h onto A. In other words, the hyperfinite sets are those that are treated as though they were finite in the non-standard model. An Ti* probability space is a triple (X, J - , Pr), where X is a nonempty set, F is a Boolean sub-algebra of ~P(X), and Pr is a function from F into jR* that meets the usual requirements on a probability function, namely:

Pr(A) > 0, for all A e F.

Pr(X) = 1.

Pr(A (JB) = Pr(A) + Pr(B), for A and B disjoint members of F. Conditional probability can be defined in the usual way:

A non-standard probabilistic model of conditional logic is one in which an 7* probability space is assigned to each world in such a way that the field Jincludes every set of worlds definable in the language, and in which the truth conditions of the conditional D are defined as follows: Definition A.I (Non-standard probabilistic truth conditions) M,w \=
((/>D> i/j) if and only i f l Prj^tW(\\^\\/\\</>\\') is infinitesimal, or PrM,w(\\ 0.

We can define probabilistic consequence, \=pr in the usual way: F \=pr A if and only if every non-standard probabilistic model that verifies every member of F also verifies some member of A. Lehmann and Magidor have proved that the relation of probabilistic consequence is captured by the logical system VW~. The rules and axioms of VW~ consist of the following: . (RCEC) Prom . (RKC) Prom to infer to infer

Theorem A.I (Lehmann and Magidor 1992) For countable language L, F |=pr A if and only ifT
Strictly speaking, Lehmann and Magidor proved the theorem for a language L that contains no nested conditionals. However, their result can be easily extended to the more general case by simply assigning an 7J* probability space to each world in the model, instead of constructing a single space for the entire model, as Lehmann and Magidor do.


Realism Regained

In their proof of the completeness theorem, Lehmann and Magidor show how to construct, given a consistent theory F of VW~ , and given a non-standard model of analysis 72.*, an 72.* probabilistic model M such that M \= T. There are two model conditions that Lewis imposed in his work on counterfactuals that no longer make sense when the spheres are interpreted in terms of qualitative probability. These conditions are strong and weak centering. Weak centering means that the smallest sphere in the system associated with world w must contain w. Strong centering means that the smallest sphere associated with w must contain nothing but w. These two requirements correspond to two axioms of Lewis's system:

Neither of these axioms is valid when the conditional D> is interpreted as a fainthearted conditional. Thus, nothing corresponding to them will show up in my partial conditional logic. For simplicity's sake, I will not make use of Lewis's system of spheres semantics. In its place, I will use the more flexible set-selection function semantics. In this case, each model contains a set-selection function /. This function takes two arguments: a world, and a set of worlds. The function's output is always a set of worlds. In classical conditional logic, we define ||^>|| to be the set of worlds in the model that verify 0. The truth conditions for the conditional can then be given very simply:

Under the interpretation by means of extreme probabilities, we should think of f ( s , \\</>\\) as the intersection of all of the propositions whose probability, conditional on (j> is, from the perspective of s, infinitely close to 1. Various logical properties of the conditional can then be represented by placing corresponding conditions on the selection function /. For instance, the probabilistic Adams conditional requires the following conditions: 1. f(s,A)QA. 0000000000000000000000000000000 3. If f ( a , A)CB and f ( s , B) C A, then f ( a , A) = f ( s , B). 4. If /(s, A) n B J= 0, then f ( s , Ar\B)C f ( s , A). 5. f ( a , A ) C R [ { a } ] . The first condition guarantees that all of the normal A worlds are indeed A worlds. The second condition ensures that if x 'ls extremely probable on 4> V V, then it must be extremely probable on either <j> or on V (or both).

Partiality, Modality, and Conditionals


The third condition indicates that probabilistically equivalent propositions can be substituted for one another in conditional antecedents. The fourth ensures that if the probability of <f> on T{> is finite, then any condition that is extremely probable on condition if> is also extremely probable on condition (f>&zi/j. Finally, the fifth condition guarantees that impossible worlds have zero probability. In moving from classical to partial conditional logic, two changes are needed. First, we must replace the set of worlds with a set of situations, to which we assign three- or four-valued interpretations of the atomic formulas. Second, we must replace the single selection function of the classical model with two selection functions, /^ and /^. When we apply /t to a situation s and a set of situations A, we get as output the set of situations that are not definitely excluded from the set of the most normal or probable A-situations. When we a Pply /^ to a situation s and a set of situations A, we get as output the set of situations that are definitely included in the set of the most normal or probable A-situations. If s C s', then, for every set A,

These conditions guarantee that the truth values of the conditionals are persistent. In partial logic, we can define the set \\4>\\^ to be the set of situations in the model that do not falsify (f>. The truth and falsity conditions of the conditionals in partial logic are the following:

There are some standard conditions on the set-selection function that we must impose. These conditions, like their analogues in the classical case, are supported by our interpretation of the selection function in terms of qualitative probability. However, just as we could not characterize reflexivity in partial modal logic, we cannot characterize three of the conditions that constitute Adams's logic. Thus, I will impose analogues only of conditions 1 and 2. In addition, we need a third condition, P3, that represents a special case of condition 4. P3 is needed to validate the contrapositives of the rules validated by PI.


Realism Regained

The canonical model for partial conditional logic is similar to that for partial modal logic. In the case of four-valued logic, the class of situations in partial conditional logic is the set of saturated theories of the logic. The two setselection functions in the canonical model can be defined as follows:

For four-valued interpretations, the conditional logic can be axiomatized by the following rules, the system rC:

Rules (Cl) and (C3) correspond to condition PI, and the contrapositives (C2) and (C4) correspond to condition P3. Rules (Cll) and (C12) correspond to condition P2. The other rules hold in any model, without special conditions on the selection function. A partial version of Lindenbaum's lemma can once again be proved, and by means of that lemma, we can prove the usual canonical model theorem, and, hence, the completeness of the calculus rC. Another way to establish the axiomatizability of the logic, as well as such properties as decidability, is to extend the translations + and from partial conditional logic to classical conditional logic. All we need to do is add two clauses to the definition I gave in the last section:

Once again, a set of formulas F entails A in (four- valued) partial conditional logic if and only if the translations of F entail the translations of A in classical conditional logic.

Partiality, Modality, and Conditionals03



Classical Conditional Logic

0000000000000000000000000000000000000000000000000000000000000000000 000000000000000000000000000000000000000000000000000000000000000 have not characterized proof-theoretically. 1. f(s,A)CA. 2. If f ( s , A)CB and f ( s , B) C A, then f ( s , A) = f ( s , B). 3. If /(a, A) n B ^ 0, then f ( s , AnB)C f ( s , A). These three conditions correspond (respectively) to the following axioms o the logic VW~:

The situation here seems to be similar to that of reflexive modal logics and the axiom T. We should not expect axioms like (Id), (CV), or (CSO) to be verified by every situation token, although we can expect them to be verified by every possible world, including the actual one. As situations gain in modal information, they will verify more instances of classical axioms like these.

A. 4

Partiality and Quantificational Logic

Partial modal and conditional logics, as important as they are, are not in themselves sufficient for formulating an adequate model of causation. In addition to the representation of necessity and objective probability, we must also be able to talk about the situations that are causing and being caused, and we must be able to represent some situations as parts of others. Consequently, in this section I will develop a partial quantificational logic. This quantificational logic will be different from standard quantified modal logics in that the individuals being named and quantified over will be indices (situations and worlds) and not ordinary substances (like people and organisms) . One might think that quantification over situations makes the modal operators redundant, since we could define necessity by simply using a universal quantifier. However, replacing modality with explicit quantification over situations would eliminate a critical element implicit in the use of modal operators: the indexicality of modal properties. We could re-introduce this element of indexicality by adding a special indexical constant to our language, something that intuitively picks out "this situation." The necessity of cf> could then be defined as c/>'s holding in every situation accessible from this situation. However, this fix would introduce another problem: many formulas involving the constant this situation would be non-persistent. We could have the formula "^ does not hold


Realism Regained

in this situation" holding in s but not holding in a strictly larger situation s' that contains s. Thus, I will use a language that contains both modal operators (with their implicit indexicality) and terms and variables that stand for situations (and do so non-indexically) . In addition to situation constants, variables, and quantifiers, I will add two new kinds of atomic formulas (t and t' are situation constants and (j> is any formula):

We can define the part-whole relation C by means of these elements:

In turn, the identity predicate can be denned in terms of C: (t = t') def (t C t ' & t ' Ci). The first new kind of atomic formula is an object language counterpart to the verification relation between situation tokens and types. For simplicity's sake, I will assume that the logic of these two kinds of formulas is entirely classical (bivalent). I will assume that if one situation is part of another, or if one situation verifies a formula, then these facts are supported in every situation in the model. For some purposes it might be useful to make situations partial in their mereological or classificatory information (for example, this might be very important in modeling certain propositional attitudes), but I have not found this additional flexibility necessary in dealing with the concept of causation. Thus, the truth and falsity conditions for these formulas are the following (where \\t\\ represents the designatum of constant t in the model A4):

The atomic formula At means that the situation t is definitely part of the actual world (from the perspective of the indexed situation). Such a formula is verified by a situation s just in case the situation \\t\\ is a part (proper or improper) of s. A model structure for the language must include a binary relation A~ to provide falsity conditions for the A predicate.

Partiality, Modality, and Conditionals


To ensure that the logic of situations is axiomatizable, it is essential that we place certain conditions on the the relation A~ . First, there is a fixed-point condition: whenever two situations verify contradictory formulas of any kind (including formulas involving A itself), the ordered pair of the two situations must belong to A" . Second, if a situation s does not support -K/>, then there must exists a token s' such that s' supports <j> and (s, s') does not belong to the relation A~ . Third, if there is no situation accessible to situation s (either by the inner relation R*- or by the outer relations R^) that extends both s and s', then the pair ( s , s ' ) belongs to A~ . Finally, the relation A" must be mereologically persistent. 1. (Fixed point condition) If M, s \= 4> and Ai, s' (= -i^, then (s, s') e A~ . 2. If A4,s Y= -)</>, then there exists an s' such that M, s' \= 4> and (s, s') $ A~ .
3. (Modal condition) If -i3x(x e #T[(S}] U^[{s}]&s C x&s' C x), then (s,8')eA-. , 4. If (x,y) G A~ , and x C z and y C u>, then (z,w) A~. In the case of the quantifiers, the truth and falsity conditions reflect the usual extension of the Dunn truth tables. M, s \= \/x(j> & for every situation s' in Sit, Ms' , s \= </>[t/x], where A4S> is a model that differs from M only in assigning s' to a new constant t ( a constant not occurring in 0). M,s \= -iVx0 <=> for some situation s' in Sit, M.si,s (= -/>[t/x], where M.si is a model that differs from M. only in assigning s' to a new constant t ( a constant not occurring in <j>). The logic that corresponds to this semantical system will be called su- In addition to the rules of the system rL for partial prepositional logic, M for partial modal logic, and rC for partial conditional logic, we need to add, first, the following quantifier and identity rules:

and t does not occur ir and t does not occur in

then then

where (f>[x//t\ is the result of replacing one or more occurrences of t in d> with x.


Realism Regained

In order to capture the classicality of atomic formulas involving =, we must add the following two rules: (Q8) If F classically entails A, and the atomic formulas in F and A involve only |=, then T h A.

Rule (Q10) connects impossibility with the support of non-actuality. Since all the formulas of our language are persistent (with respect to the part-whole ordering), we must add rules that ensure that if a situation t is part of the current index, and t supports formula </>, then the current index also supports 0. Also, if the current index supports <jf>, and t supports -1$, then the current index supports ->At, Thirdly, every situation verifies the formula that it is actual. Rules (Qll) and (Q12) guarantee that actuality has the right kind of fixed-point character.

The rules governing the support relation |= ensure that the formulas supported by a situation form a saturated theory.

I will assume that all mereological and classificatory facts are supported by all tokens, and that they hold of necessity:

where tj> contains only |= atoms.

Partiality, Modality, and Conditionals


Finally, I will stipulate that the domains of quantification associated with all situations are the same. In each case, we will be quantifying over all situations, actual and non-actual, possible and impossible. We can of course express the actuality of a situation by the formula At and its possibility by OAt. Since the domains of quantification are constant, we can add two rules corresponding to the Barcan and converse Barcan formulas of standard quantified modal logic.

In a canonical model for the logic Csu, the set of situations consists of a certain set of supersaturated theories. A theory F is supersaturated if and only if it meets the three conditions: 1. If Vx4> g F, then for some t, 4>[t/x] < F. 2. If 3x(j> 6 F, then for some t, 4>[t/x] F. 3. If (0 V V) e F, then either </> 6 F or V F. To prove the completeness of the logical rules, we must show that if F (/ A, then F ^ A. To begin with, we will add a new constant, t*, to stand for the current index, the situation token at which the types in F are verified, and the types in A are not verified. For each type <j> in F, we shall add the type (t*\= <j)}, producing a new set, F + . We can easily prove a lemma to the effect that if F \f A, then F+ \f A, using rules (Q4) and (Q14), given the fact that t* does not occur in F. We can then easily prove a generalization of the partial version of Lindenbaum's lemma: Lemma A.2 (Generalized Lindenbaum's Lemma) // F+ \f A, then F+ can be extended to a supersaturated theory F* such that This can be proved by the construction of a series of pairs (Fj, A<}, such that Ti \f A,, F+ C F;, A C A,, and F* n A* = 0. The construction is based, as usual, on an enumeration of the language in which every formula comes up infinitely often. When we reach formula fa, we follow the following rules: 1. l f T i \ - f a , then 2. If fa \- Ai, then 3. Otherwise: if fa = i/j[t'/v],3v^ that <t>[t/v\ Fi, then Ti,Ti,fa \f A, and there is no t such

4. If fa = ip[t'/v],Vvip Ai,Fj \f &i,fa, and there is no t such that AJ, then


Realism Regained

We can then let F* = IV It is easy to show that F C F*, F* f~l A = 0, and that F* is a supersaturated theory. This supersaturated theory will contain a model base B(T*), consisting of a consistent and complete assignment of classical truth values to the mereological and classincatory atomic formulas (those containing C and |=). In the canonical model for the logic C$it based on F*, the set of situations consist of all the supersaturated theories that extend the model base J3(F*). In this canonical model, we can associate each term t with a canonical situation-token (a supersaturated theory) in Sit_Mc in the following way:

The supersaturation of F* and axioms (Q13) through (Q17) ensure that each of these associated sets is a supersaturated theory. Rules (Q18) and (Q19) guarantee that these theories also extend B(T*) and so are all members of The part-whole relation C in the canonical model is given by the subset relation between theories in Sitj^c- The interpretation function / assigns truth values to simple atomic formulas by reference to the inclusion or non-inclusion of the formula and its negation in each theory.

Sit MO-

It is then straightforward to prove the usual truth theorem for the canonical model: a formula (whether atomic or complex) is at least true at a situation theory just in case it is included in the situation-theory, and it is at least false just in case its negation is included in the situation theory. The situation corresponding to the special constant t* has been constructed so as to include every member of F and exclude every member of A. From the truth theorem for the canonical model, it follows that F \t= A. This suffices (since F and A were arbitrary sets of formulas such that F \f A) to prove the completeness of the inference rule set. An alternative route to this completeness result is to make use again of the Gilmore-Feferman technique of translating partial logic into classical logic. In the case of jC.su, the translations + and must be extended as follows:

Partiality, Modality, and Conditionals


Muskens has proved that F entails A in partial predicate logic if and only if F+ entails A + in classical predicate logic (Muskens, 1995, p. 54). As corollaries of this result we have partial versions of the compactness theorem and the Lowenheim-Skolem theorem, and a second proof that partial predicate logic can be recursively axiomatized. In addition, the translation enables us to transfer other standard results in classical predicate logic, such as the Lowenheim/ Skolem theorems.

A. 5

First-Order Quantification over Situation- Types

I take the formulas of our language to correspond to something in the world: what situation theorists like Barwise refer to as "situation-types." These situation-types are closed under such logical operations as negation, conjunction, disjunction, and generalization. In addition, there are specifically modal and conditional situation types, corresponding to formulas containing d and


Situation-types are needed to account for certain kinds of causal connections, in particular the connection I will call causal explanation. Unlike many Platonists, but like some contemporary realists such as Armstrong and Hochberg, I do not insist that every meaningful predicate or open formula of our language must correspond to a situation-type. There are purely formal or logical properties and relations that can be defined but may not correspond to a real type in the world. In this way, I can avoid Russellian paradoxes involving properties, like the supposed property of heterologicality (the property of being something that cannot be truthfully predicated of itself). Although I am a realist about types and think of them as universals (things that can be multiply instantiated in the world), this realism about universals is not essential to the program I am undertaking in this book. If one prefers to think of atomic situation types as natural classes, classes bound together by especially close relations of natural similarity, as does, for example, David Lewis


Realism Regained

(1983). I have no deep-seated objection. I am inclined to think that relations of natural similarity must be grounded in the co-instantiation of some universal, rather than the other way around, but I am not confident of having a definitive argument for settling this ancient dispute. There will be no need in this volume for any quantification over types, and relatively little need for it in the next volume. However, in chapter 15, in developing a Platonistic theory of the natural numbers, I did make use of quantification over types. I do not, however, require the full force of standard second-order quantification, in which the second-order variables are taking as ranging over arbitrary classes of the first-order domain. Instead, I can make use of what is essentially a first-order theory of situation-types. Since I distinguish between real and merely logical types, my logic will not include an axiom of abstraction, positing the existence of a type corresponding to every open formula of the language. Lambda abstraction will not be assumed to lead unfailingly to the specification of a real situation-type. Thus, I need not accept heterologicality, defined as A:r : ->(.r|= x ) . We can take situation-types to be a special class of entities, closed under certain logical operations. We do not have to think of the logical and modal operators as literally parts of "complex" types. Instead, we can take negation, for example, to be a particular relation between types. Types that are logically equivalent (in partial logic), such as <j> and -i-i<^ or 0&V ; and </;&</>, may, if we wish, be identified with one another. We also do not have to reify the variables of predicate logic as parts of generalized types. Instead, we can take substitution to be a ternary relation (definable recursively) between two types and an individual token, and we can take existential generality to be another ternary relation between two types and an individual, definable in terms of substitution. Quantification over types can be enabled simply by taking the set of types to be a subset of the set of tokens. The same first-order variables can then be taken as ranging both over ordinary particulars and over abstract types.

Appendix B

A Causal Calculus
B.I Causation and Projectible Statistics

The power of causation comes from its impact on statistical inference. We can now specify exactly what impact causation has on projectible statistics by formalizing the principles known as "Markov's rules." Markov's principles tell us that when one fact a screens off one of its effects b from some other fact c, then b is statistically independent of c, given a. In standard treatments of causal inference, Markov's principles are formulated for the special case in which the causes and effects are random variables. We need to formalize Markov's principles for the general case, in which causes and effects are represented by types of arbitrary logical complexity. This task of formalizing Markov's principles depends on expressing a relation between types of situations, and not merely a relation between situation-tokens. I defined such a relation between situation-types in chapters 4 and 5. In this appendix, I will illustrate the advantages of such an account. For simplicity's sake, I will assume that all of the situation-tokens in the relevant class of models are modally complete and coherent. Consequently, I will make use of a fully classical, bivalent conditional logic, the logic VW~, which includes the following axioms and rules: (RCEC) From (RKC) From to infer to infer



Realism Regained

The logic VW~ corresponds to the interpretation of the conditional in terms of extreme probabilities proposed by Ernest Adams (1975) and Judea Pearl (1988) (as I discussed in A. 3). (See also Lehmann and Magidor (1992).) This logic corresponds to David Lewis's logic VW, minus the MP axiom (that is, minus (^&(<^D> ?/>)) > if) ). Once we have formalized such generalized Markov principles, these principles can enable us to specify a nonmonotonic logic that is adequate to the task of reasoning about dynamic situations.. Consider, for example, the infamous Yale Shooting Problem, due to Hanks and McDermott (1987). I will formalize the problem, using the modal/statistical conditional D in the statement of the defeasible rules. We start with a set of facts, {A,S,L}, standing for the fact that the victim is initially alive, the gun is initially loaded, and that an event of shooting takes place (after a short period of waiting). We have three defeasible rules:

The first two of these rules are instances of the so-called law of inertia. The third rule is a causal law specifying that the firing of a loaded gun overrides the inertia of being alive, resulting in the victim's being dead in the succeeding state. In standard nonmonotonic logics, this set of facts and rules has two permissible extensions. In one of these extensions, the gun stays loaded and the victim is killed. In the other, the gun mysteriously becomes unloaded before the shooting, and the victim remains alive. Each of these scenarios involves the occurrence of something unexpected, either the overriding of the inertia of being alive, or the overriding of the inertia of being loaded. By taking into account the causal structure of the situation, we can apply Markovian rules to derive a principled solution to the problem. Consider the following causal structure diagram. Notice that L screens off A and -iA' from L'. This means that, by means of Markov's principle, we can strengthen the antecedent of the law of inertia as applied to L, resulting in the new rule:

This rule now takes priority in many nonmonotonic logics (such as Pearl's System Z or Asher/Morreau's Commonsense Entailment) over the law of inertia as applied to A. This means that we can throw out the second scenario, and we can successfully infer that the victim is not alive. Suppose that we had instead the following causal structure, where Z represents the presence of zealous police protection for the would-be victim. The presence of these zealous bodyguards is causally prior both to A (since it is a possible explanation of the victim's surviving up to the present time) and,

.A Causal Calculus


Figure B.I: The Yale Shooting Problem let us assume, to L' (since the body guards have access to the loaded gun during the quiescent interval). In this case, L no longer screens A and -\A' off from L', and we can no longer apply Markov's principles. In this case, we cannot infer that the victim is dead, which seems to be the intuitively correct result. Of course, we were not told in the original story that there was no such police bodyguard. Apparently, nonmonotonic reasoning about dynamic situations involves two processes: first, assuming the minimality of the causal structure of the current situation, and second, using that causal structure (in combination with Markov's principles and various defeasible rules) to infer the probable consequences.


Some Other Weil-Known Puzzles

Judea Pearl (1988) discusses a very simple problem illustrating the necessity of introducing causal information into the formalization of common-sense reasoning. Consider a sprinkler and a sidewalk. We have two defeasible rules: (Sprinkler-onD> Wet) and (WetD> Rain). Suppose we know that the sprinkler is on. We can reasonably conclude that the sidewalk is wet. However, if we go on to infer that it probably rained in the recent past, something has gone wrong. We need to make some sort of use of the fact that both the sprinkler's being on and the rain are causally prior to the wetness of the sidewalk. Another well-known problem is that of the lamp, discussed by Vladimir Lifschitz (1990). Suppose we have a lamp connected to a pair of switches. If


Realism Regained

Figure B.2: The Yale Shooting Problem, With Bodyguards both switches are up, or both are down, then the lamp is on; otherwise, it is off. Suppose that the first switch is down and the second is up. The lamp is off. Now, suppose we flip the first switch up. We want to derive the consequence that the lamp comes on. However, a result that is equally in accord with the defeasible rules is one in which the second switch moves down. Again, we need to take into account the fact that the state of the lamp, but not that of the second switch, is causally posterior to the position of the first switch. Finally, there is the problem of the emperor's colored blocks, proposed by Lin and Reiter (1994). There are two red blocks. The emperor has decreed that either both blocks or neither shall be yellow at any time. Suppose we try to perform the action of painting just one of the blocks yellow. The correct conclusion to draw is that the action will fail. We must somehow avoid the conclusion that painting the first block will, in and of itself, cause the second block to become yellow as well.


Screening Off

I will now define the relation, represented by CT(SI, s%, 3), according to which one token is screened off from a second by a third. I will make use of two new abbreviations: Pw and Nw. Pw(s) is the sum of the tokens that are parts of world w and immediately prior to token s, and Nw(s) is the sum of the tokens that are parts of world w and immediately posterior to token s. Throughout these definitions, I will assume the thesis I have argued for in

A Causal Calculus


chapters 5 and 8, namely, the thesis that the existence of a situation-token necessitates the existence of every token causally prior to it. If we drop this thesis, we must take into account the possibility of counterfactual, non-actual causes of actual token events. The causal antecedents of a given token would then include non-actual, as well as actual, prior tokens. In order to screen off the probability of the effect, we would have to have complete information about the occurrence and non-occurrence of each of its causal antecedents. It seems that this is wrong: it is sufficient to take into account the occurrence of the actual causes of an actual event. If so, we must postulate that actual events have no non-actual causal antecedents, and this postulation, in turn, would make sense only if each event necessitates all of its causal antecedents. Definition B.I (Backward Causal Chain) A sequence of tokens c is a backward causal cone relative to w, starting from s, Bw(c, s) if and only if following conditions are met:
1. c(0) = s

2. Mi < dom(c}c(i + 1) = Pw(c(i)) Definition B.2 (Forward Causal Chain) A sequence c is a forward causal cone relative to w starting from s, Fw(c,s) if and only if following conditions are met:
1. c(0) = s

2. Vi < dom(c)c(i + 1) = Nw(c(i)) Definition B.3 (Common Cause) Token sj is a common cause of s2 and 83, CC(si, 82,83), if and only if: there exist sequences c and c' and world w such that the conditions Bw(c,s%) and Bw(c',S3) hold, together with the conditions s i E c(i) and si C c'(j), for some i and j. Definition B.4 (Trajectory) Situation s\ stands athwart the trajectory from *2 t S3 Tr(si, 82,3), if and only if: there exist sequences c and c1 and world w such that the conditions Fw(c,s-z) and BW(C',SS) hold, together with the conditions Si C c(i) and s\ C c'(j), for some i and j. Definition B.5 (Causal Screening Off) Token s\ is screened off from s2 by ss, CT(SI, 2 , 83), if and only if s3 X s2 and: Vx(CC(x, si, s 2 ) > Tr(s3, x, s 2 ) In other words, si is causally screened off from s2 by 53 just in case 53 is prior to s2 and 83 stands athwart the trajectory from any common cause of Sj and $2 to s2 itself.


Realism Regained


Conditions on Hyperfinite Probability Functions

The system-of-spheres semantics corresponds to an interpretation of the D conditional in terms of extreme probabilities:

When spheres A and B belong to a Lewis system S, and A C B, this can be interpreted as representing the situation in which the probability of A, given B, is infinitely close to 1. Tim Fernando (1998) has recently demonstrated that any model with a finitely additive hyperreal-valued probability function is elementarily equivalent to such a system-of-spheres model. There are four additional conditions that must be imposed upon the hyperreal probability function: 1. Miller's principle, a principle of higher-order probability enunciated by Brian Skyrms. 2. Markovian locality. 3. Reichenbach's rule. 4. Occam's razor.

B.4.1 Miller's Principle

Miller's principle requires that the first-order probability weights can be recovered from higher order probabilities through integration. In fact, I believe that facts about modality, including facts about normality and objective chance, are necessary. In this case, we should adopt the extension of S5 to conditional logic, holding both of the following axioms:

These S5-like axioms entail Miller's principle. Hence, if we wish to be very cautious, we can adopt Miller's principle as a minimal constraint on the relation between first-order and higher-order modalities. Miller's principle can be stated as follows: Hypothesis B.I (Miller's Principle) Let\W]fl be the partition o/W by probabilistic agreement, i.e., w ~M w' iffVw"(J,w(w") = fj,wi(w"). I f A C [ W ] n , then:

A Causal Calculus


In the case of finite models, the consequent of Miller's principle can be represented as:

The following axioms, which I call "Skyrms's axioms," are the proof-theoretic analogue to Miller's principle. They are restricted forms of what is often called absorption.

where K is in each case a Boolean combination of T and D^-formulas. The operator O> can be defined in terms of Q> in the usual way:

Miller's principles have a number of other interesting implications for conditional logic. The following theorems of conditional logic exploit some of this power: Theorem B.I (Implications of Miller's Principle) Miller's principle ensures the validity of the following two axioms:

Proof: (a) By axiom SI, we have that (TD> (</>D-> I/O) ig logically equivalent to ((T&</>)D> i/;). Since cj> is logically equivalent (in classical logic) to T&</>, we have the desired result. (b) By idempotence, we have:

By axiom SI, we derive the theorem. QED The second theorem is very important, since it suggests that we can fruitfully define a kind of nonmonotonic consequence in terms of the logical properties of the D> conditional. Definition B.6 (Nonmonotonic Consequence)


Realism Regained

Given the probabilistic interpretation of O>, this definition stipulates that a conclusion is nonmonotonically derivable from a set of premises just in case the probability of the conclusion, conditional on the conjunction of the premises, is infinitely close to one. We have, thus, a very well-motivated norm of nonmonotonic reasoning: defeasible inference is just a special case of Bayesian conditioning, for which we have Dutch-book arguments as justification. Furthermore, the consequence relation so defined is cumulative (in Gabbay's sense) and preferential (in the sense of Kraus, Lehmann, and Magidor). It obeys such principles as Cut, OR, and Cautious Monotonicity. The second theorem of the Skyrms system gives us a defeasible form of modus ponens:

In order to derive a particular conclusion nonmonotonically from a set of premises, it is sufficient to demonstrate that the premises are logically equivalent to the left side of this defeasible-MP schema. What is needed to effect this transformation are equivalence-preserving principles governing the strengthening of the antecedent of D> conditionals. For example, suppose we have the premises 0, ^>, (^D> x)- We can nonmonotonically derive the conclusion x by defeasible MP just in case we can prove that (</>n x) 'is logically equivalent to (((f> &c ^)O^> x)- This can be done if, for example, 0 logically entails T/>. It can also be done if the premises logically entail </>D> ip. However, to get interesting nonmonotonic consequences, we need much stronger rules for finding logically equivalent sets of propositions in which the antecedent of some conditional has been strengthened in the second set. The principles of Markovian locality and Occam's razor will give us such rules.


Markovian Locality

In the chapter on the indeterministic model of causation, I already introduced the principle of probabilistic locality. This principle requires that the probability of the actuality of a given token is independent of the actuality of any nonposterior token, given the actuality of its immediate cause. (In this definition, I make use of the mereological sum operator, x4>, the mereological sum of all the tokens supporting type $.) I will also make use of the symbol >-, causal non-posteriority, defined as follows: Definition B.7 (Causal Non-Posteriority) Hypothesis B.2 (Probabilistic Markovian Locality) If and si, 83, and 54 are compossible, then, for any world w, Hypothesis B.3 (Reichenbach's Rule) 7/si ands% have no common cause, and neither is prior (even in part) to the other, and 53 is not posterior to both

A Causa,! Calculus


i and 82, then for any world w, Prw(As\ & As^/As^} = Prw(Asi/As^) x Prw(As2/As3).
Theorem B.2 (From Causal to Probabilistic Screening Off)

Ifcr(si, S2,s3) and (s^si) then Prw(Asz/Asi & As 3 &^4s 4 ) = Prw(As2/As3&As4).

Theorem B.3 (Soundness of Modular Inferences) The following are log ically equivalent, given probabilistic (Markovian) locality and Reichenbach 's rule (where <& is any modally closed formula):1

Proofs of these theorems appear in section B.7.


Occam's Razor

Inductive inference is inference to the simplest, most economical explanation. This preference for simple hypotheses reflects a logical requirement on probability functions, one that incorporates Occam's razor: a rational probability function gives infinitely greater probability to propositions that entail fewer tokens, fewer causal connections, or fewer types classifying a given structure. The requirement of Occam's razor can be formalized in three steps. First, we must define a partial ordering on worlds:

A world w is weakly preferred to a world w' just in case there is a structurepreserving homomorphism from the parts of w, w, into the parts of w'. Strict preference, -<, is defined in terms of weak preference in the usual way. Next, we extend this partial ordering to sets of worlds:

A set A is weakly preferred to set B just in case for every member w of B there is a member w' of A such that w' is weakly preferred to w. Definition B.8 (Occam's Razor) A model satisfies Occam's razor if and only if\/w WxVj4V5(^4 -< B & B C dom
J A formula is modally closed just in case every occurrence of an actuality type At occurs within the context of a modal operator.


Realism Regained

Causal minimization means that we can assume that where our premises are silent about the existence of causal connection or the supporting of a type by a token, the causal connection does not exist, and the type is not supported. This maximizes the extension of the screening-off relation and thereby maximizes the nonmonotonic consequences that are legitimated. Theorem B.4 (Soundness of Causal Projection) // every maximally preferred world in any model that verifies a modally closed formula <& also verifies cr(di,6^2,^3) in that same model, then the following inference is nonmonotonically correct (in the class of models satisfying Occam's Razor, Miller's principle, Reichenbach's rule, and Markov locality):

This theorem provides a paradigm for causal projection. First, we use Occam's rule to minimize the causal connections between situation-tokens, maximizing the extension of the screening-off relation. We then apply Markov locality and Reichenbach's rule to strengthen the antecedents of the nonmonotonic conditionals in our premise sets, until the antecedents contain every premise that is not modally closed. Finally, we use Skyrms's axioms to derive a nonmonotonic conditional whose antecedent contains all of the premises and whose conclusion contains the desired conclusion. This demonstrates that the inference is nonmonotonically correct, that the probability of the conclusion, conditional on the premises, is infinitely close to one.


The Yale Shoo