Beruflich Dokumente
Kultur Dokumente
RAMSEY’S LEGACY
MIND ASSOCIATION OCCASIONAL SERIES
1 3 5 7 9 10 8 6 4 2
PREFACE
This volume contains revised versions of ten of the thirteen original papers
given and discussed at the Frank Ramsey Centenary Conference, held in
Newnham College, Cambridge, from 30 June to 2 July 2003. This conf-
erence, organised by the editors, was the first of three conferences held to
commemorate the centenary year of Ramsey’s birth. (The other two were
held later in the year in Paris and Vienna.) The Cambridge conference was
generously supported by the Mind Association, the Analysis Trust, the
Aristotelian Society, the British Society for the Philosophy of Science, and
the Faculty of Philosophy, and the Centre for Research in the Arts, Social
Sciences and Humanities, of the University of Cambridge. We are grateful to
all these bodies, and especially to the contributors, and everyone who
attended and helped us to organise the conference, who made it, we think,
not unworthy of its subject.
H.L.
D.H.M.
Cambridge
February 2005
This page intentionally left blank
CONTENTS
Notes on the Contributors ix
Introduction 1
HALLVARD LILLEHAMMER
Success Semantics 22
SIMON BLACKBURN
Ramsey on Universals 83
FRASER MACBRIDE
References 170
Index 178
This page intentionally left blank
NOTES ON THE CONTRIBUTORS
SIMON BLACKBURN is the Professor of Philosophy at the University of
Cambridge, and a Fellow of the British Academy. His books include
Spreading the Word, The Oxford Dictionary of Philosophy, Ruling Passions, and
Think, and he has contributed widely to issues in the philosophy of
language, epistemology, and metaphysics.
Frank Ramsey’s brief publishing career lasted for eight years from 1922 to
his death in 1930 at the age of 26.1 During this time Ramsey produced
ground-breaking work in philosophy as well as mathematics and economics.
The chapters in the present volume testify to the lasting significance of
Ramsey’s work in each of these disciplines, with an emphasis on Ramsey’s
ideas in the core philosophical areas of logic, metaphysics, and the
philosophy of mind.
1 For details on Ramsey’s life and work, see the introductions to Ramsey (1978) and
(1990), and Sahlin (1990). For electronic resources, see <http://www.fil.lu.se/sahlin/
ramsey/> and <http://www.dspace.cam.ac.uk/handle/1810/3484>. Ramsey’s writings,
including work unpublished in his lifetime, are collected in Ramsey (1931, 1978, 1990, 1991).
2 Hallvard Lillehammer
the collective truth of which guarantees the success of his action. Thus, my
action of raising the glass puts me in a position to know that the glass was
not glued to the table while I was acting. The action of raising the glass is
itself a source of knowledge about the absence of any obstacles to the action
of raising the glass. Most of the relevant beliefs comprising this knowledge
will not be explicitly formed. Nevertheless, were they to be explicitly formed
they would be justified by the experience of acting. According to Dokic and
Engel, this avoids the apparent difficulty. On their view, Ramsey’s Principle
should be understood as a claim about beliefs accessible to the agent,
whether he holds them explicitly or not. It is this set of mainly implicit
beliefs the truth of which entails the success of the agent’s action.
Blackburn agrees with Dokic and Engel that true beliefs contribute to the
success of an indefinite number of actions. But he rejects their claim that
true beliefs are a guarantee of success when agents act. According to
Blackburn, success in action is also sensitive to the agent’s environment and
his other mental states. However, while he rejects Ramsey’s Principle as
interpreted by Dokic and Engel, Blackburn does not reject success
semantics. Instead, he puts forward a compositional version of success
semantics, according to which a representational feature of a representing
vehicle, such as a name or a predicate, refers to some entity if and only if
actual and possible actions based upon that vehicle are typically successful,
when they are, at least partly because of something about that entity.
Blackburn calls this the Fundamental Schema. According to Blackburn, one
advantage of the Fundamental Schema is that it retains the notion of success
in action as fundamental to the theory of representation without thereby
implying that true beliefs guarantee success in action.
While avoiding the problem addressed by Dokic and Engel, Blackburn’s
account runs into other difficulties. Perhaps the most serious objection is
that success semantics can easily appear to presuppose what it tries to
explain (see Papineau 1993). The explanation of the content of beliefs in
terms of successful action is vulnerable to the charge that we can only
understand what successful action is if we have an antecedent grasp of the
satisfaction conditions of desires. Yet desire satisfaction is as much a
representational notion as truth is. Blackburn’s response to this objection is
that if the Fundamental Schema were intended as a reductive account of all
representation, then its reliance on the representational character of desire
would indeed refute it. However, the schema can also be thought of in a less
ambitious way. According to Blackburn, for any representing vehicle (e.g. a
belief ), the Fundamental Schema can be used to account for its
representational power. In applying the schema, the representational power
of other representing vehicles (e.g. desires) may need to be presupposed.
But the representational power of these vehicles can then be explained by
reapplying the Fundamental Schema. Thus, by means of a diachronic
application of the schema to an agent’s overall mental economy, a plausible
picture of what the agent believes and desires will eventually emerge. While
Introduction 3
Blackburn admits that his epistemological solution to the problem leaves
other, metaphysical, problems unanswered, he suggests that once the main
problem is seen in its proper light the remaining metaphysical worries begin
to disappear.
Another influential and intriguing claim from ‘Facts and Propositions’ is
that ‘It is, perhaps, also immediately obvious that if we have analysed
judgement we have solved the problem of truth’ (p. 39). According to
Ramsey, there is no separate problem of truth, but merely a ‘linguistic
muddle’. Locutions involving the concept of truth do not introduce a new
subject matter, but merely allow us to reaffirm what is being said or thought.
Edgington’s chapter confronts this ‘minimalist’ account of truth with
another of Ramsey’s influential claims, namely that indicative conditionals
do not express genuine propositions. Taken at face value these claims
conflict, because it is clearly possible both to affirm and to reject an
indicative conditional such as ‘If I eat some pie, I will get a bellyache’.
Drawing parallels with the later work of Quine and Lewis, Edgington argues
that Ramsey’s view is entirely consistent on this point. While Ramsey was a
minimalist about truth, he was not a minimalist about what it takes to be a
bearer of truth. While, in a loose sense, indicative conditionals can be
affirmed and denied, they cannot strictly speaking be negated. Disagreement
about an indicative conditional is not disagreement about whether that
conditional or its negation is true, but rather about the non-equivalent
question of the conditional probability of the consequent given the
antecedent. In her chapter Edgington traces the subsequent history of the
analysis of indicative conditionals after Ramsey. She also draws attention to
a number of features of conditionals as construed by Ramsey that support
the claim that they do not express genuine propositions.
1 For the reasons why it is a only a quasi redundancy view, see Dokic and Engel (2002:
23).
Ramsey’s Principle Resituated 9
results of action, which change according to the desire (or set of desires)
involved. They are to be identified with the invariant conditions in the world
that guarantee success whatever goal is pursued. According to Ramsey’s
Principle, these conditions are nothing but the state of affairs corresponding
to the belief or, more simply and less emphatically, the belief’s truth
conditions. Typically, the truth conditions which RP promises to derive
from the conditions of success of actions are not those of our actual actions,
but they are the truth conditions of the beliefs which would lead to actions.
(This disposes of the familiar objection to pragmatism that a number of our
beliefs which are actually useful in such or such circumstances turn out to
be false.)
Second, RP, as stated above, applies to full beliefs, those which we are
disposed to judge as true or false, period. A common mistake consists in
supposing that it applies to partial beliefs—or to our subjective degrees of
beliefs—as well. But if it did apply to these, RP would immediately turn out
to be false, for the degree of our belief cannot guarantee the success of
action. Suppose, for instance, that the degree of my belief that it will rain
tomorrow is only 0.5. It can combine with a desire not to risk a wet picnic,
which in turn can cause me to stay at home tomorrow. But we cannot say
that it is part of the success condition of my staying at home that either it
will rain tomorrow or not (Whyte 1990: 156). Could we say, however, that a
sufficiently high degree of belief (say 0.6) could guarantee the success of our
actions? Couldn’t we say that beliefs are more likely to be true when they
lead to success more often than not, or even typically? Certainly such a
relation seems plausible, given RP. But the high degree of partial belief does
not warrant automatically the success of all the actions to which they lead.
This may seem to be a threat to the correctness of RP, since most of our
beliefs are partial ones, even if we do not hold them consciously. After all,
did not Ramsey himself famously say that ‘all our lives we are in a sense
betting’ (Ramsey 1926c: 79)? But this kind of objection rests upon a
misunderstanding of Ramsey’s Principle. In order to assign any degree to a
belief, one must be able to give a content to that belief, and RP tells us that
this belief’s content or truth conditions are those which suffice for the
success of the actions to which it would lead if it were a full belief. In this sense
RP is presupposed by decision theory when it assigns degrees to our beliefs.
The assignment of content to our beliefs through their success conditions is
thus more fundamental than the assignment of degrees to these through the
actions that we perform.2
2 Although we cannot argue for this here, this feature is closely related to the fact that
the step of practical reasoning leading to action—what Searle (1983) calls the ‘intention in
action’—is made through categorical judgements. See below, however, about the connections
between knowledge and action.
10 Jérôme Dokic & Pascal Engel
It is now customary to call ‘success semantics’ the philosophical project
of deriving truth conditions from success conditions. According to many
writers, Ramsey’s Principle should be supplemented by a teleological
account of our beliefs and desires. Success semantics, they claim, is
necessarily a ‘teleosemantics’, for the contents of our beliefs (and desires)
are determined, at least in part, by their biological functions, including
adaptative ones.3 RP seems to fit quite well in the teleosemantical picture.
However, there is some controversy about the nature of the relationship
between Ramsey’s Principle and teleology. There are at least four options:4
(a) The contents of beliefs and desires are directly defined by their
biological functions or purposes.
(b) Teleological considerations are relevant for explaining the normal
functioning of the formation mechanisms of beliefs and desires, in
particular their causal roles in the production of action.
(c) Ramsey’s Principle needs a teleological definition of the satisfaction
conditions of desires, from which it can derive a definition of the
truth conditions of beliefs.
(d) Ramsey’s Principle, in its absolute version, is in fact false. Truth
guarantees success only in a normal context, and teleological
considerations are needed to define what a normal context is
(relative to the organism).
The first option has been defended by Papineau (1987, 1993). According
to him, beliefs have the biological functions of leading to success when they
are true. This is what he calls their primary purposes, to be distinguished
from their secondary purposes. For instance, the belief that one is not going
to be injured in the ensuing conflict, though false, has the secondary
purpose of getting people to fight effectively. Success semantics comes into
the picture precisely to isolate primary purposes, since it equates the truth
conditions of beliefs specifically with the conditions under which beliefs
contribute to the satisfaction of desires.
According to defenders of option (b), beliefs do not have biological
functions from which one could directly read off their contents. For
instance, Millikan argues that the content of a representation does not rest
‘on the function of the representation or of the consumer, on what these
do’. There is no such a thing ‘as behaving like a representation of X or as
being treated like a representation of X’ (1993: 89). Millikan introduces a
distinction between the production and the consumption of representations in a
cognitive system. She deplores, rightly to our mind, that most theories of
3 Sahlin (1990: 72) mentions this point about the famous chicken–caterpillar example
given by Ramsey in ‘Facts and Propositions’. Defenders of teleosemantics include Millikan
(1984), Dretske (1988), McGinn (1989), Papineau (1987, 1993), Jacob (1997 ).
4 We do not exclude that some of these options can be combined.
Ramsey’s Principle Resituated 11
representation almost exclusively focus on the production conditions of
representations to the detriment of their consumption conditions. This is
particularly true of ‘informational’ or ‘causal covariance’ theories, which try
to define the content of a representation by reference to what causes the
representation.5 In contrast, according to Millikan, the content of a
representation is entirely fixed by the ways it is used in the cognitive system
to which it belongs. Of course, one can invoke teleological considerations in
order to deal with the conditions under which the representation is
produced. For instance, one can suppose that one of the functions of the
visual system is to produce representations that accord with reality, in other
words, veridical representations. This function of the visual system, though,
does not enter the definition of the content of a particular visual
representation, which is determined by the way it is consumed, eventually by
the kinds of behavioural control it can exert.
Millikan nonetheless claims that other teleological considerations are
relevant to defining the contents of our beliefs. The consumer part of a
cognitive system has a biological function which has been selected for by
evolution. According to Millikan, it is not directly the function of the
consumer part which determines the content of a belief, for the use of a
given belief can have an indefinite number of results, depending on the
subject’s context and other propositional attitudes. Rather, the content of a
belief is determined by the ‘Normal conditions’ of functioning of the
consumer part. The phrase ‘Normal conditions’ is a term of art in Millikan’s
account. The conditions under which a system is functioning ‘Normally’ are
not necessarily those in which the subject is most often (this would
correspond to the statistical sense of ‘normally’), but (roughly) those in
which it exerts the function which it or its ancestors have been selected for
in the past.
Options (c) and (d) constitute quite different arguments for teleo-
semantics. According to (c), Ramsey’s Principle cannot get off the ground
without an independent account of the satisfaction conditions of desires. RP
defines the contents of beliefs in terms of the satisfaction conditions of the
underlying motivating desires, i.e. of their contents. The partnership
between RP and teleology should be understood as follows: success
semantics is a theory of the contents of beliefs, and teleosemantics is an
account of the contents of desires.
According to (d), RP should be rejected in its absolute form. Truth does
not guarantee success in every situation; at best, truth leads to success in a
normal environment. RP should then be relativised to the context. Now, an
5 Roughly, informational or causal covariance theories define the content of a token
representation by reference to the information it carries, or the information that any token of
that type has the function of carrying. Such information is defined in its turn by the laws,
most often causal, which link the referent to the production of a token representation. See
Fodor (1990a); Jacob (1997 ).
12 Jérôme Dokic & Pascal Engel
environment is ‘normal’ only if the agent has been adapted to it. The notion
of adaptation is teleological, which means that success semantics must also
be a teleosemantics.
We cannot consider here all these options. In Dokic and Engel (2002) we
argue, against (c), that one cannot have an independent account of the
satisfaction conditions of desires. Moreover, it could be argued (Whyte
1993) against Papineau’s version that his distinction between normal or
primary purposes of beliefs has the effect of making teleology redundant.
RP just explains truth conditions in terms of the fulfilment of desires. But
adding that desires must bring about ends which are favoured by natural
selection adds nothing. Our argument here will be targeted specifically at
(d): we shall claim that when the notion of adaptation is well understood,
there is no need to relativize Ramsey’s Principle to circumstances. This
leaves us with options (a) and (b). We cannot go into details here, but let us
remark that, on either account, there is a sense in which teleological
considerations play only a ‘pre-semantic’ role. Perry (1993) introduces the
distinction between semantic and pre-semantic uses of context. In the case
of the interpretation of utterances, context is used pre-semantically in order
to determine the language, the words, and the linguistic meaning. For
instance, the considerations that make a given proper name, say ‘Émile
Ajar’, connected to a particular man, in the case in point Romain Gary, play
a pre-semantic role according to Perry. In general, the considerations
operating at the pre-semantic level do not have to enter the definition of the
propositional contents of utterances, which is a semantic matter. Thus, it is
not part of the meaning of the proper name ‘Émile Ajar’ that it has been
introduced as a guise by Romain Gary. Similarly, although teleological
considerations are relevant to determine the contents of our beliefs and
desires, they play a pre-semantic role. Given an organism with beliefs and
desires, i.e. given that their normal causal roles in an organism are in place,
RP can be used to derive their truth and satisfaction conditions. Teleological
factors are not part of the contents of our beliefs and desires. The success
conditions of an action are facts which are coeval with the action; typically,
in themselves they have nothing to do with the historical conditions in
which our cognitive system has evolved. In this respect our defence of RP
does not commit us to a form of naturalistic account of content (although
we do not claim that it is incompatible with such an account).
Perhaps the divergence between our understanding of success semantics
and the various versions of teleosemantics can be formulated thus. Both
success semantics and teleosemantics give sense to the familiar claim that
‘truth is the aim of belief’: truth is what our beliefs are directed to if our
actions are to succeed. In this sense, both make room for the idea that truth
is in some sense ‘normative’ for belief formation. According to Papineau, a
teleosemanticist can perfectly account for this normative feature by arguing
that the general fact that we value true beliefs simply flows from the very
Ramsey’s Principle Resituated 13
connection that success semantics postulates between true beliefs and the
satisfaction of desires:
If you act appropriately on true beliefs, then your actions are guaranteed to satisfy
your desires, and indeed . . . this pragmatic connection [is] a crucial component in
the analysis of truth conditional content . . . And this pragmatic connection does
mean that there is always a species of derived personal value to truth in beliefs that
are relevant to action, for such truth will always help you to find a way to satisfy
whatever desires you have. (1999: 26)
Now Papineau emphasises the fact that, on this view, truth, as a value or
a norm, is the external aim or goal of belief. On our view, however, Ramsey’s
principle flows from an internal relation between the truth of beliefs and
their success, and truth is the internal ‘aim of belief’.6 When an agent acts,
and when his action is successful, the very fact that it is so implies that his
beliefs are true. We could stress the contrast by saying that for the teleo-
semanticist truth is the distal, or external, aim of belief, whereas for success
semantics as we conceive of it, truth is the proximal, or direct, aim.7
6 On this internal reading of the aim of beliefs, as opposed to the teleological external
one, see Engel (2004). The fact that success semantics allows us to account for the truth-
directedness of belief makes room for the normative dimension of belief. This point is also
emphasised, although in a different way, by Simon Blackburn in his contribution to this
volume.
7 In this sense, there is some truth to Horwich’s criticism of a principle which is close to
RP ( Horwich 1998), which he claims to be trivial and merely a logical consequence of the
fact that our actions presuppose true instrumental beliefs. According to Horwich, his own
minimalist conception of truth has no difficulty in explaining the desirability of truth, for any
instrumental belief of the form ‘ If I do A, then I will get result R’ will, if the action is
successful, be true. Call such a belief D. Then an agent who wants to satisfy a desire to get R
will want it to be the case that If I believe that D, then D, which, by the familiar equivalence
principle, is equivalent to If I believe that D, then it is true that D. According to Horwich,
generalising leads us to conclude that All our directly action-guiding (instrumental ) beliefs are true.
Horwich concludes that there is no need to postulate an external and intrinsic goal of truth.
We agree, but it does not make RP and success semantics trivial for that. On the contrary, we
say that it allows for a substantial link between belief (and knowledge) and action.
14 Jérôme Dokic & Pascal Engel
It is of course plausible that some failures can be traced to false beliefs. I
try to drink from a particular glass because I believe that it contains
something that will quench my thirst. If my belief is false and the glass is
empty, I won’t get what I want. However, it is much less plausible, from a
cognitive point of view, to suppose that any possible failure of an action
corresponds to some false belief or representation on the agent’s part.
Robert Brandom remarks that ‘ignorance is no less a threat than error to the
positive guarantee of practical success that [Ramsey’s Principle] seeks to
identify with truth’ (1994: 175–6). Suppose that I do not get what I want
because the glass is glued to the table. According to Ramsey’s Principle, it
seems that I should have the belief that the glass is not glued to the table,
whose falsity explains the failure of my action. However, the fact that I tried
to raise my glass shows at best that I did not have the positive belief that it
was glued to the table, but it in no way indicates my having the negative
belief needed to vindicate Ramsey’s Principle, namely the belief that it was
not glued to the table. In general, there is no guarantee that, in every
particular case of action, there is a plausible cognitive level intermediary
between a general but trivial belief that there are ‘no impediments’ and a
non-denumerable set of beliefs corresponding to each possible failure of the
action.
In the same vein, John Perry contends that Ramsey’s Principle in its
absolute form amounts to ‘overburdening’ belief. He writes:
let us first note how unrealistic it would be to suppose that the content of beliefs fix
all of the circumstances relevant to the success of our action. Consider the force of
gravitation. If I am in space or on the moon or in some other situation where
gravitational forces are much diminished, the movement we envisage me making in
the example will not lead to getting a drink; the water would fly out of the glass all
over my face—or perhaps I would not even grab the glass, but instead propel
myself backwards. If all possible failures are to be accounted for by false beliefs, the
corresponding true beliefs must be present when we succeed. So, when I reach for
the glass, I must believe that the forces of gravity are just what they need to be for
things to work out right. (1993: 202)
According to Perry, the gap between action and success cannot be bridged
by the agent’s cognitive state only (i.e. the set of her beliefs). At best, the
truth of a belief guarantees the success of an action only relative to a normal
context (for instance, on earth), whose identity conditions need not be
known by the agent.
Of course, Ramsey himself would not be much impressed by Brandom’s
and Perry’s objections from situated cognition. If Ramsey’s Principle is
relativised to circumstances, it becomes false by definition; any reference to a
normal context should be blindly included in the belief’s truth conditions.
However, even if this response is (as we think) correct, it does not go far
enough. Brandom and Perry make appeal to our pre-theoretical intuitions
about the contents of our beliefs. They argue that Ramsey’s Principle
delivers truth conditions which are at odds with these intuitions. The
Ramsey’s Principle Resituated 15
principle would be strengthened if we could show that it is in fact
compatible with them.
10 Williams (1991, ch. 8). In fact, Williams’s target is a different version of the Principle
of Epistemic Closure, according to which if someone knows that p, and knows that p implies
q, then she knows that q. PEC in the text is stronger than this principle, on two counts: it
takes into account a larger set of alternatives (namely all alternatives incompatible with one’s
knowing, which includes but is not restricted to the set of alternatives incompatible with
what is known), and it does not require that the subject know that the alternatives are
incompatible with her putative knowledge.
Williams rejects the KK Principle (the principle that if one knows, one knows that one
knows), which is a consequence of PEC. We cannot go into the discussion of this principle
here. See in particular Williamson (2000). Actually Williams intends to defend a contextualist
conception of knowledge, whereas our view about knowledge and action here is non-
contextualist.
Perhaps PEC should be modified to block the possibility of bootstrapping oneself into
knowing that one knows. However, the principle that if someone knows that p, and knows
that p implies q, then she knows that q, is too weak, for it neglects the possibility of reflective
knowledge, such as the knowledge that I am not hallucinating based on my perceptual
experience. If the neutralist conception of experience is rejected, it can be argued that my
perception that p, which is essentially factive, is accessible to reflection or introspection, and
thus can indirectly justify the belief that I am not hallucinating.
Ramsey’s Principle Resituated 17
consideration is only that if the subject were to form the corresponding
beliefs, they would be justified by the very same experience which justifies
her actual belief that it is raining. Second, the justification of the former
beliefs need not be as direct as the justification of the latter belief. The
exclusion of an alternative to the subject’s claim to knowing can be indirectly
justified on the basis of her perceptual experience. Indirect justification can
be inferential or reflective. One can gain knowledge that this is not fake rain
by inferring it from one’s perceptual knowledge that it is raining. More
controversially, one can gain knowledge that one is not dreaming by reflecting
on one’s experience with a non-sceptical attitude.
On such a view, PEC is essentially correct, but it needs a less misleading
formulation in terms of implicit knowledge:
(PEC*) If I know that p, and q implies that I do not, I at least implicitly
know that q is not the case.
In general, I have at least implicit knowledge that p if and only if I am in a
position to acquire such knowledge, whether or not I exercise the inferential
and reflective capacities needed actually to know that p.11
It seems to us that (PEC*) is consonant with Ramsey’s famous account
of knowledge, when he says: ‘ We say “I know”, however, whenever we are
certain, without reflecting on reliability. But if we did reflect, then we should
be certain if, and only if, we thought our way reliable’ (1929b: 110). Here
Ramsey rejects explicitly the condition (known as the ‘KK Principle’) that in
order to know that p one needs to know that one knows that p. In
Williams’s terminology the reliability conditions for a given item of
knowledge do not number among the entailments of what is known (1991:
347).12 (PEC*) does not imply the KK Principle.
The temporal interpretation of PEC is naturally associated with a neutralist
conception of perceptual experience. According to this conception,
perception is not a genuine source of objective knowledge. The best that I
can learn from my experience of looking out of the window is that it seems to
be raining. Perceptual experience is neutral with respect to the truth of the
objective beliefs that are normally grounded on it, such as the belief that it is
raining. Whether or not my belief is true, my experience remains in essence
the same.
In contrast, the logical interpretation of PEC is naturally associated with
the rejection of the neutralist conception of perception, more precisely with
what is sometimes called a ‘disjunctive’ theory of experience.13 When my
11 Compare Williamson’s (2000: 128) remarks about the distinction between knowing and
being in position to know (neither of which, according to him, imply the KK Principle).
12 See Dokic and Engel (2002: 29).
13 Hinton (1973); McDowell (1982). However, see Williamson (2000, ch. 1) for doubts
about some versions of disjunctive theories. What is important for our purpose is the
rejection of so-called ‘conjunctive’ theories, such as the neutralist conception of experience.
18 Jérôme Dokic & Pascal Engel
perceptual experience is veridical, the perceived fact that p manifests itself to
me, so that the proposition ‘It seems to me that p’ is not the most precise
characterisation of what is going on in my cognitive space. There is a real,
cognitive distinction between a situation in which a fact manifests itself to
me in perception, and a situation in which I am only under the impression
that this is so. As a consequence, a transition from my experience of looking
out of the window to a belief that I am not hallucinating would be
warranted. In the terminology of Burge (1993), I am entitled to make such a
transition, given that the occurrence of the experience implies the truth of
the belief.
We are aware that much more needs to be said about epistemic principles
of closure. However, our aim in this chapter is not to defend a detailed
epistemological outlook, but to point out an analogy between knowledge
and action. The analogy we are interested in is between PEC and the
following Principle of Pragmatic Closure:
(PPC) If I am intentionally doing p, and q implies that I am not, I know
that q is not the case.
Here the phrase ‘doing p’ is used to imply success: just like knowing that p,
doing p implies p. So q can be any alternative to the success of the action of
doing p.14 PPC is not exactly analogous to PEC, for it does not state that in
order to do p, I must do whatever is necessary to lift any obstacle to my
making it the case that p. This would be utterly implausible, leading to
permanent procrastination. PPC is not so obviously wrong. It states that if q
implies the failure of my action of doing p, I must know that q is false. PPC
is in fact a stronger version of Ramsey’s Principle, according to which the
beliefs underlying a particular action should amount to knowledge, or at
least should be sufficiently warranted. It is not enough that the agent holds
the beliefs whose collective truth guarantees success; the action counts as
intentional only if these beliefs are themselves epistemically well grounded.
PPC is consonant with the spirit of Williamson’s (2000) claim that the
place of belief and desire in the economy of mental life depends on their
connection with knowledge and action and with the idea that knowledge is
prior to belief in the understanding of action. Belief emerges only when
mind is maladapted to world, just as desire emerges only when world is
maladapted to mind. On this view, PPC is not just a variant of Ramsey’s
Principle; on the contrary, the versions of Ramsey’s Principle formulated in
terms of belief and desire are derived from the more fundamental PPC.
15 As Bermúdez (1998: 118) rightly says, ‘To say that affordances are directly perceived is
precisely to say that instrumental relations can feature in the content of perception.’
20 Jérôme Dokic & Pascal Engel
Can all the beliefs underlying an action be implicit? The answer might be
positive for spontaneous actions, if they exist. Searle pointed out that there are
actions which are not caused by any prior intentions, such as the
spontaneous action of pacing about the room while reflecting on a
philosophical problem (Searle 1983: 84). If these actions are genuinely
intentional, they must be able to ground a set of beliefs whose collective
truth guarantees success. However, none of these beliefs need to be formed
before acting.
The distinction between a neutralist and a disjunctive account of
perceptual experience has an analogue in the action case. According to a
neutralist conception of action, the best that I can do is try to move my
body. Action is neutral with respect to its success conceived as the
satisfaction of the underlying objective desires, such as the desire to raise my
arm. Whether or not I succeed in actually raising my arm, I am doing
essentially the same thing, namely trying to raise it. This conception is
naturally associated with the temporal interpretation of PPC, for there is no
physical action such that I can know in advance that there won’t be any
obstacles to its success. Such knowledge is possible only for tryings to move
one’s body, which in a sense cannot fail.
In contrast, the rejection of the neutralist conception of action is in line
with the logical interpretation of PPC. According to a disjunctive account of
action, a particular trying is either a mere trying, which is a failed action, or a
genuine (i.e. successful) action. So an action can have intrinsic success
conditions which go beyond the mere trying to do something. The
possibility is then open that one’s experience of acting, which is essentially
psychophysical, is a source of knowledge about the action’s external success
conditions.
CONCL USION
To sum up, Ramsey’s Principle in its absolute form is untouched by
considerations about situated cognition. In particular, the objection of
cognitive overload is answered by distinguishing between implicit and
explicit knowledge. Ramsey’s Principle and the stronger Principle of
Pragmatic Closure concern in fact all warranted beliefs accessible to the agent,
whether or not she actually holds them. The agent must only have the means
of forming a set of warranted beliefs whose truth guarantees the success of
her action.
However, the best argument in favour of Ramsey’s Principle is
transcendental, in the sense that it embodies a condition of possibility of
intentional action. Some of those, like Perry, who want to relativise the
principle to circumstances invoke the agent’s adaptation to her environment
in order to justify their claim that the agent does not act with a full
awareness of all possible obstacles. Ironically, the objection of cognitive
overload does not stand precisely because agents are normally adapted to
Ramsey’s Principle Resituated 21
their environment. Adaptation is not a purely external relation between an
agent and its environment, as if the former happened to ‘fit’ the latter.
Rather, adaptation manifests itself in the fact that action is normally a source
of knowledge about its own success conditions. This is another aspect of
the internal relation between knowledge and action which Ramsey much
emphasised. Our actions’ success conditions reflect themselves in the
subject’s cognitive state, if only implicitly, because the agent’s contribution
and that of Mother Nature are so intertwined that it is impossible to tell
them apart.16
16 The argument in this chapter derives from Dokic and Engel (2002). We would like to
thank for their comments on this article and for their encouragements Hugh Mellor, Nils–
Eric Sahlin, and Simon Blackburn.
Success Semantics
SIMON BLACKBURN
THE THEORY
Any theory of mind that takes our representational or intentional capacities
as something to be explained seems likely to work in terms of some kind of
distinction between vehicle and content, and that is what I shall do. The
vehicle of representation, or what Ramsey called ‘the subjective factor’, is
usually thought of in terms of the sentence, identified by features other than
those intrinsically connected with meaning. So it is contingent whether a
sentence has the content that it does. This standard approach need not
preclude a wider theory, according to which there might be or actually are
other kinds of representational vehicles. For example, there may be non-
linguistic vehicles, or we might want to work towards a theory in which the
whole person represents things, without there being anything as it were
smaller to count as a specific vehicle at all. We come to say something about
such extensions in due course, but for the moment it will do no harm to
think in terms of sentences as paradigm representational vehicles.
So consider a subject S. S gets about the world, and we suppose that
some of her actions are successful. She achieves what she desires. And
suppose some of her actions are based upon a vehicle V. It is not going to
be easy to say exactly what that means, but at a first pass it may mean that it
is because of an event whereby V becomes salient in her overall psycho-
logical state. Some writers like to think in terms of a sentence, such as V
may be, entering S’s ‘ belief box’. Without being so literal, or geometric, we
26 Simon Blackburn
can use that as a model, again if only for the purpose of approaching a wider
theory. A slightly more realistic version for humans might be that S gets into
a state in which, were she to be asked why she is doing what she is, the
answer would contain V as an ineliminable ingredient. As for what
distinguishes one’s belief box from one’s entertainment box, containing
vehicles of content which we entertain without believing, we should look to
functionalism. We should concentrate upon force, meaning that a belief
differs from a mere entertainment of a thought precisely in that beliefs are,
as it were, in gear. They are playing a role in the machinery of agency. So for
the moment we are to think of an event, which I shall call the tokening of a
vehicle, precipitating a vehicle into that machinery.
In order to come at the idea of V bearing a content (being a
representation, having intentionality) we think in terms of explanation. What
explains S’s success as she acts upon the belief expressed by ‘The university
library is over there’? In the typical or paradigm case, she is successful
because the university library is over there. She is not, on the other hand,
typically successful because of the whereabouts of Heffers or Trinity College
or Grantchester. Why were S and R successful in meeting this afternoon?
Because they exchanged tokens of ‘Let’s meet at the university library’ and
the university library was where they then went. Once more, it is their going
to the university library, not their going past Heffers or through Trinity, that
explains their successful tryst. Why was S successful in her shopping?
Because she said ‘Can I have some haddock?’ and haddock was what she
got. The properties of neighbouring halibut and cod are not typically
relevant to the success of the actions based upon that tokening.
We could at this point go directly for an attempt to describe the
representative content of the whole vehicle, the sentence. We might try
something like this:
A vehicle V has the content p if and only if behaviour based on V is
typically successful, when it is, because p.
However, difficulties lie down that direct road. I have in mind difficulties
connected with the utility of false belief, which can accrue in various ways.
Consider, for instance, the vehicle ‘I am popular’. Psychologists say that this
is a useful thing to get into your belief box. It promotes your ability to get
on well, even if it is false (maybe, especially if it is false). So it will not be
true that behaviour based on tokening this sentence is typically successful,
when it is, because the subject is popular. Yet this is the content of the
sentence.
It would be possible to try to handle this kind of example as Dokic and
Engel do, by bringing into view the variety of possible desires that might
accompany the tokening. Then, while a false content explains occasional
success, only true content could explain a general pattern of success across
all these possible applications. I think this may work, although it takes us
some distance from our actual evidence. We have no general access to the
Success Semantics 27
requisite patterns. We have to invent scenarios in which the tokening of ‘I
am popular’ conspires with other desires to generate a whole pattern of
actions, most of which are unsuccessful if, but only if, it is false.
A different range of problems comes into view if we think of
approximations. Behaviour based on tokening the standard expression of
the Boyle–Charles gas law (Pressure times volume = constant times temper-
ature) is typically successful, when it is, because of the truth of van der
Waal’s equation.2 But the sentence does not express the same thing as that
equation. Here expanding our gaze to take in possible but non-actual desires
does not seem likely to help, since it will always be true that it is the more
complex relationship between the magnitudes involved that explains success
in relying upon the simpler relationship.
In addition, there will be sentences that are too seldom tokened for there
to be a typical way in which behaviour based on them is successful, let alone
an explanation of any such pattern of success in terms of their truth. All in
all, then, a direct approach looks unlikely to give us what we want.
If we want to stay more closely to the evidence, the remedy must be to
go compositional. We do not want to ignore the structure of the
representing sentence. So let us look at reference first, and try what I shall
call the Fundamental Schema:
Suppose the presence of ‘a’ is a feature of a vehicle ‘a . . .’. Then ‘a’
refers to a if and only if actual and possible actions based upon the
vehicle ‘a . . .’ are typically successful, when they are, at least partly
because of something about a.
Here we imagine a sentence containing a name. Actions are sometimes
based upon it. When they are successful, this is typically at least partly
because of something about some object. And that is the object that is
referred to in the sentence.
At this point we may wonder why ‘success’ is allowed to muscle its way
to the front. After all, ‘a’ may represent something and then actions based
upon a tokening that includes it would typically fail, when they do, at least
partly because of something about whatever it represents. The university
library being far from a mile away would explain why I failed to get there,
acting on the tokening ‘The university library is about a mile away’. Perhaps
‘action semantics’ would be a better title than ‘success semantics’. I think
this point is right, as far as it goes. But I also think that success in action is
the fundamental notion: like Davidson and Wittgenstein I incline to think
that failure only exists against a background of success. It is only because of
our successes that the representational powers we have are adaptive, and
2 The more complex equation that corrects for the finite volume of gas molecules, and
the attraction between them, which are ignored in the Boyle–Charles Law.
28 Simon Blackburn
exist in the first place. So I shall retain the title, while remembering that it is
the place of representation in the overall life of an agent that is the focus.
For any actual term, there will of course be a huge variety of possible
sentences in which it may occur. So the pattern of success illustrated here
for any one particular sentence can be enormously bolstered by thinking of
other sentences alike in containing the term ‘a’—enough so that the
credentials of the object a as the focus, the uniquely invariant explanans of a
huge variety of doings, will be abundantly established.
Some might worry that the ‘something about a’ introduces something
suspicious and unscientific, such as surreptitious mention of facts.3 But that
is just an artefact of the generality of the proposal. The fundamental schema
collects together a pattern of explanation, and, as is often the case, to
generalize in this way we need mention of truth or fact. But in any particular
case, the explaining is done without anything suspicious of the kind. Why
was John’s action based on tokening ‘The university library is about a mile
away’ successful? Because the university library is about a mile away. In
other words, the introduction of a vocabulary of fact or truth is necessary
for theorists generalizing about the phenomenon. But it does not indicate
any mysterious residue in the phenomena themselves.
Before expanding this, and confronting objections, we should notice a
few points. Some are obvious enough, but others deserve separate mention.
And so to difficulties. Some are easy to cope with, but others less so. The
hardest, I believe, is voiced by Papineau. Papineau talks of Ramsey’s
32 Simon Blackburn
different suggestion, criticized above. But the present proposal is just as
vulnerable to the objection. Papineau complains: ‘It explains truth for
beliefs, only by assuming the notion of satisfaction, for desires. Yet
satisfaction is as much a representational notion as truth, and so ought itself
to be explained by an adequate philosophical theory of representation’
(1993: 70–1). So, for instance, consider our agent who wants a particular
book, believes that the book is in the university library, and that the
university library is in some direction from where he stands. Suppose all
goes well. We can say that his success is explained by the book being in the
university library, and the university library being where he expected. But his
success is identified in terms of getting what he wanted, and that requires
content or intentionality: he wanted a particular book, which he therefore
had to represent to himself. If we cannot say that much about him, we have
no reason, it seems, to talk of success at all. But to say that much requires
some pre-existing representation, and that vitiates the proposal as a general
account.
Should this objection silence us? It does not falsify the fundamental
schema, but only suggests a limit to its utility. Yet how severe is this limit? If
we were trying to give a reduction of all intentionality at a blow, it would be
serious. But perhaps we do not have to claim any such ambition. It remains
true that for any particular representative feature of a vehicle, we can use the
fundamental schema to give a truth condition or account of its represent-
ative power. That account only works, it is true, by imagining the feature
embedded in the psychology of an active, desiring agent. And it is true that
when we turn to the fact of desire, other representative powers will be
implicated. But these in turn can be explained by a reapplication of the
schema. Suppose the book our agent desired was Emma, and suppose his
desire was activated by the tokening of a representative vehicle: ‘I must read
Emma’. Then the fact that the term ‘Emma’ represents Emma is given,
according to the fundamental schema, by the fact that actual and possible
actions based upon the vehicle ‘Emma . . .’ are typically successful, when
they are, because of something about Emma. Notice that among these
examples of success we can number the very occasion under discussion: the
agent’s success on this occasion arose because Emma was in the university
library. Faced with this, it is not very clear how damaging Papineau’s
problem is. But, in addition, we can approach it from a different angle.
Papineau’s problem will probably seem most intractable if we think
synchronically. We might imagine the simultaneous tokening of two
vehicles, VB carrying the content of the belief, and VD carrying the content
of a desire. And we perplex ourselves because ‘success’ underdetermines the
identity of these two things together. ‘Success’ could consist in the belief
having one content, and the desire a related content, or the belief having a
different content with an accommodating difference in desire, and so on
indefinitely. Underdetermination stares us in the face.
Success Semantics 33
But suppose we think a little more diachronically. We find out what baby
wants by finding what brings peace. We could be wrong: baby may have
wanted a biscuit, but be pacified by a rattle. But as the days go by, typical
patterns emerge. If ‘biccy’ reliably correlates with pacification by a biscuit,
we get one entry into our lexicon. If when baby seems to want a biscuit we
direct his attention successfully by saying where it is, and the words we use
become part of baby’s repertoire, then we take them to be representing
wherever it is. And so it goes, entry by entry. But at the end of the process
there is only one thing to think, sometimes, about what the emergent child
believes and wants. And by then representative features of vehicles are
available either to enter the function of pushing and pulling, the ‘desire
box’, or the function of guiding the actions appropriate to the pushings and
pullings, the ‘belief box’.
This solves the epistemological problem. We play off macro behaviour
and microstructure of vocabulary, and, just as with a crossword puzzle, clue
at a time, fallibly, but eventually uniquely, a solution emerges. But does it
solve the metaphysical or ontological problem? Does it tell us what
representation is, or how intentionality is possible? Does it, for instance, make
room for misrepresentation?
I believe so. Consider misunderstanding first. Suppose subjects S and R
want to meet, and S says ‘Let’s meet in New York’ and R hears ‘Let’s meet
in Newark’. They will fail to meet. S intended R to token something with
one kind of power, and he tokened something with a different kind. Instead
of directing him to New York, the event set him off towards Newark. It is
an event which reliably does that, because there is a feature of the vehicle
(which might be ‘Newark is the place to go’), and actual and possible
actions based upon the vehicle are typically successful, when they are,
because Newark is the place to go. On this occasion, it is not, and action
will fail.
With falsity we imagine an agent whose tokenings of ‘a’ and of ‘F ’
generally slot into the fundamental schema so as to compel interpretations
as referring to a, and to the property F, respectively. We suppose that the
(syntactic) structure (or some other feature) of the vehicle ensures its
indicative form. So the subject bases action on ‘Fa’, interpreted as a being
F.4 In other words, he acts on the belief that a is F. Unfortunately a is not F.
So either the subject will be unsuccessful, or his success will not be
explicable in the typical, disquotational fashion. He is not successful because
a is F, but in spite of a not being F. There is no principled difficulty about
isolating such cases and saying the right thing about them.
I
Two ideas associated with Frank Ramsey have been very influential, and
further developed in recent years, one about conditional judgements, the
other about truth. The two ideas can appear to be in tension with each
other, for the former has a consequence which it has seemed hard to square
with the latter. Philosophical descendants of Ramsey on truth have found it
hard to be philosophical descendants of Ramsey on conditionals.1 It is this
tension which I want to discuss and try to defuse.
A footnote in Ramsey’s ‘General Propositions and Causality’ (1929a) has
had a great impact on thinking about indicative conditionals since the 1960s.
The idea it propounds has come to be known as the Ramsey Test. Here it is:
If two people are arguing ‘ If p will q?’ and are both in doubt as to p, they are adding
p hypothetically to their stock of knowledge and arguing on that basis about q . . .
they are fixing their degrees of belief in q given p. ( p. 155)
Now there are proofs, the first due to David Lewis (1976), which appear to
show that, on this way of thinking about conditionals, conditionals do not
express propositions, evaluable in terms of truth. This came as a surprise to
the interested part of the philosophical community. I think Ramsey would
not have been surprised. There are numerous indications in the 1929 paper
that he accepted that treating conditional judgements this way was not to
treat them as expressing propositions. I don’t say that he had a proof to this
effect, but I think he saw that the conclusion was correct.
Ramsey is also a source of a family of views on truth called ‘minimalist’
or ‘deflationist’ or ‘redundancy’ theories: ‘It is true that Caesar was
murdered’ means no more than that Caesar was murdered. The word ‘true’
has the useful function of enabling us to say things like ‘That’s true’,
‘Einstein’s Theory is true’, ‘Everything he said was true’. These locutions do
not introduce a special subject matter of truth. They provide us with a way
of affirming what was said, or Einstein’s Theory, or in the last case, of
generalising—committing oneself to all instances of ‘If he said that p, then
p’. As the names suggest, this is meant to be a thin and undemanding
account of truth, compared to its rivals. It is hard to see how this minimalist
II
Return to the Ramsey Test. What it proposes is that in assessing a
conditional, we suppose that the antecedent is true, and consider what we
think about the consequent, under that supposition. That sounds innocuous
enough. The second part of the quotation is more specific about what this
comes to: we are ‘fixing [our] degrees of belief in q given p’. This notion,
‘degree of belief in q given p’, was introduced in Ramsey’s earlier paper
‘Truth and Probability’ (1926c), and one of his ‘fundamental laws of
probable belief’ is
Degree of belief in ( p and q) = degree of belief in p degree of belief in q given p.
( p. 77)
Substitute ‘probability’ for ‘degree of belief’, and you have a well-known
law of probability, the term on the right being a conditional probability.
There was nothing novel in this fundamental law of conditional probability,
which was standard since the eighteenth century. What was novel in
Ramsey’s 1926 paper was the interpretation of probability, and conditional
probability, as partial belief, and partial conditional belief, the principles of
probability yielding what he called a ‘logic of partial belief’ (p. 53). And
what was novel in the 1929 footnote was the linking of ‘degree of belief in q
given p’ with our ordinary, typically uncertain, conditional judgements
expressed using ‘if’.
A A&B
A&¬B
¬A ¬A
FIG. 1
One way of explaining the idea is shown above in Figure 1, which shows
two partitions: sets of exclusive and exhaustive possibilities. The probab-
ilities of, or one’s degrees of belief in, the members of a partition (to the
extent that they can be made precise) should sum to 100%, the degree of
belief in a certainty. That is all there is to treating degrees of belief as
probabilities. Coming to a view about A is comparing A and ¬A: seeing
Conditionals and Truth 39
them as in competition for your belief. If you think that A is 4 times more
likely than ¬A, that is to think that it is 80% likely that A.
In coming to a view about B on the assumption that A, you ignore the
¬A-possibility—hypothetically eliminate it. Just focusing on the A-
possibilities, you compare B to ¬B; that is, you compare A&B with A&¬B.
If you think that A&B is 4 times more likely than A&¬B, you think it is 4
to 1 that B if A; that is, B is 80% likely on the supposition that A.
Restricting attention to the A-possibilities, your degree of belief in B is 80%.
For example, if you think it’s 80% likely that she will be cured if she has the
operation, you think (she has the operation and is cured) is 4 times more
likely than (she has the operation and is not cured).
This way of looking at it is not the most basic way, for two reasons. First,
you can have a degree of belief in B on the supposition that A without
having a degree of belief in A (and hence without having degrees of belief in
A&B and A&¬B). Ramsey noted this, considering conditionals like ‘If I do
p, q will probably result’. He says ‘Here the degree of probability is clearly
not a degree of belief in “Not p, or q” (i.e. the material implication) but a
degree of belief in q given p, which it is evidently possible to have without a
definite degree of belief in p, p not being an intellectual problem’ (1929a:
154).
Second, to arrive at a degree of belief in A&B, one typically uses the
notion of a conditional probability, according to Ramsey’s Test, and asks,
how likely is it that A? And how likely is it that B if A? So the basic thought
experiment is simply to assume the antecedent and consider how likely the
consequent is, under that assumption (as Ramsey said). On reflection, that
can be seen to be equivalent to judging that A&B is (say) 4 times more
likely than A&¬B, even if you have no degrees of belief regarding the
components of this comparison. To repeat the example, if I think that it is 4
to 1, or 80% likely, that she will be cured if she has the operation, I think
(operation and cured) is 4 times more likely than (operation and not cured),
even if I don’t have a degree of belief that she will have the operation, let
alone that she will have it and be cured.
In the passage just cited, Ramsey makes the point that it is our uncertain
conditional judgements that make this approach essential. If I am sure that
if I do p, q will result, there is no harm, he says, in considering this as belief
in the material implication, that is, the disjunction ¬p or q (‘but it differs’ he
says ‘from an ordinary disjunction in that one of its members is not some-
thing of which we are trying to discover the truth but something within our
power to make true or false’). But when it is ‘If p, q will probably result’, this
is not a degree of belief in ¬p or q, he says, but a degree of belief in q given
p. ‘And our conduct is largely determined by these degrees of hypothetical
belief’ (1929a: 154).
Indeed, provided that the antecedent does not get zero, the probability of
the material conditional and the conditional probability coincide in the case
of certainty. Return to the partition on the right of Figure 1. If p(A&¬B) is
40 Dorothy Edgington
0, p(AB) and p(B given A) are both equal to one. There is only one other
case in which they coincide: when the probability of the antecedent is 1. In
all other cases, the material implication is more probable than the
conditional probability, because it gets added probability from the case in
which the antecedent is false. They come spectacularly apart when the
probability of ¬A is high yet p(A&B) is low relative to p(A). For example,
let p(¬A) = 90%, p(A&B) = 1%, and p(A&¬B) = 9%. p(AB) = 91%;
p(B given A) = 10%. And the latter, not the former, matches how we do
assess conditionals. For instance, I judge that it’s 90% likely that Jane won’t
be offered the job, 10% likely that she will, 1% likely that she will be offered
and decline, 9% likely that she will be offered and accept. I think it’s 10%
likely that she will decline if she is offered the job, while the probability of
the corresponding material implication is 91%.
This would appear to rule out that conditionals are material implications.
If conditionals were material implications, we should judge them to be
probable when the material implication is probable. But we don’t. Hence
conditionals are not material implications (though they may harmlessly be
treated as such in some contexts). It does not, of course, rule out that
conditionals are some fancier sort of proposition. Nevertheless, it gives the
shape of the stronger results to follow: for any proposed truth condition, it
can be shown that the probability of its obtaining can come apart from the
probability of consequent given antecedent.
III
In the 1960s Ernest Adams developed a theory of conditionals conforming
to Ramsey’s footnote (Adams 1965, 1966). Most importantly, he developed
a logic for arguments with conditionals among their premises or conclusion,
whose rationale is as follows. First take an uncontentious classically valid
argument without conditionals: one such that it is impossible that its
premises be true and conclusion false. Ask this question: suppose I am close
to certain, but not quite certain, that its premises are true. Then what should
I think about its conclusion? It is easy to prove the following: it is
impossible that the improbability of the conclusion should exceed the sum
of the improbabilities of the premises (where improbability is one minus
probability). Thus, arguments which necessarily preserve truth, necessarily
preserve probability in the sense that there can be no more improbability in
the conclusion than there is in all the premises together. So, in a two-
premise valid argument each of whose premises gets a probability of 99%,
the worst-case scenario for the conclusion is that it gets 98%. This
vindicates the use of deduction from uncertain premises, provided that they
are not too uncertain, and provided that there are not too many such
premises. The only way that you can validly deduce an improbable—
perhaps zero-probability—conclusion from highly probable premises is
where there are a great many such premises, as in the Lottery Paradox. On
Conditionals and Truth 41
the other hand, if you have just two premises but each is just 50% probable,
the conclusion can get 0, for instance, ‘The coin will land heads; it will land
tails; so it will land heads and tails’.
This is all good stuff: an elaboration of the use of deduction under
conditions of uncertainty. Now turn to conditionals. The truth-functional
account does badly when we consider uncertain conditional judgements, as
we saw above. Indeed, whatever plausibility it has comes from ignoring
uncertain judgements. Adams’s view in the 1960s could be put like this: we
don’t have a theory of the truth conditions of conditionals such that we
believe them to the extent that we think they are probably true. But we do
have an excellent way to assess their probability (or the degree to which they
are accepted)—Ramsey’s way. So let us use the criterion of probability
preservation (strictly, probability or conditional probability preservation) as
the criterion of validity of arguments with conditionals. And a nice logic
emerges. For example, modus ponens is valid: demonstrably, if p(A) is high
and p(B given A) is high, p(B) is high. On the other hand, there are the
counter-examples to strengthening of the antecedent, to transitivity, and to
contraposition, which are now well known. ( Note: the logic is for sentences
in which the conditional, if it occurs, occurs as a main connective. Not
having truth conditions, we have no automatic treatment of embedded
conditionals—a point to which I return.)
Robert Stalnaker was also impressed by Ramsey’s footnote, and by
Adams’s work. His project in the late 1960s was to fill the gap noted by
Adams: to find a proposition such that the probability of its truth is
measured in Ramsey’s and Adams’s way (see Stalnaker 1968, 1970). He was
working in the framework of possible-worlds semantics, popular and
exciting at the time through Richard Montague’s work. Plausibly, he was
working with a richer notion of a proposition than was available to Ramsey.
And he thought he had found the requisite proposition. He did find a
proposition which generated a logic that agreed with Adams’s: this was a
constraint on the success of his project—for propositions, necessary
preservation of truth and necessary preservation of probability go hand in
hand.
Then David Lewis proved that the project must fail. There is no
proposition such that, necessarily, the probability of its truth is the
conditional probability of something given something. Conditional
probabilities cannot be made to behave as unconditional probabilities, the
latter being probabilities of the truth of propositions. There are now many
ways of proving this result: thinking that B is probable on the supposition that
A is not equivalent to thinking that something or other is probable,
simpliciter.
42 Dorothy Edgington
IV
I shall try to give some indication of why this negative result holds. Let us
try to construct the required proposition. Take two logically independent
propositions, A and B. Suppose, then, that there is a proposition ‘If A, B’,
which I shall call XA,B, or just X for short, such that necessarily, i.e. in all
probability distributions, the probability of X = the probability of B given A.
First we prove three entirely to be expected things about the logical shape of
X: (1) X must be entailed by A&B; (2) ¬X must be entailed by A&¬B; (3)
both X and ¬X are compatible with ¬A, i.e. neither X nor ¬X is entailed by
¬A.2
(1) If X were not entailed by A&B, certainty that A&B would not
require certainty that X, i.e. there would be probability distributions
in which p(A&B) = 1 and p(X) is less than 1. But if p(A&B) = 1,
p(B given A) = 1, and therefore, on the assumption that p(X) =
p(B given A), p(X) = 1.
(2) If ¬X were not entailed by A&¬B, certainty that A&¬B would not
require certainty that ¬X, i.e. there would be probability distrib-
utions in which p(A&¬B) = 1 but p(¬X) less than 1, so p(X)
greater than 0. But if p(A&¬B) = 1, p(B given A) = 0, and so, on
the assumption that p(X) = p(B given A), p(X) = 0.
(3) If X were entailed by ¬A, then, given (1) and (2), X would be AB.
But in general, p(AB) p(B given A), as we saw above. If ¬X
were entailed by ¬A, then, given (1) and (2), X would be A&B. But
in general, p(A&B) p(B given A).
The logical relations between X, A&B, A&¬B, and ¬A are represented
in Figure 2 below. (Please ignore for the moment the dotted line.)
So far so good. We have a proposition of the expected logical shape.
But—Result 1—it is simply and blatantly false that there is a proposition X
of this logical shape such that in all probability distributions p(X) =
p(B given A). Figure 2 shows a partition of four exclusive and exhaustive
possibilities: (i) A&B&X; (ii) A&¬B&¬X; (iii) ¬A&X; (iv) ¬A&¬X. A
probability distribution is any assignment of non-negative numbers to the
members of a partition that sum to 1 (or 100%). Some such assignments
will make p(X)—the sum of the numbers assigned to (i) and (iii)—equal to
p(B given A)—the number assigned to (i) divided by the sum of the
2 I assume, for simplicity, that X is a classical, bivalent proposition, such that in any
possible situation in which X is not true, it is false, i.e. ¬X is true. But the arguments that
follow would go through just the same without that assumption, reading ‘p( X )’ as the
probability that X is true, and ‘p(¬X )’ as the probability that X is not true. Even if it need not
be false whenever it is not true, we would still show, in the same way, that there is no
proposition the probability of whose truth necessarily equals the probability of B given A.
Conditionals and Truth 43
numbers assigned to (i) and (ii), and other assignments will not. For
instance, let the numbers assigned to the four possibilities be respectively
0.4, 0.1, 0.25, 0.25. Then p(B given A) = 0.8, p(X) = 0.65. The trouble is
that p(B given A) depends solely on how probabilities are distributed among
the A-possibilities; p(X) may be true or may be false if ¬A, and so it
depends also on how probabilities are distributed among the ¬A-
possibilities. But there are probability distributions which agree on the A-
possibilities, hence agree about p(B given A), yet disagree on the ¬A-
possibilities, hence disagree on p(X).
B X (i)
A
¬B ¬X (ii)
X (iii)
¬A
¬X (iv)
FIG. 2
Perhaps Result 1 does not refute what Stalnaker intended. Perhaps his
intention was that the holding of the equation p(X) = p(B given A) should
be an additional constraint on probability functions: we are to add a new
rule, which stipulates that the probability assigned to the conditional
proposition X must always equal p(B given A).
Stipulations have consequences, and this stipulation has unacceptable
ones. You can make the stipulation for a single case in a single probability
distribution, but it remains to be seen whether this clashes with the use we
make of probability distributions, or whether one can consistently stipulate
such a proposition for all conditional probabilities. Result 2: Lewis (1976) in
effect showed that, given the stipulation, the only consistent place to divide
p(¬A) into p(¬A&B) and p(¬A&¬B) is along the dotted line, making B
equivalent to X and hence p(B) = p(X) in all probability distributions. For,
being certain that B makes p(B given A) = 1 (provided p(A) 0). If B&¬X
were possible, being certain that B would be consistent with having p(X)
less than 1. Similarly, being certain that ¬B makes p(B given A) = 0. If
¬B&X were possible, being certain that ¬B would be consistent with
having p(X) greater than 0. This conclusion is an absurdity. Conditionals are
obviously not equivalent to their consequents.
Stalnaker himself, in the wake of Lewis, showed—Result 3—that if the
equation holds for given propositions A, B, and X, there can be other, more
complex propositions in the same probability distribution, C and D—truth-
44 Dorothy Edgington
functional compounds of A, B, and X—such that we cannot construct a Y
such that p(D given C) = p(Y).3
I have given the following argument—Result 4 (see e.g. Edgington 2001).
You have a distribution as in Figure 2. Then you rule out line (ii): you rule
out A&¬B, no more, no less. This makes p(B given A) = 1. But it does not
make p(X) = 1, because you have not ruled out the possibility that
¬A&¬X.
And so it goes on. The source of the difficulty is, as I said above,
estimating how likely it is that B on the assumption that A is an exercise
which concerns only the A-possibilities, the possibilities in which A is true.
Any candidate proposition X is true in some but not all ¬A-possibilities and
so estimating it includes deciding how likely it is to be true if ¬A. Fixing it,
on a one-off basis, so that p( X) = p(B given A) makes things go wrong
elsewhere in one’s thinking.
V
Everyone finds the result puzzling. Two opposing philosophical attitudes
find it particularly obnoxious or threatening: that of the ‘staunch truth-
conditional semanticist’, as for example Bill Lycan describes himself in his
book Real Conditionals (2001), who does not want to forgo the virtues of
explanations in terms of truth conditions; and that of the minimalist about
truth, for whom truth is so innocuous that you shouldn’t be able to rule out
its application to some bit of language to which it prima facie applies. I shall
set aside the former, as not my present concern. Let us turn to minimalism
about truth. Those who defend it take Ramsey’s remarks in ‘Facts and
Propositions’ (1927) as a source. The aim of that paper is to give an analysis
of judgement, belief, and assertion. And he says:
But before we proceed with the analysis of judgement, it is necessary to say
something about truth and falsehood, in order to show that there is really no
separate problem of truth but merely a linguistic muddle. Truth and falsity are
ascribed primarily to propositions. The proposition to which they are ascribed may
be either explicitly given or described. Suppose first that it is explicitly given; then it
is evident that ‘ It is true that Caesar was murdered’ means no more than that Caesar
was murdered, and ‘ It is false that Caesar was murdered’ means that Caesar was not
murdered. . . .
In the second case in which the proposition is . . . not explicitly given we have
perhaps more of a problem, for we get statements from which we cannot in
ordinary language eliminate the words ‘true’ and ‘false’. Thus if I say ‘He is always
right’, I mean that the propositions he asserts are always true, and there does not
seem to be any way of expressing this without using the word ‘true’. But suppose
3 This proof of Stalnaker’s is published as a letter to van Fraassen in van Fraassen (1976:
303–4). The reasoning is reproduced in Gibbard (1981: 219–20), Edgington (1995: 276–7 ),
and Bennett (2003: 71–3).
Conditionals and Truth 45
we put it thus: ‘For all p, if he asserts that p, p is true’, then we see that the
propositional function p is true is simply the same as p, as e.g. its value ‘Caesar was
murdered is true’ is the same as ‘Caesar was murdered’. We have in English to add
‘is true’ to ‘p’ to give the sentence a verb, forgetting that ‘p’ already contains a
(variable) verb. . . .
. . . If we have analysed judgement we have solved the problem of truth. (pp.
38–9)
About the second case, Ramsey seems to be saying that it is only for
syntactic reasons that we need to use the word ‘true’—to ‘give the sentence
a verb, forgetting that “p” already contains a (variable) verb’. If our language
were endowed not only with pronouns but with ‘prosentences’, the word
‘true’ would not be needed. ‘Everything he says is true’ is just a way of
generalising over all instances of ‘If he said that p, then p’. In David Lewis’s
words:
The mention of truth lets us formulate generalisations that make long stories short,
but the long stories made short are not about truth. [For example] ‘ Whatever the
Party says is true’ is equivalent to an infinite bundle of conditionals ‘ If the Party
says that two and two make five, then two and two make five’, ‘ If the Party says
that we have always been at war with Eastasia, then we have always been at war
with Eastasia’. And so on, and so forth. (2001: 279)
tells us nothing about truth. It is about the existential grounding of the purring of
cats. All other instances of the truthmaker principle are likewise equivalent, given
the redundancy biconditionals [of the form ‘ It is true that p iff p’ ], to this infinite
bundle of conditionals not about truth but about the existential grounding of all
manner of other things: the flying of pigs, or what-have-you. (2001: 279)
VI
If truth is such a thin notion, can it be right to say that it doesn’t apply to
conditionals (and other types of judgement where its application is
controversial in philosophy, such as ethical judgements)? Doesn’t
‘Everything he said on that occasion was true’ fulfil its sole function when
what he said was that today is Sunday, what you are doing is wrong, and if
you keep on doing that you will be punished’? Can there be a case for
denying truth to conditionals while still being a minimalist about truth?
I shall cite a number of philosophers, including Ramsey, who would
answer ‘yes there can’. And then I shall turn to some more signs of the non-
truth-bearing behaviour of conditionals construed in Ramsey’s way.
To be a minimalist about truth is not necessarily to be liberal about what
it takes to be a bearer of truth. This point is argued by Jackson, Oppy, and
Smith (1994). And here is Lewis to this effect:
I take our topic to be, in the first instance, the truth of propositions. Sentences, or
sentences in context, or particular assertions of sentences, or thoughts, can
derivatively be called true; but only when they succeed in expressing determinate (or
near enough determinate) propositions. A sentence (or . . .) might fail to express a
proposition because it is ambiguous; or because it is vague; or because it is
paradoxical or ungrounded; or because it is not declarative; or because it is a mere
expression of feeling in the syntactic guise of a declarative sentence.4 Such a
sentence (or . . .) is not a candidate for the status of truth simpliciter. But it might be a
candidate for a related status such as truth on all or some of its disambiguations, or
truth on all or most or few of its precisifications, or . . . make-believe truth. (2001:
276)
(Or, one might add, following a couple of examples from Hartry Field
(1994), truth relative to a set of norms; truth relative to a frame of reference;
or truth relative to a supposition.) For Lewis, clearly, minimalism about
truth is compatible with stringent requirements on when a sentence or
utterance is a candidate for truth.
4 Lewis adds a footnote here: ‘ When people in philosophy books go to the footy, they
express their feelings by saying “Boo!” or “Hooray!”. Real people use a wider range of
expressive locutions. Some of them have at least the superficial form of declarative sentences:
“Leeds boot boys rule” or “Collingwood sucks”. A ( pompous) bystander might indeed
respond to such a sentence by saying “That’s true” or “That’s false”, but calling it doesn’t
make it so. Unless the sentence did after all express a true or false proposition, the
bystander’s response would be just a piece of make-believe.’
Conditionals and Truth 47
Ramsey’s view was similar. Ramsey did not say that there was no problem
of truth, but no separate problem of truth—separate from the analysis of
judgement. ‘Truth and falsity are ascribed primarily to propositions’, he also
said (1927: 38). Much of ‘General Propositions and Causality’ is devoted to
arguing that ‘Many sentences express cognitive attitudes without being
propositions; and the difference between saying yes or no to them is not the
difference between saying yes or no to a proposition. This is even true of the
ordinary hypothetical’ (1929a: 147–8, my emphasis). Perhaps the presence of
‘cognitive’ in the above quotation is at first sight surprising. But there must
be a good sense in which conditional statements ‘If you strike it, it will
light’, ‘If Mary didn’t cook the dinner, John did’, express cognitive
attitudes—express beliefs. Read Ramsey’s way, they do not express
categorical beliefs about the way things are; but they express a conditional
belief in the consequent, under the supposition that the antecedent is true.
Ramsey stresses that disagreement about conditionals is not disagreement
about whether a proposition p or its negation ¬p is true:
If B says ‘ If I eat this mince pie I shall have a stomach ache’ and A says ‘ No you
won’t’ he is not really contradicting B’s proposition … [ This example] asserts
something for the case when the [antecedent] is true: we apply the Law of Excluded
Middle not to the whole thing but to the [consequent] only. (1929a: 147–8)
That is, we both suppose that he eats the pie. Under that supposition, either
he will get a stomach ache or he won’t, and we disagree about which, or
their relative likelihoods; we have different views about the likelihood of
(eating it and getting a stomach ache) relative to that of (eating it and not
getting a stomach ache).
And about a similar case, Ramsey says: ‘Before the event we do differ
from him in a quite clear way: it is not that he believes [a proposition] p, we
¬p ; but he has a different degree of belief in q given p from ours; and we
can obviously try to convert him to our view’ (1929a: 155). And then comes
the famous footnote.
The main business of this paper of Ramsey’s is to argue that judgements
of causal law, also called variable hypotheticals, are not judgements about
propositions:
Variable hypotheticals are not judgements but rules for judging ‘ If I meet a , I shall
regard it as a ’. This cannot be negated but it can be disagreed with by one who does
not adopt it . . . A variable hypothetical is not strictly a proposition at all . . . The
difficulty comes fundamentally from taking every sentence to be a proposition.
(1929a: 149, 159, 162)
VII
It would be nice to be able to state in an informative and illuminating way
the necessary and sufficient conditions for being a truth bearer. I don’t
know how to do that. In this section I draw attention to a number of further
features of conditionals as construed by Ramsey which seem to differentiate
them from propositions.
VII I
Consistently with the Ramsey Test, we may partially reinstate truth values
for conditionals, and say that a conditional is true if it has a true antecedent
and consequent, false if it has a true antecedent and false consequent, and
has no truth value if its antecedent is false. All the problems with assigning
52 Dorothy Edgington
truth values arise for the case in which the antecedent is false. If we do say
this, we need to give up some cherished notions. It is no fault in a
conditional that it has a false antecedent and hence is not true. It is no merit
in a conditional that it has a false antecedent and hence is not false. Beliefs
and assertions do not aim at truth. Rather, we want our conditional
judgements to be true on the assumption that they have a truth value, true
on the assumption that they are either true or false, i.e. we want the
consequent to be true on the assumption that the antecedent is true. The
probability of a conditional is not the probability that it is true, but the
probability that it is true given that it is either true or false. This is equivalent
to Ramsey’s idea.
A minimalist about truth will not like this option, however, for it departs
from the central tenet of minimalism. On this option ‘It is true that if A, B’
is not equivalent to ‘If A, B’. Ramsey’s own view, I have argued, is quite
consistent: minimalism about truth combined with the denial that our
hypothetical judgements behave like truth bearers.
What is Squiggle? Ramsey on
Wittgenstein’s Theory of Judgement
PETER M. SULLIVAN
I
At the age of 20, and fresh from his undergraduate studies in mathematics,
Ramsey set about writing what would be his first substantial publication, his
1923 Critical Notice of Wittgenstein’s Tractatus (hereafter TL P). It is hard
for modern students of that book, who negotiate its obscurities with
generations of previous commentary to serve as guides, to appreciate the
task Ramsey confronted; and, to the extent that one can appreciate it, it is
hard not to feel intimidated by the brilliance of his success. His Critical
Notice made Ramsey the first of Wittgenstein’s interpreters.1 In my view it
makes him, still, the best.
I want to illustrate that here by considering what light his remarks cast on
a single passage of the book, in which Wittgenstein advances what might be
called his theory of judgement.
5.54 In the general propositional form, propositions occur in a proposition only
as bases of the truth-operations.
5.541 At first sight it appears as if there were also a different way in which one
proposition could occur in another.
Especially in certain propositional forms of psychology, like ‘A thinks that p
is the case’, or ‘A thinks p’ , etc.
Here it appears superficially as if the proposition p stood to the object A in a
kind of relation.
(And in modern epistemology ( Russell, Moore, etc.) those propositions
have been conceived in this way.)
5.542 But it is clear that ‘A believes that p’ , ‘A thinks p’ , ‘A says p’ , are of the
form ‘ “p” says p’ , and here we have no co-ordination of a fact and an
object, but a co-ordination of facts by means of a co-ordination of their
objects.
5.5421 This shows that there is no such thing as the soul—the subject, etc.—as it is
conceived in contemporary superficial psychology.
A composite soul would not be a soul any longer.
Part of what is going on here is well understood. This is, roughly, that a
fact can be represented only by a fact that manifests in itself the structure of
the fact represented, and thus that whatever represents a state of affairs has
1 Russell was too much a collaborator, I think, to be counted as first among
Wittgenstein’s interpreters.
54 Peter M. Sullivan
to be complex in the same way as a proposition that might represent that
state of affairs. That was first spelled out by Ramsey, and is now common
ground amongst interpreters. Away from that common ground, though,
interpretations soon diverge. In particular, they diverge even over such a
seemingly basic question as whether Wittgenstein’s analytical proposal is
intended to make clear the sense that propositional attitude statements have,
or whether instead it is intended to make clear that they have no sense. I’ll
suggest that this question and others that have seemed pressing to later
commentators have done so because the focus and intention of
Wittgenstein’s proposal have been misidentified, in ways that attention to
Ramsey can correct.
For the purpose of this illustration I’ll begin by contrasting the accounts
of the passage offered by two of the best of later commentators on the
Tractatus, Elizabeth Anscombe and Anthony Kenny. Although they wrote
forty and thirty years ago, their discussions have not been surpassed. For
Ramsey’s views I will draw on, in addition to the 1923 Critical Notice, his
1927 paper ‘Facts and Propositions’. Only the first of these is explicitly
exegetical. In the second Ramsey is addressing the issues in his own right.
Towards the end of this paper, though, Ramsey makes the following very
suggestive acknowledgement: ‘In conclusion, I must emphasize my
indebtedness to Mr Wittgenstein, from whom my view of logic is derived.
Everything that I have said is due to him, except the parts which have a
pragmatist tendency, which seem to me to be needed in order to fill up a
gap in his system’ (1927: 51). This suggests a simple subtractive scheme for
arriving at an exegesis from a non-exegetical treatment:
X–Y = Z,
where
X = the total view of ‘Facts and Propositions’,
Y = the ‘parts which have a pragmatist tendency’,
so that
Z = what Ramsey took to be the Wittgensteinian basis to which he was
adding.
The pragmatist theory of content that Ramsey proposed to fill the ‘gap’ is
evidence of his genius as an original philosopher in his own right. What I
want to emphasize about Ramsey as an interpreter is his recognition of the
gap, or, in other words, his understanding of what kinds of issues and
questions Wittgenstein’s account aimed to settle and which it left open.
The pragmatist elements of ‘Facts and Propositions’ have more than one
role. The most obvious but least important such role is to distinguish
different attitudes—judgement, supposition, hope, expectation—towards, as
we say, the same propositional content. This role I will ignore throughout
the chapter. Wittgenstein’s ‘theory of judgement’, like Russell’s, is indifferent
What is Squiggle? 55
to such contrasts, so might be better called a theory of understanding (cf.
Russell 1992: 107). The second and most important role is to make good a
shortcoming in Wittgenstein’s account that Ramsey had already pointed to
in the Critical Notice, extending this theory of understanding from
elementary propositions to propositions in general. For much of this
chapter (Sections V–VIII) I will be able to adopt the simplifying assumption
that this is the role of Ramsey’s pragmatist innovations, so that removing
these additions returns us to the elementary case. In the final section of the
chapter, though, I will acknowledge a role that Ramsey rightly found for his
pragmatist tendency in the elementary case too.
II
Anscombe’s reading of our passage has two main planks. The first is the
common ground I’ve already sketched, but which she sets out more
thoroughly and clearly in the following passage.
‘it is clear’, [Wittgenstein] says; and of course what was clear to him was that for
anything to be capable of representing the fact that p, it must be as complex as the
fact that p; but a thought that p, or a belief or statement that p, must be potentially a
representation of the fact that p (and of course actually a representation of it, if it is
a fact that p). It is perhaps not quite right to say that ‘A judges p’ is of the form ‘ “p”
says that p’ ; what he should have said was that the business part of ‘A judges that
p’ , the part that relates to something’s having as its content a potential
representation of the fact that p, was of the form ‘ “p” says that p’ : ‘A believes p’
must mean ‘ There occurs in A or is produced by A something which is (capable of
being) a picture of p’ . (Anscombe 1959: 88)
Summarizing this, we might say that on Anscombe’s analysis the gist of
‘A judges p’ is to be given in the pattern:
(AA) A’s mental bits are configured thus and so, and the fact that they
are so configured represents that p.
As Wittgenstein told Russell (Wittgenstein 1995: 125), and as Ramsey seems
to have known without needing to ask, it doesn’t matter at all what these
mental bits might be.
A thought is a type whose tokens have in common a certain sense, and include the
tokens of the corresponding proposition, but include other non-verbal tokens;
these, however, are not relevantly different from the verbal ones, so that it is
sufficient to consider the latter. ( Ramsey 1923: 274)
Anscombe follows this recommendation in formulating the second plank of
her reading, that what she takes to be the ‘business part’ of (AA), the part
that follows the comma, is by the lights of the Tractatus a significant, bipolar
proposition. She models this part of her discussion on the illustration
Wittgenstein presents at TLP 3.1432,
56 Peter M. Sullivan
that ‘a’ stands in a certain relation to ‘b’ says that aRb ,
asking: What actually is that certain relation? There seem to be true and false
answers to this question. Given the notational conventions she supposes in
force, for instance, the following is true (where ‘^’ indicates concatenation):
that ‘a’ stands to ‘b’ in the relation one establishes between names n and
m by writing n^‘R’^m says that aRb,
while the following is false:
that ‘a’ is separated by one character from ‘b’ says that aRb.
With different conventions different instances of the ‘business part’ of (AA)
would hold. For instance, in the loose style that many of us adopt, one that
tolerates either of ‘aRb’ or ‘Rab’ indifferently, we should have:
that ‘a’ occurs in a three-character string also containing ‘R’ and ending
with ‘b’ says that aRb.
According to Anscombe’s reading, then, a specific instance of the pattern
(AA)—or, as she puts it, some particular ‘interpretation’ of ‘ “p” says that p’,
in which ‘ “p” ’ is replaced by a formulation of the representing fact that
constitutes that proposition (1959: 90)—will be a contingent, bipolar
proposition giving the content of a given report, ‘A judges that p’.
III
There are three obvious criticisms to be brought against Anscombe’s
reading. Two are raised by Kenny, the third not.
1. The first questions whether the examples just given should persuade us
that ‘ “p” says that p’ can be the form of a contingently true statement. To
imagine a change from true to false in such a statement amounted, we saw,
to imagining a change in linguistic conventions. On the conventions
Anscombe supposes in force, the last example,
that ‘a’ occurs in a three-character string also containing ‘R’ and ending
with ‘b’ says that aRb,
would be false; but with the laxer conventions I mentioned it would be true.
The subject matter of this sentence must therefore be something that can
remain the same while the conventions governing it are altered. It must be,
in Wittgenstein’s terminology, a sign rather than a symbol (TLP 3.321–
3.322). But Wittgenstein seems to count it a mistake to ascribe any
representational character to a mere sign. And if that is so, then the way
Anscombe aims to provide for such a statement to be merely contingently
true will, by Wittgenstein’s lights, prevent it from being a truth at all. Thus
Kenny:
What is Squiggle? 57
‘ “p” says that p’ does not have true–false poles. For what appears within the nested
quotation marks is either—as Anscombe understands it—a description of
accidental features of the propositional sign, in which case the proposition is always
false; or it is a description which identifies ‘p’ precisely as the proposition that says
that p; in which case the proposition is necessarily true (and therefore, for
Wittgenstein, a pseudo-proposition). (1984: 7)
This complements an earlier and simpler argument to the same
conclusion:
But ‘ “p” says that p’ must be a pseudo-proposition, since a proposition shows its
sense and cannot say that it has it (TLP 4.022). ( Kenny 1973: 101)
Neither argument is altogether persuasive. The thought of TLP 4.022,
that a proposition shows its sense, emphasizes Wittgenstein’s conception of
the proposition as a standpoint of representation (TLP 2.173), a transparent
medium through which we are presented with reality. Yet that conception
seems not to preclude another straightforwardly empirical standpoint, one
that focuses on the medium rather than looking through it. TLP 4.022
might be better rendered ‘A proposition shows the state of affairs that is its
sense’, rather than ‘A proposition shows that this state of affairs is its sense’.
Kenny would need the second, rather than the first, to argue that, since it is
shown what sense ‘p’ has, it cannot be said what sense it has.
As for the dilemma argument, it is certainly true that for Wittgenstein a
symbol, and not a mere sign, is the proper bearer of meaning: that means,
firstly, that only a symbol can have the kind of internal relation to reality
that Wittgenstein takes to be involved in the philosophically important
notion of meaning (TLP 3.31); and, secondly, that all manner of philos-
ophical confusions result if that primary notion of meaning is imagined to
attach to a mere sign (TLP 3.324). But again, accepting the priority of the
meaning of symbols seems not to preclude, but to make possible, a
secondary notion attaching to signs, when those signs are, contingently, the
‘visible parts’ of meaningful symbols (TLP 3.32).2 It was, Wittgenstein said,
for the empirical science of psychology to determine what are the actual
constituents of thought and the kind of relation they bear to things (1995:
125). Yet Kenny’s argument leaves to psychology no form in which to
report its results.
IV
Although the previous section casts doubt on some of his arguments,
Kenny’s overall picture is, I believe, much closer to Wittgenstein’s
intentions.
Suppose that I think a certain thought: my thinking that thought will consist in
certain psychic elements—mental images or internal impressions, perhaps—
standing in a relation to each other. That these elements stand in such and such a
relation will be a psychological fact; a fact in the world, within the purview of the
natural sciences; just as the fact that the penholder is on the table is a physical fact
within the purview of the natural sciences. But the fact that these mental elements
have the meaning they have will not be a fact of science, any more than the fact (if
it is a fact) that the penholder’s being on the table says that the cat is on the mat (if
the appropriate code is in force). ( Kenny 1984: 8)
This section does not aim to challenge that picture, but only to question
whether Kenny’s discussion can properly lead us to it.
As we’ve already seen, Kenny denies that anything of the pattern ‘ “p”
says that p’ can be a contingent truth. From that he concludes that ‘belief
propositions must be pseudo-propositions’—
or more precisely, they will be the conjunction of a genuine proposition and a
spurious one. The proposition that Jones believes that grass is green will be a
conjunction of (1) the proposition that certain mental elements in Jones’s mind are
What is Squiggle? 59
related in a certain way, and (2) the pseudo-proposition that their correlation in that
way says that grass is green (1973: 101).
And this, Kenny contends, removes the original problem that Anscombe’s
analysis merely relocated:
if ‘ “p” says that p’ is only a pseudo-proposition, and ‘A believes that p’ is of that
form, it is easy to see how propositions reporting beliefs are no exception to the
rule that propositions can occur in other genuine propositions only as the bases of
truth-functional operations. (1984: 7)
But, now, is the original problem, the appearance that ‘p’ occurs as
argument to a non-extensional function in ‘A judges p’, really solved? That
depends, I think, on what we take the original problem to have been: does it
turn on the apparent non-extensionality of the argument-place occupied by ‘p’,
or is the problem that the non-extensional place occupied by ‘p’ appears to
be an argument-place? Kenny’s solution presumes the first. He is content to
leave us with the appearance that ‘p’ does indeed occur in whatever analyses
‘A judges p’, while protecting the ‘rule’ of extensionality for ‘genuine’
propositions: on his account, one might say, ‘p’ does occur non-extensionally
in a non-proposition. (Since ‘A judges p’ is not a genuine proposition, it has
no truth value, so no truth value that varies independently of that of the
contained proposition ‘p’.) If, on the other hand, the second formulation of
the problem is the right one, we should expect a solution of it to reveal that
‘p’ does not occur as an argument at all in whatever genuine proposition
replaces ‘A judges p’. And the history of our passage, which shows
Wittgenstein to be concerned in the first instance to dispel the impression
that ‘p’ has in a belief report an occurrence comparable to that of the name
‘b’ in ‘aRb’, suggests strongly that the second formulation is the right one.3
The question arises how can one proposition (or function) occur in another proposition?
The proposition or function can’t possibly stand in relation to other symbols. (1961a:
118)
There are internal relations between one proposition and another; but a proposition
cannot have to another the internal relation which a name has to the proposition of which
it is a constituent, and which ought to be meant by saying it ‘occurs’ in it. In this sense
one proposition can’t ‘occur’ in another. (1961a: 116)
( I rely here on the premise that any relation to a proposition p would be expressed in a
proposition in which p occurred as relatum.)
4 This point happily allows me to avoid the general question, much discussed in recent
work on the Tractatus, whether there is confusion in relying on the idea that a piece of
nonsense might be ‘correct’.
What is Squiggle? 61
V
Ramsey finds the business part of a belief report where several of our
considerations have suggested it ought to be found, in the description of A’s
mental set-up. The first passage of the Critical Notice to make this plain is
not focused primarily on our passage, but occurs in Ramsey’s account of the
picture theory; he is expounding the claim that ‘the representing relation’,
the correlations between picture elements and objects, ‘belongs to the
picture’ (TLP 2.1513).
. . . this, I think, means that whenever we talk of a picture we have in mind some
representing relation in virtue of which it is a picture. Under these circumstances we
say that the picture represents that the objects are so combined with one another as
are the elements of the picture, and this is the sense of the picture. And I think this
must be taken to be the definition of ‘represents’ and of ‘sense’; that is to say, that
when we say that a picture represents that certain objects are combined in a certain
way, we mean merely that the elements of the picture are combined in that way, and
are co-ordinated with the objects by the representing relation which belongs to the
picture. ( That this is a definition follows, I think, from 5.542.) (1923: 271)
Ramsey is here concerned with two forms of statements. The first,
a picture represents that certain objects are combined in a certain way,
shares with ‘ “p” says that p’ the shape: subject–verb–complement. Its
subject seemingly refers to a representing item (a picture, proposition, or
thought); its clausal complement then formulates the represented state of
affairs. The second,
the picture elements are combined in that way, and are co-ordinated with the
objects,
begins by replacing the apparent reference to a representing item (“p” ) by
the schema of a sentence that would state the representing fact; it then
dispenses altogether with the clausal complement by which the represented
fact would be formulated (‘that p’), and thus with the representational verb
that introduces it (‘says’). Ramsey cites our passage as entailing that the first
of these statements is defined by, or reduces to, the second. For that to be
so, our passage must have the following implications:
1. No representational relation between ‘p’ and the possible fact that p
figures fundamentally in what is asserted by ‘ “p” says that p’ (nor
therefore in ‘A judges p’).
2. Nor is there in that statement any mention of, or formulation of,
the possible fact that p ; i.e. the apparent occurrence of ‘p’ on the
‘right-hand side’ of ‘ “p” says that p’ (and therefore of ‘A judges p’)
will, on a right account, simply disappear.
62 Peter M. Sullivan
VI
The major differences between Ramsey’s two discussions occur in parts that
I intend to cut away, and that allows me to swap backwards and forwards
between them. I’ll give priority to the account in the Critical Notice, drawing
on ‘Facts and Propositions’ for comparisons and confirmation. This earlier
discussion starts under the simplifying assumption that we have to deal with
only one logical symbolism, so that
apart from variation in the names used, there would be a rule giving all
propositional signs which, in that symbolism, had a certain sense, and we could
complete the definition of ‘sense’ by adding to it these rules. Thus ‘ “p” says that
~aRb’ would, supposing us to be dealing with the symbolism of Principia
Mathematica, be analysed as follows: call anything meaning a, ‘a’, and so on, and call
‘a’ ‘R’ ‘b’,7 ‘q’; then ‘p’ is ‘~q’ or ‘~~~q’ or ‘~q~q’ or any of the other symbols
constructed according to a definite rule. (1923: 278)
Let’s first strip away from this all the complexities introduced by
Ramsey’s attempt to deal with truth-functional complexity. That will allow
us to drop the material about rule-generated equivalents of ‘~q’, at the same
7 The expression, ‘ “a” “R” “b” ’ might look to be simply a string of quotations, but what
is intended is a proposition. Ramsey uses ‘ “R” ’ here, and I will use it in what follows, as a
relational expression: it expresses the relation between names defined in the previous
quotation from Ramsey (1923: 275).
64 Peter M. Sullivan
time swapping the example from ‘~aRb’ back to the elementary ‘aRb’. The
result is:
‘“p” says that aRb’ is to be analysed: call anything meaning a, ‘a’, and so
on; then ‘p’ is ‘a’ ‘R’ ‘b’.
Now convert this—in a fashion Ramsey clearly anticipates later in the same
paragraph—to an analysis of ‘A judges p’. We then have:
Call that by which A means a, ‘a’, and so on; then ‘A judges aRb’ is to
be analysed: ‘a’ ‘R’ ‘b’.
The central feature I’ve aimed to preserve through this succession of cuts
and simplifications is marked in Ramsey’s original statement of his proposal
by the striking formulation, ‘then “p” is “~q” or “~~~q” [and so forth]’.
This formulation embodies the idea, which grounds the implications 1–3 of
the previous section, that to say what a propositional sign says, given an
allocation of names, is just to say which propositional sign it is. Adapted to an
account of belief in accordance with TLP 5.542, this central idea becomes:
to report a belief is simply to report, in that identifying way, the occurrence
of a belief token; or in other words, to say what A believes is simply to say
how his mind is set.
The same conclusion can be reached from the account in ‘Facts and
Propositions’.
If then I say about someone whose language I do not know ‘He is believing that
not-aRb’, I mean that there is occurring in his mind such a combination of a feeling
and words as expresses the attitude of believing not-aRb, i.e. has certain causal
properties, which can in this simple case, be specified as those belonging to the
combination of a feeling of disbelief and names for a, R, and b, or, in the case of
one who uses the English language, to the combination of a feeling of belief, names
for a, R, and b, and an odd number of ‘not’s. (1927: 44–5)
Again, strip away from this everything introduced only to deal with truth-
functional complexity (and thus the parts with a pragmatist tendency), and
what remains is:
To say of someone ‘He is believing aRb’ is to say that there is occurring
in his mind a combination of names for a, R , and b.8
So again, we reach the conclusion that to report his belief is simply to say
how his mental bits are configured.
8 Along with—if you must—a feeling of belief. Ramsey allows his reader the freedom to
substitute for his talk of feelings ‘any other word . . . which the reader prefers, e.g. “specific
quality” or “act of assertion” and “act of denial” ’ (1927: 144 n.). Suppressing the pragmatist
tendency—and thus reassuming Wittgenstein’s indifference between believing, assuming,
suspecting, denying, or whatever—I have preferred in the main text to substitute nothing for
it.
What is Squiggle? 65
VII
The account of judgement Ramsey finds in Wittgenstein is thus strictly
parallel to the theory advanced by Geach in Mental Acts (1957, sect. 14).9
Geach speaks of Ideas of objects, where by an Idea is meant ‘the exercise of
a concept in judgement’ (1957: 53). An Idea of a would thus be, in Ramsey’s
way of speaking, a mental tokening of a name of a.10 Geach also introduces
an undefined operator on relations ‘§( )’, pronounced ‘squiggle’, such that ‘if
a relational expression is written between the brackets, we shall get a new
relational expression of the same logical type as the original one. If “R” is
dyadic, so is “§(R)”, . . . and so on’ (1957: 52). In Geach’s stark presentation
‘§( )’ is formally undefined, but the intention is clear: for A’s Ideas of a and b
to stand in the relation §(R) will be for A to judge that aRb. Geach’s relation
§(R) is thus Ramsey’s relation ‘R’: it is a mental tokening of a name of R.11
So, expressing Ramsey’s theory in Geach’s terms, we have:
A judges that aRb =def I(a) §(R) I(b).
Readers of Geach have naturally enough asked: What is squiggle? Its
behaviour is in some respects unusual. It is, as Geach points out (1957: 53),
clearly non-extensional: that §(cordate) I(a) will not amount to anyone’s
judging that a is a renate. And its syntax is, to put it mildly, non-Fregean:
what type of function is it that takes arguments of different types and
delivers values of correspondingly different types? 12
9 Though Geach later acknowledges the influence of the Tractatus on his theory (1957:
101), he does not mention Ramsey at all in Mental Acts ; I take that merely to illustrate the
minimal referencing conventions in vogue in 1957.
10 I am here bending, perhaps distorting, what Geach says. Geach’s Ideas are intrinsically
general: they are Ideas of any knife, or of some spoon, rather than of spoon a or knife b (1957:
53–4); or they are Ideas simply of flash and bang, rather than of this flash or that bang (1957:
63). There are two reasons for this. The less relevant is the grinding of Geach’s customary
axe, that name, and not proper name, is the fundamental syntactic category. The more relevant is
his view that ‘there is more hope that an account designedly adequate for general judgements
will turn out to be adaptable to singular judgements, than there is of the reverse adaptation’
(1957: 63). That reverse adaptation is of course Ramsey’s aim in the parts of ‘Facts and
Propositions’ that I have cut out. I am similarly cutting out Geach’s scepticism about its
prospects in translating his theory into an account of singular judgements.
11 Remember again that a name of a relation is itself a relation. As Geach makes plain
elsewhere (1961), this is in harmony with Frege’s view that the name of a function is itself a
function. Geach points out there that a name of a function can occur in a written expression
when there is no ink-mark of it to impress itself on the eye, as the name of the
exponentiation function occurs in ‘32 ’. Similarly a mental name of a relation can occur when
there is no phenomenological ‘ink-mark’ of it to impress itself on the inner eye.
12 Geach in the previous section complains that Russell’s multiple relation theory
‘require[s] different relations of judging (differing as to the number and logical types of the
terms between which they hold) for every different logical form of sentences expressing
judgements’ (1957: 49), e.g. a triadic relation for ‘s judges that Fb ’, J(s , F , b ), and a tetradic
66 Peter M. Sullivan
If these are legitimate concerns, then they should spread to embrace I( ).
No function delivers, for a as argument, the concept of a, or the mental
name of a; for there is no such thing as the concept or mental name of a.
And unless it is presumed, as in a Tractarian context it should not be, that
all objects are of a single type, I( ) must float across types as freely as §( ).
Geach seems untroubled by such thoughts. But do we really know what §( ),
or I( ), is meant to be?
VII I
Surprisingly, in the face of very similar questions Ramsey shares Geach’s
equanimity. Recall that the analytical proposal of the Critical Notice, quoted
in Section VI, assumed that ‘we had only to deal with one logical symbolism’
(1923: 278). At some point that false assumption has to be lifted: so long as
it is in force we have an analysis only of ‘A asserts p using such and such a
logical notation’, not of the neutral ‘A asserts p’; and, as Ramsey says, to
pass off the first for the second would have such effects as that ‘the
evidently significant fact that Germans use “nicht” for not becomes part of
the meaning of such words as “believe”, “think” when used of Germans’
(1923: 278). In the corresponding passage from ‘Facts and Propositions’,
also quoted in Section VI, Ramsey illustrates how, in a simple case,13 the
assumption can be lifted: the account there exploits the causal equivalence
of the attitudes of believing ~q and disbelieving q to avoid having to make
explicit reference to any negative particle. The two passages show that
Ramsey’s concern to eliminate unwanted language-relativity from his
analysis is focused solely on the language’s logical vocabulary. Surely,
though, the point that motivates the concern is broader than that. ‘ But we
may very well know that a Chinaman has a certain opinion without having
an idea of the logical notation he uses’ (1923: 278), Ramsey says. Equally,
one would have thought, one might know that without knowing what
names he has for things; and certainly, if I ever know a Chinaman’s
unexpressed opinion, I know it without having any notion at all of what
mental bits are involved.
If ‘ “a” “R” “b” ’, or the Geachian ‘I(a) §(R) I(b)’, is to give the content
of my report of A’s judgement, then in it ‘ “a” ’, or ‘I(a)’, must function as
my name of A’s name for a. But how am I to name that when I am not
acquainted with A’s name, when in truth I have no notion what it might be?
The Geachian formulation, in which, as we saw, ‘I( )’ and ‘§( )’ function
exactly as Ramsey’s quotation marks, might prompt the thought that my
names for A’s names are what Russell called ‘descriptive names’. But that
relation for ‘s judges that a R b ’, J(s , a , R , b ). If we were to construe ‘§( )’ as any kind of
functional expression, it would seem in hardly better shape.
13 For the importance of the qualification, see Ramsey (1927: 149 n.).
What is Squiggle? 67
thought would work only if I( ) and §( ) were functional, and we’ve seen that
they are not. The suggestion would in any case not suit the Tractarian
context. Most importantly, though somewhat vaguely, it would undermine
the appeal of Ramsey’s analysis, which lies in the idea that my portrayal of
A’s judgement pictures it precisely as it in turn pictures reality: if the
apparent complexity in ‘I(a) §(R) I(b)’ were really operative, so that my
judgement and A’s differed essentially in their multiplicity, then my report
would no longer display how A’s mental bits are configured. Less importantly,
though more concretely, complex terms such as ‘I(a)’ would on this account
be, have no place in the language envisaged in the Tractatus.
A natural response to the points just made would be to retreat to a
generalization, so that my report would have the form
x, y,S: x names a . y names b . S names R . x S y,
and this is indeed one of the ways in which Ramsey presented Wittgenstein’s
analysis in his lectures.14 In the face of those points, though, this formu-
lation can be only a temporary retreat, as can most easily be shown by
reviewing the sign–symbol dilemma that structured the dispute between
Anscombe and Kenny. The quantified variables the generalization employs
cannot be supposed to range over symbols. That would raise again Kenny’s
worries over whether its first three conjuncts are attempts to say what can
only be shown. More definitively, in my view, this worry would now clearly
14 Ramsey expounded Wittgenstein’s analysis in his Lent Term 1925 lectures entitled
‘Foundations of Mathematics’. Notes of these lectures, taken by L. H. Thomas, read:
The meaning of such a proposition as ‘A asserts a R b ’ is now analysed as
( ‘a ’, ‘b ’, R 0 ): ‘a ’, ‘b ’ are in A’s mind . ‘a’ M a . ‘b ’M b . R 0M R . ‘a’ R 0‘b ’.
In this analysis no facts occur and it does not presuppose a R b . (1925b: 40)
( It is explained that M represents ‘means’; that M is of different type from M; and that R 0
represents the relation that signs ‘a ’ and ‘b ’ have in ‘a R b ’.)
Ramsey’s focus at this point in his exposition is most clearly brought out by his first
comment on the displayed formulation (effectively embodying implications 1–3 of Section V
above), that ‘in this analysis no facts occur’. Later in the notes Ramsey turns to the short-
coming in Wittgenstein’s account that he had identified in the Critical Notice and was to
address with the pragmatist innovations of ‘Facts and Propositions’. The notes then read,
very much in the vein of Anscombe’s analysis (see again (AA) in Section II above), and in
apparent tension with the key point that ‘no facts occur’:
We can reduce ‘A asserts p ’ to ‘p ’ says p just as before—in A’s mind there is a symbol ‘p ’
and ‘p ’ says p. ( p. 45)
The apparent tension with the earlier point is, surely, no more than that—i.e. it is simply a
consequence of Ramsey’s shifting his attention to a new point. In just the same way, I think,
the fact that Ramsey was content to make that earlier point by means of a formulation of
Wittgenstein’s proposal similar to that discussed in the text does not show that he would
have accepted this formulation as adequate in every respect. Each of these two formulations is
adequate to the point in hand at the relevant stage of Ramsey’s exposition. ( I am grateful to
Michael Potter for reminding me of the need to square the points made in the text with
Ramsey’s presentation of the issues in these lectures.)
68 Peter M. Sullivan
also afflict the fourth conjunct: while an arrangement of signs can be
described in a proposition, the logical combination of symbols cannot (TLP
4.21ff.). Ramsey, in any case, is clear throughout that his analysis concerns
configurations of signs. But that alternative now also seems to be
unworkable. The retreat to a generalization was forced by the thought that I,
as reporter, need have no notion of what verbal or mental signs A happens
to employ. Equally, though, I have no notion of how those signs, whatever
they might be, are connected with things. If the variables in this formulation
range over signs, then the naming it speaks of is an external relation. What
relation this is, just as much as what those signs are, is something for
psychology to find out (Wittgenstein 1995: 125). How can I report that
things stand in this relation if I have no notion what the relation might be?
The question ‘ What is the naming relation?’ is the form now taken by the
question ‘ What is squiggle?’. Our analysis seems both to need and to
preclude an answer to it.
IX
Ramsey, as I noted, is no more fazed by the question than Geach is. He
glides past the difficulties just outlined with the phrase ‘apart from variation
in the names used’ (1923: 278), spoken in the tone of ‘apart from negligible
details’. Those difficulties don’t worry him, I think, because they are
irrelevant to the ‘formal standpoint’ (1927: 41) his discussion adopts.
To make this clearer it helps to note an oddity of the way Ramsey
introduces his quotational names for a judger’s names, his equivalent of
Geach’s I-terms and §-terms. He says, ‘call anything meaning a, “a”, and so
on’ (1923: 278, quoted in context in Section VI). You might compare that to
an injunction ‘Call any elephant “Nellie” ’, to which a natural response is
that it asks you to do something you cannot do: you cannot name an
elephant ‘Nellie’ unless you know which elephant you are naming, and you
don’t know that just by being told that ‘Nellie’ is to name any elephant. At
any rate, that would be a reasonable response if it were supposed that
‘Nellie’ is to be really a name. It would, on the other hand, be an inept
response if ‘Nellie’ were intended merely as the kind of dummy-name
introduced in the course of a proof. (Compare ‘Let n be a real number
between 0 and 1’. ‘ Which real number?’) Dummy-names like that are
schematic in two ways: they achieve generality, and they cover ignorance.
They are introduced on the back of premises enough to ensure that there is
something of a relevant kind to be named, but not enough to make its
naming really possible. Ramsey’s formulation seems suited only to
introducing a dummy-name like that. If that is what it actually does, it will
What is Squiggle? 69
follow that subsequent analyses, which include these ‘names’, will inherit
their schematic character.15
A modern reader expects to extract from Ramsey’s discussion an account
of reports of belief, an explicit semantic explanation of how expressions
function in indirect speech. To someone with that expectation Ramsey’s
easy resort in the analysans to a device of quotation that is unexplained, and
that threatens to be inexplicable, is bound to appear as a serious flaw in his
account. But Ramsey’s concerns are different from this modern reader’s. He
announces his target as ‘the logical analysis of . . . judgement’, not of reports of
judgement (1927: 34, my emphasis). From the very traditional opening pages
of ‘Facts and Propositions’, and the character of his engagement with
Russell in them, it seems plain that Ramsey understood his problem very
much as Russell did when he wrote: ‘The problem at issue is the problem of
the logical form of belief, i.e. what is the schema representing what occurs
when a man believes’ (Russell 1922: 19). The question for both of them is:
What kind of thing is going on, or what kind of fact is it, when a man
believes something? And Ramsey’s answer to that runs, for the atomic case:
it is for him to have names of the things his belief concerns combined in his
mind in the way he believes those things to be combined. It is not a fault in
this answer that it presents the fact in question only by description, or that
the names that would be involved in the fact are not actually named, but are
mentioned only descriptively as names of such and such things. That
complexity attaches only to the analyst’s characterization of the fact, and not
at all to the fact characterized.
The same holds, I think, when the analysis of ‘A judges that aRb’ is
presented notationally by ‘ “a” “R” “b” ’, or by ‘I(a) §(R) I(b)’. Modern
conceptions of what such an analysis is intended to achieve, of what
questions it is supposed to answer and how, lead us to question how the
terms in the analysans are supposed to work. So we ask, ‘ What is the seman-
tics for quotation here exploited?’, or ‘ What relation is reckoned to hold
between I(x) and x?’, or again, ‘ What is squiggle?’. Those questions miss
Ramsey’s drift, since for him ‘ “a” ’ is not genuinely a complex term at all. It
is just a stopgap, a stand-in for a name one is not in a position to know.
I’m sure that’s how Ramsey thought of his Wittgensteinian analysis; and
I’m sure he had Wittgenstein right in thinking of it that way. There is,
though, a natural thought that suggests this cannot be the end of the matter.
This natural thought, which I’m inclined to accept as correct, is that an
expedient is only properly counted an expedient if it is temporary. It implies
that one can brush aside questions about the complexity of such apparent
15 The point is not just that Ramsey’s analyses will be presented schematically, so as to
cover in one go a range of propositions. That much would be true of ‘x is a bachelor =def x is
male and x is unmarried’, a schema whose instances are full-fledged meaningful propositions.
The consequence drawn above is that the instances of Ramsey’s analysis will have a schematic
character.
70 Peter M. Sullivan
names as ‘ “a” ’ only if, in theory at least, those stopgap names could
eventually be replaced by real ones. And to the question of what theory
could meet that ‘in theory’ obligation, the only feasible answer is
psychology.
There are passages in ‘Facts and Propositions’ where Ramsey clearly
envisages this kind of supplement from psychology.16 In those passages he is
certainly going beyond anything said in the Tractatus, but I see no reason to
hold that he is going against anything in Wittgenstein. What those passages
envisage might now go under the title of a ‘naturalistic theory of content’, an
empirical identification of a human being’s mental bits, and a description of
their external relations to things. Wittgenstein would of course have thought
it no business of a philosopher to supply such a thing. But the ‘gap’ for it is
there, as Ramsey says; and perhaps noting that is more important than
arguing over what to call people who try to fill it.17
16 With this point we abandon the simplifying assumption introduced in Section I, that
the pragmatist elements of ‘Facts and Propositions’ can be cut away by limiting oneself to the
elementary case: p. 45 of that article clearly envisages that a causal theory would extend to the
names occurring in an elementary proposition.
17 Thanks to Alan Millar and Michael Potter for comments on a draft, and to participants
at the Cambridge conference on Ramsey for challenging questions.
Ramsey’s Transcendental Argument
M I C H A E L P O TT E R
One of the papers of Ramsey’s Nachlass which his widow, Lettice, sold to
the Hillman library at Pittsburgh is a set of notes entitled ‘The Infinite’.
Embedded in these notes is the following curious argument for the Axiom
of Infinity:
We can say that the idea of infinity proves its existence. ( Wittgenstein’s extra prop).
But the sign proves nothing. We can prove it this way. It is clear that there may
be an of atoms and whether there are or not is an empirical fact, and this
possibility implies an of objects, as it were to be the possible atoms. In this way it
is clear that transcendentally taken the axiom of infinity is true, though empirically it
is doubtful.
I
Let us begin, then, by getting clear about the argument itself. What is clear
straight away is that the context Ramsey intends is the system of the
Tractatus, in which he had been immersed since he prepared the first draft of
its English translation early in 1922. So we cannot hope to understand
Ramsey’s argument without first going some way into this context. Now it
would be a brave man who confidently asserted what the key idea of the
Tractatus is. (Certainly what Wittgenstein himself calls his key idea—that
logical constants do not refer—is rather hard to present in a way that makes
it anything like the lynchpin of the book.) But it is at any rate one of the key
72 Michael Potter
ideas of the Tractatus that the task of a proposition is not merely to say how
things stand in the world but to contrast the way they do stand with other
ways they could have stood but don’t. The job of a proposition, that is to
say, is to carve up the ways things might stand into two classes: the propos-
ition is then true or false according as the way things are is in one or other
of these two classes. And Wittgenstein takes it that these different ways
things might stand—possible worlds, to use the modern jargon—must have
something in common, in order that they should be different ways our world
could be rather than just wholly distinct worlds with nothing whatever to do
with one another. The elements which different possible worlds have in
common Wittgenstein calls objects. ‘Object’ is thus for Wittgenstein a
technical term, referring to whatever it is that our language presupposes in
order that it should be significant.
We should grant one thing straight away: it is hard to be confident that
the existence of objects really does flow from the key idea just alluded to.
Certainly the existence of objects is one of the first claims of the Tractatus
that Wittgenstein himself publicly renounced, and the only argument he ever
offers for believing it—the argument for substance of Tractatus 2.0211–2—is
notoriously brief and problematic. Nonetheless we must grant Wittgen-
stein’s claim for the time being if we are to be in a position to appreciate
Ramsey’s argument, since it is a claim which Ramsey simply presupposes.
And if we do grant Wittgenstein’s claim, it is easy to see that a great deal
follows. It follows at once, for instance, that objects are necessarily existent,
just because they are by definition what different possible worlds have in
common. (And presumably, therefore, most of the humdrum things we
knock against in our mundane lives are not objects in Wittgenstein’s
technical sense, since it seems we can quite coherently express the possibility
of their non-existence.) It follows, too, that there is no genuine relation of
identity that can hold or fail to hold between objects: what changes in the
transition between possible worlds is how objects are combined with one
another to form atomic facts; what the objects are does not change, because
they are the hinges about which the possibilities turn and hence are
constant.
But although there are according to the Tractatus no genuine identity
statements, there are many propositions that are apparently of this form.
What appear to us to be meaningful identity statements linking proper
names actually involve disguised descriptions to be analysed by means of
Russell’s famous rewriting device. But at this point there is a difficulty.
Russell analyses g(the f ) as
(x): fx . (y)fy x = y . gx ,
which still contains the symbol for identity. So if, as Wittgenstein claims,
there is no relation of identity, the analysis is still strictly meaningless.
Wittgenstein’s solution is to adopt a new convention for interpreting
quantified variables: where one variable occurs in the scope of another,
Ramsey’s Transcendental Argument 73
Wittgenstein assumes (unlike Russell) that the ranges of interpretation of
the two variables do not overlap. Consider, for example, (x , y)x Ry: for
Russell this means that something is R-related to something; whereas for
Wittgenstein it means that something is R-related to something else. This
elegant notational device allows Wittgenstein to re-express the Russellian
analysis of g(the f ) as
(x): fx . gx . ~(x, y) . fx . fy.
(In words: something is both f and g , and there are not two things which are
both f.)
Notice also that since which objects there are does not vary between
worlds, how many there are does not vary either. And if how many objects
there are does not vary between worlds, there cannot be a genuine
proposition which expresses how many there are. The best we could do in
this regard would be to say something which presupposes for its significance
that there are a certain number of objects. If we did that, we would show
but not say how many things there are.
Now you might very well think that what I have just said is wrong and
that there is in fact a way of saying how many things there are: you might
think indeed that Wittgenstein’s own notation for avoiding the identity sign
allows us to express exactly this. If fx is any propositional function,
(x1, …, xn) . fx1 . … . fxn
apparently says that there are at least n things that are f. So if Tx is some
propositional function which is true of every object of a certain sort (e.g.
fx ~fx), and if we let
pn =def (x1, …, xn).Tx1 . … .Txn ,
then pn seems to say just that there are at least n things.
Seems to, but does not. What has always to be borne in mind is that
Wittgenstein’s way of reading nested quantifiers is a device. We can invent all
manner of such devices—all manner of combinations of signs—but
whether any such combination succeeds in saying something significant
depends on whether it carves up the ways the world could be into two
classes, those in which it is true and those in which it is false. (This,
remember, was part of what I earlier called Wittgenstein’s big idea.) And if
there are not enough objects, his device will not say something false but will
simply not say anything at all. Suppose, for instance, that there are only
three objects a, b, and c. Then p2 says the same as Ta .Tb Tb .Tc Tc .Ta,
which is a tautology, and p3 says the same as Ta .Tb .Tc, which is also a
tautology. But what does p4 say? Nothing remotely similar to the preceding
sentences is available. So we are forced to conclude that p4, despite
appearances, is not a proposition at all but merely a jumble of signs without
significance. More generally, the pattern is this. If there are n things in the
world, the sequence
74 Michael Potter
p1, p2,…
starts with n ways p1, p2,…, pn of expressing tautology; but from then
onwards the signs pn+1, etc., rather than being, as we previously thought,
ways of expressing something false, are in fact ways of expressing nothing at
all.
Do not be too hard on yourself if you made the mistake, though:
Wittgenstein says nothing in the Tractatus to guard against it, and when
Ramsey went to visit Wittgenstein in Austria in September 1923, he
evidently persuaded Wittgenstein how easy a mistake it is to make. (It may
indeed be that Ramsey himself had made it.) For in Ramsey’s copy of the
Tractatus, at the point where the text explains that one cannot say ‘There are
100 objects’ or ‘There are 0 objects’, Wittgenstein added an extra propos-
ition intended to clear up the confusion (see Lewy 1967): ‘The proposition
“there are n things such that . . .” presupposes for its significance, what we try
to assert by saying “there are n things”.’ We may be sure, then, that even if
Ramsey did not understand the point before he went to Austria, he certainly
did when he returned to Cambridge to begin his first term of study as a
graduate student in October 1923.
This, then, is the Tractarian background. With it in place, Ramsey’s
argument is quickly explained. Let q0 be the claim that there are infinitely
many empirical things (electrons, protons, or whatever). This may well, as a
matter of fact, be false. But what is clear, Ramsey thinks, is that it is
significant. And if it is significant, the sentence p0 is also significant. But, as
we have seen, the signs
p1, p2,…, p0
are all either tautological or meaningless. So in particular p0 cannot be sign-
ificant without being true. Since it is significant, therefore, it is true. But p0
is just the Axiom of Infinity. (Or, more strictly, it shows what the Axiom of
Infinity tries, illegitimately, to say.) So we may conclude that the Axiom of
Infinity is true.
II
‘The Number of Things in the World’ is not a finished article ready for
publication: it launches into its subject far too abruptly to be that. None-
theless, it is quite close to being in a suitable form for inclusion as a section
in a longer article. Yet Ramsey never published it. Why not?
There could be any number of mundane reasons for this, of course, but
what I want to show here is that Ramsey’s paper ‘The Foundations of
Mathematics’, which he published in 1926, contains the clues to a
particularly straightforward explanation for his abandonment of the
transcendental argument. That paper is nowadays famous principally (and
for many readers, I suspect, only) for the distinction Ramsey draws between
Ramsey’s Transcendental Argument 75
the set-theoretic and the semantic paradoxes. This distinction enables him
to argue that a simple theory of types suffices to solve the set-theoretic
paradoxes, leaving the semantic paradoxes to be solved at the level of
meaning, with the advantage that the simple theory of types has no need of
Russell’s problematic Axiom of Reducibility.
But it is actually a little strange that this is nowadays seen as Ramsey’s
principal achievement in the philosophy of mathematics, since the idea is
not really his: the distinction between two types of paradox had already been
made by Peano, as Ramsey knew, and the observation concerning the
simple theory of types which he drew from it is not in itself especially deep.
But ‘The Foundations of Mathematics’ does contain another big idea,
and it is this other idea that may have led him to abandon his transcendental
argument. We have already seen that Wittgenstein’s notation allows us to
form the string of signs
pn =def (x1, …, xn).Tx1 . … .Txn ,
which seems to say that there are at least n things but, if there are not n
things, is actually meaningless. Already in ‘The Number of Things in the
World’ Ramsey notes that this lurking possibility that we are talking
gibberish is very inconvenient, since ‘in making complicated signs, if we are
not careful’, we shall involve such forms as this. He then argues that it
would be far more convenient if we could give the string of signs a meaning
and suggests that ‘the most suitable meaning to give it is that of contra-
diction’ (1991: 172). But this does not yet overturn Ramsey’s transcendental
argument because, as he observes, p2 .~p3, for instance, is ‘not really the
expression of a proposition “There are exactly two things”, and yet it is
possible to treat it symbolically exactly as if it was’. Thus Ramsey’s position
at this time remains that p2 ‘has no meaning (unless we define it arbitrarily to
mean contradiction)’ except in the case in which there are at least two
things.
But what happened, and provides a sufficient explanation for Ramsey’s
abandonment of his transcendental argument, was that he adopted a new
notation which allowed him, as he thought, to define non-arbitrarily a
sequence of propositions which switches from tautology not to meaning-
lessness but to contradiction. To explain Ramsey’s new idea we need to
recall one more item from the theory of quantification in the Tractatus. One
of the ways envisaged there of forming quantified expressions is to take a
proposition p and replace some name ‘a’ in it with a variable x. The result is
an instance of what Ramsey calls a predicative function. It is a symbolic
notation whose role is to pick out a certain class of propositions, namely all
those which are just like p except that they may have in place of ‘a’ some
other name of the same type. As Ramsey points out in ‘The Foundations of
Mathematics’, if fx is a predicative function, there is a clear sense in which fa
says the same about a as f b says about b. The fact which obtains if fa is true
has just the same structure as the fact which obtains if fb is true: the only
76 Michael Potter
difference is that the latter fact has b in it where the former has a. What
Ramsey did was to introduce a quite different notation for picking out a
class of propositions. A propositional function in extension (Ramsey 1925a: 215)
is a notation ex such that, for any name ‘a’ of the appropriate type, ea
expresses a proposition involving a. There is no longer any requirement that
ea should say about a the same as eb says about b.
What matters here is that this notion of propositional function in
extension enables Ramsey to define a propositional function
T(x, y) =def ( ) e x e y
with the property that T(a,a) is a tautology and T(a,b) is a contradiction for
any two distinct objects a and b. He can then define p2 to be the logical sum
of all the propositions of the form ~T(x , y). Similarly pn can now be defined
to be the logical sum of all propositions of the form ~T(x1,x2 ).~T(x2 ,x3 ).
… .~T(xn1,xn ). And p0 is the logical product of the propositions pn for all
finite n. The result of all this is that with these new definitions Ramsey’s
sequence
p1, p2,…, p0 ,…
goes from tautology not to meaninglessness but to contradiction, thus
pulling the rug from under Ramsey’s argument.
III
Before we go on let us, as Ramsey would say, look around and see where we
have got to. Ramsey’s transcendental argument is as follows:
(1) If p0 is meaningful, it is true.
(2) If q0 is meaningful, p0 is meaningful.
(3) q0 is meaningful.
So p0 is true.
As we have seen, it is a sufficient explanation for Ramsey’s abandonment of
the argument that he adopted a notion—that of propositional functions in
extension—which makes premise (1) false.
For Ramsey the point of the notion of propositional function in
extension was that it was central to his attempt to show that mathematics
consists of tautologies. His difficulty, however, was that although the notion
was essential to his project, it vitiated his argument for something else that
was essential, namely the Axiom of Infinity. He was therefore reduced to
offering, in the last paragraph of the essay as published, what is little more
than a rhetorical flourish: ‘The Axiom of Infinity . . . if it is a tautology,
cannot be proved, but must be taken as a primitive proposition. And this is
the course which we must adopt, unless we prefer the view that all analysis
is self-contradictory and meaningless’ (1925a: 224). This is not satisfactory,
Ramsey’s Transcendental Argument 77
and Ramsey knew it. About the same time as he was correcting the proofs
of ‘The Foundations of Mathematics’, Ramsey delivered a paper to the
British Association in Oxford in which he admitted that ‘there still remains
an important point in which the . . . theory must be regarded as unsatis-
factory, and that is in connection with the Axiom of Infinity’ (1926a: 241).
Ramsey’s failure to find an argument for the Axiom of Infinity that is
cotenable with his theory of propositional functions in extension is thus not
peripheral but a mortal blow to his version of logicism.
But can we date Ramsey’s argument? I think that we can, but to do so we
need to continue the narrative a little beyond Ramsey’s return from his first
meeting with Wittgenstein in September 1923.1
It is clear that Ramsey started on the work which became his famous
article on ‘The Foundations of Mathematics’ soon after he had returned
from Austria. He had told his mother in a letter written during the visit that
he would ‘try to pump [Wittgenstein] for ideas for its further development
which I shall attempt’, and this seems to be just what happened. We know
that he was preoccupied for some time with issues arising from the Wittgen-
steinian notation for identity. In November 1923 he wrote to Wittgenstein
(McGuinness and von Wright 1995: 191) about what he thought was a
difficulty of expressing ‘Something other than a is f ’. Wittgenstein wrote
back immediately with the answer
fa..(x, y) . fx . fy : ~fa (x)fx ,
and Ramsey had to admit (McGuinness and von Wright 1995: 194) that he
had not fully understood the notation.
By the time Ramsey was ready to depart for his second visit to Austria in
March 1924, he had made a breakthrough. In a letter to Moore in February
of that year 2 asking for a reference in support of his application for an Allen
Scholarship, Ramsey reported that ‘ I have got on W[ittgenstein]’s principles
a new theory of types without any doubtful axiom which gives all the results
of Russell’s one, and solves all the contradictions.’ What Ramsey is referring
to here, of course, is his use of Peano’s distinction between types of paradox
to argue for a simple theory of types without the need for reducibility.
Ramsey must therefore have come upon this argument quite early in his
graduate work.
But he can hardly at this point have come upon his other big idea,
propositional functions in extension. For he goes on to tell Moore that
‘ Wittgenstein and I think it wrong to suppose with R[ussell] that mathe-
matics is more complicated formal logic (tautologies); and I am trying to
1 Ramsey’s notes on the Axiom of Infinity are certainly later than this since they refer
explicitly to the ‘extra proposition’ which Wittgenstein wrote in Ramsey’s copy of the
Tractatus during that visit.
2 Letter in Moore papers, Cambridge University Library.
78 Michael Potter
make definite the vague ideas we have of what it does consist of.’ And the
whole point of the notion of a propositional function in extension in ‘The
Foundations of Mathematics’ is that it is what Ramsey uses to show that
mathematics consists of tautologies.
‘The Foundations of Mathematics’ did not appear in print until late in
1926, but almost all the work that went into it was done much earlier. After
six months in Vienna (during which he had to field various complaints from
his over-anxious mother that he wasn’t doing enough work and would have
to answer to the trustees of the Allen Scholarship for misuse of their
money) Ramsey returned to Cambridge in October 1924 and immediately
took up a teaching fellowship at King’s College. It seems very unlikely that
he would have had very much time for research during his first term in this
post. Fellows of King’s were worked quite hard in those days and he would
probably have had twelve hours a week of undergraduate supervisions to
give in his first term.
After his first term as a teaching fellow was over, though, he would have
had a little time to write up the work he had done before and during his stay
in Vienna in the form of an essay. We know from a letter he wrote to
Lettice 3 (with whom he had at this point only just started a relationship) that
he sent the essay to be typed on 31 December 1924. This was just in time
for it to be submitted as an entry for the Smith’s Prize (a competition for
dissertations by beginning graduate students in the Cambridge Mathematics
Faculty) at the beginning of the Lent Term (i.e. mid-January) 1925.
The essay did not win the Smith’s Prize, which went to a contemporary
of Ramsey at St John’s College called Gerald Room.4 The following
summer, however, Ramsey decided to submit the essay for publication. The
impetus for this was the reforms imposed on the university by the Oxford
and Cambridge Act of 1925. Until then the vast majority of the teaching
staff at Cambridge did not hold office in the university itself but received
their earnings by means of stipends from the colleges at which they held
their fellowships, which they supplemented by charging a guinea for each
student who attended one of their lecture courses. The new Act of Parlia-
ment led to the establishment of a reformed employment structure (which
has survived in its essentials to the present day) in which the normal post
for most of the university’s academic staff was to be the office of University
Lecturer. Ramsey intended to apply for one of these newly created posts,
but if he was to do so he needed some more publications. One product of
this sudden need to publish was his paper ‘Universals’, which he wrote
(apparently in something of a rush) and submitted to Mind in the summer of
3 Letter in Modern Research Archive, King’s College, Cambridge.
4 Room’s essay was called ‘Varieties Generated by Collinear Stars in Higher Space’. It
would of course make a good story if Room had sunk without trace, but in fact he had a
distinguished career as a geometer, was a founding Fellow of the Australian Academy, and
had the mathematics library at the University of Sydney named in his honour.
Ramsey’s Transcendental Argument 79
1925. Another was that Ramsey decided to try to publish his prize essay. In
those days before double-blind refereeing, however, he was worried that a
journal which had not heard of him might reject it, so on 24 July 1925 he
wrote to Russell 5 asking for a letter of support to be included with the paper
so as to ensure that the journal editor took it seriously.
Russell’s reply has not survived, but Ramsey submitted his paper to the
Proceedings of the London Mathematical Society, with or without Russell’s
testimonial: it must have been accepted in September or October of 1925,
since we know that it was read at a meeting of the London Mathematical
Society on 12 November 1925. As I have been unable to trace a copy of the
prize essay in the form it was originally submitted as an entry for the Smith’s
Prize, it is a matter of conjecture how much Ramsey altered it between then
and when it was published, but circumstantial evidence suggests that any
changes were very minor. Certainly by the time he corrected the proofs of
the published article, in late July 1926, he had ‘thought of ever so many ways
in which if I hadn’t been damned slack I’d have made it better. That always
happens at least also with my Universals paper; I never write anything
except in a hurry because it is pressing and then am too slack and self-
satisfied to improve it afterwards at my leisure.’ 6 And there are various
places in which the published article betrays its origins as a prize disser-
tation. It starts with a table of contents, for instance, which is a little unusual
in a paper of this length (forty-seven pages).
What is at any rate clear is that the overall content of the published paper
did not advance significantly beyond the prize essay. We know this because
in the Lent Term 1925, immediately following his submission of the essay,
Ramsey gave for the first time a lecture course entitled ‘Foundations of
Mathematics’ and lecture notes taken by a student at these lectures survive.7
Ramsey included in this course a summary of his own work and included in
this summary all the key ideas of the published paper.
Where does all this leave the dating of our transcendental argument? As
we have seen, Ramsey adopted the idea of propositional functions in
extension some time between February and September 1924, most likely
during his stay in Vienna. The transcendental argument must date from
before this adoption. On the other hand, a set of notes entitled ‘Identity’,
with which his notes entitled ‘The Infinite’ are closely related, make use of
Wittgenstein’s translation of ‘Something other than a is f ’, which Ramsey
did not receive until about the beginning of December 1923. All of this
suggests that the argument dates from some time between January and
September 1924. But the influence of Kant is also evident in the notes, not
5 Letter in Russell Archives, McMaster University.
6 Letter to Lettice Ramsey in the Modern Research Archive, King’s College, Cambridge,
quoted by kind permission of the Provost and Scholars of King’s College, Cambridge.
7 L. H. Thomas papers, Special Collections Department, North Carolina State University
Library.
80 Michael Potter
only in the form of the argument itself, but in a contrast Ramsey draws
between intuitive and discursive mathematics, and we know from his diary
that Ramsey was reading Kant early in 1924. This makes it rather plausible
that ‘The Infinite’ may well be what Ramsey is referring to in his diary entry
for 28 January 1924: ‘ Wrote after tea some notes on formal logic
(abstraction, identity, axiom of infinity).’8
IV
The conclusion Ramsey was trying to substantiate, that mathematics
consists of tautologies, is one to which Wittgenstein was fundamentally
opposed. It was therefore important to him to object to some part of
Ramsey’s theory. However, what he objected to was not so much Ramsey’s
failure to provide a good argument for the Axiom of Infinity as the other
part of his account, the theory of propositional functions in extension.
Some of his objections to this theory are contained in a letter he dictated to
send to Ramsey in July 1927 (McGuinness and von Wright 1995: 216–18)—
just about the first evidence we have of Wittgenstein doing serious philo-
sophy after his long sabbatical in Lower Austria. But Wittgenstein did not
rest there: he returned to the issue in Philosophical Remarks (§120) and
Philosophical Grammar (pt. II, ch. iii, §16), struggling to find a formulation
which expressed his objection clearly.
The issue of whether Ramsey’s notion of a propositional function in
extension is coherent lies at the centre of deep difficulties in modern set
theory on which I cannot arbitrate here. But there is at the very least reason
to think that Wittgenstein may have been right (see Sullivan 1995). In which
case it behoves us to return to Ramsey’s argument and ask, if we suppose
that its first premiss is reinstated, whether its second and third premisses are
likewise in good order.
Ramsey’s second premiss, let us recall, was that if it is meaningful to say
that there are infinitely many empirical entities (e.g. physical atoms or
protons), then there must be some type of which it is meaningful to say that
there are infinitely many objects of that type. Or, more briefly, if q0 is
meaningful, p0 is meaningful. Is this true?
There have of course been at various times scientists who have been
physical atomists—who have supposed, that is to say, that the physical
world is made up of irreducible entities of certain kinds. What these kinds
have been has varied. At one time the irreducible entities were thought to be
atoms (hence the name). During most of the twentieth century school-
children were taught that the world is made up of electrons, protons, and
neutrons. Nowadays the fundamental particles are much more exotic. But at
any of these stages in scientific development it would have been possible to
1. INTRODUCTIO N
Is there a fundamental division of objects into two classes, particulars and
universals? This was the question that Ramsey set out to address in his 1925
Mind paper ‘Universals’. After considering a variety of different arguments
in favour of a fundamental division he came to the sceptical conclusion that
there was no reason to suppose such a division between particulars and
universals obtained. The theory of universals, Ramsey declared, was
‘nothing but a muddle’ (1925c: 30).
‘Universals’ has not received the critical attention it deserves. Despite the
great burgeoning of interest in universals over the last twenty-five years and
the fact that so many contemporary theoretical developments presuppose
the existence of a fundamental division between particulars and universals,
Ramsey’s views have received little or no attention.1 Why has ‘Universals’
been so neglected? Partly because many of Ramsey’s commentators have
thought it possible to say simply, shortly, and decisively why the arguments
of ‘Universals’ are mistaken. But Ramsey has been ill served by his
commentators. They have failed to appreciate the significance of
‘Universals’ because they have treated its arguments in isolation not only
from one another but also from arguments employed by Ramsey in other
writings and the evolving views of Russell and Wittgenstein that influenced
Ramsey. When placed in this wider context, it is evident that the arguments
of ‘Universals’ cannot be so easily dismissed. It also becomes evident that
‘Universals’ continues to bear significance for contemporary debate.
5 Several of Ramsey’s critics have rejected these assumptions, arguing that the same
proposition may admit multiple parsings where different parsings reveal the presence of
different constituents ( R b on one parsing, a R on another, etc.) but where the proposition
nevertheless still says the same thing. See Moore (1962: 297 ); Anscombe (1959: 95); Geach
(1975: 146); Dummett (1981: 264–6); Oliver (1992: 95–6). But unless we enquire into the
underlying motivations that shape Ramsey’s argument we will be unable to assess the relative
merits of endorsing or rejecting alternative views.
90 Fraser MacBride
intuition on his side when he makes this claim. Consider a situation in which
a cup is on top of a saucer. Intuitively the cup’s being on top of the saucer is
the same external state of the environment—the very same ‘chunk of
reality’—as the saucer’s being underneath the cup.6 Similarly, it may be
argued, looking away from everything psychological and concerning only the
external fact in virtue of which it is true to say that aR b, it seems plain that
this fact consists of just a, R , and b, and that whether we choose to describe
it as ‘R holds between the terms a and b’, ‘a possesses the complex property
of “having R to b” ’ or ‘b has the complex property that a has R to it’ is a
mere matter of language. Consider a situation in which a cup is next to a
saucer. Intuitively the holding of the next to relation between the cup and the
saucer, the possession by the cup of the property being next to the saucer, and
the saucer’s having the property that the cup is next to it, are the very same
state of the external environment, a chunk of reality that consists of nothing
but the cup and the saucer in adjacency.
Whatever the intrinsic merits or defects of these arguments they cannot
suffice as interpretations of Ramsey. It is critical to the structure of
Ramsey’s argument that the admission of complex universals results in ‘an
incomprehensible trinity’ of propositions. After all, the argument is intended
to be a reductio ad absurdum of the assumption that complex universals exist.
But neither of the interpretations offered makes sense of this. The trinity of
propositions (or facts) that result from the admission of complex universals
are—so far as these interpretations go—redundant or superfluous rather
than incomprehensible.
In order to understand why Ramsey should have thought that the
admission of complex universals results in ‘an incomprehensible trinity’ it is
necessary to appreciate the influence that Wittgenstein exerted upon him
during the period in which ‘Universals’ was composed. In his Notebooks
1914–1916 Wittgenstein set out to investigate whether negative propositions
correspond to negative facts. Raphael Demos had proposed that a propo-
sition of the form ‘~p’ does not correspond to a negative fact, but rather
corresponds to some true proposition ‘q’ that is incompatible with ‘p’
(Demos 1917). Russell was later to criticise Demos’s account on the
grounds that ‘it makes incompatibility a fundamental and objective fact
which is not so much simpler than allowing negative facts’ (Russell 1918:
213). Russell therefore admitted negative facts alongside positive facts to
correspond to negative and positive propositions respectively. However, just
as Demos’s account provides no explanation of the incompatibility of the
propositions ‘p’ and ‘q’ but leaves this a brute fact of nature, Russell’s
account likewise provides no explanation of the incompatibility of the
positive and negative facts p and ~p.
6 See Fine (2000: 2–6) for a similar argument against converse relations. Williamson
(1985) arrives at the same conclusion via linguistic considerations.
Ramsey on Universals 91
When Wittgenstein came to reflect upon these issues, he found that
fundamental incompatibilities of this kind offended against his Humean
scruples. Just as Hume could not understand how there could be brute
necessary connexions between causes and effects, Wittgenstein could not
understand how brute necessary incompatibilities could obtain between
positive and negative facts. At first Wittgenstein struggled to find a way to
avoid such incompatibilities:
The question is really this: Are there facts beside the positive ones? ( For it is
difficult not to confuse what is not the case with what is the case instead of it.) . . . It
is the dualism, positive and negative facts, that gives me no peace. For such a
dualism can’t exist. But how to get away from it? ( Wittgenstein 1961: 25.11.14)
But Wittgenstein was soon to light upon an account of the role of ‘~’ and
the other logical constants that obviated the need to posit a dualism of
positive and negative facts, an account that addressed his Humean concerns.
The account was to become the Grundgedanke of the Tractatus:
4.0312 . . . My fundamental idea is that the ‘logical constants’ are not
representatives . . . ( Wittgenstein 1922)
It was the assumption that the negation sign contributes to the content of
‘~p’ that led Demos and Russell to affirm fundamental incompatibilities
between propositions and facts. By both accounts ‘~’ combines with ‘p’ to
produce a representation of a fact different from p—either a positive fact q
that is incompatible with p or the negative fact ~p. A dualism of
incompatible facts is thereby induced. To avoid this dualism Wittgenstein
denied that the negation sign contributes to the content of the sentences in
which it occurs. He proposed instead that both ‘p’ and ‘~p’ have the same
content, but that they represent this content in different modes. On
Wittgenstein’s account ‘~’ switches the mode in which the content is
represented; what ‘p’ represents to obtain, ‘~p’ represents to be absent.
The influence of Wittgenstein is evident in Ramsey’s 1927 paper ‘Facts
and Propositions’. In this paper Ramsey endeavours to combat the idea that
the logical constants are representatives of logical objects. Suppose, for the
sake of argument, that the negation sign is the name of a logical object not.
Then the sentence ‘~~p’ expresses a proposition that contains a constituent
not that is missing from the proposition expressed by ‘p’. ‘p’ and ‘~~p’
therefore express different propositions (see Ramsey 1927: 43). But if they
express different propositions, then it appears impossible to explain the fact
that they are mutually entailing; the account that treats logical constants as
representatives of logical objects leaves this a brute necessary connexion
between distinct propositions. Ramsey concluded that the logical constants
‘must function in some different way’:
I find it very unsatisfactory to be left with no explanation of formal logic except
that it is a collection of ‘necessary facts’. The conclusion of a formal inference must,
I feel, be in some sense contained in the premises and not something new. I cannot
92 Fraser MacBride
believe that from one fact, e.g. that a thing is red, it should be possible to infer an
infinite number of different facts, such as that it is not not-red, and that it is both
red and not not-red. These, I should say, are simply the same fact expressed by
other words; nor is it inevitable that there should be all these different ways of
saying the same thing. We might, for instance, express negation not by inserting a
word ‘not’ but by writing what we negate upside down. ( Ramsey 1927: 42; cf.
Wittgenstein 1922: 5.43)
Ramsey thus seeks to avoid brute necessary connexions by insisting that
the conclusion of an inference must be contained in its premises.7 In this
way Ramsey tries to ensure that the connexions between propositions are
intelligible rather than brute.
The underlying Humean concerns that shape Ramsey’s perspective re-
emerge later in ‘Facts and Propositions’ when he shifts his attention from
the logical constants to the quantifiers. Frege and Russell had maintained
that the universal and existential quantifiers denote higher-order properties.
Their theory assigns quantified sentences the form ‘F( f )’: according to the
theory a universal quantification ‘For all x, fx’ ascribes the higher-order
property of universal application to the lower-order property f whereas an
existential quantification ‘There is an x such that fx ’ ascribes the higher-
order property of merely having application to f. Ramsey rejects this account
in favour of Wittgenstein’s theory that ‘For all x, fx’ is equivalent to the
infinite conjunction of all the values of ‘fx ’ (‘fa & fb & fc &...’) whereas
‘There is an x such that fx ’ is equivalent to their infinite disjunction
(‘fafbfc...’). Ramsey endorses Wittgenstein’s account because
It is the only view which explains how ‘fa’ can be inferred from ‘For all x, fx’, and
‘ There is an x such that fx’ from ‘fa’. The alternative theory that ‘ There is an x such
that fx ’ should be regarded as an atomic proposition of the form ‘F( f )’ ( f has
application) leaves this entirely obscure; it gives no intelligible connection between a
being red and red having application, but abandoning any hope of explaining this
relation is content merely to label it ‘necessary’. ( Ramsey 1927: 48–50)
8 Of course Ramsey was later to reject theories of this kind on Humean grounds in his
later paper ‘General Propositions and Causality’: ‘ But may there not be something which
might be called real connections of universals? I cannot deny it, for I can understand nothing
by such a phrase; what we call causal laws I find to be nothing of the sort’ (1929a: 160).
94 Fraser MacBride
itions. The only way to explain this mutual entailment is to insist that these
sentences express the same proposition (recall Ramsey’s injunction: the
conclusion ‘must be in some sense contained in the premises and not
something new’ (1927: 42)). But if they express the same proposition, then
the different predicates—‘ R ’, ‘ R b ’, and ‘aR ’—embedded in these
sentences cannot refer to different universals. Such complex predicates must
be incomplete symbols, neither representatives nor names of complex
universals. It is noteworthy that it will not do to respond to this Humean
argument that one and the same proposition may say three different things
at once. This relieves the pressure upon the advocate of complex universals
to explain the mutual entailment between three distinct propositions. But
this response still leaves in place a comparable explanatory burden: the
requirement to account for the tangle of necessary connexions that still
obtain between the constituents R , Rb, aR , a, and b even when they
belong to a single proposition (the necessary connexions that oblige R to
be instantiated if Rb is, and so on).
If this interpretation is correct, it is an underlying Humeanism that is
ultimately responsible for not only Ramsey’s rejection of complex universals
but also his claim that function signs that contain a negation are incomplete
symbols. And it is because he denies that such signs are referring devices
that Ramsey has no reason to suppose that the difference between signs that
can be negated and those that cannot corresponds to a difference in the
functioning of objects that make up atomic facts. Consequently, Ramsey
may accept with equanimity the claim of the first generation of
commentators that there is a subject–predicate distinction based upon the
distinction between expressions that can be negated (predicates) and
expressions that cannot (names). He may endorse this claim while still
doubting whether there is a fundamental division of objects into two classes,
particulars and universals.9
9 See MacBride (1999, 2001) for further Humean arguments against the particular–
universal distinction of this general kind. See also MacBride (2005), which seeks to show that
Dummett’s and Geach’s arguments fail to demonstrate the necessity for a subject–predicate
distinction based upon the distinction between expressions that can be negated and those
that cannot.
Ramsey on Universals 95
They denied this claim because they rejected the broadly Fregean
conception that ties ontology to language. From this perspective the malady
that afflicts ‘Universals’ is one of flawed conception. It presupposes that
ontological distinctions are tied to logical or linguistic ones. This appears to
set Ramsey at odds with much of contemporary ontology:
Contrary to what one might call the classical stance in analytic philosophy, of which
Ramsey in ‘Universals’ is one of the most brilliant representatives, much of
contemporary ontology rejects the assumption that one can get at ontological issues
through linguistic distinctions, and conversely that one can get rid of the latter
through the former. ( Dokic and Engel 2002: 40–1)
12 It should be noted that Russell maintained at one time or another a variety of other
conceptions of the particular–universal distinction. Compare Russell (1911: 124; 1912: 53;
1919: 286–7 ).
13 For reasons of space I cannot offer a thoroughgoing reconstruction of the view Russell
historically held in 1925. Instead I will confine myself to drawing attention to some of the
interpretative issues that arise. Note first that Russell defines the particular–universal
distinction relative to the occurrence of particulars and universals in ‘propositions’. Under the
influence of Wittgenstein, Russell had adopted the view that propositions are types of mental
or linguistic representations (see Russell 1918: 184–5, 196; 1919: 315–19; 1921: 240–2, 273-4;
Whitehead and Russell 1925: 406–7 ). According to this view, it is the symbolic represent-
atives of the particulars and universals that occur in the propositions that concern them
rather than the particulars and universals themselves. This makes Russell’s 1925 claim to
define the particular–universal distinction relative to the different ways in which particulars
and universals occur in propositions puzzling—puzzling because particulars and universals do
not occur in propositions. Second, contra Ramsey’s gloss, Russell does not simply define
particulars ‘as terms which can occur in propositions with any numbers of terms’ and
universals as n-termed relations that can ‘only occur in a proposition with n +1 terms’
( Ramsey 1925c: 29; 1926b: 31). Rather Russell invokes the further idea that universals are
items that not only occur in propositions with a fixed number of constituents but also occur
in propositions in a certain distinctive manner; they are kinds of thing that ‘occur as’
relations. Unfortunately Russell does not make clear what it means to occur as a relation.
98 Fraser MacBride
Conceive of atomic propositions as worldly complexes—non-linguistic,
non-mental items—that actually contain the constituents they are about.
Then if the space of atomic propositions consists of the sequence of forms
(A), particulars (x, y, z, ...) may be defined as entities that can occur in
propositions with any number of constituents. By contrast, universals (R 1,
R 2, R 3, ...) may be defined as entities that can only occur in propositions
with n-ly many constituents (where n may be finite or infinite). Call those
entities unigrade. Unigrade entities have a definite degree or adicity; they are
either monadic or dyadic or triadic ... or n-adic. Entities that enter into
propositions with differing numbers of other entities are not n-adic for any
number. Call these entities multigrade.14 So whereas particulars are multigrade,
universals are unigrade.
Russell neglected to provide any direct motivation in Principia for the
view that particulars are multigrade, universals unigrade. But this leaves one
wondering from where his insistence derives that the space of atomic
propositions exhibits the structure (A). If no assurance can be given that
atomic propositions are one or other of the forms (A) depicts, then this
conception of the particular–universal distinction can carry no conviction.
For all that has been established so far the space of atomic propositions
may exhibit an indefinite variety of other structures contrary to (A). For
example, Russell has not shown that there is anything to prevent the atomic
propositions from all exhibiting the same n-adic form. Nothing has been
done to rule out the epistemic possibility that the atomic propositions are
composed entirely of two constituents,
(B) R 4(x) R 5( y) R 6(z) ...
or that they are composed entirely of three constituents,
(C) R 7(x, y) R 8( y,z) R 9(z,w) ...
Of course (B) and (C) are not the only structures that reality might
exhibit contrary to (A). According to Ramsey ‘we cannot even tell that there
are not atomic facts consisting of two terms of the same type’ (1925c: 29). It
might be thought that self-predication of this kind will eventually result in a
version of Russell’s paradox (‘a vicious circle contradiction’). But, as we
have already seen, Ramsey endeavours to block the relevant version of the
paradox by denying that negative predicates are referring devices. So far as
Ramsey is concerned, we cannot then even rule out the epistemic possibility
that the atomic propositions are composed as follows:
(D) f( f ) a(a) f(a) ...
Yet even if we restrict our attention to alternatives less radical than (D), it
is evident that the mere possibility of (e.g.) (B) or (C) will serve to
5. CONCLUSION
Ramsey did not hold any of the claims usually attributed to him; two
generations of commentators have gone astray. Of course this does not
settle what Ramsey really claimed and why he wished to claim it.15 Ultimately
it will only be possible to come to a definitive judgement of the kind in the
context of a fuller study that would address, amongst other things, the
influence that other works of the period bore upon ‘Universals’. These
include W. E. Johnson’s Logic (1921–2), A. N. Whitehead’s Principles of
Natural Knowledge (1919) and The Concept of Nature (1920), and G. E. Moore’s
(1923) polemic against G. F. Stout that Ramsey took to have ‘already
sufficiently answered’ the view that properties are particular tropes.
16 Of course, one may question whether form a circle is truly multigrade rather than (e.g.)
the monadic property of an aggregate or plurality. But (a) there appears to be no obligation to
conceive of form a circle this way. Moreover, (b) there are other kinds of plausible examples of
multigrade universals that cannot be so dismissed.
104 Fraser MacBride
concepts apply. By contrast, the arguments of ‘Universals’ point towards a
more thoroughgoing Humeanism that applies even to such notions as
particular and universal, perhaps the most fundamental categories of all.17
1. INTRODUCTIO N
Suppose we propose a new scientific theory. Presumably we will do this
because we want to explain some facts we already know about, and we will
have a language to describe these facts that doesn’t get its meaning from our
theory. This language we can call the ‘primary system’. However, typical
theories also introduce new, or ‘secondary’, terms that are not in the primary
system. These secondary, theoretical, terms tend to refer to things that are
abstract and distant from observation. A plausible claim is that their
meanings in some way derive from their connection with primary terms. The
question arises how are we to understand this dependence.
This, roughly speaking, is the question Ramsey considers in his 1929
paper ‘Theories’. Having introduced the primary–secondary distinction,
Ramsey considers whether it is possible to regard secondary terms as defined
in terms of primary—a natural suggestion given their lack of independent
meaning. However, this turns out to be problematic. First of all, secondary
terms will require very disjunctive definitions, which basically amount to
lists of all the primary manifestations that each secondary concept has. If we
needed these definitions, the theory would be pointless, since it would be
simpler just to list its consequences directly in primary vocabulary. The
second problem is that definitions of secondary vocabulary in terms of
primary cannot cope with changes in the defining theory. Discovering a new
manifestation of some secondary concept, for example, would entail
alteration to the definitions of all the secondary terms in the theory, and
thus to the meanings of all the theory’s terms. Somehow, then, it needs to
be shown how secondary terms can function in a newly introduced theory
without requiring that they be defined from primary terms, or at least,
without requiring that whatever definitions are available are explicitly kept in
mind by those who use the theory.
Ramsey’s alternative proposal begins with the observation that
The best way to write our theory seems to be this ( , , ): dictionary . axioms.
(1929c: 131)
In effect, Ramsey’s idea was that the theory should be assimilated to what
was later dubbed its ‘Ramsey sentence’, the sentence formed by replacing
the secondary terms in the theory with variables, then prefixing the resulting
formula with a corresponding number of existential quantifiers. On this
106 Pierre Cruse
view, as Ramsey points out, propositions containing secondary terms are
not seen as ‘propositions by themselves’, but rather as propositional
functions, which only gain meaning when added within the scope of the
existential quantifiers prefixing some particular theory. Secondary terms are
not therefore names of properties or objects but are more like schematic
names that stand for whatever happens to play some role in realizing the
theory in which they occur. Terms can thus serve as abbreviatory devices
allowing us to work out deductive relations between parts of theories
without having to consider everything the theory says. However, in order to
state the content of a statement within a theory we need to go back to what is
asserted to exist by the entire theory in which it occurs—in other words, the
Ramsey sentence. Thus, as Ramsey puts it, ‘the incompleteness of the
“propositions” of the secondary system affects our disputes but not our
reasoning’ (1929c: 132).
In this chapter I propose to examine some of the arguments for and
against Ramsey’s conception of theories. In particular, I will focus on
attempts that have been made to defend empiricist conceptions of scientific
theories, for reasons I will explain shortly.1 On balance, I think it is fair to
say that the contemporary view is largely negative in this regard, holding that
Ramsey sentences only ever promised to help doctrines that are now
outdated, and even did that rather unsuccessfully.2 While I think that these
criticisms are correct when aimed at existing applications of Ramsey’s view
in this context, they undermine neither the view itself, nor the aim to use it
to defend an empiricist conception of theories. In this chapter I will explain
why this is, and propose a broadly empiricist framework in which, I will
claim, Ramsey’s account is a successful explanation of the semantics of
theories.
1 There are also non-empiricist views of theories inspired by Ramsey’s view, for example,
Lewis (1970). However, I will restrict my attention in this chapter to versions of the theory
that have aimed to defend some form of empiricism.
2 That the form of empiricism Ramsey’s account helps defend is defunct has been very
widely argued; see e.g. Suppe (1971) for a summary of some of the main lines of argument.
That Ramsey’s account doesn’t help defend it is pressed, for example, by Demopoulos and
Friedman (1985), Ladyman (1998), and Psillos (1999).
Empiricism and Theories 107
therefore why there should be any such distinction in the first place, and
even if there is, why it should be of semantic significance.
Ramsey, as far as I can see, didn’t commit himself on this question; he
gives little explicit attention to the question of what sort of thing might
count as a primary term. However, an obvious answer surfaced when it was
noticed that Ramsey’s account might be used to defend empiricist
conceptions of scientific theories. If you are an empiricist—especially of the
positivistic variety—you will likely be worried by the following problem. If
empiricism is true, then meaning is essentially connected to observation. In
order for a term or concept to have independent meaning it must be an
observational term, and refer to something directly observable. On the other
hand, the empiricist stance involves a strong commitment to viewing science
as the paradigm of meaningful discourse, in contradistinction to
metaphysics, which is not genuinely meaningful. If both these claims are
true, it follows that everything science says must be expressible using
observational terms alone (plus logic and mathematics).
The problem, of course, is that this doesn’t seem to be the case. Science
is replete with theoretical terms like ‘electron’, which refer to things that are
not by any stretch of the imagination observable. So either much of science
is meaningless, or we have no way of explaining what is wrong with
metaphysics. Neither conclusion is palatable to the positivist–empiricist,
hence the problem. To solve it some way must be found of showing how
the content of theories can be expressed using nothing more than
observational terms, logic and maths.
This is where Ramsey’s view of theories promises to help. Ramsey’s view
of theories, as we have seen, does two things. First, it explains why
theoretical terms are necessary to science. They are necessary because we
need abbreviatory devices to work with theories—it would be hopelessly
complicated to list all the primary consequences of a given theory every time
we needed to use any part of it. Second, it entails that the meanings of
secondary terms depend entirely on their connections with primary terms. If
Ramsey’s account of theoretical terms is true, everything a theory says can
be said in the primary system alone, using the Ramsey sentence. Thus, if we
identify the ‘primary system’ with the language of logic, maths, and
observation terms, and the secondary system with the language of
theoretical terms, we are in a position to explain (a) why there are theoretical
terms, and (b) how the content of theories is still entirely derivative from
observational terms. Thus, we are in a position to make the role of
theoretical terms in theories consistent with empiricism.
6. AN ALTERNATIVE CONCEPT IO N OF
OBSERVATIO NAL TERMS
I want to respond to this problem by putting forward a version of Ramsey’s
view of theories that I think gets round these problems. My proposal will be
a version of the empiricist idea that theoretical terms have to have their
meanings given in some way with reference to experience or observation.
However, I think it is possible to hold a view that makes sense of this
empiricist intuition without committing yourself to the kind of
observational–theoretical distinction that the Newman argument requires to
get off the ground.6
The conception of observational terms I am getting at was suggested by
Grover Maxwell in 1962, before he (erroneously, as I have argued) went on
7 This relates to Ketland’s (2004) version of the Newman result as follows. Ketland
shows that if a Ramsey sentence is empirically adequate, then (subject to cardinality
constraints) it is true. In this proof, a theory is regarded as empirically adequate (following
van Fraassen 1980) if and only if it has a model whose ‘observational reduct’ is isomorphic to
the world’s observational reduct, where an observational reduct of a model is the model we
get by removing from the model any individuals not in the union of the fields of the
observational—i.e. unramsified—terms in our theory. However, if some of our theory’s
observational terms are mixed terms, then the union of the fields of our observation terms
contains unobservable objects. Thus, although it is true that given the way we have defined
empirical adequacy, a theory’s empirical adequacy implies the truth of its Ramsey sentence,
empirical adequacy fails to have its intended sense of ‘truth only about what is observable’.
Rather, it requires the theory to be true about the objects in the extensions of the theory’s
mixed terms, some of which objects are unobservable.
Empiricism and Theories 115
this issue turns out. However, I think there is enough evidence for it that it
is worth proceeding on the assumption that it is true (for further defence,
see especially Fodor 1983, 1984).
If the modular theory is true, exactly what does count as observable by
this criterion is also an empirical issue. But, however this issue turns out, it
seems very likely that at least some observational properties are going to
come out as mixed. For a start, if there are perceptual modules at all, their
function is surely going to be to produce representations of physical
properties of a subject’s environment rather than just proximal patterns of
stimulation. One of the major arguments for the claim that perception is
modular is that it is obviously adaptive for an organism to have a
mechanism which produces accurate representations of its immediate
environment independently of what further beliefs it may hold, simply
because its survival will depend on what is actually in front of it rather than
what it thinks is in front of it (see Fodor 1984: 38). But if perception is to do
this, it must at the very least succeed in producing representations of
properties of the environment rather than just of proximal patterns of
stimulation.
Once you admit that external physical properties can be perceptually
represented, there seems little reason to think that these properties cannot
be mixed. For example, suppose for the sake of argument that the shapes of
objects count as observable on the definition just given. Now a shape
property such as square is, on the face of it, a mixed property, since it is just
as meaningful to attribute squareness to an unobservable thing as an
observable one. Denying this would involve claiming that in fact only the
property square-and-observable-to-humans is represented, rather than squareness
itself. But this just seems like unnecessary double-counting: we can explain
why square objects are observable by the fact that they are observable and
square, without the additional assumption that only their observable
squareness is actually perceived. Without some strong reason to think
otherwise, then, there is surely a strong default presupposition in favour of
the assumption that at least some mixed properties can be perceptually
represented.
10 Brian Greene (1999) describes the current situation (or at least, the situation as it was in
1999) as follows:
Without monumental technological breakthroughs, we will never be able to focus on the
tiny length scales necessary to see a string directly … As the Planck length is some 17
orders of magnitude smaller than what we can currently access, using today’s technology
we would need an accelerator the size of the galaxy to see individual strings . . . If we are
going to test string theory experimentally, it will have to be in an indirect manner. ( p. 215)
It may, Greene says ( pp. 224–5), turn out to be possible to gain evidence of the presence
of strings indirectly, via, for example, their cosmological implications. I am not sure whether
this kind of observation would suffice for a CI explanation of the content of SUPERSTRING.
But even if it would, it seems implausible that the contentfulness of the concept SUPER-
STRING turns wholly on whether such observations turn out to be possible.
11 This is a criticism of the CI theory only if that theory claims that it is a necessary
condition for the assignment of content that there exist appropriate causal correlations
between tokenings of the concept and the presence of its referent. Fodor (1984) claims that
this is only a sufficient condition, so this isn’t strictly a criticism of his theory. However, I
take it that he would claim that it is actually plausible to assign content to theoretical concepts
like ELECTRON using a causal theory, which I deny.
Empiricism and Theories 119
there is causal covariance between a concept and its referent doesn’t make
appealing to the Ramsey sentence of the theory in which a concept occurs
any less of an explanation of its content, even if it now makes another
explanation available. But if this is the case, then the CI explanation is
threatened with redundancy, since the Ramsey-style account applies both in
cases where the CI account applies, and in those in which it doesn’t.
However, we should also ask whether there are any cases in which the CI
theory could account for the possession of a theoretical concept but
Ramsey’s theory could not. We might begin by noticing that in typical cases
where the CI theory might apply, we are imagining the presence of a theory
whose Ramsey sentence would contain enough detail to assert the existence
of the supposed unobservable entity that is the referent of the concept in
question. For example, consider the case of the electron, described above.
According to the CI story in that case, we are able to token the concept
ELECTRON precisely because we have a theory which entails that electrons
cause certain observational effects. But then the Ramsey sentence of that
theory must at least assert the existence of something that causes those
observational effects. This kind of case cannot therefore decide which of the
two theories is true. To decide between the theories we would have to think
of a case in which there was causal covariance between a concept and its
referent in the absence of a Ramsey sentence that asserted the existence of
anything like that referent.
What would such a case be like? In the case of electrons, say, we would
presumably have to imagine someone who tokens the concept ELECTRON
almost as a kind of reflex in the appropriate cases, without having an explicit
conception of what is actually present when they are responding in this way.
However, I doubt whether it is possible to conceive of the appropriate kind
of case. Can we really imagine someone who is disposed to recognize the
experimental manifestations of electrons without any sort of Ramsey
sentence-type representation of what they are responding to? The problem
with this supposition is that electron manifestations (cloud-chamber tracks,
Geiger counter clicks, and so on) have nothing superficially in common. It is
therefore difficult to see how someone could come to token the same
concept in more than one of these cases without simply learning
enumeratively to do so. But if they had been trained in this way—think
ELECTRON in this kind of case, and in this kind of case, and . . .—we would
be more inclined to say that they had a disjunctive concept of being a cloud-
chamber track, or a Geiger counter click, or whatever, even if electrons are
in fact the causal source of all these things. One way to bring this out is by
noticing that such a person would not be able to recognize new
manifestations of electrons without being explicitly trained to do so. What is
lacking is precisely the explicit conceptualization of the different phenomena
as causal manifestations of the same kind of thing.
Admittedly we could imagine that someone had a genuine causal
sensitivity to electrons in the absence of a Ramsey-style theory if we
120 Pierre Cruse
supposed that they could actually perceive electrons. However, this would not
demonstrate that causal connection in the absence of a Ramsey-style theory
is sufficient for a non-observational concept to have content, since we
would in such a case merely attribute to them an observational concept.
Ultimately, as I suggested above, the proposed theory suggests that which
are the observational concepts is essentially an empirical issue. However, the
way that humans are constructed presumably prevents us from possessing
an observational concept of an electron, so the case is merely hypothetical.
In the case of theoretical concepts, then, I think there is a strong case for
saying that, while a Ramsey-style account of theoretical terms can apply
when the CI theory does not, the converse is not true, as there are no cases
(given the way in which humans are constructed) in which we would
attribute a theoretical concept on the basis of causal covariance in a case
where no Ramsey-style account would apply. However, if this is true, it
leaves us requiring some account of the semantics of observational
concepts. It is in this context that I think that the CI theory is very plausible.
One way to bring this out is by comparing what we would want to say
when the causal covariance required by the CI theory is not present in the
case of a theoretical concept and an observational concept. In the case of
theoretical concepts, I have argued that this does not impugn the claim that
those concepts genuinely have content. You can have a theoretical concept
without any disposition to token it when its referent is present if you aren’t
aware of any of its observable manifestations, provided you have a theory
that describes its referent in observational (in this case mixed) terms.
However, the same does not seem true in the case of observational
concepts. Were someone to fail to be caused to token a concept such as
SQUARE when squares were present in the optimal conditions, this would
give us strong grounds for saying that they don’t possess the concept at all.12
This difference suggests that it is plausible to think that different conditions
have to be met for observational and theoretical concepts to have content.
In the case of observational concepts it is necessary and sufficient for
possession of a concept that one is reliably caused to token it by the
presence of its referent in appropriate circumstances. In the case of
theoretical concepts, it is necessary and sufficient for possession of a
concept that one thinks of it in Ramsey’s way, as abbreviating whatever
plays a certain role in a theory.
I will finally note that if the CI theory of content does apply to
observational terms, this will give us further justification for thinking that
observational terms can be mixed. The CI theory, as we have seen, explains
12 This is complicated slightly by the fact that someone might conceivably have a
theoretical concept of a square, e.g. if they could provide a mathematical definition.
However, assuming that we are talking about an observational concept of a square, I think
the point holds.
Empiricism and Theories 121
the content of a concept with reference to the state of the world that reliably
causes it. But presumably the properties that are efficacious in causing us to
token perceptual mental representations are going to be mixed properties
such as squareness rather than truncated properties such as square-and-
observable-to-humans. Square-and-observable-to-humans is too anthropo-
centric to count as a genuinely causally efficacious property. If this is true,
there is no reason to deny that mixed properties can be observed.
In summary, then, the theory I want to put forward is this. There are two
kinds of concepts, observational and theoretical. Observational concepts are
those which are represented by perceptual modules. They are concepts
which we acquire in the absence of explicitly formulated theory, and their
content is to be explained by their causal sensitivity to what they represent,
along the lines of the CI theory. Theoretical concepts, on the other hand,
are those that denote things that are not represented by perceptual modules.
They should have their meanings explained using Ramsey’s method, with
reference to their place in the theory in which they occur. But this requires
that theoretical concepts ultimately can have their meanings specified with
reference to observational concepts alone. Thus, if this theory is true, it
suggests that Ramsey’s theory of theories can be used to defend quite a
strong form of concept empiricism.
8. CONCLUSION
I have argued that, despite extant objections, there remains a strong case for
saying that an empiricist account of theoretical concepts based on Ramsey’s
account of theoretical terms is true. For this claim to be defensible we have
to be careful about what sort of empiricism we are trying to defend. First of
all, it entails that we understand empiricism as an epistemological and
semantic doctrine about how we acquire concepts, and not an ontological
doctrine about whether ‘theoretical entities’ exist—a view we saw Ramsey’s
view cannot help to defend. Second, empiricism must allow some mixed
concepts to be acquired directly through experience, without allowing direct
reference to theoretical entities.
However, I have tried to suggest that there is justification for thinking
that a form of empiricism is true that meets both these conditions. The
form of empiricism I have in mind derives in the first instance from the
claim that perception is modular, and uses this to define observational terms
as those terms which refer to properties that perceptual modules can
represent. I have argued that, if this view is true, there are strong reasons to
think that observational terms so defined can be seen as getting their
meaning directly, simply in virtue of being reliably caused by the things they
represent. This strongly suggests that at least some observational terms are
mixed, as required. However, I have also tried to argue that this explanation
of content only plausibly applies to observational concepts of this form,
entailing that we need a different explanation of how theoretical concepts
122 Pierre Cruse
acquire their content. The best account here, I claim, is Ramsey’s theory that
the content of theories is always expressible as an existential claim in which
only observational and logico-mathematical terms occur.13
13 The research for this chapter was carried out with the aid of a prix FSR at the Centre
de Philosophie des Sciences, Université catholique de Louvain. Thanks to Jeff Ketland and
Stathis Psillos, with whom I have had very helpful correspondence on the issues raised in this
chapter.
Ramsey Sentences and Avoiding the
Sui Generis
FRANK JACKSON
I
Many philosophers say that the history of conceptual analysis is a history of
failure. This is an exaggeration. The foundations of logic and mathematics
contain many successful analyses. But you can understand why they say
what they say. There have been many attempts to analyse knowledge since
Gettier (1963) disturbed the ‘justified true belief’ conventional wisdom and
we are still arguing the toss. And this is not an isolated example. The
following seems a fair statement of the situation we find ourselves in. There
are many important concepts—examples would include knowledge,
intelligence, rationality, probability, pain, personal identity, life—which we
appeal to in characterising elements of our world in the sense of classifying
them: the intelligent are alike in a way that marks them off from the
unintelligent, cases of knowledge differ from cases where we lack
knowledge, pains differ from other feelings, and so on. But there are no
generally accepted analyses of these concepts despite many attempts by
many clever philosophers. To borrow from Steve Stich, in a contest
between someone offering an analysis and someone searching for a counter-
example, the smart money is on the person looking for a counter-example.
Many view this situation with equanimity. In Knowledge and its Limits,
Timothy Williamson records his view that most concepts are unanalysable,
and is of the view that this is an interesting but unworrying fact (Williamson
2000: 77, 100). I am sure his attitude is widely shared. As against this, I think
we should worry. This chapter divides into three parts. I start by saying why
we should worry. I then suggest a way out. I finish by noting that the way
out would not be available if Ramsey had not told us about the sentences
named after him.
II
Much of language is a system of representation. What one or another
sentence represents is the putative information about how things are that we
use that sentence to convey. If you are wondering whether to turn left or
right to find the coffee, a few words will give you the answer. If you are
wondering what kind of animal is about to crawl up your leg, the sentence
‘It is a ferret’ will give you the far from glad tidings.
124 Frank Jackson
The representational view of language has of course been controverted.
In ‘Epistemology and Truth’, Donald Davidson argues as follows.
The correct objection to correspondence theories is . . . that such theories fail to
provide entities to which truth vehicles (whether we take these to be statements,
sentences, or utterances) can be said to correspond. As I once put it, ‘ Nothing, no
thing, makes our statements true.’ If this is right, and I am convinced it is, we ought
also to question the popular assumption that sentences, or their spoken tokens, or
sentence-like entities, or configurations in our brains can properly be called
‘representations’, since there is nothing for them to represent. If we give up facts as
entities that make sentences true, we ought to give up representations at the same
time, for the legitimacy of each depends on the legitimacy of the other. (1988: 184)
But surely maps and diagrams represent. People who use, and the people
who create, the familiar map of the London underground take it for granted
that it represents the relative positions of the stations, and they are right to
do so. But once this is conceded, it is hard to see why we should make a big
thing of the difference between maps and diagrams as opposed to sentences
and words, for the job we do with maps can be done in many cases by
prose, and in fact we often use words to assist people new to the map of the
London underground to grasp its representational content. We might well
have a debate about the right entities to be what a map or sentence
represents—facts, events, worlds, sets of worlds, propositions, mereological
sums, etc.—but that there is something to debate here should not, I think,
make us think that it is open to serious doubt that maps and sentences (and
thoughts if it comes to that) represent.
III
In order to use language to pass on information, we need to know what the
relevant descriptive words stand for. In order to use Morse code or
semaphore to pass on information, it is vital to know what the various
configurations stand for. This is why we go to classes on Morse code and
semaphore—or at least we did when those systems were still in general use.
The same goes for words. For example, we need to know what the word
‘electron’ stands for in order to be able to use it as part of a system of
representation for exchanging putative information about what our world is
like, every bit as much as we need to know what the various arm positions
stand for in order to use semaphore to exchange information. It would be
very strange if we didn’t need to know what words stand for; we would be
giving words special powers denied to other physical structures that we use
to transmit information.
IV
I hope these remarks sound like commonplaces. Is it news that if you—
potential giver and potential receiver of information alike—don’t know
Ramsey Sentences and the Sui Generis 125
what a physical structure stands for, then that structure is not much use for
passing on information? Did anyone need reminding about the value of
going to classes on Morse code? But now we have enough to disturb
equanimity about the problems for conceptual analysis. I will illustrate with
the case of knowledge but very similar points apply to personal identity,
pain, life, and all the rest.
We use, and are justified in using, the word ‘knowledge’ to pass on
information about how things are, especially concerning the epistemic states
of humans. (Or at least we do unless some kind of expressivism about
sentences containing the word ‘knowledge’ is correct.) In order to do this,
we need to know, or maybe have true justified belief concerning, what the
word ‘knowledge’ stands for. That’s the commonplace. But what does this
knowledge come to? Let’s review some possible answers.
V
All we know is that the word ‘knowledge’ stands for the property it stands for. That’s all
we can give by way of answer.
But we know that speakers of other languages can pass on the very same
information about what our world is like without using the word
‘knowledge’. It would be an extreme form of linguistic chauvinism to say to
ourselves ‘How lucky we are to be English-speakers, because if we were not,
there would be important information we could not pass on.’ Moreover, we
would not accept this kind of answer from one who claimed to understand
Morse code. Instead we would conclude that they did not understand Morse
code. The reason, of course, is that knowing that a word stands for what it
stands for is a trivial item of knowledge, whereas understanding a language
is in general a highly non-trivial matter.
VI
We know the property the word ‘knowledge’ stands for. But the property is a sui generis
unanalysable property. This means that all we can say by way of answer requires us to
use the very word itself or some synonym. The property is distinct from the word or words
we use to tell people about it, but we can do no better by way of words than to use the very
words themselves. In this sense, knowledge is unanalysable.
There are two problems for this answer. The first is that knowledge does
not seem to be the right sort of property to be sui generis. What a person
knows supervenes on enough information concerning truth, belief,
justification, defeasibility, reliability, counterfactual dependence, flukiness,
and the like. In all the surveying of possible cases prompted by Gettier’s
paper, one thing we take for granted is that being a case of knowledge is a
derivative or grounded property. It is a priori that no two cases can differ
only in the fact that one is, and the other is not, a case of knowledge; in
addition, there must be a difference in one or more of: how fluky the case is,
126 Frank Jackson
the counterfactual connections between fact and belief, the degree and
nature of justification, and so on and so forth. Exactly which items need to
go into this list is controversial—we nearly all agree that truth and belief had
better be there but quarrel about other candidates for inclusion—but it is
not controversial that there is an illuminating list of knowledge-making
properties. How so if knowledge is sui generis?
The second problem is the threat of scepticism. We can agree that we
have true justified belief on occasion, that we have true beliefs reached by
reliable processes, that we have true justified beliefs which are non-
accidentally so and which have desirable anti-defeasibility properties, that
sometimes we have true beliefs in situations where the possibility that they
might be false has been excluded, that we have justified true beliefs not
derived from false premises, and so on. We can agree, that is to say, that we
have on occasion true beliefs that satisfy all the at all plausible constraints
that have been suggested by the many who have sought an analysis of
knowledge. But if knowledge is sui generis, none of these agreements
amounts to agreeing that we have knowledge. How then can we ever be
certain that we have knowledge by contrast with being certain that we have
true justified belief, true belief reliably acquired, true belief with various anti-
defeasibility properties, etc.?
We can put it this way. Let K1, ..., Kn be the sum total of all the sensible
suggestions that have been or might be put forward as analyses of
knowledge. Let knowledgei be knowledge analysed according to Ki . We can
be confident that we sometimes have knowledgei for each i. The sceptical
challenge is to provide a reason for saying that, moreover, we sometimes
have knowledge itself—the allegedly sui generis unanalysable property distinct
from all the analysable knowledgeis on the suggestion under consideration.
I mentioned Williamson earlier as someone who views failures of analysis
with equanimity. My sense is that he would confront the sceptical challenge
by insisting that knowledge has an explanatory and predictive value that
belief, for example, lacks, and this gives us good reason to accept knowledge
as a feature of our world over and above one or another construction out of
true belief and whatever. Knowledge earns its keep by playing important
explanatory and predictive roles that cannot be handed across to the
varieties of belief. However, the examples he gives are not especially
compelling. Here is one (I think similar points apply to all the examples he
gives—see Jackson 2002):
How long would we expect a fox to be willing to search for a rabbit in the wood
before giving up, assuming initially (a) that the fox knows that there is a rabbit in
the wood, or ( b) that the fox believes truly that there is a rabbit in the wood? In (b)
but not (a), the fox’s initial true belief may fail to constitute knowledge because the
true belief is essentially based on a false one, for instance, a false belief that there is
a rabbit in a certain hole in the wood. When the fox discovers the falsity of that
belief, the reason for the search disappears. That will not happen in (a), because a
true belief essentially based on a false one does not constitute knowledge. Thus,
Ramsey Sentences and the Sui Generis 127
given plausible background conditions, more persistence is to be expected in (a)
than in (b). In many such cases, lengthy persistence is better explained by initial
knowledge than by initial true belief. ( Williamson 2000: 86–7)
The trouble with this example is that Williamson tells us the problem with
the explanation in terms of (a) without mentioning knowledge. The problem is, to
quote, that ‘the true belief is essentially based on a false one’. In
consequence, the case gives no reason for favouring a knowledge story over
a belief story with the defect remedied in terms that make no mention of
knowledge and which Williamson himself provides. In terms borrowed
from the final sentence of the quotation: in many such cases, lengthy
persistence is better explained by initial true belief that is not essentially
based on a false one than by initial true belief simpliciter.
VII
You are right that we had better know what ‘knowledge’ stands for and that it had better
not stand for something sui generis. But this does not mean, as you seem to be implying,
that there should be other words that we might use instead of the word ‘knowledge’, words
that, suitably assembled, would play the same informational role. For perhaps we refer to
knowledge via some guise G or other. We know what the word stands for—it stands for
the property that is G—but we cannot, as of now anyway, say what the property itself is.
But if this is right, we can state the circumstances in which someone will use
the word ‘knowledge’. They will use it when they think something has the
property which is G. The word ‘knowledge’ will be a good word for
receiving and passing on the information that something has the property,
whatever that property is, which is G, and that is what we mean here by a
word standing for a property. So the suggestion itself delivers the words we
need to do the informational job that the word ‘knowledge’ plays. Maybe
when we learn which property is G, we will shift our use of the word
‘knowledge’ to that property, but that’s another question.
VII I
There are two things you might mean by saying that ‘knowledge’ stands for a sui generis
unanalysable property. One is that it stands for the kind of property Moore thought
goodness to be. You are right that we should reject this idea. The other is that although
each and every case of knowledge can be fully described in all relevant respects in terms
that do not include ‘knowledge’ or a near-synonym, there is no pattern capturable in these
terms. This view involves no mysterious extra properties—each case is fully describable in
terms that do not include ‘knowledge’—but because there is no pattern statable in terms
that do not include ‘knowledge’ or a near-synonym, there is no question of giving an
analysis of knowledge. Knowledge is a patternless infinite disjunction, as we might put it.
There are three ways we might spell out this suggestion (which is of course
modelled on some versions of autonomy theses about the relation of the
moral to the non-moral, and of the mind to the physical). On one, the idea
128 Frank Jackson
is that there is no pattern at all, not just no pattern in terms of the features
of the disjuncts. This spelling-out raises serious questions about how we
could have acquired the concept of knowledge and learnt to use the word,
as cases where there is no pattern are cases where one cannot pick up a
word by reflection on examples, and also raises serious questions about the
utility of our talk of, and our concept of, knowledge. The more patternless a
collection of items is, the less interest and theoretical value it has for us qua
collection. And I think we should resist any suggestion that the applicability
of the word ‘knowledge’ in itself creates the interest. To say that would go
against our earlier point that we use words to tell about the world.
The second way of spelling out the suggestion affirms that there is a
pattern but not one capturable at the level of the disjuncts; it is not
capturable in the terms of belief, truth, and the like—the more fine-grained
features on which knowledge supervenes. But this is to return to the view
that knowledge is an extra property, the view suggested by Moore’s view of
goodness, the view we are supposed at this point in the discussion to be
avoiding. Our topic here is not Platonism about properties or properties as
the universals that serve to carve nature at its fundamental joints. For us,
whenever there is a pattern unifying a set of items, there is a property in the
wide sense of a way that things might be. What one then says, moving from
speculative cosmology to analytic ontology, about whether we should think
of this pattern in terms of a common relation that the items have to a
universal, or in terms of resemblance nominalism, or in terms of an
immanent shared thing which inheres in every item, or ... is another
question.
The third way of spelling out the suggestion affirms that there is a pattern
at the level of the disjuncts capturable in principle in terms of the fine-
grained features but it is not capturable by us in these terms. Perhaps God
can see the pattern at the level of the disjuncts but we cannot. But now it is
unclear what we are supposed to be using the word ‘knowledge’ to stand for
in the sense of that which we tell about when we use the word. If we come
across a tribe that cannot detect a certain feature, we can be confident that
they lack a word for it, or if they somehow do have one, their claims made
using the word will be very unreliable. Obviously, we do not want to say
that our situation with the word ‘knowledge’ is at all like this.
Perhaps the suggestion is that we know that there exists a pattern at the
level of the disjuncts and that creatures with special powers could articulate
the pattern in the relevant terms, but all we can do is say, justifiably, that
there exists a pattern. Experienced tennis players can tell whether a ball
coming towards them will go out or go in with remarkable reliability. They
know their judgement is triggered by a pattern at the level of direction, spin,
velocity, height over the net, and the like, and that a brilliant cognitive
scientist might find the pattern at this level after a lot of work. At the same
time, the tennis players do not, and know that they do not, know what the
pattern is at the level of velocity etc. Perhaps our use of the word
Ramsey Sentences and the Sui Generis 129
‘knowledge’ is something like that. But the tennis example is patently not an
example where we do not know or cannot say what unifies the disjuncts.
What unifies them is their association with whether the ball lands in or out,
and we all know that the information we pass around with the words ‘in
ball’ and ‘out ball’ concerns where the ball is likely to land in relation to the
lines, and, moreover, we can specify the relevant location without using the
words ‘in’ or ‘out’ (so any suggestion that the property is sui generis is not to
be taken seriously). There are, and we know that there are, two unifiers for
the disjuncts. The one we cannot give with any exactitude is the one
involving spin, height above the net, and so on; the other is in terms of
where the ball lands, and that we can give and is the one experienced tennis
players give putative information about when they say ‘in ball’ or ‘out ball’ in
response to an approaching ball.
IX
The idea that we ask after the feature of the world that a word like ‘knowledge’ stands for
in order to understand the information it serves to pass around is a hangover from an
unduly regimented view of language. In some cases we did get together and agree to use one
or another physical structure to stand for this, that or the other property. Morse code and
semaphore are examples. But mostly we should think of words in terms of knowledge how
and not knowledge that, as a matter of exercising abilities, especially recognitional ones,
and not in terms of flagging features.
The emphasis on recognitional abilities seems absolutely right, especially
when one thinks of the tennis example above, but it does not address our
problem. Our problem is, What information do we pass around by using
words, the word ‘knowledge’, in particular? And it is not plausible that what
we pass around is information like that we have recognitional abilities, or that we
are currently exercising the ability that underlies our use of the word ‘knowledge’. That
all pertains to how it is that we have the ability to pass around information
using the word, not to the information we pass around.
Likewise, one of the things we are able to recognise is similarity to one or
another degree, and it is plausible that sometimes the information we pass
around is to the effect that some item is, to some degree or other, similar to
certain exemplars, but again what we pass around is that there is a similarity,
not that we are recognising it. Of course we can pass around information
about our abilities—as when we say that we can recognise Tony Blair in a
photograph—but when we do, we use words that stand for the relevant
recognitional features. Recognising Blair is something we tell about by using
words we know stand for our acts of recognition.
X
This whole discussion is in the grip of an outmoded theory of reference. It is being assumed
(presumed) that the reference of a word like ‘knowledge’ is given by a descriptive condition
130 Frank Jackson
associated with the word. The talk of the property the word stands for is nothing more
than the description theory in other words. What we learn from Saul Kripke and Hilary
Putnam’s work on reference is that the reference, and hence what is being said about how
things are, by the use of a word like ‘knowledge’ may be quite opaque to users of the word.
The right theory of reference is an a posteriori matter, and this means that competent
users of a descriptive word need not know what that word stands for. What a word stands
for awaits the delineation of the right theory of reference and the discovery of the relevant
empirical facts. For example, if the right theory of reference for the word ‘knowledge’ is
that it refers to the property that stands in causal relation R 5 to users of the word, then to
know what the word stands for we need to know both this and the property that stands in
R 5.1
This is not the place to argue the theory of reference as such, but let me
indicate why this seems to me to be a perverse moral to draw from recent
work on reference. If what people are saying about how things are depends
on something we philosophers know about but the folk do not, we had
better hurry up and solve the theory of reference. The folk use words all the
time to say how they take things to be and they would like to know what it
is that they are saying about how things are. Should we put notices in the
papers to warn them that they do not know what they are saying and ask
them to watch out for the results of the next workshop on the theory of
reference in the hope that it will settle the matter once and for all? I hope
this will strike you as absurd.
XI
We find ourselves in the following situation. There are many words that
patently serve the function of passing around information about what the
world is like. This requires that, in coming to understand them, we grasp
what features they stand for. For a few of these words, it is plausible that the
corresponding features are fundamental features that cannot be thought of
as complexes of more basic features. But mostly the features are not
fundamental in this sense. To suppose otherwise is, as we’ve seen, to make a
mystery of the way these features supervene on more fine-grained ones and
to raise the bogey of scepticism. But this means that, provided we have
sufficient linguistic resources, we should be able to identify the feature
words like ‘knowledge’, ‘intelligence’, ‘life’, ‘personal identity’ stand for,
using words other than the very words themselves; and do so not in the
boring sense of using other words that are synonyms or near-synonyms, but
in the interesting one which is what we have in mind when we offer
conceptual analyses. We should be able to find illuminating alternative ways
1 Kripke (1980); Putnam (1975). I do not know if Kripke or Putnam would agree with
this use of their work on reference but some certainly seem to draw the opacity moral from
their work.
Ramsey Sentences and the Sui Generis 131
of saying that something is known, is probable, is intelligent, is alive, is the
same person as, and so on and so forth. Why then is it—conceptual
analysis—so hard?
XII
If, as some hold, myself included, whenever linguistic structures S1 and S2
are alternative ways of saying the same thing about how the world is, then
‘S1 iff S2’ is a priori, we have the traditional connection between analysis and
the a priori. But notice that we raised the puzzle simply in terms of the way
language passes around putative information about how things are. There is
a puzzle here independently of where one stands on the a priori. Some think
that their robust rejection of the a priori means that they can think of the
problems for conceptual analyses as ripples on a discredited backwater, but
in fact there’s an issue for anyone who sees language as being like Morse
code and semaphore in being a system of representation.
XII I
There is even an issue for wide-ranging expressivists. By wide-ranging
expressivists I mean those who hold that very many of the terms
philosophers have found hard to analyse, not merely the normative and
ethical ones of classical expressivism, serve to express rather than report or
describe. The issue for wide-ranging expressivists is not, of course, to
capture the information putatively passed around by the use of the
problematic terms. Their view is that there is no such information, and to
ask after it is to misunderstand the role of the terms in our language. All the
same, expressivists allow, have to allow, that the terms in question figure in
language fragments that serve to make claims about how things are. For
example, if ‘knowledge’ is a term that wide-ranging expressivists hold
expresses rather than reports, they must allow that the sentence ‘The word
“knowledge” in English serves to express an attitude rather than to make
claims how things are’ serves to make claims about how things are. The
question for them, accordingly, is what to say about the attitude in question.
Is it sui generis, or is it subject to such and such an analysis, or ... ? Maybe
wide-ranging expressivists have less trouble with these questions than the
rest of us; maybe not. That’s an interesting question for another time.
XIV
My answer to why conceptual analysis is hard and why there is so much
controversy takes off from a picture that underlies much of what I have said
already.
The world is a huge complex entity spread out in space and time. We
make sense of it by finding patterns. If we did not discern patterns, we
132 Frank Jackson
would be overwhelmed by the complexity. Finding patterns is a matter of
carving out similarity regions, and we use physical configurations—maps,
colouring conventions, and above all words—to tell about these similarity
regions. Globes of the world once coloured the parts of the British Empire
red. Here we have, first, the world. Second, those parts of the world that are
alike in being one or another part of the British Empire—the similarity
region is the scattered object that is the British Empire. And, third, the
words and the colouring on the globes that serve to tell about that scattered
object: where it is, its shape, how it waxed and waned. The same goes for
more philosophically interesting examples. We can think of pain as a huge
scattered object united by each element being a case of pain, of knowledge
as that which unites in the relevant respect all the bits of the world that
know such and such, of life as that which unites in the relevant way all the
things that are alive, and so on.
There are infinitely many similarity regions in space and time, especially
when you bear in mind that anything that can be captured in a system of
representation counts for our purposes, and that systems of representation
carve out regions in logical space as well as actual space. We are not talking
about carving nature at especially natural joints or anything like that, and we
are talking about commonalities across possible worlds. This means that any
system of representation that captures a good number of these regions has
to be structured for the same reason that number systems have to be
structured. We can tolerate a certain number of primitives, but, as everyone
knows, after that we have to have a finite set of rules for operating on a
finite list of primitives to form terms for the indefinitely many similarity
regions (or numbers).
Now a conceptual analysis is nothing other than a claim that there are
two different ways of capturing the same similarity region, with the proviso
that we are not interested in boring cases. The region carved out by the
word ‘sibling’ is the union of the regions carved out by the word ‘sister’ and
the word ‘brother’, and that is why we can analyse ‘x is a sibling’ as ‘x is a
brother or a sister’.
There is, therefore, no mystery about why conceptual analysis is
sometimes hard. There is no reason why it should typically be obvious that
different sets of ingredients put together in appropriately different ways
carve out the same region. After all, in the main, the equations for the conic
sections are not especially obvious, and they are cases where geometric and
algebraic systems of representation carve out the same curves. Why should
the situation be greatly different for representation by words and sentences?
Another way of putting the point is to note that when we asked at the
beginning rhetorical questions like ‘Surely we know what words like “pain”
and “knowledge” stand for, otherwise we would not know what we were
saying about how things are when we use them?’ to answer that of course
we do is not to say that we know off the bat interestingly different ways of
representing what we know in other words.
Ramsey Sentences and the Sui Generis 133
XV
This tells us why conceptual analysis is often hard, and perhaps is all we
need to say about Moore’s famous Paradox of Analysis.2 But it does not tell
us why there is so much disagreement, and why the smart money is on the
counter-exampler. Finding the algebraic way to represent a curve that we
already know how to represent geometrically may be tricky, but there is
plenty of agreement once we’ve pulled it off. In order to explain the
ubiquity of disagreement, we need a point about many concepts brought to
our attention by many philosophers including Ramsey, and which famously
played a central role in the arguments for materialism as a philosophy of
mind in the hands of David Armstrong and David Lewis.3
I said we need to spot patterns if we are to make sense of our world.
Very often the patterns we need, or which do the job best, form a theory in
the sense of patterns that are identified by their relations to other patterns.
They come as package deals, as David Armstrong likes to put it. What unites
the husbands is that each has a wife, and what unites the wives is that each
has a husband. Other familiar examples are the relation between force,
mass, and acceleration in Newtonian mechanics, and the way belief–desire
psychology finds patterns in the way we move through the world. It is, I
take it, a contingent fact about the world we live in that many of the most
useful patterns are package deal ones. In order to understand British
politics, you need to understand the patterns picked out by terms like
‘cabinet’, ‘minister’, ‘backbencher’, ‘electorate’, and so on, and they make up
a package deal. But this is a contingent feature of British politics.
This gives us two sources of controversy when we come to do
conceptual analysis. Both arise from the implicit nature of the package deals
in the philosophically interesting cases.
XVI
The package deal that makes up Newtonian gravitational field theory has
been written down. But the package deals that make up belief–desire
psychology, rationality, being intelligent, knowing something, and so on
have never been written down. Of course bits have been written down.
Most accounts of rationality include clauses to the effect that one ought not
believe contradictions other things being equal. Most accounts of chance
include clauses connecting credences concerning chances of outcomes to
2 But note that what we mean by analysis is not what Moore meant; see Moore (1942:
660).
3 Armstrong (1968); Lewis (1966, 1970). The importance of Ramsey’s contribution is
highlighted by Lewis.
134 Frank Jackson
credences concerning outcomes simpliciter (see e.g. Mellor 1971). All
accounts of knowledge make knowledge factive. But the absence of
canonical statements means that two things can and do happen.
One is that different people can have different theories without this
being obvious. The other is that people can change their theories without
noticing that they have. I think both are illustrated in the debate over the
analysis of knowledge that started with the Gettier examples. The Gettier
examples are often cited as one of the few examples of knockdown
refutations in philosophy.4 But in fact there are three possibilities, only one
of which is the knockdown refutation case.
Case 1. Roderick and Alfred believe that the representational content they
give to the word ‘knowledge’ is the intersection of the contents they give to
the words ‘true’, ‘belief’, and ‘justified’. They are told about the Gettier
cases and realise the error of their ways. Perhaps they explain that what
tricked them is that they’ve always granted the link between knowing and its
not being a fluke that one’s belief is true but thought that being justified
rules out being right by fluke. The Gettier cases show this is a mistake. ( I
have 90% credence that I am an example of case 1.)
Case 2. Roderick and Alfred rightly believe at t that the representational
content they give to the word ‘knowledge’ is the intersection of the contents
they give to the words ‘true’, ‘belief’, and ‘justified’. They are told at t +
about the Gettier cases and immediately realise that there is an interesting
epistemological state to be in that is distinct from true justified belief and
which is in some ways superior. In the act of realising this, they switch,
starting pretty much from t + , their usage of the term ‘knowledge’ to this
state they have only just discerned but without realising that they are making
a switch rather than correcting an error. The fact that a change in usage has
occurred is easy to overlook because Gettier cases are unusual and because
of the lack of any explicit agreement. (I have a 10% credence that I am an
example of case 2, or, better, that it is vague whether or not I am a case 1 or
a case 2.)
Case 3. Roderick and Alfred rightly believe that the representational
content they give to the word ‘knowledge’ is the intersection of the contents
they give to the words ‘true’, ‘belief’, and ‘justified’. They are told about the
Gettier cases, are unconvinced, and become well known for articles
defending the true justified belief analysis against Gettier and related cases.
(I am certain that I am not an example of case 3 but I think that there are
examples of case 3.)
So why is there so much disagreement? Part of the answer is that there
isn’t. What there is is an awful lot of apparent disagreement. Philosophers
XVI I
What’s all this got to do with Ramsey? Ramsey sentences tell us how having
a place in a network can deliver an analysis in the sense of an account that
reduces the number of unanalysable notions we have to admit in our
account of what our world is like (Ramsey 1929c).
The natural first thought on being told that some concept is defined by
its place in a network is that vicious circularity threatens. Isn’t it circular to
define C1 in terms of C2, and then turn around and define C2 in terms of
C1? And defining C1 to Cn in terms of their places in a network looks
suspiciously like this, except that more concepts are in play. But Ramsey
136 Frank Jackson
sentences tell us that this need not be the case. To sketch the familiar story,5
let T(C1, ..., Cn) be the sentence that gives the network, with the ‘Ci’s
thought of as names of the kinds corresponding to the concepts. If each Ci
is defined by its place in T, then the content of T(C1, ..., Cn) is its Ramsey
sentence, namely (x1) ... (xn) T(x1, ..., xn). For T(C1, ..., Cn) simply says that
there are kinds standing thus and so to one another, which is what the
Ramsey sentence says. But then to be Ci is to be that which is in the i-th
place if such there be, for each i. That is to say, y is Ci iff (x1) ... (xn) [y has
xi & T(x1, ..., xn)], where each xj is in Cj’s place in T. As the right-hand side
of this biconditional contains no occurrences of any Cj, we see how a
network story can avoid circularity, and how a network story allows us to
reduce the number of sui generis concepts we need to admit.
Without Ramsey’s insight, we would, I think, have had to embrace one
of: language is not a system of representation; or it is, but it represents
things as having implausibly many sui generis features and, in consequence,
misrepresents a lot of the time and makes a mystery out of our super-
venience intuitions. Ramsey shows us how to get out of a nasty dilemma.6
5 Best known through Lewis (1970). I omit the uniqueness requirement that Lewis
includes, as indeed did Ramsey. The point at issue is independent of it.
6 I am indebted to many discussions with supporters and opponents, too many to list.
My general debt to David Lewis’s writings on Ramsey sentences will be obvious.
What Does Subjective Decision
Theory Tell Us?
D. H. MELLOR
1. THE QUESTION
By ‘subjective decision theory’, or ‘SDT’ for short, I shall here mean the
common core of the subjective decision theories of Ramsey (1926c), Savage
(1972), Jeffrey (1983), and others, ignoring differences of detail. This core
theory bases an assessment of decisions to act on two features of the
possible outcomes of alternative actions: how probable they are and how
valuable they are—or rather, how probable and valuable we think they are as
we make our decisions. For the probabilities and values which SDT invokes
are not objective chances or values, if such there be. They are measures of
how strongly, while deciding how to act, we believe in and desire various
possible outcomes of our actions. This is why the theory is called ‘subjec-
tive’: in it, the values of these outcomes are just the so-called subjective utilities
which they have for us in advance, and their probabilities are just the
different degrees of belief, or credences, that we have in them.
Although these features of SDT are contentious, I shall take them for
granted in what follows, since what concerns me here is how we should read
the theory, so understood. Should we read it normatively, as saying, rightly
or wrongly, how we should act, or would act if we were rational; or
descriptively, as saying, rightly or wrongly, how in fact we do act? Jeffrey and
most other modern subjective decision theorists read it normatively, and
take Ramsey to have done so too. I, like Blackburn (1998, ch. 6), think they
are wrong on both counts: Ramsey read his theory descriptively, and was
right to do so. The theory, as he presents it, is not normative: it is a descrip-
tive theory that forms part of a functionalist account of states of mind. And
that, I shall argue, is how we should read SDT; for only on this descriptive
reading is it defensible.
Since the issues that will concern us arise in even the simplest cases,
those are all I shall consider. So suppose, for example, that I am trying to
decide whether to stop smoking tobacco in order to avoid getting cancer:
that is the intended end (call it E) to which my stopping smoking is a means
(call it M). Suppose too that my nicotine addiction prevents me from
smoking less, so that unless I stop smoking altogether I will carry on as
before. This therefore is the relevant alternative to M (call it ¬M), just as my
getting cancer is the relevant alternative to E (call it ¬E). Then SDT says
that whether I will or should ‘do M’ (i.e. make M the case) depends on what,
at the time and in the circumstances, are the utilities for me of the four
138 D. H. Mellor
possible upshots of my action—M & E, M & ¬E, ¬M & E, and ¬M & ¬E—
and on what ( I now think) my credences in E and ¬E will be if I do M and
if I do ¬M.
Specifically, SDT says that whether I will or should do M depends on the
expected utilities for me of M and of ¬M. M’s expected utility for me is the
average of the utilities for me of M & E and of M & ¬E, weighted by what
my credences in E and in ¬E will be if M is done; and similarly for ¬M.
Then SDT says that I will or should do whichever of M or ¬M has the
greater expected utility. That is, I will or should do M if M’s expected utility
for me exceeds ¬M’s, and I will or should do ¬M if ¬M’s expected utility
for me exceeds M’s. (If the two expected utilities are equal, SDT says
nothing either way.) This is the principle of maximising subjective expected
utility or, for short, the maximising expected utility principle, or MEUP. The
question then is whether we should read MEUP normatively, as saying that
we should maximise our expected utilities, or descriptively, as saying that we
will maximise them.
1 This chapter is derived from papers given to the Birkbeck College Philosophy Society
in London on 18 March 2003, the Frank Ramsey Centenary Conference in Cambridge on 1
July 2003, Tokyo University on 7 October 2003, the Ramsey Conference in Paris on 24
October 2003, and the Durham University Philosophy Society on 6 November 2003. I am
greatly indebted to all those who took part in the discussion of these talks.
Three Conceptions of
Intergenerational Justice
PARTHA DASGUPTA
1. INTRODUCTIO N
How should we measure human well-being over time and across
generations? In which way ought the interests of people in the distant future
be taken into account when we make our own decisions today? In which
ethical language should citizens deliberate over the rate at which their
society ought to invest for the future? In which assets should that
investment be made? What should the balance between private and public
investment be in the overall investment that a generation makes for the
future?
Frank Ramsey’s paper of 1928 in the Economic Journal (‘A Mathematical
Theory of Saving’) constructed a framework in which these questions can
be asked in a form that is precise and tractable enough to elicit answers.
Although very famous today, the article had no initial impact. In the years
following its publication, a period now known as the Great Depression, the
central economic conundrum in Western industrial countries was to find
ways of increasing immediate employment. Factories and machinery lay idle,
as did people. The policies that were needed then were those that would
help to create incentives for employers to hire workers. This, however, was
a short-term problem. With the emergence of post-colonial nations
following the Second World War, long-run economic development became
a focus of political interest. By the early 1960s Ramsey’s paper came to be
acknowledged as the natural point of departure for exploring the normative
economics of the long run. The number of trails the paper laid was
remarkable. In academic economics it is probably one of the dozen most
influential papers of the twentieth century.
I don’t recall ever reading Ramsey’s article until preparing for the
Centennial Conference on Ramsey. Classics typically don’t get read by us
economists: we come to know them from subsequent developments of the
subject and from textbook accounts. The paper has all the hallmarks of a
classic and then some more. What has struck me most on reading the work
is that it reads as though it could have been written last year. The techniques
are thoroughly contemporary. Moreover, there is a self-conscious attempt at
identifying a parsimonious body of assumptions that lead to the
conclusions: the paper has no fat in it.
Ramsey’s conception of intergenerational justice is grounded firmly on
the Utilitarian calculus. In what follows, I first present an account of
150 Partha Dasgupta
Ramsey’s formulation of the problem of optimum saving and sketch its
most dramatic implications (Sections 2–4). As we will see, they look odd and
are at variance with ethical intuition in plausible worlds. The theory is even
incoherent in some worlds. Therefore, in Section 5 I explore one particular
interpretation of a dominant alternative ethical theory, that of Rawls (1972),
which defines just rates of saving to be the ones that would be ‘agreed’ upon
by all generations behind a veil of ignorance—the hypothetical social
contract. In keeping with Rawls’s reading of what members of a given
generation would agree to be a just intragenerational distribution of
resources, I take the Rawlsian principle of just saving to recommend
maximising the well-being of the least well-off generation—the Difference
Principle.1 I show that, in plausible worlds, the implications of Rawls’s
theory are very odd and are at variance with both ethical intuition and
actual, reflective practice.
So, in Section 6, I turn to a formulation of the concept of justice among
generations that was developed by a great twentieth-century economist, the
late Tjalling Koopmans. Koopmans was moved to reformulate the problem
of intergenerational justice because of the latent incoherence in Ramsey’s
conception, mentioned above. Although Ramsey’s and Koopmans’s
conceptions lie at different interpretative ends (Rawls would call the former
‘teleological’, the latter ‘intuitionist’), Koopmans (1960, 1972) showed that,
mathematically, the two are very similar and that Ramsey’s techniques for
identifying optimum rates of investment are usable in his own formulation
(Koopmans 1965). In Section 6 I confirm that Koopmans’s formulation is
sufficiently flexible to permit us to derive conclusions that do not jar against
considered judgement.
The common mathematical structure of Ramsey’s and Koopmans’s
conceptions has been found to have wide applicability—so wide that within
modern economics there is no rival formulation for evaluating the
intergenerational distribution of benefits and burdens. Today, we
economists who work on the concept of justice among the generations refer
to that overarching mathematical structure as the Ramsey–Koopmans
construct, even though the interpretation we give to that mathematical
structure is the one advanced by Koopmans.
It is a significant feature of Koopmans’s conception that the well-beings
of future generations are discounted at a positive rate. This has been
regarded by many to be cause for concern. In Section 7 I argue otherwise.
In Section 8 I show, more generally, that the obsession in both the
philosophy and economics literatures over the question of whether it is
ethically justifiable to discount the well-beings of future generations has
been misplaced. Koopmans’s formulation shows that there are at least two
1 Rawls (1972) uses the language of ‘primary goods’, not utility, nor well-being. At this
point I am regarding well-being as an index of Rawlsian primary goods.
Conceptions of Intergenerational Justice 151
ethical parameters that reflect considerations of intergenerational equity, the
discount rate being one. There is, however, another parameter, that is in
some sense dual to discounting, in that many of the demands made by
considerations of intergenerational equity that can be achieved by
manipulating the discount rate can also be achieved by manipulating the
other parameter. That Koopmans’s conception insists on positive
discounting would, therefore, seem to be of less moment than it has been
taken to be.
For ease of comparison among the formulations of Ramsey, Rawls, and
Koopmans, I shall assume, until Section 9, that population is constant and
that societies face no uncertainty. So, in Sections 9 and 10, I extend
Koopmans’s formulation to include population change and uncertainty,
respectively. The main conclusions are summarised in Section 11.
3 is the summation sign, from t to infinity. Thus, in equation (1), signifies dates that
=t
go from t to infinity.
4 Note that the first element of the sequence is generation 1’s consumption.
5 Looking backward, therefore, it would reason that generation 0 had ‘done the right
thing’ by consuming 0. Note too that generation 1 would find it optimum to choose the
level of consumption generation 0 had planned for it, namely, 1.
154 Partha Dasgupta
Plainly, then, generation 1 would consume 1, invest accordingly, and pass
on the optimum stocks of capital assets to generation 2. Denote the set of
feasible consumption streams for generation 2 to be 2. A typical member
of 2 can be written as (C2, C3, ..., C, ...).6 The problem to be faced by
generation 2 will be to identify that element of 2 that maximises V2. It is an
interesting and important feature of expression (1) that generation 2 would
identify the optimum consumption stream to be (2, 3, ..., , ...).7 Plainly,
then, generation 2 would consume 2, invest accordingly, and pass on the
optimum stocks of capital assets to generation 3. And so on. The ethical
viewpoints of the succeeding generations are congruent with one another.
Each generation chooses its level of consumption and leaves behind capital
assets that can sustain the subsequent stream of consumption levels that it
deems to be just, aware that succeeding generations will choose in
accordance with what it had planned for them. In modern game-theoretic
parlance, Ramsey’s optimum consumption stream is a so-called ‘non-
cooperative’ Nash equilibrium among the generations. If expression (1) is
the coin by which generation t interprets intergenerational well-being (for all
t 0), and if every generation can be expected to choose ethically, then
there is no need for an intergenerational ‘contract’. That it is not possible for
the generations to devise a binding agreement among themselves is of no
moment.
for t 0,where 1/(1+ ) < 1. In expression (2), is the discount rate and
is the resulting discount factor.9
To some economists, Ramsey’s stricture forbidding the discounting of
future well-beings reads like a Sunday pronouncement. Solow (1974a: 9)
expressed this feeling when he wrote, ‘In solemn conclave assembled, so to
speak, we ought to act as if the [discount rate on future well-beings] were
zero.’ But there is a deeper problem with the stance. In such complex
exercises as those involving the use of resources over a very long time
horizon, in a world where investment in capital has a positive return (the
latter reflecting an in-built bias in favour of future generations), it is foolhardy
to regard any ethical judgement as sacrosanct. This is because one can never
8 Their position has been re-examined and endorsed by a number of modern
philosophers; see Feinberg (1980) and Broome (1992). For wide-ranging discussions among
economists on this question, see Lind (1982) and Portney and Weyant (1999).
9 The discount rate in expression (2) is constant. Arrow (1999) has appealed to agent-
relative ethics to explore the consequences of using a variable discount rate. The variation he
explored arises from the idea that each generation should award equal weight to the well-
beings of all subsequent generations, but should award its own well-being a higher weight
relative to that awarded to the subsequent generations. In this chapter I am exploring the
concept of intergenerational well-being, an essential ingredient in Government House Ethics.
It is doubtful that agent-relative ethics would be appropriate for such exercises as
Government House would be required to conduct.
156 Partha Dasgupta
know in advance what it may run up against. A more judicious tactic than
Ramsey’s would be to play off one set of ethical assumptions against
another in not implausible worlds, see what their implications are for the
distribution of well-being across generations, and then appeal to our
intuitive senses before arguing over policy. The well-being discount rate may
well be too blunt an instrument to settle questions of intergenerational
equity.
Consider, for example, the following ethical tension:
(A) Low rates of consumption by generations sufficiently far into the
future would not be seen to be a bad thing by the current generation
if future well-beings were discounted at a positive rate. It could then
be that, by applying positive discount rates, the present generation
finds it acceptable to save very little for the future—it may even find
disinvestment to be justifiable. But if that were to happen, the
demands of intergenerational equity would not be met. This
suggests that we should follow Ramsey and not discount future well-
beings.
(B) As there are to be a lot of future generations in a world that faces an
indefinite future and where the return on investment is positive, not
to discount future well-beings could mean that the present
generation would be required to do too much for the future; that is,
they would have to save at too high a rate. But if that stricture were
to be obeyed, the demands of intergenerational equity would not be
met. This suggests that we should abandon Ramsey and discount
future well-beings at a positive rate.
The force of each consideration has been demonstrated in the economics
literature. It has been shown that in an economy with exhaustible resources
and ‘low’ productive potentials for manufactured capital assets, optimum
consumption declines to zero in the long run if the future well-beings are
discounted at a positive rate, no matter how low the chosen rate is
(Dasgupta and Heal 1974), but increases indefinitely if we follow Ramsey in
not discounting future well-beings (Solow 1974b). This finding was the
substance of Solow’s remark (Solow 1974a), that, in the economics of
exhaustible resources, whether future well-beings are discounted can be a
matter of considerable moment. In recent years environmental and resource
economists writing on sustainable development have taken this possibility as
their starting point (e.g. Bromley 1995).
On the other hand, if the Ramsey requirement, that future well-beings are
not discounted, is put to work in a close variant of the model economy
Ramsey himself studied in his paper, it recommends that every generation
should save at a very high rate. For classroom parameterisations, the
optimum saving rate has been calculated to be in excess of 60 per cent of
gross national product. In a poor country such a figure would be unaccep-
tably high, requiring the present generation to sacrifice beyond the call of
Conceptions of Intergenerational Justice 157
duty.10 The real problem is that no one, not even Ramsey, could be expected
to know in advance how to capture the right balance between the claims of
the present generation and those of future ones. The issues are far too
complex, especially in infinite horizon models. Unaided intuition is suspect.
Rushing to Utilitarianism with no discounting can be treacherous. What the
quantitative exercises in Dasgupta and Heal (1974) and Solow (1974b) tell us
is that the long-run features of optimum saving policies depend on the
relative magnitudes of the rate at which future well-beings are discounted
and the long-term productivity of capital assets.
In fact there is a deeper problem with Ramsey’s stricture that future well-
beings should not be discounted. Koopmans (1965) showed that
consideration B can even overwhelm the stricture and render expression (1)
incoherent. Zero discounting can imply that there is no best policy; that, no
matter how high is the rate of saving, saving a bit more would be better. To
see how and why, imagine a world where goods are completely perishable.
Consider an economic programme where consumption is the same at every
date. Now imagine that an investment opportunity presents itself in which,
if the present generation were to forgo a unit of consumption, a perpetual
stream of additional consumption µ (>0) would be generated.11 Suppose
intergenerational well-being is represented by expression (1). Then, no
matter how small is µ, future generations, taken together, would experience
an infinite increase in well-being as a consequence of the investment, the
reason being that µ ‘multiplied’ by infinity is infinity. So, for any level of
consumption, no matter how low, a further reduction in consumption
(possibly short of a reduction that brings consumption down to zero) would
be desirable. As a piece of ethics, this is clearly unacceptable. Ramsey’s
conception simply does not do.
5. RAWLSIAN SAV IN G
In the philosophical literature the only rival to Ramsey’s Utilitarian principle
of optimum saving is probably the principle of just saving in Rawls (1972).
In fact, though, Rawls doesn’t have much of a theory of just saving. The
first half of his second principle of justice, emanating from choice behind
the veil of ignorance (the ‘original position’), alludes to a just savings
principle (Rawls 1972: 302), but he gets nowhere with it (Arrow 1973;
Dasgupta 1974, 1994). For example, he writes:
10 As a matter of comparison, it should be noted that saving rates in the United Kingdom
and the United States are in the range 10–15 per cent of their gross national products.
Interestingly, the fast-growing poor countries of the world in the 1970s (Taiwan, South
Korea, and Singapore) routinely saved at rates in the range 40–5 per cent of their gross
national products.
11 This means that the rate of return on investment is µ. The example has been taken
from Arrow (1999).
158 Partha Dasgupta
The parties do not know to which generation they belong . . . Thus the persons in
the original position are to ask themselves how much they would be willing to save
. . . at any given phase of civilization with the understanding that the rates they
propose are to regulate the whole span of accumulation . . . Since no one knows to
which generation he belongs, the question is viewed from the standpoint of each
and a fair accommodation is expressed by the principle adopted. ( Rawls 1972: 287)
But this says nothing of import; it is merely a requirement of inter-
generational consistency, namely, that each generation should find it
reasonable to save at the rate that was agreed upon in the original position.
But we are not told what could be expected to be agreed upon. If Rawls’s
Difference Principle, which is all-important in the rest of his book, were
applied to the saving problem, then for all consumption streams {C0, C1, C2,
..., C, ...}, the Rawlsian Vt would be
(3) Vt = inf{U(Ct ), U(Ct +1 ), ...},
where ‘inf’ means ‘greatest lower bound of’. The problem with this
conception is that if savings yielded a return, there would be no ethical
motivation to save: a positive rate of saving, no matter how low, would
mean that the present generation would be worse off than all future
generations, an inequity that it could prevent by not saving at all!
Rawls recognised the problem. So he altered the motivation assumption
of individuals and wrote:
The process of accumulation, once it is begun, and carried through, is to the good
of all subsequent generations. Each passes on to the next a fair equivalent in real
capital as defined by a just savings principle . . . Only those in the first generation do
not benefit . . . for while they begin the whole process, they do not share in the
fruits of their provision. Nevertheless, since it is assumed that a generation cares for
its immediate descendants, as fathers say care for their sons, a just savings principle . . .
would be acknowledged. ( Rawls 1972: 288; emphasis added)
One could take Rawls to mean by this that generation t’s well-being depends
not only on its own consumption level, but also on its descendants’
consumption levels. Arrow (1973) and Dasgupta (1974) proved that if
parental concerns extend only to a finite number of descendants, the
Difference Principle either implies that no generation should do any saving
(this would be so if the natural concern for descendants is ‘small’), or
recommends a programme of savings and dissavings that would be revoked
by the generation following any that were to pursue it (this would be so if
the natural concern for descendants is not ‘small’). The latter would mean
that Rawlsian savings policies are intergenerationally inconsistent.
On the other hand, if parental concern were to extend to all descendants,
the Rawlsian formulation would look similar to Ramsey’s (expression (1)),
albeit with possible discounting (expression (2)). However, the infinite sum
would now represent a generation’s well-being, not intergenerational well-
being. Given that the Difference Principle is to apply, the Rawlsian
recommendation would be that the rate of saving should be zero: any
Conceptions of Intergenerational Justice 159
saving, whether positive or negative, would create inequity across the
generations.
In short, what Rawls has offered us is either mean-spirited (no saving at
all) or intergenerationally inconsistent. So we must look elsewhere for a
theory of just saving.
18 As noted earlier, this has been shown to be the case in simple economic models
involving exhaustible resources. See Dasgupta and Heal (1979, ch. 10).
19 For example, Heal (1998). Earlier, I called it consideration A.
Conceptions of Intergenerational Justice 163
Therefore, µ > 0. As in Section 3, I assume that generation t’s well-being is
an increasing function of its consumption level (Ct ), but that it increases at a
diminishing rate, meaning that U(C) is positive and U(C) is a strictly
concave function (U(C) < 0). Define H(C) = G(U(C)). Since G(U) is an
increasing and strictly concave function also, it must be that H is an
increasing and strictly concave function of C. Thus, H(C) > 0 and
H(C) < 0. For expositional ease, I now focus on the question of equity
among the generations in the distribution of consumption, rather than well-
being.
The theory of inequality measures has taught us that the correct index of
the degree of concavity of H with respect to C is the absolute value of the
percentage rate of change in H(C). Let be that measure. Then we have
(5) (C) = CH(C)/H(C) > 0.
(C) is called the elasticity of H(C). The theory of inequality measures has
taught us that the larger is (C), the more equality-regarding is the concept
of intergenerational well-being in expression (4). Since is defined at each
value of C, is a local measure, which means that in general is a function
of C.
Now consider generation 0’s ethical problem. It has inherited from its
predecessors a wide array of capital assets. Given this inheritance and the
fact that the rate of return on capital investment is µ, it is faced with a
feasible set of consumption streams, which, as in Section 3, I label as 0.
From generation 0’s vantage, a typical consumption stream reads as (C0, C1,
..., C, ...). Imagine now that (0, 1, ..., , ...) is that member of 0 which
maximises V0, where V0 is given by expression (4), with t = 0. Ramsey
(1928), Koopmans (1965), and others have shown that the optimum
consumption stream (0, 1, ..., , ...) must be a solution of the equation
(6) µ = + (Ct )g(Ct ),
where g(Ct ) is the percentage rate of change in consumption between the
consumption levels enjoyed by generations t and t+1.
Equation (6) is fundamental to intergenerational ethics. It has a simple
interpretation. µ is the rate of return on investment, meaning that it is, at the
margin, the percentage rate at which consumption can feasibly be exchanged
among successive generations, t and t+1 (t 0). The right-hand side of
equation (6) can be shown to be the percentage rate at which, at the margin,
it is ethically permissible to exchange consumption among the successive
generations t and t+1 (see e.g. Arrow and Kurz 1970; Dasgupta and Heal
1979). If the two expressions were not equal, an appropriate reallocation of
consumption between t and t+1 would increase V0. Therefore, the consum-
ption stream deemed just must satisfy equation (6), and it must satisfy the
equation for every t 0. In the language of social cost–benefit analysis, the
right-hand side of equation (6) is the social rate at which future consumption
164 Partha Dasgupta
ought to be discounted (in contrast to future well-beings, which are dis-
counted at the rate ).
There is an attractive class of functional forms of H(C) for which
equation (6) simplifies enormously. Consider the form
(7) H(C) = BC(1),
where B > 0 and > 1.20 If H(C) satisfies formula (7), the elasticity of
H(C), which is (C), is independent of C. In the economics literature,
formula (7) is ubiquitous. As we see below, it offers a most instructive
laboratory for conducting thought experiments.
On using expression (7) in equation (6) and rearranging terms, we obtain
(8) g(Ct ) = ( µ )/ .
For vividness, imagine that is chosen to be less than µ. Equation (8) tells
us that, since the right-hand side is a positive constant, justice demands that
consumption should increase at the exponential rate (µ )/. Notice
though that, as and are two free ethical parameters in Koopmans’s
theory, that same growth rate would be implied by an infinite family of
(, ) pairs. Presumably, a concern for equity in consumption among the
generations would lead us to insist that g(Ct ) should not be too large.
Otherwise, earlier generations would enjoy far lower consumption levels
than later generations. Lowering the right-hand side of equation (8) would
flatten the optimum consumption stream somewhat. But just as g(Ct ) would
have a low value if, other things being equal, were chosen to be nearly µ,
the same low value would be realised if, other things being equal, were
chosen to be large. This is the sense in which and are ethically dual to
each other.
9. POPULATION GROWTH
As Earth is finite, changes in the size of population, when averaged over
time, must be zero over the very long run. The base case we have been
considering so far, that population size remains constant, is thus valid when
the reckoning is the very long run. But for the not so very long run,
population can be expected to change. What is the right concept of inter-
generational well-being when population size is expected to change over
time?
Two alternatives have been much discussed in the literature. Both reduce
to expression (4) if population is constant. After presenting them I
introduce a third conception, which has been shown to be the natural one to
20 The constant B plays no role, in view of what was mentioned in note 16. I have
introduced it, nonetheless, in case the reader feels that H(C ) ought to be negative for very
low values of C, but positive for sufficiently large values of C.
Conceptions of Intergenerational Justice 165
adopt when we try to formulate the concept of sustainable development
(Dasgupta 2001). It too reduces to expression (4) if population is constant.
One alternative is to regard the well-being of a generation to be the per
capita well-being of that generation (with no allowance for the numbers
involved) and sum the per capita well-beings of all generations, possibly
using a discount rate. To formalise, imagine for simplicity that members of
the same generation are awarded the same consumption level. Let U be the
well-being of the representative person in generation . We then have (Cass
1965; Koopmans 1965)
(9) Vt = G(U )(t ), for t 0,
=t
21The example is taken from Meade (1955: 87–9) and Arrow and Kurz (1970: 13–14).
22To prove this, simply maximise [N1U(C1/N1) + N2U(C2/N2 )] by suitable choice of C1
and C2, subject to the constraint C1 + C2 = C.
166 Partha Dasgupta
an artefact for the problem in hand. On the other hand, if numbers don’t
count, so that social well-being is taken to be [U(C1/N1) + U(C2/N2 )], the
government should distribute less to each person in the more populous
island,23 which is to say that the use of expression (9) discriminates against
more numerous generations. This simply cannot be right. Extending this
example to the case of a sequence of generations, we conclude that, of
expressions (9) and (10), it is the latter that reflects the notion of
intergenerational well-being.
Expression (10) measures total (discounted) well-being. But there is
another formulation of the concept of intergenerational well-being which is
equally compelling. It is the average well-being of all who are to appear on the
scene:
(11) Vt = ( N G(U )(t ) )/( N (t ) ),
=t =t
10. UNCERTAINTY
How should uncertainty be accommodated? The theory of choice under
uncertainty, in its normative guise, is called the expected utility theory. There is
a large and still-growing experimental literature attesting to the fact that in
laboratory conditions people don’t choose in accordance with the theory.24
But here we are concerned with normative questions. That the choices we
make in the laboratory don’t conform to expected utility theory does not
mean that the theory is not the correct ethical basis for evaluating the policy
alternatives Government House faces.
When applied to the valuation of uncertain well-being streams,
probabilities are imputed to future events. The probabilities are taken to be
subjective, such as those involving long-range climate, although there can be
objective components, such as those involving the weather. Let Et denote
generation t’s expectation. Imagine once again that population remains
constant. Intergenerational well-being can then be expressed as
(12) Et ( G(U )(t ) ), for t 0,
=t
25 The Poisson process is often invoked by economists because of its simplicity—a large
asteroid hitting Earth is a possible interpretation; but there is little else to commend it.
Conceptions of Intergenerational Justice 169
environmental damage than to repair it subsequently. The theory gives
expression to the demand that, in evaluating radically new technology (e.g.
biotechnology), the burden of proof ought to shift away from those who
advocate protection from environmental damage, to those supporting the
new technology.
But these are early days for such theories as Bewley’s. The problem is
that they can be supremely conservative. Admittedly, even the expected
utility theory can be made ultra-conservative if we adopt an infinite aversion
to risk (which is to say that the elasticity of G(U) in expression (12) is
infinity), and imagine that the worst that can happen under any change in
policy is worse than the worst that can happen under the status quo. But it
is difficult to justify such an attitude: we wouldn’t adopt it even in our
personal lives. At the moment we don’t have a theory, normative or
otherwise, that covers long-term environmental uncertainties in a
satisfactory way.
These are some reasons why the expected utility theory remains a popular
framework for evaluating policy options. In practical decision-making,
though, short cuts have to be made. Simple rules of thumb are often
followed in the choice of public policy, for example, setting interest rates so
as to keep the rate of inflation from exceeding, say, m per cent per year. But
the expected utility theory remains the anchor for reasoning about economic
policies. If the probability of disasters under radically new processes and
products are non-negligible, the expected utility theory recommends
caution. The theory stresses trade-offs, it asks us to articulate our attitude to
risk, and it forces us to deliberate on the likelihood of various outcomes.
For the moment, it is the only plausible game in town.26
11. CONCLUSIONS
In this chapter I have argued that the formal apparatus Frank Ramsey
introduced to give shape to the question ‘How much of its income should a
nation save?’ can be given a far wider interpretation than the one he gave to
it. Ramsey’s ethics was overtly Utilitarian. Nearly five decades of work by
economists working on the ethics of the long run has shown that that ethics
will not do. It has also shown that, agreeably, there is a compelling ethical
theory that has the same mathematical structure as the one invented by
Ramsey. So, although Ramsey’s ethics cannot be accepted, the techniques he
devised for evaluating the just rate of saving can be adapted for use in
worlds that are ethically far richer than the one he considered.
26 Alternatives to the expected utility theory were much explored during the 1950s. See
Luce and Raiffa (1957, ch. 3) for an axiomatic classification of such theories.
REFERENCES
If a reference contains two dates, the first is that of the first publication (or,
in Ramsey’s case, of composition, where known), the second that of the
publication cited.