Beruflich Dokumente
Kultur Dokumente
VOLUME 86
EDITORS
ADVISORY COMMITTEE
Edited by
Martin R. Jones
and
Nancy Cartwright
ISSN 0303-8157
ISBN: 90-420-1955-7
Editions Rodopi B.V., Amsterdam - New York, NY 2005
Printed in The Netherlands
Science at its best seeks most to keep us in this simplified,
thoroughly artificial world, suitably constructed and suitably
falsified world . . . willy-nilly, it loves error, because, being
alive, it loves life.
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Analytical Table of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Kevin D. Hoover, Quantitative Evaluation of Idealized Models in the
New Classical Macroeconomics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
John Pemberton, Why Idealized Models in Economics Have Limited Use. . 35
Amos Funkenstein, The Revival of Aristotles Nature . . . . . . . . . . . . . . . . . . 47
James R. Griesemer, The Informational Gene and the Substantial Body:
On the Generalization of Evolutionary Theory by Abstraction . . . . . . . 59
Nancy J. Nersessian, Abstraction via Generic Modeling in Concept
Formation in Science . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
Margaret Morrison, Approximating the Real: The Role of Idealizations
in Physical Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
Martin R. Jones, Idealization and Abstraction: A Framework . . . . . . . . . . . 173
David S. Nivison, Standard Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
James Bogen and James Woodward, Evading the IRS . . . . . . . . . . . . . . . . . 233
M. Norton Wise, Realism Is Dead . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269
Ronald N. Giere, Is Realism Dead? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287
This page intentionally left blank
PREFACE
This volume was conceived over a dozen years ago in Stanford, California when
Cartwright and Jones were both grappling each for their own reasons with
questions about idealization in science. There was already a rich European
literature on the subject, much of it represented in this series; and there was a
growing interest in the United States, prompted in part by new work on
approximation and in part by problems encountered within the American
versions of the semantic view of theories about the fit of models to the world.
At the time, tuned to questions about idealization and abstraction, we began
to see elements of the problem and lessons to be learned about it everywhere,
in papers and lectures on vastly different subjects, from Chinese calendars
to nineteenth century electrodynamics to options pricing. Each usually
unintentionally and not explicitly gave some new slant, some new insight, into
the problems about idealization, abstraction, approximation, and modeling; and
we thought it would be valuable to the philosophical and scientific communities
to make these kinds of discussions available to all, specifically earmarked as
discussions that can teach us about idealization. This volume is the result.
We asked the contributors to write about some aspect of idealization or
abstraction in a subject they were studying idealization as they found it
useful to think about it, without trying to fit their discussion into some
categories already available. As a consequence, most of the authors in this
volume are not grappling directly with the standard philosophical literature on
the problems; indeed, several are not philosophers we have a distinguished
historian of ideas who specialized in the medieval period, a renowned historian
of physics, an eminent economic methodologist, and an investment director for
one of the largest insurance companies in the world.
The volume has unfortunately been a long time in the making; many of the
papers were written a decade ago. We apologize to the authors for this long
delay. Still, for thinking about abstraction and idealization, the material remains
fresh and original. We hope that readers of this volume will find this diverse
collection as rewarding a source of new ideas and new materials as we ourselves
have.
Two of the papers in particular need to be set in context. Norton Wises
contribution, Realism is Dead, was given at the 1989 Pacific Division
Meetings of the American Philosophical Association, in Berkeley, California, as
part of an Author Meets the Critics session devoted to Ronald Gieres book
Explaining Science, which was hot off the presses at the time. (The other
10 Preface
idealization and abstraction in both models and laws, degrees of idealization and
abstraction, and idealization and abstraction as processes. Relations to the work
of Cartwright and McMullin. Three ways in which idealization can occur in laws
and our employment of them there are quasi-laws, idealized laws, and ideal
laws.
Ancient Chinese astronomy and calendary divided time in simple, idealized ways
with respect to, for example, lunar cycles, the seasons, and the period of Jupiter.
The resulting schemes do not fit the data exactly, and astronomers employed
complex rules of adjustment to correct for the mismatch. Nonetheless, the
simpler picture was often treated as ideally true, and as capturing an order which
was supposed to underlie the untidiness of observed reality.
Critique of, and an alternative to, the view that the epistemic bearing of
observational evidence on theory is best understood by examining Inferential
Relations between Sentences (IRS). Instead, we should attend to empirical
facts about particular causal connections and about the error characteristics of
detection processes. Both the general and the local reliability of detection
procedures must be evaluated. Case studies, including attempts to confirm
General Relativity by observing the bending of light, and by detecting gravity
waves.
The new classical macroeconomics is today certainly the most coherent, if not
the dominant, school of macroeconomic thought. The pivotal document in its
two decades of development is Robert Lucass 1976 paper, Econometric Policy
Evaluation: A Critique.1 Lucas argued against the then reigning methods of
evaluating the quantitative effects of economic policies on the grounds that the
models used to conduct policy evaluation were not themselves invariant with
respect to changes in policy.2 In the face of the Lucas critique, the new classical
economics is divided in its view of how to conduct quantitative policy analysis
between those who take the critique as a call for better methods of employing
theoretical knowledge in the direct empirical estimation of macroeconomic
models, and those who believe that it shows that estimation is hopeless and that
quantitative assessment must be conducted using idealized models. Assessing
the soundness of the views of the latter camp is the main focus of this essay.
1
See Hoover (1988), (1991) and (1992) for accounts of the development of new classical thinking.
2
The Lucas critique, as it is always known, is older than Lucass paper, going back at least to the
work of Frisch and Haavelmo in the 1930s; see Morgan (1990), Hoover (1994).
In: Martin R. Jones and Nancy Cartwright (eds.), Idealization XII: Correcting the Model.
Idealization and Abstraction in the Sciences (Pozna Studies in the Philosophy of the Sciences and
the Humanities, vol. 86), pp. 15-33. Amsterdam/New York, NY: Rodopi, 2005.
16 Kevin D. Hoover
3
See Morgan (1990), ch. 6, for a history of this problem.
Quantitative Evaluation of Idealized Models in the New Classical Macroeconomics 17
4
For the history of autonomy, see Morgan (1990), chs. 4 and 8.
5
See Sheffrin (1983) for a general account of the rational expectations hypothesis, and Hoover
(1988), ch. 1, for a discussion of the weaknesses of the hypothesis.
18 Kevin D. Hoover
6
Calibration is not unique to the new classical macroeconomics, but is well established in the
context of computable general equilibrium models common in the analysis of taxation and
international trade; see Shoven and Whalley (1984). All the methodological issues that arise over
calibration of new classical macromodels must arise with respect to computable general equilibrium
models as well.
Quantitative Evaluation of Idealized Models in the New Classical Macroeconomics 19
evaluations of policies. The central issue now before us is: can models which
clearly do not fit the data be useful as quantitative guides to policy? Who is
right Lucas and Prescott or Hansen and Sargent?
7
Cf. Diamond (1984), p. 47.
20 Kevin D. Hoover
8
They do not say, however, whether it is actually written in FORTRAN.
Quantitative Evaluation of Idealized Models in the New Classical Macroeconomics 21
9
One might also include Ct-l, because, even though it is the lagged value of Ct (the output), it may
be thought of as being stored within the model as time progresses.
22 Kevin D. Hoover
Simon factors adaptive systems into goals, outer environments, and inner
environments. The relative independence of the outer and inner environments
means that
[w]e might hope to characterize the main properties of the system and its behavior
without elaborating the detail of either the outer or the inner environments. We might
look toward a science of the artificial that would depend on the relative simplicity of
the interface as its primary source of abstraction and generality (Simon 1969, p. 9).
Simons views reinforce Lucass discussion of models. A model is useful
only if it foregoes descriptive realism and selects limited features of reality to
reproduce. The assumptions upon which the model is based do not matter, so
long as the model succeeds in reproducing the selected features. Friedmans as
if methodology appears vindicated.
But this is to move too fast. The inner environment is only relatively
independent of the outer environment. Adaptation has its limits.
In a benign environment we would learn from the motor only what it had been called
upon to do; in a taxing environment we would learn something about its internal
structure specifically, about those aspects of the internal structure that were chiefly
instrumental in limiting performance (Simon 1969, p. 13).
This is a more general statement of principles underlying Lucass (1976)
critique of macroeconometric models. A benign outer environment for
econometric models is one in which policy does not change. Changes of policy
produce structural breaks in estimated equations: disintegration of the inner
environment of the models. Economic models must be constructed like a ships
chronometer, insulated from the outer environment so that
. . . it reacts to the pitching of the ship only in the negative sense of maintaining an
invariant relation of the hands on its dial to real time, independently of the ships
motions (Simon 1969, p. 9).
Insulation in economic models is achieved by specifying functions whose
parameters are invariant to policy.
Again, this is easily clarified with the simple consumption model. If were
a fixed parameter, as it might be in a stable environment in which government
policy never changed, then equation (1) might yield an acceptable model of
consumption. But in a world in which government policy changes, will also
change constantly (the ship pitches, tilting the compass that is fastened securely
to the deck). The role of equation (3) is precisely to isolate the model from
changes in the outer environment by rendering a stable function of changing
policy; changes, but in predictable and accountable ways (the compass
mounted in a gimbal continually turns relative to the deck in just such a way as
to maintain its orientation with the earths magnetic field).
Quantitative Evaluation of Idealized Models in the New Classical Macroeconomics 23
The independence of the inner and outer environments is not something that
is true of arbitrary models; rather it must be built into models. While it may be
enough in hostile environments for models to reproduce key features of the
outer environment as if reality were described by their inner environments, it
is not enough if they can do this only in benign environments. Thus, for Lucas,
the as if methodology interpreted as an excuse for complacency with respect
to modeling assumptions must be rejected.
New classical economists argue that only through carefully constructing the
model from invariants tastes and technology, in Lucass usual phrase can the
model secure the benefits of a useful abstraction and generality. This is again an
appeal to found macroeconomics in standard microeconomics. Here preferences
and the production possibilities (tastes and technology) are presumed to be
fixed, and the economic agents problem is to select the optimal combination of
inputs and outputs. Tastes and technology are regarded as invariant partly
because economists regard their formation as largely outside the domain of
economics: de gustibus non est disputandum. Not all economists, however,
would rule out modeling the formation of tastes or technological change. But for
such models to be useful, they would themselves have to have parameters to
govern the selection among possible preference orderings or the evolution of
technology. These parameters would be the ultimate invariants from which a
model immune to the Lucas critique would have to be constructed.
perhaps, covered in rubbish. To break the code or clear the rubbish, the
experimenter must either insulate the experiment from countervailing effects or
must account for and, in essence, subtract away the influence of countervailing
effects.10 What is left at the end of the well-run experiment is a measurement
supporting the law a quantification of the law. Despite its tenuousness in
laboratory practice, the quantified law remains of the utmost importance.
To illustrate, consider an experiment in introductory physics. To investigate
the behavior of falling objects, a metal weight is dropped down a vertical track
lined with a paper tape. An electric current is periodically sent through the track.
The arcing of the electricity from the track to the weight burns a sequence of holes
in the tape. These mark out equal times, and the experimenter measures the
distances between the holes to determine the relation between time and distance.
This experiment is crude. When, as a college freshman, I performed it, I
proceeded as a purely empirically minded economist might in what I thought
was a true scientific spirit: I fitted the best line to the data. I tried linear, log-
linear, exponential and quadratic forms. Linear fit best, and I got a C on the
experiment. The problem was not just that I did not get the right answer,
although I felt it unjust at the time that scientific honesty was not rewarded. The
problem was that many factors unrelated to the law of gravity combined to mask
its operation conditions were far from ideal. A truer scientific method would
attempt to minimize or take account of those factors. Thus, had I not regarded
the experiment as a naive attempt to infer the law of gravity empirically, but as
an attempt to quantify a parameter in a model, I would have gotten a better
grade. The model says that distance under the uniform acceleration of gravity is
gt2 / 2. I suspect that given calculable margins of error, not only would my
experiment have assigned a value to g, but that textbook values of g would have
fallen within the range of experimental error despite the apparent better fit of
the linear curve.11
Even though the data had to be fudged and discounted to force it into the
mold of the gravitational model, it would have been sensible to do so, because
we know from unrelated experiments the data from which also had to be
fudged and discounted that the quadratic law is more general. The law is right,
and must be quantified, even though it is an idealization.
Confirmation by practical application is important, although sometimes the
confirmation is exceedingly indirect. Engineers would often not know where to
begin if they did not have quantified physical laws to work with. But laws as
10
Cartwright (1989, secs. 2.3, 2.4), discusses the logic and methods of accounting for such
countervailing effects.
11
An explicit analog to this problem is found in Sargent (1989), in which he shows that the
presence of measurement error can make an investment-accelerator model of investment, which is
incompatible with new classical theory, fit the data better, even when the data were in fact generated
according to Tobins q-theory, which is in perfect harmony with new classical theory.
Quantitative Evaluation of Idealized Models in the New Classical Macroeconomics 25
It is hard to find them in nature and we are always having to make excuses for them:
why they have exceptions big or little; why they only work for models in the head;
why it takes an engineer with a special knowledge of materials and a not too literal
mind to apply physics to reality (Cartwright 1989, p. 8).
Neither do formal models constitute all of economics. Yet despite the short-
comings of idealized laws, we know from practical applications, such as
shooting artillery or sending rockets to the moon, that calculations based on the
law of gravity get it nearly right and calculations based on linear extrapolation
go hopelessly wrong.
Cartwright (1989, ch. 4) argues that capacities are more fundamental than
laws. Capacities are the invariant dispositions of the components of reality.
Something like the notion of capacities must lie behind the proposal to set up
elasticity banks to which researchers could turn when calibrating computable
general equilibrium models (Shoven and Whalley 1984, p. 1047). An elasticity
26 Kevin D. Hoover
12
In a regression of the logarithm of one variable on the logarithms of others, the elasticities can be
read directly as the value of the estimated coefficients.
13
For a full discussion of the relationship between new classical and Austrian economics see
Hoover (1988), ch. 10.
14
In (Hoover 1984, pp. 64-66), and (Hoover 1988, pp. 218-220), I refer to this as the Cournot
problem, since it was first articulated by Augustine Cournot (1927, p. 127).
15
Some economists reserve the term representative-agent models for models with a single,
infinitely-lived agent. In a typical overlapping-generations model the new young are born at the start
of every period, and the old die at the end of every period, and the model has infinitely many
periods; so there are infinitely many agents. On this view, the overlapping-generations model is not
a representative-agent model. I, however, regard it as one, because within any period one type of
young agent and one type of old agent stand in for the enormous variety of people, and the same
types are repeated period after period.
Quantitative Evaluation of Idealized Models in the New Classical Macroeconomics 27
aggregation requires not only that every economic agent have identical
preferences but that these preferences are such that any individual agent would
like to consume goods in the same ratios whatever their levels of wealth. The
reason is straightforward: if agents with the same wealth have different
preferences, then a transfer from one to the other will leave aggregate wealth
unchanged but will change the pattern of consumption and possibly aggregate
consumption as well; if all agents have identical preferences but prefer different
combinations of goods when rich than when poor, transfers that make some
richer and some poorer will again change the pattern of consumption and
possibly aggregate consumption as well (Gorman 1953). The slightest reflection
confirms that such conditions are never fulfilled in an actual economy.
New classical macroeconomists insist on general equilibrium models. A
fully elaborated general equilibrium model would represent each producer and
each consumer and the whole range of goods and financial assets available in
the economy. Agents would be modeled as making their decisions jointly so
that, in the final equilibrium, production and consumption plans are individually
optimal and jointly feasible. Such a detailed model is completely intractable.
The new classicals usually obtain tractability by repairing to representative-
agent models, modeling a single worker/consumer, who supplies labor in
exchange for wages, and a single firm, which uses this labor to produce a single
good that may be used indifferently for consumption or as a capital input into
the production process. Labor, consumption, and capital are associated
empirically with their aggregate counterparts. Although these models omit most
of the details of the fully elaborated general equilibrium model, they nonetheless
model firms and worker/consumers as making individually optimally and jointly
consistent decisions about the demands for and supplies of labor and goods.
They remain stripped-down general equilibrium models.
One interpretation of the use of calibration methods in macroeconomics is
that the practitioners recognize that highly aggregated, theoretical
representative-agent models must be descriptively false, so that estimates of
them are bound to fit badly in comparison to atheoretical (phenomenal)
econometric models. The theoretical models are nonetheless to be preferred
because useful policy evaluation is possible only within tractable models. In
this, they are exactly like Lucass benchmark consumption model (see section
III above). Calibrators appeal to microeconomic estimates of key parameters
because information about individual agents is lost in the aggregation process.
In general, these microeconomic estimates are not obtained using methods that
impose the discipline of individual optimality and joint feasibility implicit in the
general equilibrium model. Lucas (1987, pp. 46, 47) and Prescott (1986, p. 15)
argue that the strength of calibration is that it uses multiple sources of informa-
tion, supporting the belief that it is structured around true invariants. Again this
28 Kevin D. Hoover
16
A notable, non-new classical attempt to derive macroeconomic behavior from microeconomic
behavior with appropriate aggregation assumptions is ( Durlauf 1989).
Quantitative Evaluation of Idealized Models in the New Classical Macroeconomics 29
17
Prescott (1983, p. 12), seems, oddly, to claim that the inability of a model to account for some
real events is a positive virtue in particular, that the inability of real-business-cycle models to
account for the Great Depression is a point in their favor. He writes: If any observation can be
rationalized with some approach, then that approach is not scientific. This seems to be a confused
rendition of the respectable Popperian notion that a theory is more powerful the more things it rules
out. But one must not mistake the power of a theory with its truth. Aside from issues of tractability,
a theory that rationalizes only and exactly those events that actually occur, while ruling out exactly
those events that do not occur is the perfect theory. In contrast, Prescott seems inadvertently to
support the view that the more exceptions, the better the rule.
18
Watson (1993) develops a goodness-of-fit measure for calibrated models. It takes into account
that, since idealization implies differences between model and reality that may be systematic, the
errors-in-variables and errors-in-equations statistical models are probably not appropriate.
30 Kevin D. Hoover
invalid, but useful if they reveal theoretically interpretable facts about the world
and not useful if they do not. Econometrics as measurement treats econometric
procedures as direct measurements of theoretically articulated structures. This
view is the classic Cowles Commission approach to structural estimation that
concentrates on testing identified models specified from a priori theory.19
Many new classicals, such as Cooley and LeRoy (1985) and Sargent (1989),
advocate econometrics as measurement. From a fundamental new classical
perspective, they seem to have drawn the wrong lesson from the Lucas critique.
Recall that the Lucas critique links traditional econometric concerns about
identification and autonomy. New classical advocates of economics as
observation overemphasize identification. Identification is achieved through
prior theoretical commitment. The only meaning they allow for theory is
general equilibrium microeconomics. Because such theory is intractable, they
repair to the representative-agent model. Unfortunately, because of the failure of
the conditions for exact aggregation to obtain, the representative-agent model
does not represent the actual choices of any individual agent. The representa-
tive-agent model applies the mathematics of microeconomics, but in the context
of econometrics as measurement it is only a simulacrum of microeconomics.
The representative-agent model does not solve the aggregation problem; it
ignores it. There is no reason to think that direct estimation will capture an
accurate measurement of even the average behavior of the individuals who
make up the economy. In contrast, calibrators use the representative-agent
model precisely to represent average or typical behavior, but quantify that
behavior independently of the representative-agent model. Thus, while it is
problematic at the aggregate level, calibration can use econometrics as measure-
ment, when it is truly microeconometric the estimation of fundamental
parameters from cross-section or panel data sets.
Calibrators want their models to mimic the behavior of the economy; but
they do not expect economic data to parameterize those models directly.
Instead, they are likely to use various atheoretical statistical techniques to
establish facts about the economy that they hope their models will ultimately
imitate. Kydland and Prescott (1990, pp. 3, 4) self-consciously advocate a
modern version of Burns and Mitchells measurement without theory i.e.,
econometrics as observation. Econometrics as observation does not attempt to
quantify fundamental invariants. Instead it repackages the facts already present
in the data in a manner that a well calibrated model may successfully explain.
19
For a general history of the Cowles Commission approach, see Epstein (1987), ch. 2.
Quantitative Evaluation of Idealized Models in the New Classical Macroeconomics 31
7. Conclusion
20
E.g., Simon (1969, p. 33) writes: What do these experiments tell us? First, they tell us that
human beings do not always discover for themselves clever strategies that they could readily be
taught (watching a chess master play a duffer should also convince us of that).
21
Favero and Hendry (1989) reject the practical applicability of the Lucas critique for the demand
for money in the U.K.; Campos and Ericsson (1988) reject it for the consumption function in
Venezuela.
32 Kevin D. Hoover
different conclusions (e.g., that the universe expands forever or that it expands
and then collapses). Equally, the same values, given the range of competing
models, may result in very different conclusions. Nevertheless, we may all agree
on the form that answers to cosmological or economic questions must take,
without agreeing on the answers themselves.*
Kevin D. Hoover
Department of Economics
University of California, Davis
kdhoover@ucdavis.edu
REFERENCES
Altug, S. (1989). Time-to-Build and Aggregate Fluctuations: Some New Evidence. International
Economic Review 30, 889-920.
Campos, J. and Ericsson, N. R. (1988). Econometric Modeling of Consumers Expenditure in
Venezuela. Board of Governors of the Federal Reserve System International Finance
Discussion Paper, no. 325.
Cartwright, N. (1983). How the Laws of Physics Lie. Oxford: Clarendon Press.
Cartwright, N. (1989). Natures Capacities and their Measurement. Oxford: Clarendon Press.
Cooley, T. F. and LeRoy, S. F. (1985). Atheoretical Macroeconometrics: A Critique. Journal of
Monetary Economics 16, 283-308.
Cournot, A. ([1838] 1927). Researches into the Mathematical Principles of the Theory of Wealth.
Translated by Nathaniel T. Bacon. New York: Macmillan.
Diamond, P. A. (1984). A Search-Equilibrium Approach to the Micro Foundations of Macro-
economics: The Wicksell Lectures, 1982. Cambridge, Mass.: MIT Press.
Durlauf, S. N. (1989). Locally Interacting Systems, Coordination Failure, and the Behavior of
Aggregate Activity. Unpublished typescript, November 5th.
Epstein, R. J. (1987). A History of Econometrics. Amsterdam: North-Holland.
Favero, C. and Hendry, D. F. (1989). Testing the Lucas Critique: A Review. Unpublished
typescript.
Friedman, M. (1953). The Methodology of Positive Economics. In: Essays in Positive Economics.
Chicago: Chicago University Press.
Friedman, M. (1957). A Theory of the Consumption Function. Princeton: Princeton University
Press.
Gorman, W. M. (1953). Community Preference Fields. Econometrica 21, 63-80.
Hansen, L. P. and Sargent, T. J. (1980). Formulating and Estimating Dynamic Linear Rational
Expectations Models. In R. E. Lucas, Jr. and T. J. Sargent (eds.), Rational Expectations and
Econometric Practice. London: George Allen & Unwin.
Hoover, K. D. (1984). Two Types of Monetarism. Journal of Economic Literature 22, 58-76.
Hoover, K. D. (1988). The New Classical Macroeconomics: A Skeptical Inquiry. Oxford:
Blackwell.
*
I thank Thomas Mayer, Kevin Salyer, Judy Klein, Roy Epstein, Nancy Cartwright, and Steven
Sheffrin for many helpful comments on an earlier version of this paper.
Quantitative Evaluation of Idealized Models in the New Classical Macroeconomics 33
Hoover, K. D. (1991). Scientific Research Program or Tribe? A Joint Appraisal of Lakatos and the
New Classical Macroeconomics. In M. Blaug and N. de Marchi (eds.), Appraising Economic
Theories: Studies in the Application of the Methodology of Research Programs. Aldershot:
Edward Elgar.
Hoover, K. D. (1992). Reflections on the Rational Expectations Revolution in Macroeconomics.
Cato Journal 12, 81-96.
Hoover, K. D. (1994). Econometrics as Observation: The Lucas Critique and the Nature of
Econometric Inference. Journal of Economic Methodology 1, 65-80.
Kydland, F. E. and Prescott, E. C. (1982). Time to Build and Aggregate Fluctuations. Econometrica
50, 1345-1370.
Kydland, F. E. and Prescott, E. C. (1990). Business Cycles: Real Facts and a Monetary Myth.
Federal Reserve Bank of Minneapolis Quarterly Review 14, 3-18.
Kydland, F. E. and Prescott, E. C. (1991). The Econometrics of the General Equilibrium Approach
to Business Cycles. Scandinavian Journal of Economics 93, 161-78.
Lucas, R. E., Jr. (1976). Econometric Policy Evaluation: A Critique. Reprinted in Lucas (1981).
Lucas, R. E., Jr. (1980). Methods and Problems in Business Cycle Theory. Reprinted in Lucas
(1981).
Lucas, R. E., Jr. (1981). Studies in Business-Cycle Theory. Oxford: Blackwell.
Lucas, R. E., Jr. (1987). Models of Business Cycles. Oxford: Blackwell.
Morgan, M. S. (1990). The History of Econometric Ideas. Cambridge: Cambridge University Press.
Prescott, E. C. (1983). Can the Cycle be Reconciled with a Consistent Theory of Expectations? or
A Progress Report on Business Cycle Theory. Federal Reserve Bank of Minneapolis Research
Department Working Paper, No. 239.
Prescott, E. C Prescott, E. C. (1986). Theory Ahead of Business Cycle Measurement. Federal
Reserve Bank of Minneapolis Quarterly Review 10, 9-22.
Sargent, T. J. (1989). Two Models of Measurements and the Investment Accelerator. Journal of
Political Economy 97, 251-287.
Sheffrin, S. M. (1983). Rational Expectations. Cambridge: Cambridge University Press.
Shoven, J. B. and Whalley, J. (1984). Applied General-equilibrium Models of Taxation and
International Trade. Journal of Economic Literature 22, 1007-1051.
Simon, H. A. (1969). The Sciences of the Artificial. Cambridge, Mass.: The MIT Press.
Watson, M. W. (1993). Measures of Fit for Calibrated Models. Journal of Political Economy 101,
1011-41.
This page intentionally left blank
John Pemberton
1. Introduction
This paper divides idealized models into two classes causal and non-causal
according to whether the idealized model represents causes or not. Although
the characterization of a causal idealized model may be incomplete, it is
sufficiently well-defined to ensure that idealized models specified using
restrictive antecedent clauses are non-causal. The contention of this paper is that
whilst such idealized models are commonly used in economics, they are
unsatisfactory; they do not predict reliably. Moreover, notions of causation that
cut across such models are required to suggest when the idealized model will
provide a sufficiently good approximation and when it will not.
Doubt is cast on the ability of simple causal idealized models to capture
sufficiently the causal complexity of economics in such a way as to provide
useful predictions.
The causalist philosophical standpoint of this paper is close to that of Nancy
Cartwright.
For the purposes of this paper a causal idealized model is an idealized model
that rests on simple idealized causes. The idealized models represent causes, or
the effects of causes, that operate in reality.
The inverse square law of gravitational attraction has an associated idealized
model consisting of forces operating on point masses, and this is a causal
idealized model by virtue of the fact that the forces of the model are causes. The
standard idealized models of the operation of a spring, a pendulum or an ideal
gas are causal by virtue of the fact that there are immediately identifiable causes
that underpin these models. Cartwright shows, in Natures Capacities and their
In: Martin R. Jones and Nancy Cartwright (eds.), Idealization XII: Correcting the Model.
Idealization and Abstraction in the Sciences (Pozna Studies in the Philosophy of the Sciences and
the Humanities, vol. 86), pp. 35-46. Amsterdam/New York, NY: Rodopi, 2005.
36 John Pemberton
way the validity of the conclusion which must stand or fall in accordance with
available empirical evidence.
Nagels example of such an assumption is firms behave as if they were
seeking rationally to maximize returns (1963, p. 215). The assumption
concerns the behavior of firms but does not require that they do actually
rationally seek to maximize returns merely that they behave as if this were the
case.
2. The assumptions are understood as dealing with pure cases and the model
is then descriptive of the pure case. The assumptions are then restrictive
antecedent clauses. Nagel comments: laws of nature formulated with reference
to pure cases are not useless. On the contrary, a law so formulated states how
phenomena are related when they are unaffected by numerous factors whose
influence may never be completely eliminable but whose effects generally vary
in magnitude with differences in attendant circumstances under which the
phenomena recur. Accordingly, discrepancies between what is asserted for pure
cases and what actually happens can be attributed to factors not mentioned in
the law (1963, pp. 215-16).
Nagel labels idealized models pure cases, and suggests that the restrictive
antecedent clauses that define them succeed by eliminating extraneous causes
not dealt with by the law (1963, p. 216). The simple idealized world is tractable
to analysis. The antecedent clauses define idealized models that are clearly non-
causal. The key question is how behavior in such an idealized model relates to
the behavior of systems in the real world. Behavior in the idealized model is
derived using deductive analysis that appeals to the simplifying assumptions
the restrictive antecedent clauses. In practice, an implicit assumption is made
but never stated in economics that the derived behavior pictured in the idealized
model does approximate the behavior of systems in reality this is here termed
the Approximate Inference Assumption and is discussed below.
Consider, for instance, the idealized model of perfect competition. The
assumptions are normally stated broadly as follows (McCormick et al. 1983,
p. 348):
1. Producers aim to maximize their profits and consumers are interested in
maximizing their utility.
2. There are a large number of actual and potential buyers and sellers.
3. All actual and potential buyers and sellers have perfect knowledge of all
existing opportunities to buy and sell.
4. Although tastes differ, buyers are generally indifferent among all units of
the commodity offered for sale.
5. Factors of production are perfectly mobile.
38 John Pemberton
5. Mathematical Economics
The Black and Scholes (hereafter B&S) paper on option pricing is a first-class
example in its economic field mathematical in style and almost universally
accepted (Black and Scholes 1973). For this reason it is also a good example of
economics that, for its practical application and in order to claim empirical
content, rests wholly upon the AIA to make inferences from a non-causal
idealized model to reality. This paper uses the B&S argument to illustrate the
use of the AIA and to give an example of its failure.
B&S make the following assumptions:
(a) The short-term interest rate is known and is constant through time.
(b) The stock price follows a random walk in continuous time with a
variance rate proportional to the square root of the stock price. Thus the
distribution of the stock price at the end of any finite interval is log-
normal. The variance rate of return on the stock is constant.
(c) The stock pays no dividends or other distributions.
(d) The option is European, that is, it can only be exercised at maturity.
(e) There are no transaction costs in buying or selling the stock or the option.
(f) It is possible to borrow any fraction of the price of a security to buy it or
to hold it at the short-term interest rate.
(g) There are no penalties for short selling. A seller who does not own a
security will simply accept the price of the security from a buyer, and
40 John Pemberton
will agree to settle with the buyer on some future date by paying him an
amount equal to the price of the security on that date.
(a), (e), (f) and (g) are false. (c) is generally false. (d) is false if the option is
American. In the case of (b) a far weaker assumption, namely that the price of
the stock is a continuous function of time, is false. These assumptions function
as restrictive antecedent clauses that define a non-causal idealized model.
Using these assumptions B&S derive a solution to the option-pricing
problem. By the end of their paper it is clear that the authors believe the solution
in the idealized model situation is applicable to real options.
It is rather surprising and worthy of note that many followers of B&S have
not only employed, implicitly, the AIA but appear to have used a stronger
assumption a precise inference assumption to the effect that the idealized
solution is precise in real situations. Cox and Rubenstein for instance write a
section of their leading textbook on option pricing under the title An Exact
Option Pricing Formula (1985, pp. 165-252).
B&S themselves conclude that:
the expected return on the stock does not appear in the equation. The option value
as a function of the stock price is independent of the expected return on the stock
(1973, p. 644).
The failure of the expected return on the stock to appear as a parameter of the
value in the option is a direct result of the powerful assumptions that B&S
employ for defining their idealized model. They have not succeeded in showing
that in real situations the expected return on the stock is not a parameter in the
value of an option. This logically incorrect conclusion demonstrates their use of
the AIA. (This is the equivalent of the conclusion that walking uphill is as quick
as walking downhill in the Narrow Island analogy below.)
Many economists have sought to demonstrate the relevance of the B&S
solution to real options by showing that similar results hold under different
(usually claimed to be weaker) assumptions than those used for the B&S
idealized model. Beenstocks attempt in The Robustness of the Black-Scholes
Option Pricing Model (1982) is typical. Unfortunately, these relaxations of
the assumptions merely tell us what would happen in ideal circumstances and do
little to bridge the gap to the real world. Beenstocks first conclusion is that
[o]ption prices are sensitive to the stochastic processes that determine
underlying stock prices . . . Relaxation of these assumptions can produce large
percentage changes in option prices (1982, p. 40). The B&S solution is not
necessarily a good approximation even in the carefully controlled idealized
situations where all except one of their assumptions are held constant. The AIA
simply does not work.
A more tangible practical example of the failure of the AIA arises when the
stock price movement is discontinuous. Discontinuities arise in a wide range of
Why Idealized Models in Economics Have Limited Use 41
In the Southern Seas, some way to the east and south of Zanzibar is a thin strip
of land that modern visitors call Narrow Island. An indigenous people inhabit
the island whose customs derive from beyond the mists of time. At the northern
end of the island is a hill cut at its midpoint by a high cliff which crashes
vertically into the sea. On the headland above the cliff, which to local people is
a sacred site, a shrine has been erected. It is the custom of the island that all
able-bodied adults walk to the shrine to pay their respects every seventh day.
42 John Pemberton
Despite its primitive condition the island possesses the secret of accurate
time-keeping using instruments that visitors recognize as simple clocks. In
addition to a traditional name, each village has a numeric name, which is the
length of time it takes to walk to the shrine, pay due respects and return to the
village. The island being of a fair length, the numeric names stretch into the
thousands.
Many years ago one of the first visitors to the island from Europe was a
traveler the islanders called Professor White. Modern opinion has it that White
was an economist. What is known is that at the time of his visit the local people
were wrestling with the problem of establishing how long it took to walk
between villages on the island.
The argument of Professor White is recorded as follows:
The problem as it stands is a little intractable. Let us make some assumptions.
Suppose that the island is a straight line. Suppose the speed of walking is uniform.
Then the time taken to walk between two villages is half the absolute difference
between their numeric names.
The islanders were delighted with this solution and more delighted still when
they found how well it worked in practice.
It was noted with considerable interest that the time taken to walk between
two villages is independent of the height difference between them. As the island
is quite hilly this astonishing result was considered an important discovery.
Stories tell that so great was the islanders enthusiasm for their new solution
that they attempted to share it with some of their neighboring islands. To this
day there is no confirmation of the dreadful fate which is said to have befallen
an envoy to Broad Island, inhabited by an altogether more fearsome people,
who were apparently disappointed with the Narrows solution.
Although the Narrow Islanders have used their solution happily for many
years, more recently doubts have begun to emerge. In some parts of Narrow,
where the island is slightly broader, reports suggest that the time between
villages on opposite coasts is greater than Professor Whites solution would
suggest. Others who live near the hills quite frankly doubt that the time taken to
walk from the bottom of the hill to the top is the same as the time taken to walk
from the top to the bottom.
It is a tradition amongst the islanders that land is passed to sons upon their
marriage. Upon the marriage of a grandson the grandfather moves to a sacred
village near the middle of the island which is situated in the lee of a large stone
known as the McCarthy. The McCarthy Stoners report that Professor Whites
solution suggests it is quicker to walk to villages further south than it is to walk
to villages that appear to be their immediate neighbors.
The islands Establishment continues to point out that Professor Whites
solution gives the time to walk between two villages, so that the time taken on
Why Idealized Models in Economics Have Limited Use 43
The Narrow Island analogy shows how the use of a non-causal idealized model
breaks down. It may represent a good approximation most of the time, but on
occasion it is no approximation at all. Moreover, the model itself provides no
clues as to when it is, and when it is not, applicable. We have this knowledge (if
at all) from a consideration of broader factors; it would seem these must include
causal factors.
A causal idealized model always provides a good approximation whenever it
captures enough of the causes sufficiently accurately, and the causes modeled
operate in reality in a sufficiently undisturbed way. Often our knowledge of the
situation will allow us to judge whether this is likely to be the case.
The visitors solution to the Narrow Island problem is correct; it is a robust
causal solution. A similar solution exists for the option pricing problem.
44 John Pemberton
A pdf may be chosen that makes allowance for all the causes, both known and
unknown, and the discounted expected value calculated to provide a robust
approximate measure of value. Sensitivity to changing assumptions may be
checked.
In the physical sciences real working systems can sometimes be constructed that
arrange for causes to operate in just the right way to sustain the prescribed
behavior of the system. Pendula, lasers and car engines are examples. Whilst the
systems are working, aspects of their behavior may be described by functional
relationships (again, in Russells (1913) sense) between variables. The design of
the system ensures consistency, repeatability and reversibility.
Real economies are not so neat; they have a complex causal structure, and
are more akin to machines with many handles, each of which is turned
continually and independently. The state of the machine is continually changing
and evolving as the handles are turned. The effect of turning a single handle is
Why Idealized Models in Economics Have Limited Use 45
10. Conclusion
John Pemberton
Institute of Actuaries, London
john.pemberton@talk21.com
REFERENCES
Beenstock, M. (1982). The Robustness of the Black-Scholes Option Pricing Model. The Investment
Analyst, October 1982, 30-40.
Black, F. and Scholes, M. (1973). The Pricing of Options and Corporate Liabilities. Journal of
Political Economy 82, 637-54.
Cartwright, N. (1989). Natures Capacities and their Measurement. Oxford: Clarendon Press.
Cox, J. C. and Rubinstein, M. (1985). Options Markets. Englewood Cliffs, NJ: Prentice-Hall.
Debreu, G. (1959). Theory of Value: An Axiomatic Analysis of Economic Equilibrium. New York:
Wiley.
Friedman, M. (1953). The Methodology of Positive Economics. In: Essays in Positive Economics,
pp. 637-54. Chicago: University of Chicago Press.
Gibbard, A. and Varian, H. (1978). Economic Models. Journal of Philosophy 75, 664-77.
McCormick, B. J., Kitchin, P. D., Marshall, G. P., Sampson, A. A., and Sedgwick, R. (1983).
Introducing Economics. 3rd edition. Harmondsworth: Penguin.
46 John Pemberton
Mill, J. S. (1967). On the Definition of Political Economy. In: J. M. Robson (ed.), Collected Works,
Vols. 4-5: Essays on Economics and Society, pp. 324-42. Toronto: Toronto University Press.
Nagel, E. (1963). Assumptions in Economic Theory. American Economic Review: Papers and
Proceedings 53, 211-19.
Russell, B. (1913). On the Notion of Cause. Proceedings of the Aristotelian Society 13, 1-26.
Amos Funkenstein
1. The Problem
In her recent book on Natures Capacities and their Measurement (1989), Nancy
Cartwright argued forcefully for the recognition of capacities as an indispen-
sable ingredient of causal scientific explanations. Idealizations in science assume
them anyhow; abstract laws explain their causal impact; symbolic representation
secures their proper formulation and formalization. So very satisfactory is the
picture of the language and operation of science thus emerging that one wonders
why the language of capacities, which indeed once dominated the language of
science, was ever abandoned. Aristotle had based his systematic understanding
of nature on potentialities () and their actualization; his terminology
and perspective ruled throughout the Middle Ages, and was abandoned at least
explicitly only in the seventeenth century. But why?
The historical retrospection may also lead us towards some more systematic
insights into the difficulties involved in the notion of natures capacities. These
difficulties may not be insurmountable: and it may or may not turn out to be the
case that we stand more to gain than to lose by readopting the lost idiom with
due modification. With this hope in mind I shall begin a historical account
which, needless to say, is highly schematized and therefore not above suspicion
of bias or error.
In: Martin R. Jones and Nancy Cartwright (eds.), Idealization XII: Correcting the Model.
Idealization and Abstraction in the Sciences (Pozna Studies in the Philosophy of the Sciences and
the Humanities, vol. 86), pp. 47-58. Amsterdam/New York, NY: Rodopi, 2005.
48 Amos Funkenstein
p. 402).1 This latter view, I shall argue, eventually drove out the Aristotelian
language of capacities, their realization or the lack thereof (privation).
Indeed, Aristotles nature is rather a ladder of many natures, classified
in orders down to the most specific nature. The nature of sublunar bodies was
to him unlike the nature of celestial bodies. The former are made of a stuff
which, by nature (), comes-to-be and decays; and its natural motion is
upwards or downwards in a straight line sublunar bodies are, of necessity,
either light or heavy (1922, 2.301 a 20ff.). The latter are imperishable, and
move by their nature in a perfect circle.2 Further down the ladder, we come
across more particularized natures (forms) until we reach the most specialized
nature, the species specialissima. In order to secure the objectivity of his
classification of nature, Aristotle commits himself to a heavy ontological, or at
least a priori, assumption: namely, that a specific difference within a genus
(say, rationality within mammals) can never appear in another genus (say,
metals);3 at best he admits analogous formations (1912, A 4.644 a 15ff.).
To this view of natures of diverse groups of things Aristotles predicate-
logic was a most suitable instrument (organon). All scientific propositions,
inasmuch as they gave account of the nature of things, should be able to be cast
into the form S P. But what if S, a subject, actually lacks a predicate which by
nature belongs to it? In the case of such an unrealized capacity Aristotle
spoke of privation (). So important to his scientific enterprise was
this concept that he did not shy away from ranking it, together with form ()
and matter (), as chief causes or principles of all there is (1957,
2, 1069 b 32-4; 22, 1022 b 22-1023 a 7; and cf. Wolfson 1947). But it is an
ambiguous notion. Already the logicians of Megara recognized that it commits
Aristotle to the strange view that a log of wood, even if it stays under water for
the duration of its existence, is nonetheless burnable (Kneale and Kneale
1962, pp. 117-28). Worse still, while the negation of a negation, assuming the
principle of excluded middle, is perfectly unequivocal, the negation of a
privation is not; it either negates the proposition that S P or says that the
proposition S P is a category mistake, but not both.4
1
I have elaborated this demand for homogeneity in my (1986), pp. 29, 37-9, and 63-97.
2
The Greek obsession with circularity (Koyr) can be said to have ruled, unchallenged, until
Kepler even while any other astronomical presupposition, e.g., geocentricity, was debated. An
exception of sorts in very vague terms were the atomists: cf. Cicero (1933), 1.10.24.
3
Aristotle (1958), Z 6.144 b 13ff., and (1957), Z 12, 1038 a 5-35. That Aristotle actually dropped
this requirement in his biological research was argued by Furth (1988, pp. 97-8). He also joins those
who maintain that Aristotles main objective was the complete description of species, not their
hierarchical classification. On the methodological level, this principle corresponds to the demand
not to apply principles of one discipline to another: Aristotle (1960), A 7.75 a 38-75 b 6. Cf.
Livesey (1982), pp. 1-50, and Lloyd (1987), p. 184ff.
4
Because of this ambiguity, it could serve Maimonidess negative theology as a way to generate
divine attributes; cf. my (1986), p. 53, nn. 41 and 44.
The Revival of Aristotles Nature 49
The coalescence of properties that makes a singular thing into that specific
thing is its form medieval schoolmen will speak of the substantial form.
The form determines the capacities of a thing: if an essential property is missing
by accident from that thing, we speak of a privation, as when we say that
Homer was blind or Himmler inhuman. For, below the level of the infima
species, what particularizes an object (down to its singular features) is matter,
not form: wherefore Aristotle could never endorse Leibnizs principle of the
identity of indiscernibles. Only some unevenness in their matter explains why
one cow has a birthmark on her left shoulder while her twin has none. Matter,
however, can by definition not be cognited if cognition is, as Aristotle
thought, an assimilatory process of knowing the same by the same, an identity
of the form of the mind with that of the object cognited.5 There is no direct
knowledge of singulars qua singulars. Nor is it needed for the purposes of
science, which always consists of knowledge of common features, of universals.
Such, in broad outlines, was the meaning of nature in the most successful
body of scientific explanations during antiquity and in the Middle Ages. Indeed,
it articulated a deep-seated linguistic intuition. The Greek word , much as
the Latin natura, is derived from the verb to be born, to become (, nasci).
The nature of a thing is the set of properties with which this thing came into
the world in contrast to acquired, mechanical or accidental properties.
Milk teeth are called, in ancient Greek, natural teeth ( o). A
slave by nature is a born slave, as against an accidental slave captured and
sold by pirates or conquerors (Aristotle 1957, A5, 1254 b 25-32, and A6, 1255 a
3ff.). At times, custom will be called second nature. In short: the term
nature was attached, in common discourse, to concrete entities. The set of
properties with which something is born was its nature.6 Aristotle
accommodated this intuition with the important condition that such essential
properties be thought of as capacities, as discernible potentialities to be
actualized.
Were there, in antiquity, other conceptions than Aristotles hierarchy of
natures, anticipations of a more modern sense of the uniformity of nature?
Indeed there were, but they remained, through the Middle Ages, vague and
marginal. From its very beginning in the sixth century, Greek science sought the
causes () of all things. This demand of the reached its
culmination with Parmenides, who postulated the existence of but one
indivisible being ( o), without differentiation or internal qualification,
5
On the Aristotelian assumption of o o see Schneider (1923). It is the basis for
Aristotles distinction between a passive intellect, which becomes everything, and an active
intellect, which makes everything ( ) (1961, 5, 430 a 14-15). This
distinction leads to later, far-reaching theories of the active intellect.
6
Hence also the conviction that, explaining the origin of a nation, one uncovers its nature or
character: cf. Funkenstein (1981), especially pp. 57-9.
50 Amos Funkenstein
While these deviant traditions were not without importance in the preparation
for an alternative scheme of nature, a more serious and enduring challenge to
Aristotle arose within the scholastic discourse itself in the turn from the
The Revival of Aristotles Nature 51
7
Above, n. 5.
52 Amos Funkenstein
a distinction the ancients never made. Every order or nature was to them at best
an empirical, contingent fact.
How far did the new discourse really permeate the explanation of natural
phenomena? A telling example of the sixteenth century shows how indeed
Aristotles natures, forms, or Thomass substantial forms (with their
natural capacities) were in fact dethroned, and turned into mere properties or
forces. In his treatise on Fallacious and Superstitious Sciences, Benedict
Pereira whose works were read by Galileo summed up, among other matters,
the history and the tat de question of discussions concerning the value of
alchemy, and concluded:
No philosophical reason exists, either necessary or very probable, by which it can be
demonstrated that chemical accounts of making true gold are impossible (Pererius
1592, pp. 87-8).
The arguments against turning baser metals into gold that he found in the
Aristotelian-Thomistic tradition (Avicenna, Averros, Thomas, Aegidius
Romanus) were all variations on the assumption that a substantial form induced
by a natural agent cannot be induced artificially. Artificial gold may, they said,
resemble natural gold, but it will always differ from it in some essential
properties (amongst which is the therapeutic value of gold for cardiac diseases):
just as a louse generated inorganically from dirt will differ in specie from a
louse with respectable organic parents (Pererius 1592, pp. 75-82, esp. p. 78).
Pereira discards this argument as unwarranted. The natural agent that molds
other metals into gold is the suns heat, a property shared by fire, which can
certainly be induced artificially at least in principle. Yet Pereira, too, believed
that it is, with our technical means, impossible to do so and that claims of
success are fraudulent. While possible in principle, given our tools, knowledge
and the fact that no case has been proven, it seems to him that it will at best be
much more expensive to generate gold artificially, even should we know one
day how to do so, than it is to mine it.
The forms have visibly turned into qualities and forces here as elsewhere
in the natural philosophy of the Renaissance. Empirical evidence rather than a
priori argument governs even the scholastic dispute. And the distance between
the natural and the artificial seems considerably undermined. A new language of
science was about to be forged. The role of fourteenth century scholastic debates
in this process was largely a critical one: to purge the scientific discourse of
excess ontological baggage.
The Revival of Aristotles Nature 53
4. Galileos Idealizations
8
This is the language of the late medieval impetus mechanics, initiated by Olivi and Franciscus de
Marchia, developed by Buridan and Oresme, held still by Dominicus de Soto. It responded to the
most embarrassing question in Aristotles mechanics: why does a projectile continue to move
cessante movente? Cf. Wolff (1978), and my (1986), pp. 164-71.
9
Galilei (1890-1909); Clavelin (1974), pp. 120ff., esp. 132-3 on the hydrostatic analogy; Drake
(1978), pp. 21-32, esp. p. 28ff. on Pereira as the source from which Galileo learned the Hipparchian
hypothesis.
54 Amos Funkenstein
continue to move with uniform velocity. Now, given that uniform acceleration
can be expressed in terms of uniform motion v = (v0 + vt)/2, an application of
this mean speed theorem to the distance traveled in the time t, renders the
formula for the free fall (see Funkenstein 1986, pp. 171-4 for the literature).
If g be the steady increment, at any time unit, no matter how small, then (in our
notation)
d = g/2 + 3g/2 + 5g/2 + . . . + gt/2 = [g + (t 1)g] t/2 = gt2/2;
Galileos notation was geometrical.
I have schematized Galileos reasoning to some extent, but not overly so.
Without the free-fall law, he could not have found the parabolic path of a
trajectory; without the latter, Newton could not have proven that the same law
that governs free fall also keeps the earth in an elliptic orbit around the sun. Its
importance has never been exaggerated.
Two new mental habits, it seems to me, made this achievement possible, one
theoretical and one practical. The theoretical habit was the willingness to accept
counterfactual states as asymptotically approachable limiting cases of reality.
The implicitly employed principle of inertia describes such a limiting case, and I
shall return to its discussion presently. The practical bent of mind that changed
since the Middle Ages was the ability to measure accelerations by means of a
mechanical clock. One cannot measure accelerations with sundials or common
types of sand-clocks. Galileo united the two.
5. Idealizations or Capacities
These, then, were the historical circumstances in which the language of capaci-
ties was abandoned and another idiom that of idealizations appropriated. We
should not, however, fall into the fallacy of origins: the fact that capacities were
abandoned does not constitute an irrefutable argument against them. The anti-
Aristotelian zealotry of early modern science may have blocked advantageous
strategies for explaining nature. Perhaps nature is not homogeneous after all.
Or perhaps, albeit uniform, it is inhabited with a multitude of free-floating
capacities not capacities of this or that particular class of objects (natures
in the Aristotelian sense) but capacities of one and the same nature (in the
singular). And perhaps such, I believe, is Nancy Cartwrights argument
idealizations (of the Galilean type) presuppose capacities anyway.
The former arguments are a matter of creed; the latter is not. In her book,
Nancy Cartwright quotes my statement that, for Galileo, the limiting case, even
where it did not describe reality, was the constitutive element in its explanation;
and argues that it is
The Revival of Aristotles Nature 55
a glaring non sequitur. Funkenstein has been at pains to show that Galileo took the
real and the ideal to be commensurable. Indeed, the section is titled Galileo:
Idealization as Limiting Cases. So now, post-Galileo, we think that there is some
truth about what happens in ideal cases; we think that that truth is commensurable
with what happens in real cases; and we think that we thereby can find out what
happens in ideal cases. But what has that to do with explanation? We must not be
misled by the ideal. Ideal circumstances are just some circumstances among a great
variety of others, with the peculiarly inconvenient characteristic that it is hard for us
to figure out what happens in them. Why do we think that what happens in those
circumstances will explain what happens elsewhere?
Because, she continues,
[w]hen nothing else is going on, you can see what tendencies a factor has by looking
at what it does . . . but only if you assume that the factor has a fixed capacity that it
carries with it from situation to situation (1989, pp. 189-91).
Now, I would argue that capacities (or, in the case of Galileo, forces) depend,
for their articulation, on ideal cases rather than vice versa. Cartwright would
have been right if ideal cases were only an inductive extrapolation from real
cases, the corroboration, so to say, of an interpretive hunch we gather from
looking at a series of situations in which one or more variables (disturbing
factors) is diminished continuously. If, however, the ideal case is not just
some circumstances among others but a circumstance in which Galileo knows
a priori that such-and-such is the case say, that a body will continue its
rectilinear, uniform motion then the ideal case is nothing but an explication or
articulation of what we mean when we say that a body has the capacity or
tendency to so move. The other, more blurred cases, which certainly can be
interpreted many ways, now are measured by the yardstick of the ideal case.
This is what Galileo meant by explaining, and indeed Cartwright admits that
[t]he fundamental idea of the Galilean method is to use what happens in special or
ideal cases to explain very different kinds of things that happen in very non-ideal
cases (1989, p. 191).
It seems as if Cartwrights capacities and my idealizations10 are locked
into a hermeneutical circle of sorts. To say that a body under ideal conditions
will necessarily act in such-and-such way amounts, of course, to saying much
more than that it has the possibility of so acting: in looking for a cause of the
behavior of that body, I name it capacity or force. On the other hand, to say
that a body has the capacity to act in such-and-such a way even if this capacity
is never fully actualized can mean several things. If we deal with a capacity that
is specific to a certain class of bodies (or even to most bodies) but is not
actualized in this particular body, no limiting cases are required; it is a more or
10
Meaning: ideal conditions as asymptotically approachable limiting cases.
56 Amos Funkenstein
less legitimate inductive generalization. If, however, we mean that this capacity
is never actualized under real conditions, then to say that this body has the
capacity in question is to say nothing more than that, under ideal conditions, it
would act in a certain way. Such is the case of Galilean idealizations, in which
we are dealing with a capacity if one prefers to call it so which all bodies
share albeit they never actually manifest it. The ideal conditions define the
capacity in question.
Within a uniform, homogeneous nature there is, then, no further meaning to
capacities but idealizations. Yet the uniformity of nature is, in itself, only an
ideal of science, an ideal that emerged, for better or worse, in the seventeenth
century.
Could it be that the ideal is a mistaken or misguided one? Indeed it could be
the case, and, if so, then our discussion of capacities would take a different
turn. In a nature like Aristotles, in which different objects (or classes of
objects) have profoundly different properties, the language of capacities
seems much more suitable, since it indicates precisely the circumstance that
while, under certain (even ideal) conditions, bodies of a certain kind behave in a
certain way, other bodies do not. Hadrons act on each other in a different way
than leptons.
Now, here we enter the realm of ideals or visions of science; and we are
free to choose among them. Modern physicists seek to unite all four forces of
nature but the quest for a unified theory may be a hunt for the snark. Kant,
who endorsed the principle of parsimony (which ultimately guides the vision of
a uniform nature) entia praeter necessitatem non sunt multiplicanda
balanced it with a contrary principle or regulative idea, namely not to fear the
variety of nature (1964, B670-96).11
Amos Funkenstein
was the Koret Professor of Jewish History in the Department of History at the
University of California, Berkeley, and University Professor at the University of
California
REFERENCES
11
Kants account of contradictory regulative ideas which are sustained nonetheless in the interest
of reason reminds one of the principle of complementarity except that it operates only at the
metatheoretical level, never in the interpretation of a concrete phenomenon.
The Revival of Aristotles Nature 57
Wallace, W. A. (1984). Galileo and His Sources: The Heritage of the Collegio Romano in Galileos
Science. Princeton, N.J.: Princeton University Press.
Witelo (1908). Liber de intelligentiis. In C. Baeumker (ed.), Witelo, ein Philosoph und
Naturforscher des XIII. Jahrhunderts, 1-71. Mnster: Aschendorff.
Wolff, M. (1978). Geschichte der Impetustheorie: Untersuchungen zum Ursprung der klassischen
Mechanik. Frankfurt-am-Main: Suhrkamp.
Wolfson, H. A. (1947). Infinite and Privative Judgements in Aristotle, Averros, and Kant.
Philosophy and Phenomenological Research 7, 173-87.
James R. Griesemer
1. Introduction
This essay is about the nature of Darwinian evolutionary theory and strategies
for generalizing it. In this section I sketch the main argument.
In: Martin R. Jones and Nancy Cartwright (eds.), Idealization XII: Correcting the Model.
Idealization and Abstraction in the Sciences (Pozna Studies in the Philosophy of the Sciences and
the Humanities, vol. 86), pp. 59-115. Amsterdam/New York, NY: Rodopi, 2005.
60 James R. Griesemer
structure of scientific objects and suggest that, for purposes of theory generali-
zation, abstraction is relative to an empirical, rather than a metaphysical,
background theory. Scientific background theories do roughly the same work in
my view that Aristotles conception of four types of explanatory causes does in
Cartwright and Mendells: things are more abstract if they have causes specified
in fewer explanatory categories. Because I seek to understand abstraction as a
scientific process rather than as a formal structure, our projects differ on where
to seek explanatory categories: in metaphysical theories such as Aristotles or in
empirical background theories. The two abstraction strategies for generalizing
evolutionary theory both depend on an empirical background theory of
biological hierarchy for explanatory categories.
In addition, both strategies rely on theories of hierarchy to ground their
abstractions. The first strategy assumes the biological hierarchy as a structural
framework within which to formulate evolutionary theory as a set of functions
without any justification. As a result, it is difficult to apply the generalized
theory to biological entities not easily located among its a priori compositional
levels. The background theory of hierarchy ought to guide us through
applications of the generalized evolutionary theory to problematic cases, but
only the fact, not the theory of hierarchy is specified. This lends the mistaken
impression that the levels of the hierarchy are unchangeable facts of conceptual
framework rather than empirical products of evolution itself, even though many
major phases of evolutionary history origins of cells, eukaryotes, sex, embryo-
genesis, sociality involve the creation of new levels of organization.
The second strategy assumes a well-developed, but false background theory.
I will call this theory molecular Weismannism to distinguish it from views
held by August Weismann. A central aim of the paper, in distinguishing
Weismann from Weismannism, is to isolate the role in the second strategy of the
twentieth-century concept of molecular information from its nineteenth-century
roots in material theories of heredity. Information has been made to do too much
of the explanatory work in philosophical accounts of evolutionary theory and
too many of the explanatory insights of the old materialism have been lost. The
problem has become acute because evolutionary theory has become
hierarchical. Information provided a useful interpretive frame for working out
aspects of the genetic code in the 1960s (at the dawn of the information age),
but that was when evolution was a single-level theory about the evolution of
organisms. Now, evolution is about all possible levels of biological
organization. Biologists and philosophers tell nice stories about how the levels
fit together, but they are more rhetoric and sketch than flesh and blood theory.
The problem is how to square the information analysis at the molecular level
with the belief that evolution may occur at a variety of levels: the more seriously
one takes information, the more likely one is to deny that evolution happens at
any level other than that of the gene. If one takes information too seriously, it
The Informational Gene and the Substantial Body 61
can even come to replace the molecular concept of the gene altogether. The
result is a theory that genes are evolutionary information. Hence if evolution
occurs at other levels, there must be genes at those levels. As formal theory,
this reflexive definition of the gene in terms of evolutionary theory works, but it
deprives empirical science of the conceptual resources needed to look for
material evidence of genes at other levels. Thus the problem of squaring
information with a belief in hierarchical evolution is replaced by the problem of
squaring a philosophical exposition of the formal structure of evolutionary
theory with the fuller, richer resources needed for evolutionary science.
Undoubtedly I place too much emphasis on information here, but it is a pressure
point: something is rotten in Denmark and it is yet unclear whether the rot is a
small piece or the whole cheese.
Molecular Weismannism is a theory relating genes and organisms as causes
and effects. It appears to be a species of pure functionalism whose theoretical
entities are defined solely in terms of their evolutionary effects, as a result of the
operation of the evolutionary process. It replaces Weismanns nineteenth-
century analysis of the flow of matter (germ- and somato-plasm) with a flow of
molecular information as basic to evolutionary theory. I say appears because
contingent, material features of the traditional organizational hierarchy invoked
by the first strategy are tacitly smuggled into the inferences made under the
second strategy. Insofar as claims to ontological simplification due to the
concept of informational gene rest on the purity of the functionalism,
contamination by material and structural considerations casts doubt on these
claims. In contrast to purists of the second strategy, I find this contamination
useful and instructive. I see a return to a more materialist conception of the gene
as part of an alternative strategy to formulate multi-level evolutionary theory.
I argue further that molecular Weismannism misrepresents important
features of the complex relationship between genes and organisms. One of the
most interesting representations concerns its analysis of the lower level of the
two-level hierarchy, the level of replicators or informational genes. The copying
process by which DNA transmits genetic information is taken to be the model
replication process, but consideration of the molecular biology reveals important
disanalogies between gene replication and copying processes.
The disanalogies raise doubts about the adequacy of an evolutionary theory
that defines replication and replicators in terms of copying and indeed raise
serious questions about the applicability of the concept of information to the
evolutionary analysis of replication. A more crucial point is that the disanalogies
are revealed by taking into account the very properties of biological matter (the
details of how genes code information) that are supposed to be removed by
abstraction in a successful generalization of evolutionary theory. If correct, this
critique undermines the use of molecular Weismannism as a purely functionalist
basis for generalization-by-abstraction. The criticism rests on the worry that
62 James R. Griesemer
1
I take it that reference to a mechanism implies reference to the matter out of which the mechanism
is constructed. To take an ordinary example, to speak of a watchwork mechanism for telling time is
to speak of specific (even if not specified), concrete materials arranged in such a way that under
certain specifiable conditions, the mechanism is caused to operate in a specifiable way or to perform
a specifiable function. Thus, mechanistic explanations are more concrete than either purely
functional (no reference to matter) or purely structural (no reference to motion) explanations.
The Informational Gene and the Substantial Body 63
preserved, by the term of Natural Selection, in order to mark its relation to mans
power of selection (Darwin 1859, p. 61).
Although quite central to biology, organisms constitute only one form of biolo-
gical organization and manifest only some of the diverse phenomena of interest
to evolutionists. Moreover, evolutionists have not come to grips with all the
sorts of organismal phenomena that biologists have already described. At least
three forms of restriction are evident in the above passage:
1. Darwins theory is a theory of evolution (primarily) by means of natural
selection. It does not (adequately) cover evolution by other means, such as
genetic drift or biased (so-called Lamarckian) inheritance.
2. Darwins is a theory of the evolution of organisms (in populations).2
However, it supplies no criterion for individuating organisms. Rather,
Darwin presented exemplary cases of organismal evolutionary analysis in
On the Origin of Species, leaving vague the intended, let alone the actual,
scope of the theory. Despite the profound faith of some biologists, it is
uncertain whether neo-Darwinian theory, as developed in the twentieth
century to include population genetics and the conceptual apparatus of the
evolutionary synthesis, fully applies to clonal organisms, plants, symbiotic
protists, bacteria, and viruses. As Hull puts it, the bias of evolutionists is to
treat the higher vertebrates as the paradigm organisms, despite their rarity
and atypicality.3
3. Darwins is a theory of selection operating at the organismal level. But
biology concerns many levels of organization, from molecules to organelles,
to cells, to organisms, to families, groups, populations, species, higher taxa
and ecological communities. Many evolutionists suspect or believe that
selection may occur at many levels of organization, but it is uncertain
whether Darwins theory applies without modification to these other levels
or exactly what modifications might be needed, or even how the levels
should be individuated.4 In addition, many philosophers and some biologists
expect that evolutionary theory in a broad sense applies to cultural and/or
conceptual change. It is controversial what form a theory of cultural or
conceptual evolution should take and whether it would turn out to be a
generalization of Darwinism or of some other, non-Darwinian, theory
2
Lewontin (pers. comm.) argues that Darwins theory is inherently hierarchical because it explains
the evolution of populations (ensembles) in terms of causes such as natural selection operating on
the members. Nevertheless, this fact about evolutions hierarchical character alone does not suffice
to make the theory general. Rather, it merely divides the elements of the theory (account of causes,
account of effects) into accounts at different descriptive levels.
3
Hull (1988). See also Jackson et al. (1985) and Buss (1987).
4
The fact that Darwin explored a kin selection explanation of the evolution of traits in sterile
castes of social insects establishes neither that the scope of his theory includes all levels of
organization nor that the theory adequately covers kin groups.
The Informational Gene and the Substantial Body 65
(Boyd and Richerson 1985; Hull 1988; Griesemer 1988; Griesemer and
Wimsatt 1989).
3. Lewontins Generalization-by-Abstraction
One way to generalize neo-Darwinian theory is by abstraction, removing the
reference to organisms to make the theory applicable to any level of
organization. There are two prominent abstraction strategies in the units
of selection literature. The first takes the biological hierarchy to define a set of
entity types, names of which may be substituted for occurrences of organism to
produce a more general theory.
There are several ways one might construe the project of generalizing evolutio-
nary theory. One strategy involves abstraction as a process of representing
objects of scientific explanation.5 An important Platonic conception is that
abstraction is a mental process by which properties are thought of as entities
distinct from the concrete objects in which they are instantiated. On an
alternative, Aristotelian conception, abstraction is the mental process of
subtracting certain accidental properties from concrete objects so as to regard
objects in a manner closer to their essential natures. Objects so considered still
have properties of actual substances, but they are regarded in isolation from
accidental features, such as the material in which a particular concrete triangle is
inscribed. The generalization strategy in selection theory on which I shall focus
below takes certain functions (replication, interaction) to be essential to the
entities that participate in selection processes and the matter that implements
these functions with concrete mechanisms to be accidental contingencies of the
actual history of life on Earth.
The view of abstraction relevant to science that serves as a framework for
this essay is a modification of the Aristotelian notion due to Cartwright and
Mendell (Cartwright and Mendell 1984). They point out that the Aristotelian
concept does not lead to a strict ordering of things from concrete to abstract
since there is no obvious way to count properties to determine which of two
entities has more of them, and hence is more concrete. They modify this
conception to consider kinds of explanatory properties, following an
Aristotelian model of four kinds of cause material, efficient, formal and final
mention of which they require to count an explanation complete. Their revision
leads not to an analysis of the abstract, but to a partial-ordering criterion for
degrees of relative abstractness:
5
Duhem (1954); Cartwright and Mendell (1984); Cartwright (1989); Darden and Cain (1988);
Griesemer (1990). The last discusses the basis of abstraction as a process in the material culture of
science rather than in a logic of the abstract.
66 James R. Griesemer
The four causes of Aristotle constitute a natural division of the sorts of properties that
figure in explanation, and they thus provide a way of measuring the amount of
explanatory information given. We can combine this with the idea of nesting to
formulate a general criterion of abstractness: a) an object with explanatory factors
specified is more concrete than one without explanatory features; and b) if the kinds
of explanatory factors specified for one object are a subset of the kinds specified for a
second, the second is more concrete than the first (Cartwright and Mendell 1984,
p. 147).
In light of this view of degrees of explanatory abstraction, how might
evolutionary theory be generalized? One way is to abstract from organisms by
subtracting the reference to them in the statement of Darwins principles (or
more accurately: by subtracting the reference to their form and matter). That
way, the principles may be taken to range over any entities having Darwinian
properties regardless of whether these are such as to lead us to call them
organisms or even whether they are material objects at all. Abstraction is a
central device for removing the reference to organisms in the two most
prominent programs for generalization discussed in the units of selection
literature.6
The first strategy stems from the work of Richard Lewontin, who characteri-
zed Darwinian evolutionary theory in terms of three principles which he claimed
are severally necessary and jointly sufficient conditions for the occurrence of
the process of evolution by natural selection.7 Lewontin suggested, in effect,
6
It should be noted that some prefer to distinguish two problems: the units of selection problem
and the levels of selection problem (see Hull (1980) and Brandon (1982)). For recent reviews of
the units and levels of selection problems, see Mitchell (1987), Hull (1988), Lloyd (1988), and
Brandon (1990). Hull (1988, ch. 11) gives an especially thorough review of the two programs with
the issue of strategies of theory generalization clearly in mind. Lloyd (1988) gives a detailed
semantic analysis in terms of abstract entity types.
7
1. Different individuals in a population have different morphologies, physiologies, and behaviors
(phenotypic variation). 2. Different phenotypes have different rates of survival and reproduction in
different environments (differential fitness). 3. There is a correlation between parents and offspring
in the contribution of each to future generations (fitness is heritable) ( Lewontin 1970, p. 1). Much
of the units of selection literature concerns whether these principles are necessary and sufficient, as
Lewontin claimed. Wimsatt (1980, 1981) argues that Lewontins principles give necessary and
sufficient conditions for evolution by means of natural selection to occur, but only necessary
conditions for entities to act as units of selection. Wimsatt claims an additional constraint on the
conditions must be made in order to distinguish entities which are units of selection from entities
merely composed of units of selection. Lloyd (1988) argues that Wimsatts criteria require still
further refinement. Sober (1981, 1984) and Brandon (1982, 1990) argue that the Lewontinian,
additivity or units of selection approach fails to address the central problem of describing
evolution as a causal process, but differ as to the form a causal analysis should take. Others (e.g.,
Sterelny and Kitcher 1988) reject this causal interpretation, arguing that the units of selection
problem rests on matters of convention for representing allelic fitnesses in their appropriate genetic
contexts. Although these are important problems, they are not of direct concern here. Rather, the
topic at hand is the abstraction strategy used in generalization regardless of which of these views is
The Informational Gene and the Substantial Body 67
right. I concur with the view that Lewontins principles are at most necessary conditions for
somethings being a unit of selection.
8
Lewontin (1970), p. 1. It may seem odd that Lewontin offers a generalization of Darwinism using
the very same term, individual, that Darwin used, but it is clear from the context that Lewontin
means something much broader than organism.
9
See Hull (1988), ch. 11, for a full critique of this tactic. It should be noted that Lloyds (1988,
p. 70) analysis in terms of entity types does not take the traditional hierarchy for granted, but rather
uses that hierarchy as an illustration of a system that satisfies her account. Other instances of part-
whole or member-class relations may also satisfy her definition of units of selection. As such, her
analysis defines units of selection for a family of semantically specified theories, one of which is,
presumably, Lewontins theory.
68 James R. Griesemer
which ranges over all of the kinds (levels) in the hierarchy, evolutionary theory
is generalized relative to the theory of the hierarchy. Since the hierarchy
determines the range of the relevant explanatory kinds, an evolutionary theory
which refers essentially to particular levels in the hierarchy is more concrete
than one which does not.
The reason an ordering criterion is desired concerns the strategy for
generalizing evolutionary theory. The aim is to specify abstract entities that are
supposed to play a given role in the evolutionary process. One produces a
general theory by quantifying over the abstract, functional entities rather than
over members of the original class of concrete instances at a given level of
organization. General laws or principles will be formulated in terms of
individuals rather than in terms of organisms since entities from any level
of the biological hierarchy might play the role of an individual in a population,
whereas the qualities of organisms will include some that are particular to
organisms. Populations might serve as individuals in a meta-population, for
example. Put differently, the idea is to identify those properties of organisms
essential to their role in evolution and, taking those properties in abstraction, to
define abstract entities (individuals) that satisfy that role. The ordering
criterion is needed to insure that the quantified theory is sufficiently abstract to
include entities at all relevant hierarchical levels within its scope. A theory
referring to concrete entities such as organisms will not be adequately generali-
zed merely by quantifying over its elements because entities higher and lower in
the hierarchy are not sufficiently like organisms to count as instances of the
same evolutionary kind. Populations are not super-organisms and cell compo-
nents are not organs. Only a theory that is more abstract than every member of
the set of theories for single levels in the hierarchy can meet the criterion.
Lewontin noted that one virtue of Darwins formulation of his principles is
that no mention is made of any specific mechanism of inheritance or specific
cause of differential fitness, even though both heredity and selection must be
understood as causal. Darwins theory is about a complex causal process, not
merely a description of effects or patterns. Moreover, fitness has been widely
interpreted as a property which supervenes on the physical properties of
organisms, so it would have been futile for Darwin to try to supply causes of
differential fitness that operate in general (Sober 1984; Rosenberg 1985). The
specific means of interaction between organisms and their selective
environments is less important in establishing the nature of natural selection as a
process than the fact that there is interaction of a certain sort.10 Although
Darwin clearly intended his theory of natural selection to describe causal
powers of nature, as suggested by the explicit analogy to mans power in the
10
On the concept of selective environment, see Brandon (1990).
The Informational Gene and the Substantial Body 69
11
Controversy and confusion over four theses provide some evidence of this difficulty: (1) that
biological species are individuals (Ghiselin 1974; Hull 1975, 1980, 1988); (2) that plant stems
individuate plant organisms (Harper 1977); (3) that standard evolutionary models cannot explain the
origin of organismality (Buss 1987); (4) that organisms are the product of symbiogenesis (Margulis
and Sagan 1986).
12
Ultimately, I think Harpers distinction is flawed for the very same reason that the abstraction
strategy described below is flawed: both rely on mechanisms that by their nature operate only at a
single level in virtue of structural and material properties peculiar to entities at that level. Lewontin
(pers. comm.) characterizes the flaw differently. He suggests that the Darwinian concept of a
variational theory for ensembles of organisms may be at fault because it forces us to count
individuals rather than to measure continuous quantities such as amount of matter or energy, as in
certain styles of ecological research.
70 James R. Griesemer
13
The discussion in Wade (1978) and Wimsatt (1980) of how migrant pool models of group
selection turn out to provide an analog of blending rather than Mendelian inheritance at the
organismal level is a revealing example of the systematic biases inherent in the heuristic use of
mathematical models for generalizing evolutionary theory, a subject Wimsatt has treated extensively.
14
When evolutionists say that one has to know the biology or know the organisms to do the
evolutionary theory, they refer to a tacit body of knowledge that includes a variety of reproductive
and ontogenetic mechanisms, ecological factors relevant to the generation of selective pressures,
facts about population size and distribution, and a wealth of other facts that all play a role in
determining whether something should count as an entity at a level. In a vein similar to my point
here, Elliott Sober (1988) argues persuasively for a role for background theory in the confirmation
of phylogenetic hypotheses. But more than this, he argues that claims that appear ontologically
neutral because they appeal to a purely methodological principle of parsimony turn out to hide
important empirical background theories or knowledge. Nevertheless, Sobers argument concerns
only the epistemological status of parsimony claims and of methods of phylogenetic inference. My
argument is concerned with the metaphysical standing of evolutionary theories resulting from a
certain strategy of theory construction, not only with the epistemology of evolutionary inferences.
The Informational Gene and the Substantial Body 71
15
At least, proponents have not always been clear. Wimsatt (1980), for example, criticizes genic
selection in ways that suggest that chunks of genome larger than single genes may be units of
selection. Because of his favorable discussion of Lewontins (1974a) continuous chromosome
model, some have interpreted Wimsatts argument as implying that there might be units of selection
up to the limit of the entire genome. But if the genome is the largest unit, this interpretation goes,
Wimsatt must have had Dawkinss replicator hierarchy in mind rather than an interactor hierarchy,
as others have supposed him to have been discussing. Brandon (1982, 1990) and Hull (1980, 1981)
have argued, for example, that Wimsatts analysis confounds discussion of replicators and
interactors. In his defense, Wimsatt (1981) gives an alternative interpretation of the confounding
in which the two questions are taken to be intertwined in virtue of the dual role of genes as both
autocatalytic and heterocatalytic. Wimsatt urges attention to the latter function of genes and an
analysis of the hierarchy of levels in terms of generative structures as an alternative means to
disentangle the concepts (see Wimsatt 1986). Moreover, Wimsatt (1981, table 3, p. 164) clearly
indicates that he includes entities at levels higher than the genome.
16
This is not an accident: genetics as a discipline made it a virtue in the first half of the twentieth
century to isolate questions of hereditary transmission from those of embryological development.
These now separate domains must be put back together to interpret the relevant background
72 James R. Griesemer
4. Dawkins-Hull Generalization-by-Abstraction
A second abstraction strategy for generalizing neo-Darwinism makes a
fundamental ontological shift. It replaces the traditional biological hierarchy as
a framework within which to formulate the theory. This strategy abstracts
certain functional roles of entities (and their components) from a description of
the evolutionary process at the levels of organisms and genes. The abstract
entities thus characterized support an ontology of biological individuals and
classes, leaving open the empirical question of whether entities at the levels of
the traditional biological hierarchy serve as units of selection.
assumptions. Nevertheless, there is irony here since Lewontin (1974a, 1974b) has given one of the
most penetrating critiques of acausal approaches, drawing a firm distinction between analysis of
variance and analysis of causes. But the causal role of Mendelian principles which Lewontin
champions, as opposed to the phenomenological, acausal analysis in terms of quantitative genetic
principles, does not address the core problem of causal analysis of evolution: Mendelism is only
part of the required causal story of heredity/development.
17
Hull (1980, 1981, 1988) argues that a coherent theory of evolution will require that the entities
which are units of selection or of evolution be individuals of a certain historical sort. He points out
that on this view group selection is incoherent, but that at least some of the entities biologists refer
to as groups actually function as evolutionary individuals.
The Informational Gene and the Substantial Body 73
18
As I will argue below, there are several versions of Weismannism, only one of which expresses
the views of Weismann himself, which must be distinguished in order to press the objections I make
to the second abstraction strategy (see also Griesemer and Wimsatt (1989)). The version operative
in the second strategy, molecular Weismannism, rests as much on the central dogma of
molecular genetics with its emphasis on the flow of genetic information (Crick 1958, 1970; cf.
Watson, et al., 1987) as it does on anything Weismann said about the material basis of heredity. See
also Maynard Smith (1975).
19
Dawkins (1976, 1978, 1982a, 1982b). For present purposes I will set to one side Dawkinss claim
that there can be cultural units of selection, memes, as well as biological units. Since I am
concerned here with the form of the generalization strategy, consideration of non-biological entities
would complicate but not add to my analysis.
20
See Wimsatt (1980) for the analysis of the bookkeeping argument and a criticism.
74 James R. Griesemer
claim Dawkins had softened his position and even that he was becoming
pluralistic. It nevertheless seems to me that Dawkinss widely noted sea change
concerns only his perspective on bookkeeping conventions; his conclusions
about the causal process were as radical in 1982 as they were in 1976.
The main idea is that genes pass on their relevant structure the nucleotide
sequences that code genetic information to copies in the process of DNA
replication. Organisms serve as vehicles, in which the genes ride, whose dif-
ferential survival and reproduction biases the distribution of gene copies of
different structural types in subsequent generations. Genes, in the form of
copies, exhibit the properties of longevity, fecundity, and fidelity. According
to Dawkins,
Evolution results from the differential survival of replicators. Genes are replicators;
organisms and groups of organisms are not replicators, they are vehicles in which
replicators travel about. Vehicle selection is the process by which some vehicles are
more successful than other vehicles in ensuring the survival of their replicators
(Dawkins 1982b, p. 162 of 1984 reprint).
In 1978, Dawkins defined replicator as follows:
We may define a replicator as any entity in the universe which interacts with its
world, including other replicators, in such a way that copies of itself are made. A
corollary of the definition is that at least some of these copies, in their turn, serve as
replicators, so that a replicator is, at least potentially, an ancestor of an indefinitely
long line of identical descendant replicators (Dawkins 1978, p. 67, as quoted in Hull
1981, p. 33).
Dawkins thus distinguishes replicators and vehicles in terms of a role in
evolution the former play that the latter do not. The evolutionary role of genes is
to bear information that can be passed on in replication. The role of organisms is
to act as agents of propagation, but since some are better than others, they serve
to make the replication of the genes they carry differential. Genes only
indirectly cause the biasing of their representation in future generations through
their role in the construction of organisms. The structure of organisms is only
indirectly transmitted to future generations through its role in perpetuating the
genes that caused the construction of the organism. The propagation of
information by transmission of structure and selective bias are the two main
functions of entities in the evolutionary process.
The phenomena of segregation and recombination during mitosis and
meiosis cause genes occurring together in the same chromosome or cell
sometimes to be separated from one another. Dawkins, like Williams before
him, recognized that the structure of the genetic material of an organism as a
whole, its genome, is unlikely to survive intact even for one generation, let
alone through the many generations required for significant evolutionary
adaptation. Therefore, not only do traits of organisms (components of pheno-
The Informational Gene and the Substantial Body 75
type) not get transmitted directly from parent to offspring, but the material
complexes of genes that collectively determine such traits do not get transmitted
as intact collectives either. To acknowledge this fact, Williams redefined the
gene in functional terms as that which segregates and recombines with
appreciable frequency, i.e., as the maximal chunks of genetic material that
survive the segregation and recombination processes (Williams 1966, p. 24).
Segregation entails that the maximal chunk will typically be smaller than the
genome as a whole. Recombination entails that the maximal chunk will
typically be smaller than a single chromosome. Recombination can separate
parts of coding sequences, so the maximal chunks may even be smaller than
(coding) genes. Since the material substance of even single whole genes cannot
be relevant to evolution, Dawkins concludes that it is the informational state,
rather than the substance, of the gene that is relevant.
More significantly, as Hull emphasizes, Williams also provided a more
specialized, functional definition of evolutionary genes in terms of their role
in the selection process directly: In evolutionary theory, a gene could be
defined as any hereditary information for which there is a favorable or
unfavorable selection bias equal to several or many times its rate of endogenous
change.21 This definition implies that evolutionary genes are information and
that they only exist at those times when the conditions mentioned in the
definition hold. The main significance of this definition is to detach the concept
of gene that is relevant to evolutionary theory from any basis in the material and
structural properties of molecular genes and to attach it to the process of
selection. (There cannot, for example, be such genes undergoing evolution by
drift since by definition they could not meet the conditions.) While it is certainly
the case that such properties are relevant to explaining how genes can function
as hereditary information, Williamss definition allows evolutionary theory to
refer to anything which functions in that way without requiring that it have the
material or structural properties of molecular genes. Evolutionary genes are an
abstraction to pure function; they are information-bearing states. The simplifica-
tion of the generalized evolutionary theory built by Dawkins and Hull upon the
foundation of evolutionary genes clarifies the ontology of the theory, but it has a
price: instead of selection wandering from level to level in the organizational
hierarchy, evolutionary genes wander in and out of existence as the selective
and genetic environments change. Further implications of this definition will be
explored in the next section.
The death of the (multicellular) body each generation implies that neither
organisms nor their genomes persist long enough to be the beneficiaries of
adaptation. Socratess nose died with Socrates, but the genes which caused his
21
Williams (1966), p. 25. Cf. Hull (1988), ch. 11. Wimsatt (1981) explores this functional definition
through an analysis of segregation analogues for entities at higher levels of organization.
76 James R. Griesemer
nose to be distinctive may be with us still through their persistence in the form
of copies organized into a genic lineage. And persistence is quite different in
character for acellular organisms (those consisting of a single cell); when a cell
divides, all the parts of the parent organism persist but the parent does not. The
background theory to the strategy is that the specific material nature and
structure of the substances that carry genetic information is inessential to the
analysis of somethings functioning as genetic material. Williams, Dawkins, and
Hull abstract from mechanism to produce their various theories of the
evolutionary gene or replicator. Our problem, then, is to analyze the notion of
persistence in the form of copies, which underlies these concepts.
As Hull reminds us, material genes no more persist in populations than do
material organisms (see especially Hull 1980, 1981, 1988). In order to
distinguish that which persists from that which does not, Williams and Dawkins
define genes in abstract, functional terms: the information coded in genes that
persists in the form of copies tokens with similar structure due to descent is
what matters in evolution, not the persistence of any given items of matter. The
structure of a gene persists, has a material embodiment or avatar, just as long as
one or more of its copies exists. Dawkins thus argues that the definitive answer
to the question what are adaptations for the good of? is genes. Adaptations
must be for the benefit of genes, not organisms, because genes are the only
biological entities with sufficient longevity, fecundity, and fidelity to reap the
benefits of evolution.22 Dawkinss challenge is to make this seem more than a
hollow victory since his genes are abstract entities and it is not clear what it
means for an abstraction to receive a benefit.
We may speak, Dawkins concludes, of selection as a causal process
operating on organisms, but since they cannot be the beneficiaries of the
evolutionary process, such talk is vicarious when it comes to explaining
adaptations. Wimsatt points out that Dawkinss argument is fallacious, turning
on an equivocation between, say, Socrates as an individual and Socrates as a
member of the class of instantiations of a phenotype: Socrates phenotoken was
killed by the hemlock, but his phenotype may well live on. Wimsatt continues,
If evolution had to depend upon the passing on of gene-tokens, it could not have
happened. Genotokens and phenotokens are not inherited, but genotypes and
phenotypes may be! Many of the remarks of modern evolutionists on the relative
significance of genotype and phenotype for evolution are wrong as a result of the
failure to make this distinction. In particular, if the phenotype can itself be inherited
or passed on, it need not be regarded merely as a means of passing on its genes or
genotype (Wimsatt 1981, n. 2; emphasis in original).
To avoid Dawkinss fallacy, Hull focuses on functional rather than structural
differences between the evolutionary roles of genes and organisms.
22
For an excellent critique of Dawkinss analysis of evolutionary benefit, see Lloyd (2001).
The Informational Gene and the Substantial Body 77
23
I shall not dispute Hulls point here because, although I think his insistence poses significant
problems for his process ontology, he considers this to be a terminological issue upon which very
little turns (Hull, pers. comm.).
24
Grene (1989, p. 68) points out that Dawkins chose the terms replicator and vehicle to
differentiate the primary role of genes in evolution from the derivative role of their bearers. Brandon
(1985, p. 84) concurs and finds this emphasis sufficient to prefer Hulls terminology. Dawkinss
language serves his rhetorical purpose well, which is to persuade biologists that the question of
replicator persistence (rather than vehicle survival) is the central question of evolutionary biology.
The language reflects the imagery of Dawkinss earlier book, The Selfish Gene (1976), in which
The Informational Gene and the Substantial Body 79
us to search for the physical entities that correspond to them, even though
Dawkinss theory is an abstract one of pure function and process. What matters
in replication, as Hull glosses Dawkins, is retention of structure through
descent. Dawkinss fidelity is copying-fidelity and longevity is longevity-in-
the-form-of-copies: neither material identity nor extensive material overlap is
necessary for copying-fidelity . . . (Hull 1981, p. 31). While material overlap is
not necessary for an entity to satisfy the conditions of replicatorhood, I claim
that material overlap is a necessary condition for a system to count as biological,
and thus that all cases of biological lineages of replicators include material
overlap as a property. Just how extensive the overlap must be depends on how
the concept of information is analyzed. I think it is more substantial than Hull
and Dawkins suggest because genetic information does not reside in the
material genes but in the relation between genes and cytoplasm, between signals
and coding system.
My complaint with Dawkinss theory is not that it fails to clarify functional
roles in the evolutionary process it does. But it does not provide resources to
identify empirically the physical avatars of his functional entities. We know that
the informational genes are tied to matter and structure, but if evolutionary
theory to be general enough to cover cultural and conceptual change must be
devoid of all reference to concrete mechanism, it cannot follow from the theory,
for example, that genes are inside organisms or are even parts of organisms, as
Dawkinss language suggests. Strictly, only the correlations between replicator
and vehicle due to causal connections of a completely unspecified sort can be
implied by such a theory. Striving to get matter and specific structure out of the
theory in order to make it apply to immaterial realms may thus leave it bankrupt
as an account of causal connection for the material, biological cases.
Hull clarified the distinction between replicators and vehicles by noting that
Dawkins mixed together two notions of interaction. Interaction with the world in
such a way that replication of copies of replicators is differential is a causal
power distinct from the interaction involved in copy-making per se. DNA
influences the former mode of interaction, albeit somewhat indirectly, via its
heterocatalytic functions of coding for the sequence of amino acids in proteins
and regulating the temporal and spatial expression of coding genes in the
developing organism. Proteins play vital roles in the metabolism and structure
of the cellular vehicles that in turn interact with the outside world. Since DNA
molecules are inside organisms, when an organisms interactions with the
external world cause its chances of survival and reproduction to be different
from those of other types of organisms, the probability that the DNA molecules
genes are depicted as homunculi riding around in lumbering organismal robots. Hulls term
interactor is silent on the relative importance of the two kinds of functionally characterized
abstract entities and Hull clearly emphasizes that both functions are logically required for evolution
to occur.
80 James R. Griesemer
25
Bibliographic caution is required in interpreting Hulls view. Hulls 1980 paper was reprinted in
Hull (1989), but not verbatim. Hull (1980) and Hull (1981) include directness as a defining
property of replicators and interactors. This language was dropped from Hull (1988). Hull (1989)
preserves the modified 1988 analysis rather than the exact text of the 1980 paper reprinted. Thus,
despite the publication dates of the various essays, Hull consistently used the directness as a
criterion in conjunction with passing on structure in the early 1980s and dropped it from the
definitions in the late 1980s. I argue below that the reason Hull dropped directness is that it is a
contingent property of the particular material mechanisms by which DNA and organisms operate
and therefore he did not consider it to be a proper part of the general analysis of selection processes.
The Informational Gene and the Substantial Body 81
Both replicators and interactors exhibit structure; the difference between them
concerns the directness of transmission. Genes pass on their structure in about as
direct a fashion as can occur in the material world. But entities more inclusive than
genes possess structure and can pass it on largely intact. . . . We part company most
noticeably in cases of sexual reproduction between genetically heterogeneous
organisms. Dawkins would argue that only those segments of the genetic material
that remained undisturbed can count as replicators, while I see no reason not to
consider the organisms themselves replicators if the parents and offspring are
sufficiently similar to each other. Genes tend to be the entities that pass on their
structure most directly, while they interact with ever more global environments with
decreasing directness. Other, more inclusive entities interact directly with their
environments but tend to pass on their structure, if at all, more and more indirectly
(Hull 1981, p. 34).
Hulls abstraction problem differs slightly from Dawkinss because of his
emphasis on replicator function rather than on replicator structure, that is, on
direct transmission of structure rather than on the nature of copies. Hulls turn
away from copying in the early 1980s has remarkable and subtle ramifications.
His return to it in the late 1980s is equally remarkable. He writes,
In my definition, a replicator need only pass on its structure largely intact. Thus
entities more inclusive than genomes might be able to function as replicators. As I
argue later, they seldom if ever do. The relevant factor is not retention of structure
but the directness of transmission. Replicators replicate themselves directly but
interact with increasingly inclusive environments only indirectly. Interactors interact
with their effective environments directly but usually replicate themselves only
indirectly (Hull 1980, p. 319. For criticism, see Wimsatt 1981, n. 4).
This seems intuitive enough if we hold to standard idealizations about
typical genes and phenotypes. Replicators seem to replicate directly because
there arent any things intervening between them and the copies they make.
DNA strands directly contact others through hydrogen bonds. What could be
more direct? Interactors seem to interact directly because there arent any events
intervening between the selection and the change in phenotype distribution.
There is nothing between the teeth of the lion (agent of selection) and the
diseased zebra it brings down, nor between the bringing down and the change in
distribution of zebra characteristics. The difficulty is that Hull never explains
what counts as transmission, so the analysis of directness remains unaddressed.
Indeed, since transmission is an aspect of mechanism, Hull has removed all the
resources to deal with it from his theory. Dawkinss copying metaphor, though
flawed, permitted a simple mechanistic interpretation of transmission: there is
physico-chemical contact between elements of the original (template) and the
copy (daughter strand) via hydrogen bonding and enzyme-mediated ligation
and, as a result, the daughter strand takes on a certain structure.
82 James R. Griesemer
26
It is important to note that this is not the same as Hulls claim that it is possible for one and the
same entity type in the traditional biological hierarchy to be both replicator and interactor, e.g.,
paramecia that reproduce by dividing their entire body in two. My claim is that because of the
problems in the concept of directness, the view should lead Hull to think that properties of
interactors will be sufficient to count them as replicators too, thus defeating the distinction.
The Informational Gene and the Substantial Body 83
seems even less workable for hierarchical evolutionary theory because we know
even less about causal structure at other levels of organization.
The directness of DNA replication seems to work as well as it does as a
paradigm because the sequence similarity of ancestor and descendant nucleic
acids is produced by a nearly universal chemical mechanism. The features of the
mechanism are supposed to supply the interpretation of directness and
transmission. But since descent, not sequence similarity, is the basis of Hulls
view, it is not clear what should serve instead of the chemical mechanism to
support an interpretation of direct transmission. Directness of transmission
cannot do the work of the copying metaphor without some further analysis: a
definition of replication is conspicuously lacking in Hulls analysis.
Replication will function as a primitive term in Hulls system until it is given
a description that is independent of evolutionary considerations. Hulls theory is
either radically incomplete, leaving its generality open to doubt, or else it is
thrown upon the mercy of Dawkinss analysis of copying, which has substantial
problems of its own.
These problems arise because there is as yet no clear and distinct notion of
the molecular gene which can resolve claims about sequences when they
function as evolutionary genes. We have only the claim that the passing on of
structure is relevant to the role of replicator, not any suggestion about the range
of mechanisms of structure transmissions that count as hereditary information.
Despite the distinction between the several concepts of the gene, the
informational turn is integral to them all, and because of it they are not easily
explicated independently of one another.27 It is harmless enough to talk about
sequence copying as the consequence of replication in the standard molecular
genetic context, but that leaves the structure of the replication process intended
to serve in the general theory unaccounted for. As Hull notes, it is critical that
Dawkinss notion of copying identify copying as a process involving descent
relations and not mere similarity.28 But the foregoing remarks are designed to
undermine the impression that simple appeals to what happens to DNA can
legitimately serve as the foundation for the causal interpretation of the
evolutionary process of replication in abstraction from any particularly
structured matter. One cannot have the ontological simplification of pure,
abstract functionalism and also appeal to concrete mechanisms whenever it
suits.
Moreover, even if the copying metaphor did account for relationships among
sequences, it would not explicate the relationship between replication and
reproduction that I claim is needed to generalize evolutionary theory. We know,
for example, in a rough way what is involved in claiming that groups of
27
For a philosophical analysis of these different senses of gene, see Kitcher (1982).
28
See especially the treatment in Hull (1988), ch. 11.
84 James R. Griesemer
organisms reproduce; we do not know what it means for them to replicate unless
that comes to the same thing as reproduction. Copying does not adequately
specify the material conditions of life processes: DNA replication and
reproduction are flows of matter, not just of information. I think that abstraction
from matter capable of reproduction supplies a faulty base for generalizing
evolutionary theory. Indeed, it is a wonder how transmission of information
could be understood as an explication of replication as an ordinary causal
process when the carriers of information are functionally defined, abstract
objects.
If DNA replication were fully conservative rather than semi-conservative,
then copying would have served as a more adequate analysis of replication,
since no transfer of parts from original to copy would have been involved and
information would be the only thing left to track from generation to
generation.29 But then replication would clearly not be analogous to
reproduction and, if I am right that reproduction is the proper basis for
evolutionary generalization, that would have made replication an even weaker
basis.
Hulls analysis in terms of directness fails to break completely with the
traditional hierarchy in so far as it explicates the concept of directness in terms
of compositional levels. A gene (lower compositional level) inside a cell (higher
compositional level) interacts indirectly with the environment outside the cell,
but relatively directly with the DNA molecules for which it serves as a template
in replication (same compositional level). Likewise, phenotypic traits of
organisms (higher level) are passed to offspring only indirectly through the
activities of the genes inside (lower level). Dawkinss use of the term vehicle
appears to commit him even more strongly to acceptance of the traditional
hierarchy than does Hulls more neutral term interactor. But the
commonalities between their analyses undermine the claims of both Hull and
Dawkins to be elaborating an alternative conceptual scheme to Lewontins.
Thus, the directness criterion seems to violate the aim of Hulls approach to
generalization by abstraction.
In the late 1980s, Hull modified his analysis of replicators and interactors,
dropping directness as the degree-property distinguishing the two roles and
emphasizing instead the degree of intactness of structure through the respective
29
I havent yet mentioned viruses with single-stranded RNA genetic material; copying would seem
an apt description of them since no transfer of parts occurs. (The production of cDNA which
incorporates into the host genome and then serves as a template for the production of new viral
RNAs acts as a filter for the original RNA material.) However, since all cellular life is DNA-based,
I am more inclined to think that RNA viruses should be treated as a special case of copying which is
neither replication nor reproduction, rather than that they are (the only) clear cases which satisfy
Dawkinss analysis. This view does not entail mechanism- or level-specific assumptions and so
does not put the project of generalizing evolutionary theory at risk.
The Informational Gene and the Substantial Body 85
function of selective interaction. (One might think they are linked by the
definition of interactor because of the reference to replication, but this process
is undefined in Hulls system.) This leaves open the possibility of some link
other than the compositional arrangement of the organizational hierarchy, but at
the same time it leaves unclear what counts as relevant transmission of
structure. By returning to copying, Hull must face Dawkinss problem that the
theory has no resources to specify the relevance relation between replicators and
interactors other than bare cause-effect. The particulars of this relation in
biology come from theories of development, including gene expression; they are
not supplied by the theory of the informational gene. The directness criterion
clearly focused attention on those aspects of DNA and organisms which count
as relevant transmission, although it provided no analysis of transmission.
Hull is very clear about his generalization strategy, but steering between
Scylla and Charybdis is tricky. The following passage suggests that he decided
the directness criterion tied his concept of replicator too closely to the material
nature of the DNA mechanism:
In this chapter I adopt a different strategy [than Lewontins]. I define the entities that
function in the evolutionary process in terms of the process itself, without referring to
any particular level of organization. Any entities that turn out to have the relevant
characteristics belong to the same evolutionary kind. Entities that perform the same
function in the evolutionary process are treated as being the same, regardless of the
level of organization they happen to exhibit. Generalizations about the evolutionary
process are then couched in terms of these kinds. The result is increased simplicity
and coherence.
. . . the benefits [of this strategy] are worth the price [in damage to common sense].
One benefit is that, once properly reformulated, evolutionary theory applies equally
to all sorts of organisms: prokaryotes and eukaryotes, sexual and asexual organisms,
plants and animals alike.
A second benefit is that the analysis of selection processes I present is sufficiently
general so as to apply to sociocultural change as well (Hull 1988, p. 402).
Rather than analyze directness, Hull shifts back to copying to correct a tacit
dependency on the traditional organizational hierarchy, in the sense that
directness is a contingent property of the way molecular genes replicate. While
it is true that directness is contingent, it is nevertheless a very important
property: the whole domain of biological systems (as opposed to merely
physico-chemical systems on the one hand and human social and psychological
systems on the other) can be described as the domain of lineages formed by
descent relations. A central, and universal, fact about biological descent is that it
includes the material overlap of the entities that form biological lineages. By
expunging the directness condition in his later analysis, Hull has left unclear
what distinguishes biological lineages and descent relations from any other sort
of cause-effect change and causal relation. Hulls general analysis of selection
The Informational Gene and the Substantial Body 87
processes leaves us without the tools to analyze the empirical differences among
selective systems along with their similarities. While there is no necessity in
material overlap, nor any necessity in the degree of relative directness with
which one DNA molecule gives rise to another, directness of a qualitative sort
does follow from the material overlap property of biological lineages (and as I
will argue in the next section, it also follows that organismal reproduction,
which also produces lineages with material overlap, is direct). Thus I think Hull
was mistaken to turn away from his earlier analysis. Rather than reintroduce
copying (in the sense of transmission of structure), which is all we have in the
vacuum left by the repeal of directness, Hull should instead have rejected pure
functionalism in favor of some explicit handling of the facts of mechanism. This
is the idea of what an empirical background theory is intended to do.
Hulls rejection of extensive material overlap as a condition on the
evolutionary process only reveals the conventionality of the biological domain
within the domain of physical systems, not the irrelevance of material overlap to
evolution. The historical traditions and conventions by which some physical
systems have been singled out as biological systems are arbitrary choices with
respect to the broadened domain of generalized evolutionary theory. The
attempt to produce a general analysis of selection processes may therefore
require that the artificial domain of biology be scrapped. It may seem
appropriate to Hull to do so, since his goal is an analysis general enough to
include sociocultural evolution, but it is not necessarily the wisest course of
action to ignore a universal property of descent mechanisms just because current
theory suggests it is contingent.
Unfortunately, Hulls modification in terms of intactness of structure
through replication is not entirely successful in removing other difficulties
either. While he has eliminated a dependency on the traditional organizational
hierarchy, Hull has not removed a dependency on Weismannism, which
functions as the background theory that plays the a priori ontological role in the
Dawkins-Hull strategy that the traditional hierarchy did for the Lewontin
strategy.
sequences. Copies are not, in general, things made out of the things of which
they are copies. Rather, copies are things which resemble originals because a
physical process induced a relevantly similar or common pattern in numerically
distinct materials.
The word copy nicely trades in ambiguities between noun and verb,
relation and process. If the ambiguity is allowed to stand, it will appear that the
theory of replicators avoids reference to matter and mechanism when in fact it
does not. To succeed it must detach the copy relation from the copy process and
refer only to the former. Otherwise, a further analysis is required to justify the
theorys ontological pretenses. The unsolved logical problem is twofold: is
descent a subrelation of the copy relation and are all descent processes copy
processes? I think the answer to these questions cannot be found unless an
analysis of reproduction is attempted.
In copying, genealogical relationship does not include material overlap:
the original is part of neither the material nor efficient cause of the copy. Thus
copying captures a weaker, more abstract notion of resemblance due to cause-
effect relationship rather than the stronger, more concrete notion of resemblance
due to descent with material overlap, which obtains in biological cases. Think of
cases of artistic copying: an artist sits before a picture in a museum and paints
another picture which resembles the one on the wall. We may speak as though
the picture on the wall gave rise to or is an ancestor of the one on the easel,
but no material part of the one on the wall is a part of the one on the easel and
the artist, not the picture on the wall, is the efficient cause. In contrast, new
double helices of DNA are not made from materials that are entirely
numerically distinct form old double helices, and DNA is (part of) the efficient
cause of DNA.
One might say that the relation of original to copy holds of evolutionary
genes, derivatively, whenever material genes function as evolutionary genes, but
the copying process is not the causal process that causes the copy relation to
hold among evolutionary genes: functioning entails process and consideration of
formal causes alone cannot explain process. Descent does the work in
replication that intention does for artistic or human acts of copying. To mix up
relation and process in this way is to invite confusion. Because we ourselves act
with intention, it is all too easy to fill in the black box with mechanism while
claiming to consider only functions. In a sense, the last remnant of materialism
infecting Dawkinss and Hulls abstract theories is the notion that DNA
carries heritable information. It is this information which is supposedly
copied in the process of replication. But I think they have paid insufficient
attention to the concept of information or to the way in which it is carried. A
proper understanding of heritable information requires detailed consideration of
the matter out of which replicators are made as well as of the mechanisms by
which information is transmitted. From this it does not follow that there
90 James R. Griesemer
be considered. Depending on how such facts are described, it may not even turn
out that DNA is replicated relatively more directly than organisms are
reproduced, despite Hulls and Dawkinss strong intuitions that this must be the
case. The problem of description in the case of DNA is that the structure of a
DNA strand is not copied by the process of strand synthesis from the template.
The new DNA strand is highly similar in sequence to the strand complementary
to the template, not to the strand which serves as the template. Indeed, a strand
and its complement never have a single nucleotide in common at any given
location (except in special cases such as palindromes and highly repetitive
sequences). Because the causal interactions which produce the new strand are
not interactions with the complementing strand it comes to resemble, these
interactions do not constitute copying. If we count as the copying process the
entire complex sequence of events, black-boxed by Dawkins, involving dozens
of enzymes, magnesium ions, free nucleotide-triphosphates, and so forth leading
from one strand to a complement-of-a-complement strand having the same
nucleotide sequence, then there will be problems interpreting this causal process
as measurably more direct than, and hence as distinguishable from, the
allegedly indirect way in which interactors (such as organisms, on Dawkinss
analysis) might be taken to cause their own replication.
An objection might be raised in light of the fact that DNA replication is only
contingently semi-conservative. Since it might have been fully conservative
yielding the parent double helix and an offspring double helix rather than two
hybrids with one parent and one offspring-strand each, my objection is to a
mere contingent fact about mechanism which is nicely avoided by the Dawkins-
Hull abstraction to pure function. There would nevertheless be a flow of
information from parent double helix to offspring double helix. Therefore, the
semi-conservative nature of actual DNA replication, with its material overlap, is
irrelevant. Worse still, it can be claimed that processes we commonly call
copying processes do indeed involve material overlap, such as in letterpress
printing or mimeographing, where ink is laid on an original and then a sheet of
paper is pressed against the original and ink is physically transferred from
original to copy. Insofar as the ink counts as first a part of the original and then
a part of the copy, such processes satisfy material overlap.30
I have three points to make in reply to these objections. First, semi-
conservative DNA replication is not the only problem of material overlap to
arise for the copying view; another concerns the more fundamental functionalist
move in interpreting the evolutionary gene as information. I argue that there
must still be material overlap for evolutionary genes to exist, but (below the
level of organisms) the material overlap must be in the chemical machinery of
30
These objections have been raised by Paul Teller and Ers Szathmry independently and I thank
them for prodding me into addressing them.
92 James R. Griesemer
replication which makes it the case that the DNA (or RNA) molecules
transferred from parent to offspring count as bearing information. The chemical
machine must move from one cell to another in order for genetic material to
count as information-bearing. This involves the direct cytoplasmic transfer in
cell division of the enzymes of translation aminoacyl-tRNA synthetases
which make it the case that a sequence of nucleic acids in mRNA count as
coded information at all. A simple thought experiment shows this. If an accident
of metabolism were to result in all the molecules of tryptophan-aminoacyl-
tRNA synthetase ending up in only one of two daughter cells, then the other cell
could not survive. I think it is plausible to say that the reason is that in such a
cellular context, the genetic material is meaningless: that cell would not have a
rich enough genetic code for the DNA to carry information. This material
overlap is not eliminable even in principle because without it the whole system
cannot be construed as informational. Without the translation machinery or with
a changed machine there is either no genetic code at all or there is a different
one. In biology, the code must reside in the cell, not in a table in a molecular
engineers handbook. This form of material overlap is not required for the flow
of information in cell fusions, e.g., from sperm nucleus to pre-existing egg cell,
but it is required in all cell fissions, including the flow from primordial germ
cell to mature egg cell.
In biology in contrast to the formal engineering theory of information
there is only a pre-existing sender, but not a pre-existing receiver or
communication channel, nor a pre-arranged agreement on a set of symbols
that serve as a fixed alphabet. Information does not reside in a given
informational macro-molecule but rather in the relation between such
molecules and the molecular contexts which determine an alphabet and
channel.31 In reproduction (whether of DNA molecules or of organisms), the
mechanism that determines the informational context must be transferred along
with the information bearers or signals. For information to flow, in cases like
these where the material circumstances of communication do not pre-exist,
matter must also flow. The analogy to engineers information misleads because
the engineer sets up the matter transmitter, receiver, and channel before any
information flows.
The second point of reply concerns the possible worlds in which DNA
replication is fully conservative. There are two ways to imagine such a world
that have different implications. One way is to imagine that the new DNA
strands are synthesized in the usual semi-conservative way but that, for
unknown reasons, there exist enzymes which come along, unzip the hybrid
daughter double helices, and reanneal the two newly synthesized strands and the
two old parental ones. It is hard to imagine how the extra apparatus could be
favored in evolution, but if such a system existed, it would still support my
31
See Cherry (1966) for a description of the formal requirements of information theory.
The Informational Gene and the Substantial Body 93
claims that DNA replication is anything but direct and that material overlap of
the apparatus would still be required. There would also still exist an
intermediate stage with hybrid molecules and these would exhibit material
overlap with both the parental molecule and the offspring molecule, so the
detour doesnt present a counterexample.
An example of the other way to imagine fully conservative replication would
be to imagine a world in which each of the nucleotides stereochemically
hydrogen bonded with its own kind rather than with a complementing
nucleotide, as in the actual system where A usually bonds to T (or U) and G
usually bonds to C. I can imagine such a counterfactual being true, but I cannot
imagine all the implications if it for the rest of biology. For example, even short
repetitive sequences would fold upon themselves. It is not clear that in such a
world the sort of catalytic activity would occur that is now thought crucial to the
evolution of genetic systems in the primitive RNA world. I do not know how to
determine whether there could even exist cells and organisms of the sorts with
which we are familiar in such a world, even if no laws of chemistry were
changed. Therefore, it is idle to raise this fantasy world in objection to my point.
The issue raised in the objection appears serious but misses the point of my
argument. I agree that semi-conservative replication of DNA is contingent and
that it would therefore seem to be a fact about the material mechanism from
which Hull should abstract, given his program. My point, however, is that Hull
requires an analysis of the concept of directness and Dawkins requires an
analysis of the concept of copying in order to use them in their explications of
replication. In such an analysis, they must either appeal to accidental features of
DNA replication, to essential features of DNA replication, or to features of
replication that DNA violates. If the third is Hulls route, then the actual world
falsifies his general analysis. If he appeals to what DNA actually does, then
consideration of the facts of semi-conservative replication fail to support the
concept of directness or justify use of the concept of copying. If he appeals to
some essence of DNA replication, then it must involve some other properties
of DNA replication than semi-conservation. What are these features? In his
earlier writings, Hull does not tell us. In his later writings, Hull drops the
requirement of directness in favor of the requirement that replicators must pass
on information largely intact via their structure. (Hull 1991, p. 42). But no
analysis of information is given. If we rely on the common wisdom about
genetic information, which is an amalgam of folk linguistics and the engineering
theory of communication, we arrive at new problems about the actual nucleic
acid replication mechanisms, problems as that of saying where genetic
information resides, which the generic term copying was originally
supposed to avoid. Thus, the point is not whether Hull (and Dawkins) have the
facts of molecular biology right, but rather whether their generalization strategy
has succeeded in avoiding reference to any such facts. If the above arguments
94 James R. Griesemer
are sound, it certainly seems that they depend on such facts to explicate their
core notions.
The third point in reply regards the claim that processes commonly called
copying processes can involve material overlap. I doubt that when sufficient
attention is paid to the physical processes involved, any alleged copying process
will be a counterexample. But if there is such, I contend it will be a freakish
example and not at all common. If such were the case, I would, like Hull,
conclude that some cases of common usage must fall by the wayside if
theoretical progress is to be made. At best, such odd cases would lead to a draw:
their anomaly would incline one to decide the matter by imposing a convention
or not to decide it at all. One shouldnt be distracted by the fact that a particular
object is called a copy: the problem here is not with the copy relation but with
copying as a process. The analogies of relation distract us from the disanalogies
of process. That every copy of a newspaper, for example, carries the same
information traces to a common material cause: they type in the press that is
repeatedly inked and pressed onto paper. Is it really worthwhile, unless we do it
merely in order to prop up the analogy, to say that the ink is part of the original
so as to allow it as a case of material transfer? Moreover, the common cause
structure relating newspapers as copies to an original press plate is different than
that of a lineage caused by descent. Using the word copy to refer merely to
the resemblance among things is no improvement in understanding, as Hull
argued. We neednt retain every common usage if it obscures analysis. In this
case, it seems clearer to say that the ink isnt really part of the original, but
rather is a numerically distinct object which is induced to take on the pattern of
the original via the impression of the type. Similarly, free nucleotides are added
to the replication reaction to make a new DNA strand, but they are linked one to
another to make a strand, whereas the blobs of ink are linked only to the
receiving paper, not to one another. Thus letterpress or mimeograph appears to
satisfy the condition of material overlap only if we accept the convention that
the ink is part of the original rather than a third thing mediating between the
original and the blank paper.
To continue the main argument: that there are more chemical events
involved in the copying of an organism than of a gene, and that organismal
reproduction may always require gene replication, do not suffice to show that
organismal reproduction is less direct than gene replication. Since gene
replication and organismal reproduction are arranged hierarchically, these
processes can occur in parallel and the degree of directness measured at one
level may not have to count all the steps at the other level.32 To quantify
32
I will discuss below the point that Weismannism seems to entail that higher-level reproduction is
necessarily less direct than lower-level replication because all the steps of the latter are included
within the steps of the former. My main point will be that this is true of Weismannism because that
The Informational Gene and the Substantial Body 95
doctrine idealizes the organism in such a way that genes and organisms exist sequentially in time as
well as compositionally in structure.
96 James R. Griesemer
33
See Watson et al. (1987) and Alberts et al. (1989) for reviews.
34
This point is analogous to Beattys (1982) argument that Mendels laws are not reducible to
molecular genetics because what explains Mendels laws is the cellular process of normal meiosis,
which is itself under genetic control. The explanation of that control will be evolutionary, and
evolutionary explanations are not part of molecular genetics. Likewise, replication is a mechanism
that serves the process of reproduction which in turn makes biological evolution possible. This
mechanism has evolved, as can be gleaned from arguments to the effect that the first replicators
The Informational Gene and the Substantial Body 97
were catalytic RNAs. But the mechanisms that serve reproduction might have been otherwise, and
at other levels of organization they probably are otherwise, so replication cannot be an adequate
basis for analyzing reproduction (and hence evolution).
98 James R. Griesemer
truly purge their abstraction to function of all traces of reference to matter and
mechanism, or they must find a way to accommodate the facts about mechanism
that still permits generalization. As I will stress below, standard characteriza-
tions of genes in terms of their information content risk inadequacy by ignoring
features of development important to evolution, but this does not mean that
there is no route to a general theory of evolution open to the materialist.
To what seemingly more general properties of replication might one appeal
that do not depend on mechanisms of DNA replication? As I have argued,
mining Dawkinss analysis of copying as a source would merely throw the
problem back to the worry about the weakness of his analogies. The other
source of properties I favor is conceptual analysis of the process of
reproduction. But I havent given such an analysis, and it is an interesting fact
that while biologists have devoted much effort to the description and
evolutionary analysis of mechanisms of reproduction, they have not paid much
attention to the concept of reproduction itself.35 One way to begin to understand
what is needed is to examine the source of the analysis of replication in the
Dawkins-Hull program evident in their writings: Weismannism.
The fundamental shift away from the Lewontin strategy is to characterize the
entities that function in the evolutionary process directly, without consideration
of traditional hierarchical levels of organization. In order to focus on process,
Hull and Dawkins rely on a background theory of causal relations between
genic and organismic material that explains their capacity to function in the
evolutionary roles of replicators and interactors. Once the evolutionary roles of
genes and organisms are analyzed in terms of the implied causal structure, they
can be abstracted from the specific material and the concrete mechanisms
through which genes and organisms exercise their capacities. By quantifying
35
Since I do not intend to present an analysis of reproduction here, my sweeping statements are
intended only as a sketch of the difficulty. For an interesting attempt to analyze the concept of
reproduction, see Bell (1982). For theoretical work that shows how important it is to develop an
analysis, see Harper (1977), Jackson et al. (1985), and Buss (1987).
The Informational Gene and the Substantial Body 101
36
Weismann (1892), Wilson (1896). See Churchill (1968, 1985, 1987), Maienschein (1987), Mayr
(1982, 1985), and Griesemer and Wimsatt (1989) for historical details.
37
See Griesemer and Wimsatt (1989). On Weismanns change of views from a germ-cell to a germ-
plasm theory in the 1880s, see Churchills excellent (1987) article.
102 James R. Griesemer
Fig. 2. The central dogma of molecular genetics, reproduced from Crick (1970), p. 561, fig. 2.
Arrows represent transfers of information. Solid arrows indicate probable transfers and
broken arrows indicate possible transfers. Cricks diagram indicated how he thought
things stood in 1958.
38
Crick (1958, p. 153) expressed the dogma in words:
The Central Dogma This states that once information has passed into protein it cannot get out
again. In more detail, the transfer of information from nucleic acid to nucleic acid, or from
nucleic acid to protein may be possible, but transfer from protein to protein, or from protein to
nucleic acid is impossible. Information means here the precise determination of sequence,
either of bases in the nucleic acid or of amino acid residues in the protein.
Crick (1970) expressed the dogma in a diagram (see fig. 2) which reinforced the original
interpretation in light of the recent discovery of reverse transcriptase. For reviews, see Watson et al.
(1987) and Alberts et al. (1989).
The Informational Gene and the Substantial Body 103
Fig. 3. Comparison of a simplified representation of the central dogma of molecular genetics with
a simplified representation of Weismannism, from Maynard Smith (1975), p. 67, fig. 8.
Reproduced by permission of Penguin Books Ltd.
39
Weismann (1892); cf. Griesemer and Wimsatt (1989). Churchill, (1987, p. 346) argues that
Weismann held the germ-cell theory in his 1883 essay On Heredity (collected in Weismann
(1889)), a view he rejected in the 1892 book.
104 James R. Griesemer
was responsible for the somatic differentiation of cells carrying the germ-plasm
of the future germ-cells). Weismann vigorously objected to the Jger-Nussbaum
theory of germ-cell continuity as a violation of his mature views on how cell
differentiation occurred in a wide variety of organisms, a distinction Wilson
failed to include in his various representations of Weismannism.
Fig. 4. Weismanns own representation of his theory of the continuity of the germ-plasm and
discontinuity of the soma, showing 12 cell generations in the development of Rhabditis
nigrovenosa with time extending from the bottom to the top of the diagram. Germ-cells
(UrKeimzellen) appear in the cell generation marked 9. Braces indicate different germ
layers so as to correspond with preceding diagrams in the text (endoderm, Ent;
mesoderm, Mes; ectoderm, Ekt). Reproduced from Weismann (1893), p. 196, fig. 16.
One plausible reason that Weismanns views are not typically discussed in
their entirety in the twentieth-century literature is that this complementing
mosaic theory of development was discredited. This theory explained differen-
tiation of the soma as the result of the qualitative division of germ-plasm in the
somatic cells, such that terminally differentiated cells exhibited only a single re-
maining genetic determinant. While modern biologists accepted Weismanns
The Informational Gene and the Substantial Body 105
40
The aim here is not to criticize Dawkins or Hull for misreading Weismann, but rather to point out
the nature and role of the assumptions they make and to suggest that evolutionary theory would be
better off if Weismann were heeded rather than Weismannism. While Dawkins clearly identifies
himself as a Weismannian (1982a, p. 164), it is no part of Hulls project to reconstruct or even be
faithful to Weismanns views: historical accuracy does not typically play a role in scientific
progress.
41
I am here following Cartwrights distinction between idealization and abstraction. Simply put,
idealizations involve the mental rearrangement of inconvenient features of an object, such as
treating a plane as frictionless, before formalizing a description of it. Abstraction involves taking
certain features or factors out of context all together. The latter is not a matter of changing any
particular features or properties, but rather of subtracting, not only the concrete circumstances but
even the material in which the cause is embedded and all that follows from that ( Cartwright 1989,
5.2, esp. p. 187). The soma is treated ideally as protein in the following sense. Although organisms
are not made only of protein, the presence and arrangement of the other constituents is by and large
caused (and explained) by the metabolic activity of enzymes, which are proteins. Thus it is an
idealization, though a reasonable one for some purposes, to exclude from consideration the
complications due to taking other molecular constituents into account. Ignoring the other
106 James R. Griesemer
constituents is analogous to treating the plane as frictionless. By contrast, treating DNA sequences,
i.e., genetic information, as germ-plasm involves taking the information content of a molecule out
of all material context. It is not at all clear that it is reasonable to treat organisms as information.
42
For criticisms of the concept of genetic information, see Hanna (1985) and Wimsatt (1986).
The Informational Gene and the Substantial Body 107
the good of. The causal continuity of the germinal material implies that it is
potentially immortal, hence at least a candidate to receive the evolutionary
benefit of an increasingly adapted soma. The second fact underlies Dawkinss
view that the body serves merely as a vehicle in which the evolutionarily
significant entities, the replicators, ride and through which their replication is
made differential (in virtue of the vehicles having different fitness). The
discontinuity of the soma implies that it cannot be an evolutionary beneficiary
because it fails to persist. The third fact governs the distinction between active
and passive replicators and determines which replicators are the ones whose
perpetuation is differential in virtue of a given interactors causal interaction
with the external environment. Germ-line replicators may be active in virtue of
their causal role in producing the soma and thus possibly influence the
probability with which they are copied. Although Wilsons diagram does not
directly imply a part-whole relationship between germ-line and soma, this is the
usual interpretation of the relationship between the germ cells and the soma.
Hulls discomfort with Dawkinss definition of replicators in terms of
copying can be interpreted in the light of what Weismannism does and does not
imply. The arrows connecting elements of Wilsons diagram are all causal, but
nothing is implied about the degree of similarity required for the relevant causal
relations to hold. Rather, the degree of similarity will depend on the specific
mechanisms through which the materials of the germ-plasm and the soma
exhibit their causal capacities by the effects of their action. Thus, Hulls focus
on directness or on intact transmission reflects only the causal structure
exhibited in Wilsons diagram of Weismannism.
The most celebrated consequence of Weismannism, the non-inheritance of
acquired characteristics, is entailed by the causal asymmetry underlying
similarities among interactors (somata) and among replicators (germ-cells). The
resemblance among successive interactors is indirect and a consequence of
mediation by the germ-line. The resemblance among replicators and their copies
is direct and due to the direct causal relation between them. For characteristics
acquired by somatic interactors to be inherited there would have to be causal
arrows from the soma back to the germ-line. Thus, bare causal roles interpreted
in terms of the directness of transmission of structure or of interaction with the
external environment as specified by Weismannism replace the specific
similarity requirement in Hulls original modification of Dawkinss analysis.
In Hulls later treatment, directness is dropped as a characterization of the
causal quality of the relations in favor of unvarnished reference to causation:
passing on structure largely intact in the case of replicators and causal
interaction in the case of interactors. These modifications insure that nothing
more than is contained in the Weismannian structure is read into the
evolutionary analysis. Directness thus seems to turn on features of qualitative
causal structure rather than on any specific mechanisms limited to the materials
108 James R. Griesemer
43
Consider, e.g., Hull (1975) in light of his subsequent work.
44
Wimsatt (1981) points out that one must minimally consider the trade-off between rates of
replication and degree of stability to account for the transmission of structure (and hence the
heritability of adaptive characters). Since different combinations of replication rates and degrees of
stability can satisfy the same trade-off, there is no single degree of stability that will satisfy the
condition Hull argues for. Direct causal arrows from soma to soma might change the trade-off in
fundamental ways.
The Informational Gene and the Substantial Body 109
the causal analysis of the former can be made independently of the latter:
development figures prominently in how hereditary transmission works.
To summarize, I claim that Weismannism in general and molecular
Weismannism in particular endorse a picture of certain biological causal
relations which serves as a background theory in the generalization-by-
abstraction strategy employed by Dawkins and Hull. This background theory
facilitates abstraction by representing causal structure without the need for
explicit reference to concrete material entities: G and S or DNA and P
practically serve as the variables in a theory that quantifies over many potential
material entity types that may serve evolutionary roles. But molecular
Weismannism facilitates abstraction in a way that gets facts of development and
inheritance wrong that are crucial for evolutionary theory. In particular it gets
facts about the causal relations among somata and between germ-plasm and
soma wrong. Indeed, it misidentifies them as accidental facts about hereditary
transmission of information rather than as essential facts about development,
which in turn explains heredity. Weismanns view and Weismannism both
entail the non-inheritance of acquired characteristics, but for different reasons.
The former entails it because of specific facts about developmental mechanisms
that happen to hold at the levels of genes and organisms; the latter entails it
because of its abstraction of causal structure from material genes and organisms.
James R. Griesemer
Wissenschaftskolleg zu Berlin
Collegium Budapest, and University of California, Davis
jrgriesemer@ucdavis.edu
REFERENCES
Alberts, B., Bray, D., Lewis, J., Raff, M., Roberts, K., and Watson, J. (1989). Molecular Biology of
the Cell. 2nd ed. New York: Garland Publishing, Inc.
Antonovics, J., Ellstrand, N., and Brandon, R. (1988). Genetic Variation and Environmental
Variation: Expectations and Experiments. In: L. Gottlieb and S. Jain (eds.), Plant Evolutionary
Biology, pp. 275-303. New York: Chapman and Hall.
Asquith, P. and Giere, R., eds. (1981). PSA 1980. East Lansing: Philosophy of Science Association.
Asquith, P. and Nickles, T., eds. (1982). PSA 1982. East Lansing: Philosophy of Science
Association.
Beatty, J. (1982). The Insights and Oversights of Molecular Genetics: The Place of the Evolutionary
Perspective. In: Asquith and Nickles (1982), vol. 1, pp. 341-355.
Bell, G. (1982). The Masterpiece of Nature. Berkeley: University of California Press.
Boyd, R. and Richerson, P. (1985). Culture and the Evolutionary Process. Chicago: University of
Chicago Press.
Brandon, R. (1982). The Levels of Selection. In: Asquith and Nickles (1982), vol. 1, pp. 315-323.
Brandon, R. (1985). Adaptation Explanations: Are Adaptations for the Good of the Replicators or
Interactors? In: D. Depew and B. Weber (eds.), Evolution at a Crossroads, pp. 81-96.
Cambridge, Mass.: The MIT Press.
Brandon, R. (1990). Adaptation and Environment. Princeton: Princeton University Press.
Brandon, R. and Burian, R., eds. (1984). Genes, Organisms, Populations: Controversies over the
Units of Selection. Cambridge, Mass.: The MIT Press.
Buss, L. (1987). The Evolution of Individuality. Princeton: Princeton University Press.
Cartwright, N. (1989). Natures Capacities and their Measurement. New York: Oxford University
Press.
Cartwright, N. and Mendell, H. (1984). What Makes Physics Objects Abstract? In: J. Cushing, C.
Delaney, and G. Gutting (eds.), Science and Reality: Recent Work in the Philosophy of Science,
pp. 134-152. Notre Dame: University of Notre Dame Press.
Cherry, C. (1966). On Human Communication: A Review, A Survey, and A Criticism. 2nd ed.
Cambridge, Mass.: The MIT Press.
*
I wish to thank Robert Brandon, Dick Burian, Leo Buss, Marjorie Grene, Lorraine Heisler, David
Hull, Evelyn Fox Keller, Dick Lewontin, Lisa Lloyd, Sandy Mitchell, Ers Szathmry, Bill Wimsatt
and Jeff Workman for helpful comments on previous versions of the manuscript. Lisa Lloyd helped
clarify important features of her formal analysis of evolutionary theory. JaRue Manning, Rob Page,
Bruce Riska, Brad Shaffer and Michael Wedin provided helpful discussion. I also wish to thank
Stuart Kauffman and the Santa Fe Institute for their invitation to participate in the Foundations of
Development and Evolution Conference in 1989 and the Wissenschaftskolleg zu Berlin for a
fellowship in 1992-93. This essay is dedicated to Bruce Riska.
The Informational Gene and the Substantial Body 113
Churchill, F. (1968). August Weismann and a Break from Tradition. Journal of the History of
Biology 1, 91-112.
Churchill, F. (1985). Weismanns Continuity of the Germ-Plasm in Historical Perspective.
Freiburger Universittsbltter 24, 107-124.
Churchill, F. (1987). From Heredity Theory to Vererbung: The Transmission Problem, 1850-1915.
Isis 78, 337-364.
Crick, F. (1958). On Protein Synthesis. Symposia of the Society for Experimental Biology 12, 138-
163.
Crick, F. (1970). Central Dogma of Molecular Biology. Nature 227, 561-563.
Darden, L. and Cain, J. (1988). Selection Type Theories. Philosophy of Science 56, 106-129.
Darwin, C. (1859). On the Origin of Species. Facsimile of the 1st edition, 1964. Cambridge, Mass.:
Harvard University Press.
Darwin, C. (1868). The Variation of Animals and Plants Under Domestication. London: John
Murray Publishers.
Dawkins, R. (1976). The Selfish Gene. New York: Oxford University Press.
Dawkins, R. (1978). Replicator Selection and the Extended Phenotype. Zeitschrift fr Tierpsycho-
logie 47, 61-76.
Dawkins, R. (1982a). The Extended Phenotype. New York: Oxford University Press.
Dawkins, R. (1982b). Replicators and Vehicles. In: Kings College Sociobiology Group (eds.),
Current Problems in Sociobiology, pp. 45-64. Cambridge: Cambridge University Press.
Reprinted in: Brandon and Burian (1984), pp. 161-180.
De Vries, H. (1889). Intracellular Pangenesis. Translated by C. Gager, 1910. Chicago: Open Court
Press.
Duhem, P. (1954). The Aim and Structure of Physical Theory. Princeton: Princeton University
Press.
Ghiselin, M. (1974). A Radical Solution to the Species Problem. Systematic Zoology 23, 536-544.
Grene, M. (1989). Interaction and Evolution. In: Ruse (1989), 67-73.
Griesemer, J. (1988). Genes, Memes and Demes. Biology and Philosophy 3, 179-184.
Griesemer, J. (1990). Modeling in the Museum: On the Role of Remnant Models in the Work of
Joseph Grinnell. Biology and Philosophy 5, 3-36.
Griesemer, J. and Wimsatt, W. (1989). Picturing Weismannism: A Case Study of Conceptual
Evolution. In: Ruse (1989), pp. 75-137.
Hanna, J. (1985). Sociobiology and the Information Metaphor. In: J. Fetzer (ed.), Sociobiology and
Epistemology, pp. 31-55. Dordrecht: D. Reidel.
Harper, J. (1977). Population Biology of Plants. New York: Academic Press.
Hull, D. (1975). Central Subjects and Historical Narratives. History and Theory 14, 253-274.
Hull, D. (1980). Individuality and Selection. Annual Reviews of Ecology and Systematics 11, 311-
332.
Hull, D. (1981). The Units of Evolution: A Metaphysical Essay. In: U. Jensen and R. Harr (eds.),
The Philosophy of Evolution, pp. 23-44. Brighton: The Harvester Press.
Hull, D. (1988). Science as a Process. Chicago: University of Chicago Press.
Hull, D. (1989). The Metaphysics of Evolution. Albany: State University of New York Press.
Hull, D. (1991). Science as a Selection Process. Unpublished Manuscript.
Jackson, J., Buss, L. and Cook, R., eds. (1985). Population Biology and Evolution of Clonal
Organisms. New Haven, Conn.: Yale University Press.
Kitcher, P. (1982). Genes. British Journal for the Philosophy of Science 33, 337-359.
Lewontin, R. (1970). The Units of Selection. Annual Review of Ecology and Systematics 1, 1-17.
Lewontin, R. (1974a). The Genetic Basis of Evolutionary Change. New York: Columbia University
Press.
Lewontin, R. (1974b). The Analysis of Variance and the Analysis of Causes. American Journal of
Human Genetics 26, 400-411.
114 James R. Griesemer
Lloyd, E. (1988). The Structure and Confirmation of Evolutionary Theory. New York: Greenwood
Press.
Lloyd, E. (2001). Different Questions: Levels and Units of Selection. In: R. S. Singh, C. B.
Krimbas, D. Paul, and J. Beatty (eds.), Thinking about Evolution: Historical, Philosophical and
Political Perspectives. Cambridge: Cambridge University Press.
Margulis, L. and Sagan, D. (1986). Origins of Sex, Three Billion Years of Genetic Recombination.
New Haven, Conn.: Yale University Press.
Maienschein, J. (1987). Heredity/Development in the United States, circa 1900. History and
Philosophy of the Life Sciences 9, 79-93.
Maynard Smith, J. (1975). The Theory of Evolution. 3rd ed. Middlesex: Penguin.
Mayr, E. (1982). The Growth of Biological Thought. Cambridge, Mass.: Belknap-Harvard Press.
Mayr, E. (1985). Weismann and Evolution. Journal of the History of Biology 18, 295-329.
Mendel, G. (1865). Experiments in Plant Hybridization. Reprinted by P. Mangelsdorf, 1965.
Cambridge, Mass.: Harvard University Press.
Mitchell, S. (1987). Competing Units of Selection? A Case of Symbiosis. Philosophy of Science 54,
351-367.
Rosenberg, A. (1985). The Structure of Biological Science. New York: Cambridge University Press.
Ruse, M., ed. (1989). What the Philosophy of Biology Is: Essays Dedicated to David Hull. Boston:
Kluwer Academic Publishers.
Sapp, J. (1987). Beyond the Gene. New York: Oxford University Press.
Sober, E. (1981). Holism, Individualism, and the Units of Selection. In: Asquith and Giere (1981),
vol. 2, pp. 93-121.
Sober, E. (1984). The Nature of Selection. Cambridge, Mass.: The MIT Press.
Sober, E. (1988). Reconstructing the Past: Parsimony, Evolution, and Inference. Cambridge, Mass.:
The MIT Press.
Stent, G. S. (1977). Explicit and Implicit Semantic Content of the Genetic Information. In: R. Butts
and J. Hintikka (eds.), Foundational Problems in the Special Sciences, pp. 121-149.
Dordrecht: D. Reidel.
Sterelny, K. and Kitcher, P. (1988). The Return of the Gene. Journal of Philosophy 85, 339-361.
Wade, M. (1978). A Critical Review of the Models of Group Selection. Quarterly Review of
Biology 53, 101-114.
Watson, J., Hopkins, N., Roberts, J., Steitz, J., and Weiner, A. (1987). Molecular Biology of the
Gene. 4th ed. Menlo Park: Benjamin/Cummings Publ. Co.
Weismann, A. (1888). On the Supposed Botanical Proofs of the Inheritance of Acquired Characters.
Reprinted in: Weismann, (1889), p. 419ff.
Weismann, A. (1889). Essays upon Heredity and Kindred Biological Problems. Authorized
translation edited by E. B. Poulton, S. Schonland, and A. E. Shipley. Oxford: Clarendon Press.
Reprinted and with an introduction by J. Mazzeo, (1977). Oceanside: Dabor Science
Publications.
Weismann, A. (1892). Das Keimplasma: eine theorie der Vererbung. Jena: Gustav Fischer.
Translated by W. N. Parker and H. Rnnfeldt (1893), under the title The Germ-Plasm: A
Theory of Heredity. New York: Charles Scribners Sons.
Williams, G. (1966). Adaptation and Natural Selection. Princeton: Princeton University Press.
Wilson, E. B. (1896). The Cell in Development and Inheritance. 2nd ed.: 1900. London: Macmillan
Co.
Wimsatt, W. (1980). Reductionistic Research Strategies and their Biases in the Units of Selection
Controversy. In T. Nickles (ed.), Scientific Discovery, Volume II: Historical and Scientific Case
Studies, pp. 213-259. Dordrecht: D. Reidel.
Wimsatt, W. (1981). Units of Selection and the Structure of the Multi-Level Genome. In: Asquith
and Giere (1981), vol. 2, 122-183.
The Informational Gene and the Substantial Body 115
1. Introduction
In: Martin R. Jones and Nancy Cartwright (eds.), Idealization XII: Correcting the Model.
Idealization and Abstraction in the Sciences (Pozna Studies in the Philosophy of the Sciences and
the Humanities, vol. 86), pp. 117-143. Amsterdam/New York, NY: Rodopi, 2005.
118 Nancy J. Nersessian
1
Our discussion will follow Maxwells presentation in the 1861-2 paper. As I have argued (1984a,
b; 1992), the extant historical records support my supposition that Maxwells construction of the
mathematical representation followed a course comparable to that which he presented to the
scientific community.
120 Nancy J. Nersessian
a) b)
Fig. 1 a) Actual pattern of lines of force surrounding a bar magnet (Faraday 1839-55, vol. 3,
plate IV).
b) Faradays schematic representation of the lines of force surrounding a bar magnet
(Faraday 1839-55, vol. 1, plate I).
Figure 1a displays patterns of iron filings that are produced by magnets and
Figure 1b is Faradays schematic representation of these. Faraday interpreted the
attractions and repulsions of the lines of force as showing the directions of
magnetic force, the quantity of magnetic force (summed across the lines, and
so related to the density of lines in a region) and the intensity of magnetic
force (tension along the lines). Maxwells first paper on electromagnetism
(1855-6) provided a mathematical formulation for the notions of quantity and
intensity as flux and flow, respectively. Flux is sometimes a vector
quantity, flow is always a vector. The configuration of the lines in a given
situation is a function of the magnetic force and the permeability of the medium.
2
Throughout the remainder of section 2 all references will be to (Maxwell, 1861-2) unless
otherwise specified. References to that work will involve only page numbers and, where
appropriate, equation numbers.
Abstraction via Generic Modeling in Concept Formation in Science 121
The geometric constraints of the lines of force are satisfied because of the
shape of each vortex. A vortex is wider the farther it is from its origin, which
gives the system the property that lines become farther apart as they approach
their midpoints. Figure 2 is a representation of a vortex segment in motion
drawn from Maxwells description. The vortex-fluid model is consistent with
constraints that derive from Faradays experiments: (1) electric and magnetic
forces are at right angles to each other, (2) magnetism is dipolar, and (3) the
plane of polarized light passed through a diamagnetic substance is rotated by
magnetic action.
Maxwell first derived the equations for the stresses in the hydrodynamic
model. He derived the equation for pressure difference in a medium filled with
vortices placed side by side with axes parallel for the general case where the
122 Nancy J. Nersessian
vortices are not circular and the angular velocity and the density are not
uniform: (1) p1 p2 = 1/4v2 (p. 457, equation 1); where p2 is parallel to the
axes and p1 is perpendicular to the axes in any direction, is the average
density of the vortices, and v is the linear velocity at the circumference of each
vortex. Taking 1/4v2 from equation (1) to be a simple tension along the axis
of stress and p1 to be a simple hydrostatic pressure in all directions, he derived
an expression what we now call the general mechanical stress tensor for the
resultant force on an element of the medium due to variation in internal stress
(p. 458, equation 5):3
F = v(1/4div v) + 1/8 grad v2 [v 1/4curl v] grad p1
Given equation 5, he then constructed the analogy with the magnetic relations
by mapping quantitative properties as follows. He stipulated that the quantities
related to the velocity of the vortex ( = vl, = vm, and = vn, with l, m, n the
direction cosines of the axes of the vortices) be mapped to the components of
the force acting on a unit magnetic bar pointing north. So, the magnetic
intensity, which in contemporary notation is designated as H, is here related to
the velocity gradient of the vortex at the surface. The quantity is taken to
represent the magnetic permeability, thus relating it to the mass of the medium.
The quantity H represents the magnetic induction.
Substituting the magnetic quantities, Maxwell next rewrote the first term of
the mechanical stress tensor for the magnetic system as H(1/4div H) (p. 459,
equation 7). He followed the same procedure of constructing a mapping
between the model and the magnetic quantities and relations to rewrite all of the
components of the stress tensor for magnetism. The resulting electromagnetic
stress tensor represents the resultant force on an element of the magnetic
medium due to its internal stress. We will not go through the details of these
additional mappings, except to point out two noteworthy derivations. First, he
derived an expression relating current density to the circulation of the magnetic
field around the current-carrying wire, j = 1/4 curl H (p. 462, equation 9). This
equation agreed with the differential form of Ampres law that he had derived
in the earlier paper (1855-6, p. 194). The derivation given here still did not
provide a mechanism connecting current and magnetism. Second, he established
3
I have written this equation, which Maxwell wrote in component form, in modern vector notation
and will do so throughout. The vector calculus was just being developed around the time of
Maxwells analysis. Note the actual physical meaning of the vector operators here: the gradient is a
slope along a vortex, the curl is the rotation of the fluid vortex, and the divergence is the flowing of
fluid in the medium. In the Treatise of 1873, Maxwell reformulated his electromagnetic theory in
terms of the theory of quaternions, a form of vector calculus developed by Hamilton (Maxwell,
1891). Although the fluid-mechanical model had been discarded by that time, Maxwell saw the
vector calculus itself as call[ing] upon us at every step to form a mental image of the geometrical
features represented by the symbols (1873, p. 137).
Abstraction via Generic Modeling in Concept Formation in Science 123
that in the limiting case of no currents in the medium and a unified magnetic
permeability, the inverse square law for magnetism could be derived. Thus, the
system agreed in the limit with the action-at-a-distance law for magnetic force
(pp. 464-66, equations 19-26).
Thus far, then, Maxwell had been able to provide a mathematical formula-
tion and what he called a mechanical explanation for magnetic induction,
paramagnetism, and diamagnetism. Although the system of a medium filled
with infinitesimal vortices does not correspond to any known physical system,
Maxwell was able to use mathematical properties of individual vortices to
derive formulas for quantitative relations with the constraints on magnetic
systems discussed above. The four components of the mechanical stress tensor,
as interpreted for the electromagnetic medium, are (p. 463, equations 12-14):
F = H(1/4div H) + 1/8 (grad H2) [H 1/4curl H] grad p1
By component they are (1) the force acting on magnetic poles, (2) the action of
magnetic induction, (3) the force of magnetic action on currents, and (4) the
effect of simple pressure. The last component is required by the model it is the
pressure along the axis of a vortex but had not yet been given an
electromagnetic interpretation. Note that we call the contemporary version of
this equation the electromagnetic stress tensor even though it is now devoid of
the physical meaning of stresses and strains in a medium.
The hydrodynamical model developed in Part I of the paper is the starting
point for the remainder of Maxwells analysis. All subsequent reasoning is
based on modifications of it. Before going on, it will be useful to underscore
salient aspects of how the mathematical representation was constructed.
2.1.1. Summary
In Part I the mathematical representation of various magnetic phenomena
derives from the vortex-fluid model. The model was constructed by first
selecting the domain of continuum mechanics as an analogical source domain
and constructing a preliminary model consistent with the magnetic constraints.
These constraints are the geometrical configurations of the lines of force and
Faradays interpretation of them as resulting from lateral repulsion and
longitudinal attraction. Maxwell hypothesized that the attractive and repulsive
forces are stresses in a mechanical aether. Given this hypothesis, one can
assume that relationships that hold in the domain of continuum mechanics will
hold in the domain of electromagnetism. The magnetic constraints specify a
configuration of forces in the medium and this configuration, in turn, is readily
explained as resulting from the centrifugal forces of vortices in the medium with
axes parallel to the lines of force. So, the vortex motion supplies a causal
process that is capable of producing the configuration of the lines of force and
the stresses in and among them. We can contrast this result with his analysis of
124 Nancy J. Nersessian
the magnetic forces on current. In Part I he had not yet specified a causal
process in the aether connecting electricity and magnetism, and so claimed not
to have provided a mechanical explanation for their interaction. Thus, it seems
that Maxwell thought that establishing a mathematical law connecting them
does not, in itself, provide an explanation.
The mathematical expressions for the magnetic phenomena are derived by
substitution from the mathematical formula for the stresses in the vortex-fluid
model. That model is not a system that exists in nature: it is idealized and it is
generic. One way in which it is idealized will become the focus of the next stage
of analysis: the friction between adjacent vortices is ignored. The model is
generic in that it satisfies constraints that apply to the types of entities and
processes that can be considered as constituting either domain. The model
represents the class of phenomena in each domain that are capable of producing
specific configurations of stresses. More will be said about generic models after
developing Maxwells complete analysis.
opposite to the vortices. This is consistent with the constraint that the lines of
force around a magnetic source can exist for an indefinite period of time, so
there can be no loss of energy in the model. He also stipulated that there should
be no slipping between the interior and exterior layers of the vortices, making
the angular velocity constant. This constraint simplified calculations and would
be altered in Part III.
Fig. 3a. The authors schematic representation of a cross-section of the preliminary analogical
model described by Maxwell.
The model is now a hybrid constructed from two source domains: fluid
dynamics and machine mechanics. In ordinary mechanisms, idle wheels rotate
in place. This allowed representation of the situation in a dielectric, or
insulating, medium. To represent a current, though, they need to be capable of
translational motion in a conducting body. Maxwell noted that mechanisms such
as the Siemens governor for steam-engines have such idle wheels. Throughout
Part II, he provided analogies with machinery as interpretations of the
relationships he had derived between the idle wheel particles and the fluid
vortices. His procedure was to derive the equations using the model, map the
results to the electromagnetic case assuming the established correlations, and
then reinterpret the results in terms of plausible machine mechanisms. The
constraints governing the relationships between electric currents and magnetism
are modeled by the relationships between the vortices and the idle wheels,
conceived as small spherical particles surrounding the vortices. Figure 3b (next
page) is Maxwells own rendering of the model. The major constraints are that
(1) a steady current produces magnetic lines of force around it, (2)
commencement or cessation of a current produces a current, of opposite
orientation, in a nearby conducting wire, and (3) motion of a conductor across
the magnetic lines of force induces a current in it.
The analysis began by deriving the equations for the translational motion of
the particles in the imaginary system. There is a tangential pressure between the
surfaces of spherical particles and the surfaces of the vortices, treated as
126 Nancy J. Nersessian
Fig. 3b. Maxwells representation of the fully constructed physical analogy (Maxwell 1861-2,
plate VIII).
steady current and magnetic lines of force is captured in the following way.
When an electromotive force, such as from a battery, acts on the particles in a
conductor it pushes them and starts them rolling. The tangential pressure
between them and the vortices sets the neighboring vortices in motion in
opposite directions on opposite sides thus capturing polarity of magnetism
and this motion is transmitted throughout the medium. The mathematical
expression (equation 33) connects current with the rotating torque the vortices
exert on the particles. Maxwell went on to show that this equation is consistent
with the equations he had derived in Part I for the distribution and configuration
of the magnetic lines of force around a steady current (p. 464, equations 15-16).
Maxwell derived the laws of electromagnetic induction in two parts because
each case is different mechanically. The first constraint in the case of
electromagnetic induction ((2) above) can be reformulated as: a changing (non-
homogeneous) magnetic field induces a current in a conductor. The analysis
began with finding the electromotive force on a stationary conductor produced
by a changing magnetic field. This first case corresponds, for example, to
induction by switching current off and on in a conducting loop and having a
current produced in a nearby conducting loop. A changing magnetic field would
induce a current in the model as follows. A decrease or increase in the current
will cause a corresponding change in velocity in the adjacent vortices. This row
of vortices will have a different velocity from the next adjacent row. The
difference will cause the particles surrounding those vortices to speed up or
slow down, which motion will in turn be communicated to the next row of
vortices and so on until the second conducting wire is reached. The particles in
that wire will be set in translational motion by the differential electromotive
force between the vortices, thus inducing a current oriented in a direction
opposite to that of the initial current, which agrees with the experimental results.
The neighboring vortices will then be set in motion in the same direction and the
resistance in the medium will ultimately cause the translational motion to stop,
i.e., the particles will only rotate in place and there will be no induced current.
Maxwells diagram (figure 3b) illustrates this mechanism. The diagram shows a
cross section of the medium. The vortex cross sections are represented by
hexagons rather than circles, presumably to provide a better representation of
how the particles are packed around the vortices. The accompanying text
(p. 477) tells the reader how to animate the drawing for the case of
electromagnetic induction by the action of an electromotive force of the kind he
had been considering. The mechanism for communicating rotational velocity in
the medium accounts for the induction of currents by the starting and stopping
of a primary current.
In deriving the mathematical relationships, Maxwell used considerations
about the energy of the vortices. The mathematics is too complex to include in
detail here. The general procedure he followed was to derive the equations for
128 Nancy J. Nersessian
2.2.1. Summary
In Part II the mathematical representation of the relationships between current
and magnetism is derived from a model that is a hybrid of machine mechanics
and fluid dynamics. The idle wheel mechanism is introduced from a constraint
deriving from the vortex fluid model: if the vortices are rotating side by side,
there is friction where they come into contact. Specific idealizations about the
relationships among the particles that there is no slipping and they do not
touch when rotating in place are dictated by the constraint that no energy is
lost in the field surrounding a constant magnetic source. On translation,
however, the particles must experience resistance so that energy is lost and heat
130 Nancy J. Nersessian
played little role in Maxwells analysis; with one, Duhem (1914), going so far
as to charge Maxwell with falsifying his results and with cooking up the model
after he had derived the equations by formal means. Nersessian (1984a, 1984b)
has argued that, properly understood, the model, taken together with Maxwells
previous work on elasticity, provides the basis for the all the errors except one
that can easily be interpreted as a substitution error.
As in Part II, the analogical model was modified in Part III by considering
its plausibility as a mechanical system. In Part II, the system contains cells of
rotating fluid separated by particles very small in comparison to them. There he
considered the transmission of rotation from one cell to another via the
tangential action between the surface of the vortices and the particles. To
simplify calculations he had assumed the vortices to be rigid. But in order for
the rotation to be transmitted from the exterior to the interior parts of the cells,
the cell material must be elastic. He noted that the light aether considered at
this point as possibly a different aetherial medium is assumed to have
elasticity, so the assumption that the electromagnetic medium has elasticity is
equally as plausible. Conceiving the molecular vortices as spherical blobs of
elastic material would also give them the right configuration on rotation,
satisfying the geometrical constraints of Part I for magnetism.
He began by noting the constraint that the electric tension associated with
a charged body is the same, experimentally, whether produced from static or
current electricity. If there is a difference in tension in a body, it will produce
either current or static charge, depending on whether the substance is a
conductor or insulator. He likened a conductor to a porous membrane which
opposes more or less resistance to the passage of a fluid (p. 490) and a
dielectric to an elastic membrane which does not allow passage of a fluid, but
transmits the pressure of the fluid on one side to that on the other (p. 491).
Although Maxwell did not immediately link his discussion of the different
manifestations of electric tension to the hybrid model of Part II, it is clear that
the model figures throughout the discussion. This is made explicit in the
calculations immediately following the general discussion. I note this because
the notion of displacement current introduced before these calculations cannot
properly be understood without the model. In the process of electrostatic induc-
tion, electricity can be viewed as displaced within a molecule of a dielectric,
so that one side becomes positive and the other negative, but does not pass from
molecule to molecule. Maxwell likened this displacement to a current in that
change in displacement is similar to the commencement of a current (p. 491).
That is, in the imaginary model the idle wheel particles experience a slight
translational motion in electrostatic induction.
The mathematical expression relating the electromotive force and the
displacement is: E = 4k2D, where E is the electromotive force (electric
field), k the coefficient for the specific dielectric, and D is the displacement
132 Nancy J. Nersessian
(p. 491). The amount of current due to displacement is jdisp = D/t. The
equation relating the electromotive force and the displacement has the
displacement in the direction opposite from that which is customary now and in
Maxwells later work. The orientation given here can be accounted for if we
keep in mind that an elastic restoring force is opposite in orientation to the
impressed force. The analogy between a dielectric and an elastic membrane is
sufficient to account for the sign error. Maxwell himself stressed that the
relations expressed by the above formula are independent of a specific theory
about the actual internal mechanisms of a dielectric. However, without the
imaginary system, there is no basis on which to call the motion a current. It is
translational motion of the particles that constitutes current. Thus, in its initial
derivation, the displacement current is modeled on a mechanical process. We
can see this in the following way.
Recall the difference Maxwell specified between conductors and dielectrics
when he first introduced the idle wheel particles. In a conductor, they are free to
move from vortex to vortex. In a dielectric, they can only rotate in place. In
electrostatic induction, the particles are urged forward by the elastic distortion
of the vortices, but since they are not free to flow, they react back on the
vortices with a force to restore their position. This motion is similar to the
commencement of a current. But, their motion does not amount to a current,
because when it has attained a certain value it remains constant (p. 491). That
is, the particles do not actually move out of place by translational motion as in
conduction. The system reaches a certain level of stress and remains there.
Charge is the excess of tension in the dielectric medium. Without the model,
current loses its physical meaning, which is what bothered so many of the
readers of the Treatise, where the mechanical model is abandoned, having
served its purpose. That situation is comparable to our continued use of stress
tensor when there is no medium in which there could be stresses.
Propositions XII and XIII provide the calculations for the elastic forces. In
these calculations the vortices are treated as spherical for simplicity. Proposition
XII determines the conditions of equilibrium for the vortices subject to normal
and tangential forces. Proposition XIII shows how to derive the relationship
between the electromotive force and the electric displacement, E = 4k2D
(p. 495, equation 105) using the mapping with the imaginary system. What can
quite readily be interpreted as a substitution error in equation 104 (p. 495)
makes things turn out right. That is, the restoring force and the electromotive
force have opposite orientation as in the opening discussion.
The equation for Ampres law (equation 9) was now in need of correction
for the effect due to elasticity in the medium (p. 496). That is, since the blobs
are elastic and since in a conductor the particles are free to move, the current
produced by the medium (i.e., net flow of particles per unit area) must include a
factor for their motion due to elasticity, j = 1/4 curl H E/t (p. 496,
Abstraction via Generic Modeling in Concept Formation in Science 133
used this error to reinforce his contention that the model was cooked up after the
derivations of the field equations and had played no role in their genesis. My
explanation is that Maxwell simply appropriated the formula from the earlier
paper in which it is correct. This lack of concern for multiplicative factors is
consistent with thinking about the model in generic terms.
Referring back to Part I, is the average density of the vortices (mass) and
is the magnetic permeability. Substituting these into equation 132 and noting
that = 1 in air or a vacuum, he derived the result that the velocity is equal to
the ratio between electrodynamic and electrostatic units. He then noted that, on
converting this number to English units, it is nearly the same value as the
velocity of light. Maxwells conclusion here is significant. In his words, we
can scarcely avoid the inference that light consists in the transverse undulations
of the same medium which is the cause of electric and magnetic phenomena
(italics in the original, p. 500). That is, Maxwell believed that the close
agreement between the velocities was more than coincidental. However, he did
avoid the inference until his next paper, where he derived the now standard
wave equation for electromagnetic phenomena and concluded that light itself is
an electromagnetic phenomenon. We can interpret Maxwells reticence at this
point as arising because the value of the transverse velocity in the electro-
magnetic medium was determined from the presuppositions of the idle wheel-
vortex model. There were no grounds to assume vortex motion in the light
aether. Note also that he did not claim that light is an electromagnetic phenome-
non here, only the possible identity of the media of transmission. On the
nineteenth-century view, light is a transverse wave in an elastic aether. This was
not the same kind of mechanism as that provided for propagating electric and
magnetic actions on the model.
Maxwell ended the analysis of Part III by showing how to calculate the
electric capacity of a Leyden jar using his analysis of electrostatic force. Part IV
applied the whole analysis to the rotation of the plane of polarized light by
magnetism and established that the direction of rotation depends on the angular
momentum of the molecular vortices. It also predicted a relationship between
the refractive index of light and the dielectric constant.
2.3.1. Summary
In Part III the equations for electrostatic induction are derived from the hybrid
mechanical system of Part II, by endowing the vortices with the mechanical
property of elasticity. This modification again arises from considering the
constraints of the model that were ignored in the earlier analyses. In order for
the tangential motion of the particles on the vortices to be transmitted to their
interior, they must be elastic. Elasticity of the vortices is also consistent with the
geometrical constraints of Part I. Maxwell re-enforced endowing the electro-
magnetic medium with this property by noting that the luminiferous medium
Abstraction via Generic Modeling in Concept Formation in Science 135
needs also to possess elasticity on the wave theory of light. This relatively
simple enhancement of the model enabled Maxwell to represent electrostatic
induction and open currents mathematically and to calculate the amount of
delay in transmission of electromagnetic actions.
The forms of the equations for the displacement current and for the modified
version of the equation for current both derive from the modification to the
hybrid mechanical model. When a force is applied to an elastic medium, the
reaction force is of opposite orientation. This led Maxwell to make the
electromotive force and the displacement due to elasticity have opposite orienta-
tion. In a dielectric, where the particles are not free to translate, the excess
tension represents charge. In a conductor, a factor needs to be added to the
expression for current to account for the displacement of the particles due to the
elasticity of the vortices.
Once the medium has elasticity, a time delay in transmission of
electromagnetic actions is necessary. Because the forces between the vortices
and the particles are tangential, only the transverse velocity of propagation in
the medium need be calculated. Although its value is nearly that of light,
Maxwell only speculated that the two aethers might in fact be one. He did not
claim that light actually might be an electromagnetic phenomenon. On my
analysis, the most likely explanation for his refusing to make this claim is that
the vortex mechanism and wave propagation do not belong to the same class of
mechanisms. In his next paper he made the identification of light as an
electromagnetic phenomenon by treating the electromagnetic medium even
more generically as simply a connected system. I will say more about this in
what follows.
Although this may be more a matter of emphasis than disagreement with her,
I find the contrast between changing features and eliminating them not the most
salient for capturing the kind of abstraction we have been examining. First,
rather than making a distinction between abstraction and idealization, I prefer to
say that there are various abstraction processes. Idealization is one; eliminating
features, another; and generic modeling, yet another. The main feature of the
process we have been considering can best be characterized as a loss of
specificity.4 That is why I have called it generic modeling. In the simple
example from Polya, the generic triangle has lost the specificity of the
individual angles and sides. However, angles and sides have not been eliminated
as features of triangles. In the Maxwell case, abstraction takes place through
generic modeling of the salient properties, relationships, and processes. One key
feature of Maxwells generic mechanical models is that they have lost the
specificity of the mechanisms creating the stresses. A concrete mechanism is
supplied, but it is meant to represent generic processes. Thus, in the analysis of
electromagnetic induction, e.g., causal structure is maintained but not specific
causal mechanisms. The idle wheel-vortex mechanism is not the cause of
electromagnetic induction; it represents the causal structure of that process.
When we reason about a generic triangle we often draw a concrete
representation or imagine one in our mind, but from our reasoning context we
understand it as being without specificity of angles and sides. That is, we know
from the context that the interpretation of the concrete figure is as generic. Thus,
the same concrete representation can be generic or specific depending on the
context. While supplying concrete mechanisms, we know from the context that
Maxwell is considering them generically. That is, these mechanisms are treated
in the way that the spring is treated generically when it is taken to represent the
class of simple harmonic oscillators. On Maxwells analysis, the causal structure
is to be viewed as separated from the specific physical systems by means of
which it has been made concrete. This is what I take to be the essence of what
Maxwell is saying in his own methodological reflections on physical analogy.
In employing the method of physical analogy, generic mechanisms are
represented by concrete mechanisms to assist in the reasoning process. They
present the mind with an embodied form to reason about and with, but from the
context the reasoner knows not to adopt any specific physical hypothesis
belonging to the domain that is the source of the analogy. The goal is to explore
the consequences of a partial isomorphism between the laws of two physical
domains. But, this exploration needs to be carried out at a level of generality
sufficient to encompass both domains. Thus, the models derived from the
physical analogies are to be regarded generically.
4
While I will not discuss the details here, generic modeling has much in common with what Darden
and Cain (1989) call generalizing theories by abstraction. See also Griesemer (this volume).
138 Nancy J. Nersessian
5
This reinforces Sellarss (1965) criticism of Hesse that to create novelty in analogical reasoning
requires mapping relational structures, not simply predicates. For an analysis of their dispute and of
Sellarss views on conceptual change, see Brown (1986).
Abstraction via Generic Modeling in Concept Formation in Science 141
In the 1861-2 paper, Maxwell had said that the causal mechanisms he
considered provided mechanical explanations of the phenomena. In the 1864
paper, he said that the mechanical analogies should be viewed as merely
illustrative, not as explanatory (1864, p. 564), and in the Treatise, that the
problem of determining the mechanism required to establish a certain species
of connexion . . . admits of an infinite number of mechanisms (1891, p. 470).
As we have seen, the mechanical explanations of the earlier analysis are
themselves generic in nature. They only specify the kinds of mechanical
processes that could produce the stresses under examination. They provide no
means for picking out which processes actually do produce the stresses. In a
later discussion of generalized dynamics, Maxwell likened the situation to that
in which the bellringers in the belfry can see only the ropes, but not the
mechanism that rings the bell (1890b, pp. 783-4). The first formulation treated
the mechanical models as representing classes of physical systems with
common properties and relationships. After that, Maxwell simply treated the
properties and relationships in the abstract and in all subsequent formulations
replaced the supposition of a continuum-mechanical medium with the mere
supposition of a connected system and proceeded as a bellringer.
This move is at the heart of the disagreement with Thomson, who never did
accept Maxwells representation. Maxwells own position was that the situation
was not any worse than that with Newtons law of gravitation, since that, too,
had been formulated without any knowledge of causal mechanisms. What we
know, but Maxwell did not, is that many different kinds of dynamical systems
can be formulated in generalized dynamics. Electrodynamical systems are not
the same kind of dynamical system as Newtonian systems. What he abstracted
through the generic modeling process is a representation of the general
dynamical properties and relationships for electromagnetism. The abstract laws,
when applied to the class of electromagnetic systems, yield the laws of a
dynamical system that is non-mechanical; that is, one that cannot be mapped
back onto the mechanical domains used in their construction.
5. Conclusion
Nancy J. Nersessian
School of Literature, Communication, and Culture and College of Computing
Georgia Institute of Technology
nancyn@cc.gatech.edu
REFERENCES
Brown, H. (1986). Sellars, Concepts, and Conceptual Change. Synthese 68, 275-307.
Cartwright, N. (1989). Natures Capacities and their Measurement. Oxford: Clarendon Press.
Chi, M. T. H., Feltovich, P. J., and Glaser, R. (1981). Categorization and Representation of Physics
Problems by Experts and Novices. Cognitive Science 5,121-52.
Clement, J. (1989). Learning via Model Construction and Criticism. In: G. Glover, R. Ronning, and
C. Reynolds (eds.), Handbook of Creativity: Assessment, Theory, and Research, pp. 341-81.
New York: Plenum.
Darden, L. and Cain, J. A. (1989). Selection Type Theories. Philosophy of Science 56,106-129.
Duhem, P. (1902). Les Thories lectriques de J. Clerk Maxwell. Etude Historique et Critique.
Paris: A. Hermann & Cie.
Duhem, P. (1914). The Aim and Structure of Physical Theory. New York: Atheneum, 1962.
Faraday, M. (1839-55). Experimental Researches in Electricity. Reprinted, New York: Dover, 1965.
Giere, R. N. (1988). Explaining Science: A Cognitive Approach. Chicago: University of Chicago
Press.
Hesse, M. (1974). Maxwells Logic of Analogy. In: The Structure of Scientific Inference,
pp. 259-282. Berkeley: University of California Press.
Koyr, A. (1978). Galileo Studies. Atlantic Highlands, NJ: Humanities Press.
Maxwell, J. C. (1854). On the Equilibrium of Elastic Solids. In: Maxwell (1890a), vol. 1, pp. 30-74.
Maxwell, J. C. (1855-6). On Faradays Lines of Force. In: Maxwell (1890a), vol. 1, pp. 155-229.
Maxwell, J. C. (1861-2). On Physical Lines of Force. In: Maxwell (1890a), vol. 1, pp. 451-513.
Maxwell, J. C. (1864). A Dynamical Theory of the Electromagnetic Field. In: Maxwell (1890a),
vol. 1, pp. 526-97.
Maxwell, J. C. (1873). Quarternions. Nature 9, 137-38.
Maxwell, J. C. (1890a). The Scientific Papers of J. C. Maxwell. Edited by W. D. Niven. Cambridge:
Cambridge University Press. Reprinted, New York: Dover, 1952.
Maxwell, J. C. (1890b). Thomson and Taits Natural Philosophy. In: Maxwell (1890a), vol. 2,
pp. 776-85.
Maxwell, J. C. (1891). A Treatise on Electricity and Magnetism. 3rd ed. (1st ed. 1873). Oxford:
Clarendon Press. Reprinted, New York: Dover, 1954.
Nersessian, N. J. (1984a). Faraday to Einstein: Constructing Meaning in Scientific Theories.
Dordrecht: Martinus Nijhoff.
*
My analysis has profited from extensive discussions with James Greeno. I acknowledge and
appreciate the support of NSF Scholars Awards DIR 8821442 and DIR 9111779 in conducting this
research.
Abstraction via Generic Modeling in Concept Formation in Science 143
Nersessian, N. J. (1984b). Aether/Or: The Creation of Scientific Concepts. Studies in the History
and Philosophy of Science 15, 175-218.
Nersessian, N. J. (1988). Reasoning from Imagery and Analogy in Scientific Concept Formation. In:
A. Fine and J. Leplin (eds.), PSA 1988, vol. 1, pp. 41-48. East Lansing, MI: Philosophy of
Science Association.
Nersessian, N. J. (1992). How Do Scientists Think? In: R. Giere (ed.), Cognitive Models of Science,
Minnesota Studies in the Philosophy of Science XV, pp. 3-44. Minnesota: University of
Minnesota Press.
Nersessian, N. J. (1995). Should Physicists Preach What They Practice? Constructive Modeling in
Doing and Learning Physics. Science & Education 4, 203-226.
Nersessian, N. J., Griffith, T., and Goel, A. (1996). Constructive Modeling in Scientific Discovery.
Cognitive Science Technical Report, Georgia Institute of Technology.
Polya, G. (1954). Induction and Analogy in Mathematics. Vol. 1. Princeton: Princeton University
Press.
Sellars, W. (1965). Scientific Realism or Irenic Instrumentalism. In: R. Cohen and M. Wartofsky
(eds.), Boston Studies in the Philosophy of Science, vol. 2, pp. 171-204. Dordrecht: D. Reidel.
This page intentionally left blank
Margaret Morrison
1. Introduction
1
See A&S, pp. 69-86 as well as Duhem (1902), pp. 221-225.
In: Martin R. Jones and Nancy Cartwright (eds.), Idealization XII: Correcting the Model.
Idealization and Abstraction in the Sciences (Pozna Studies in the Philosophy of the Sciences and
the Humanities, vol. 86), pp. 145-172. Amsterdam/New York, NY: Rodopi, 2005.
146 Margaret Morrison
2
For example Boyles law, PV = RT, relates the various properties of a gas, including pressure,
volume, temperature and the universal gas constant.
3
The reason that Duhem reacts so strongly to Maxwells mathematical representation of
electromagnetism is that he sees the English as using different sorts of symbolic algebras and failing
to differentiate the algebra from the theory itself. In contrast, a Continental physicists theory is
essentially a logical system uniting hypotheses through rigorous deductions to their consequences,
which are then compared with experimental laws. Algebra plays a purely auxiliary role facilitating
calculations that lead from hypotheses to consequences. The calculations can always be replaced by
a strict logical progression of hypotheses and consequently each symbol corresponds to a property
148 Margaret Morrison
that can be physically measured. This correspondence was not to be found in Maxwells
electromagnetism. Cf. A&S, ch.4.
Approximating the Real: The Role of Idealizations in Physical Theory 149
same phenomena and having a realist view of theories that he takes issue with
their methodological practices (A&S, pp. 79-82, 101).
Duhems other arguments against the truth or falsity of physical laws
concern the case of underdetermination that facilitates the application of a
potentially infinite number of symbolic representations for each concrete fact,
thereby ruling out a unique correspondence between one particular law and a
real system. In addition, constant refinement of instruments leads to a continual
adjustment of laws, a process that continues indefinitely. Hence, by its very
nature, physics and the provisional character of its abstract laws can claim to be
nothing more than an approximation.4 The more complicated the law becomes
the greater its approximation, yet it is only by retaining their approximative
nature that physical laws can function as laws at all. Generality and abstraction
are built in to the very nature of exact sciences like physics.
The mathematical symbol forged by theory applies to reality as armor to the body of
a knight clad in iron: the more complicated the armor, the more supple will the rigid
metal seem to be; the multiplication of the pieces that are overlaid like shells assures
more perfect contact between the steel and the limbs it protects; but no matter how
numerous the fragments composing it, the armor will never be exactly wedded to the
human body being modeled (A&S, p. 175).
Realist accounts of idealization willingly accept abstraction and approxima-
tion as a fundamental feature of physical theory and simply recast Duhems
argument as one that involves a practical problem of calculation, rather than one
that prevents us from characterizing our models or laws as realistic (or approxi-
mately true) descriptions of reality. In other words, because all scientific laws
are idealizations or abstractions, the problem becomes one of supplementing
laws and theories with parameters necessary for better and consequently more
accurate representations of reality. Constant revision is seen as evidence for
convergence toward a law that will capture more of the essential features of the
phenomena in question.
The challenge of bridging the gap between idealized models, abstract laws
and the reality they represent requires that we be able to say something about
4
Unlike geometry which progresses by adding new indisputable truths to a fixed body of
knowledge, and unlike the laws of common sense which are themselves fixed (expressing very
general and unrestricted judgements), the laws of physics face constant revision and the prospect of
being overturned. The symbols that the laws relate are too simple to represent reality completely.
For example, when we describe the sun and its motion using physical laws we replace the
irregularities of its surface with a geometrically perfect sphere. Consequently the precision of the
law cannot be mirrored by reality. The common sense law stating that the sun rises each day in the
east, climbs in the sky and sets in the west has a degree of certainty that is fixed and immediate and
relatively easy to calculate. The corresponding physical law that provides the formulas that furnish
the coordinates of the suns center at each instant acquires a minuteness of detail that is achieved
only by sacrificing the fixed and absolute certainty of common sense. Cf. A&S, pp. 165-179.
150 Margaret Morrison
how it is that these models, laws and theories facilitate the production of
scientific knowledge. To do that one must first determine whether the problem
of abstraction in Duhems sense is simply one of calculation and filling in
appropriate parameters.
5
I should distinguish between Cartwrights (1983, n. 5) version of this thesis and Duhems.
Cartwright characterizes fundamental laws that do not accurately describe the real world as false
while Duhem sees them as neither true nor false but approximate.
6
In other words, the idea of a determinate reality is implicit in the distinction. This dichotomy
neednt be seen as supporting scientific realism since the latter view encompasses the idea that the
structure or theory mirrors, to a greater or lesser extent, that reality. One can be a metaphysical
realist and support the distinction between theory and reality while remaining agnostic about
whether the formal structure is an accurate representation of that reality. This latter view is the one
advocated by van Fraassen. McMullin claims (in correspondence) that the phenomenological
models popular in recent physics are not idealizations in his sense but simply constructions intended
to summarize the phenomena. If this is so then one needs to provide an account of how it is possible
Approximating the Real: The Role of Idealizations in Physical Theory 151
to differentiate these from the models that supposedly provide a realistic representation of the
phenomena.
152 Margaret Morrison
following quote from Gell-Mann suggests that this kind of heuristic use of
models may in fact be common in modern theory; but as a methodological
device it does little to bolster the connection between the process of idealization
and an underlying presupposition about the relationship between models and
reality that is characteristic of scientific realism and the debate between
Simplicio and Salviati.
We construct a mathematical theory of the strongly interacting particles, which may
or may not have anything to do with reality, find suitable algebraic expressions that
hold in the model, postulate their validity and then throw away the model (Cf. Gell-
Mann and Neeman 1964, p. 198).
Although McMullins discussion may account for the more straightforward
problem of idealization, a deeper problem arises in the context of modern
theory. In addition to the complex mathematical abstraction, physical concepts
like that of the Higgs field are not only highly idealized but their degree of
departure from real physical systems often cannot be determined due to a lack of
information. In these situations the question of how models relate to reality
takes on an entirely new dimension.
McMullin does discuss complex cases of idealization as instances of what he
refers to as construct idealization, a process that involves a simplification of
the conceptual representation of an object. This is a more specific type of
mathematical idealization and is utilized in the development of models as
opposed to the formulation of laws. The process operates in a way that
resembles Cartwrights use of abstraction; features that are known to be relevant
are simply omitted (or simplified) in order to obtain a result.7 When this is done
by simplifying properties already known to exist, as in the case where Newton
assumed the sun to be at rest in order to derive Keplers laws, we have an
example of formal idealization. When a model leaves much of the material
structure of the phenomena unspecified we have an instance of material
7
Cartwright distinguishes between abstraction and idealization, claiming that modern science works
by abstraction and that the process is multi-dimensional. When we formulate laws we often begin
with a real system or object and either rearrange specific properties in order to facilitate calculation
or ignore small perturbations. For example, we know that the law governing the lever is specified
for cases involving homogenous and perfectly rigid rods, even though these are not realizable in
practice. Similarly in the case of Galileos law for free fall, we typically ignore the perturbations
caused by air resistance or friction. In both cases, all relevant features of the situation are
represented and the deviations from the ideal case can supposedly be accounted for. According to
Cartwright, this differs significantly from the process of abstraction where we subtract not only
small perturbations from concrete cases but often ignore highly relevant information as well. To use
one of her examples, if we have an idealized model of how an helium-neon laser works, the
information contained in the model will include a description of the lasing material and the specific
laws that govern that particular kind of laser. We can ignore these details and move to a higher level
of abstraction when we formulate a law that applies to all lasers; a law which states that the basic
operating mechanism of the laser involves an inverted population of atoms. See Cartwright (1989).
Approximating the Real: The Role of Idealizations in Physical Theory 153
idealization. The distinguishing feature is the way the properties are added back
in order to make the model more realistic.
Most accounts of idealization, including McMullins and Laymons
(Laymon 1982), assume that the process of adding back properties or
calculating the effects of simplification involves a cumulative aspect that results
in the model gradually becoming a more realistic representation of the
phenomena. The difficulty with this view, as a philosophical reconstruction, is
that it greatly simplifies the process of model building, thereby giving us an
idealized model of scientific practice. The two most important concerns that
these accounts overlook are that the growth of models occurs usually by the
proliferation of structures rather than by a cumulative process, and secondly,
changes in the qualitative or conceptual understanding often accompany slight
changes in a model or idealization. Consequently, there is no stable structure to
which parameters can be added to increase the models predictive and
explanatory power. Without this kind of stability no degree of inductive support
can be assigned to the model since the addition of each new parameter results in
either the construction of a new model or the addition of new assumptions that
are incompatible with the previous structure. In either case there is no constancy
or uniqueness of a theoretical picture. Consider the following cases.
McMullin describes the derivation of the ideal gas law as an example of
formal idealization. The law is formulated using the assumption that the
molecular constituents of a gas are perfectly elastic spheres that exert no forces
and have a negligible volume relative to that occupied by the gas. This law
holds for normal ranges of temperatures and pressures and predicts only a
continuous monotonic change in the systems properties. As a result, it is unable
to explain a change from an homogeneous to an heterogeneous system, such as
the appearance of liquid droplets in a gas. Technically it is inappropriate in
accounting for phase transitions the changes from a solid to a liquid to a gas.
When this law is amended to take account of the finite size of the molecules and
intermolecular forces, the result is the van der Waals law, which describes
real as opposed to ideal gases and is valid at high temperatures.
If we are to assume that the van der Waals law builds on the ideal gas law to
give a better and more realistic approximation then it is important to consider
the qualitative assumptions appropriate to each case. The van der Waals
approach assumes that the gas pressure has its ideal value in the bulk of the
gas and that the molecules suffer a loss of energy and momentum as they escape
from the bulk and collide with the walls of the container.8 This too is an
idealizing assumption which is further changed by yet another law, the Dieterici
8
What this means is that in the bulk of the gas the molecules behave as though they were in a gas
without attractive force so that the effective pressure is the same as for an ideal gas. Cf. Tabor
(1985), ch. 5.
154 Margaret Morrison
equation, which states that the temperature of the gas is everywhere constant,
even near the walls of the container. This law entails a reduction in the mean
density by assuming that the density at the wall is less. As a result the
Boltzmann distribution for total energy applies to the molecules striking the
walls as well as those in the bulk, thereby contradicting the model provided by
the van der Waals equation. The Dieterici equation gives a more accurate result
for heavier complex gases, but like the van der Waals equation it cannot
accommodate data near the critical point.9 In addition to this inconsistency the
van der Waals equation has a fundamental theoretical difficulty; it leads to
negative compressibility for some values of thermodynamic variables. This
result contradicts the well-established van Hove theorem which states that an
accurate statistical calculation can lead only to non-negative compressibility.
The difficulty that arises when attempting to describe the development of
these gas laws as an adding back of realistic assumptions is that in each case
there are fundamental differences in the way the molecular system is described.
Rather than accumulating properties in one basic structure, we have a number of
mutually inconsistent ways of representing the system in each case. If we
merely changed the properties of the ideal gas in order to accommodate the van
der Waals law, a case could be made that the inconsistency only occurred
between the ideal and the real case; but the problem is more significant. The so-
called real system represented by the van der Waals approach is not a unique
description, since it conflicts with other non-ideal cases like the Dieterici
equation. The two models make different assumptions about how the molecular
system is constituted. Instead of a refinement of one basic molecular model, we
have a number of different models that are suitable for different purposes. The
billiard ball model is used for deriving the ideal gas law, the weakly attracting
rigid sphere model for the van der Waals law, and a model representing
molecules as point centers of inverse power repulsion is used for facilitating
transport equations. The fact that this situation arises in one of our better
developed and highly confirmed theories suggests that the problem is not one
that is peculiar to newly proposed theories and phenomena.
In fact nuclear physics exhibits the same pattern as the kinetic theory. There
exist a number of contradictory nuclear models each of which postulates a
different structure for the atomic nucleus. According to the liquid drop model,
nucleons are expected to move very rapidly within the nucleus and produce
frequent collisions, similar to the molecules in a drop of liquid. Despite its
success there are many experimental results that this model cannot account for
(Cf. Gitterman and Halpern 1981). Moreover, the model ignores the fact that,
9
PV = RT represents the ideal gas law, but in the van der Waals case two terms, a and b, are added
to represent intermolecular forces yielding a new law, P + (a/V2) (Vb) = RT. The Dieterici equation
adds yet another term, P(Vb) = RT exp(a /VRT). Again, see Tabor (1985), ch. 5.
Approximating the Real: The Role of Idealizations in Physical Theory 155
like electrons, nucleons in rapid motion have spin 1/2, and therefore obey the
Fermi-Dirac statistics and are subject to the Pauli exclusion principle. Again it
isnt possible to simply reintroduce these conditions (even partially) into the
original model. Instead an alternative, the shell model, was proposed which
would incorporate the spin statistics and account for the anomalous experimen-
tal data. In this model, each nucleon has an independent motion. Although the
new model is superior to the drop model and can accommodate many nuclear
data, it cannot account for the total energy of the nuclei or nuclear fission,
phenomena for which the liquid drop model was initially proposed. Still other
experimental results are explained using what is termed the compound nucleus
model, while another possibility is the optical model which accounts for some
of the observed neutron scattering.
In all of these cases the departure from reality must be addressed from within
the context of each different model, since each presents a different description
of the phenomena. For calculations where the departure is seemingly irrelevant
a simple model is used, and when a more complex account is required the
process of adding back parameters takes place within the domain of an entirely
different model or set of idealized circumstances. McMullin argues that the
technique of de-idealizing, which serves to ground the model as the basis for a
continuing research program, works only if the original model idealizes the real
structure of the object. But, because we often have many models and the real
structure of the object is often too complicated to manipulate or is simply
unknown, the claim is perhaps better understood as one that characterizes
successful de-idealization as evidence for some degree of approximation to the
real structure of the object. But even this seems an overly optimistic, if not
inaccurate way of representing the modeling process in modern physics. The
inductive support that might normally be gained through successful predictions,
provided that the properties of the model were added in a cumulative way to a
stable structure, is simply not present. Because each model is different the
inductive base is not strengthened in a systematic way. McMullin does claim
that if processes of self-correction and imaginative extension of a model are
suggested by the structure of the model itself, then the processes are not ad hoc,
and we can have a reasonable belief that the model gives a relatively good fit
with the real system (1985, p. 264). However, the sense in which modifications
to nuclear models were suggested by the original model of the atomic nucleus is
somewhat remote, since the only claim that remained constant was the fact that
the nucleus consisted of protons and neutrons. The linkages tracing the complex
models we use today to their origins in the Rutherford model are certainly less
than perspicuous.10 A similar story can be told in the case of van der Waalss
10
Also, if the process was cumulative, the shell model would surely be able to account for the
phenomena associated with the liquid drop model, which is not the case.
156 Margaret Morrison
law and its relationship to the kinetic theory. Historical work by Martin Klein11
and others has shown that the van der Waals equation did not even qualify as a
deduction from the kinetic theory. In fact, its explanations of the gas-liquid
transition in terms of intermolecular forces was not really an application of
statistical mechanics. Nevertheless, no one would suggest that the law is
somehow ad hoc. The very nature of theory construction is such that one
expects extensions and refinements to models and idealized laws to be
motivated in part by experimental findings, as well as by theoretical
considerations that are both internal and external to the model. The difficulty of
ad hoc postulations arises when mathematical refinements, intended to establish
a physical result, cannot be physically explained using any available model, as
in the case of renormalization of mass in quantum electrodynamics.
In some cases, like the example of the frictionless plane, correcting the
model presupposes that we know the degree of departure from the real system or
what parameters have been omitted. In this sense all of science involves
idealization, since every model or law contains what Michael Redhead has
called computational gaps (Redhead 1980). This poses a philosophical
problem only if we fail to recognize that our theories are only correct within a
certain margin of error rather than in some sense of absolute correspondence,
and where the acceptable degree of approximation is determined within the
practice itself. However, what the examples above alluded to was a more
problematic notion of idealization, one which creates difficulties that are
motivated not only by philosophical worries about correspondence, but by
theoretical concerns that arise within the scientific context. These cases involve
computational gaps that are not merely practical problems about calculation but
create more serious problems for theorists, as well as for traditional realist
interpretations of scientific practice. The examples discussed above are such
that it is difficult to determine the degree to which the model represents the real
system, since little can be determined about the actual structure of the system
due to a lack of direct information and the number of conflicting models.
Consequently, the only indicator of the models success is its predictive power
rather than its isomorphism with reality.
Laymon also uses the van der Waals equation as an example of how the
kinetic theory can acquire increased confirmation. He claims that the fact that
added parameters (which make the model of a gas more realistic) result in more
accurate predictions lends credence to the kinetic theory (Laymon 1985). As I
mentioned above, the problem with this view is that the van der Waals law itself
contradicts other features of the kinetic theory and disagrees with experimental
results near the critical point. In addition there are several different ways of
11
See Klein (1974), as well as the appendix of Morrison (1990). My account draws on the work of
Klein.
Approximating the Real: The Role of Idealizations in Physical Theory 157
12
One would perhaps be able to make a case that the evidence accounted for by the van der Waals
law confirmed the kinetic theory if the law could be deduced from the theory, or even if the
additional parameters, a and b, were suggested by the theory. But, of course, neither is the case.
13
This difference between realistic and concrete models becomes important in Kelvins work,
which I discuss below. Sometimes it is the case that certain metaphors function as a way of turning
abstract theoretical notions into concrete representations, as was the case with the clock metaphor in
the seventeenth century. As Larry Laudan has pointed out, it was this metaphor that solidified the
notion that the physical world operated as a mechanical system ( Laudan 1981). Another example is
the case of elliptical orbits, which became concrete for Newton, but were considered abstract by
Kepler.
158 Margaret Morrison
14
I would like to thank Carl Hoefer for calling my attention to this point, and forcing me to rethink
some of the difficulties that arise even with computational idealization.
Approximating the Real: The Role of Idealizations in Physical Theory 159
often the case that in describing the role of models a distinction is made between
those that are merely heuristic and those which are presumed to have some
truth component or represent reality in an accurate way. Implicit in this
distinction is usually a view that heuristic models serve as placeholders until a
more realistic model can be developed. However, this kind of dichotomy
misrepresents and undermines to some extent the way heuristic models function
in theory construction and confirmation. Instead of valuing models as suppose-
dly accurate representations of reality, we need to draw attention to the heuristic
role played by all models in establishing empirical laws and theories. The
recognition of the heuristic component in model building and theory
construction serves to minimize the division between different kinds of models.
This, of course, does not preclude distinguishing between those models we
know to be false and those about which we are uncertain. The point is simply
that given the nature of modeling and our relative inability in many cases to
independently determine whether models are in fact accurate representations of
reality, the distinction between supposedly realistic models and those that are
merely heuristic can be successfully made only in cases where we knowingly
use fictional representations. Of course these fictional representations can be
either concrete, like Maxwells aether model, or ideal, like the case of a point
particle. As I stressed above, concrete models represent an object or system that
is physically realizable; it may be possible but in fact not actual, or alternatively,
it may be a candidate for reality in the form of an hypothesis.
In order to illustrate the importance of heuristic models I want to discuss the
development of Maxwells electromagnetic theory, which provides an
interesting account of how it is possible to move from a mathematical analogy
to a fictional model to a more abstract dynamical theory. Not only does it
illustrate the importance of idealized constructs and mathematical
representations as heuristic mechanisms, but it shows how they play a
substantive role in theory construction and development. The criticisms of
Maxwells theory by Duhem and Kelvin, which centered on the way he used
idealization and models, bear a significant similarity to modern debates on the
subject.
In the various stages of development that led to the version of field theory
presented in the Treatise on Electricity and Magnetism (Maxwell 1873;
hereafter TEM), Maxwell relied on a variety of methodological tools that
included a fictional model of the aether in addition to a variety of physical
analogies. Although the model was recognized by Maxwell as fictitious it
nevertheless played an important role in developing both mathematical and
160 Margaret Morrison
physical ideas that were crucial to the formulation and conceptual understanding
of field theory. The evolution of electromagnetism illustrates a process of theory
construction that has, to a great extent, remained unchanged in modern science.
In 1856 Maxwell attempted a representation of Faradays electromagnetic
theory in what he called a mathematically precise yet visualizable form
(Maxwell 1856; hereafter FL). The method involved both mathematical and
physical analogies that were based on Kelvins 1842 analogy between
electrostatics and heat flow (FL, p. 156). Maxwells analogy was between
stationary fields and the motion of an incompressible fluid that flowed through
tubes with the lines of force represented by the tubes. Using the formal
equivalence between the equations of heat flow and action at a distance,
Maxwell substituted the flow of the ideal fluid for the distant action. Although
the pressure in the tubes varied inversely as the distance from the source, the
crucial difference was that the energy of the system was in the tubes rather than
being transmitted at a distance. The direction of the tubes indicated the direction
of the fluid in the way that the lines of force indicated the direction and intensity
of a current. Both the tubes and the lines of force satisfied the same partial
differential equations. Maxwell went on to extend the hydrodynamic analogy to
include electrostatics, current electricity and magnetism. The purpose of the
analogy was to illustrate the mathematical similarity of the laws, and although
the fluid was a purely fictional entity it provided a visual representation of this
new field theoretic approach to electromagnetism.
What Maxwells analogy did was furnish a physical conception for
Faradays lines of force; a conception that involved a fictional representation,
yet provided a mathematical account of electromagnetic phenomena. This
method of physical analogy, as Maxwell referred to it, marked the beginning of
what he saw as progressive stages of development in theory construction.
Physical analogy was intended as a middle ground between a purely
mathematical formula and a physical hypothesis. The former causes us to lose
sight of the phenomena to be explained, while the latter clouds our perception
by imposing theoretical assumptions that restrict our ability to evaluate
alternatives. By contrast, the method of physical analogy allows us to grasp a
clear physical conception without full blown commitment to a particular
physical theory, while at the same time preventing us from being drawn away
from the subject under investigation by the pursuit of analytical subtleties.
(FL, p. 156). So, in a physical analogy we have a partial similarity between the
laws of one science and those of another. The hydrodynamic analogy was
specifically intended as a way of gaining some precision in the mathematical
representation of the laws of electromagnetism and assisting others in the
systematization and interpretation of their results. It was important as a visual
representation because it enabled one to see electromagnetic phenomena in a
new way. Until then action at a distance accounts had dominated, and as a
Approximating the Real: The Role of Idealizations in Physical Theory 161
result, the idea that these phenomena could be conceptualized in another way
was indeed novel, but highly suspect.
Although the analogy did provide a model (in some sense), it was merely a
descriptive account of the distribution of the lines in space with no mechanism
for understanding the forces of attraction and repulsion between magnetic poles.
This physical treatment was developed further in a paper written by Maxwell in
1861-2, entitled On Physical Lines of Force (Maxwell 1861-2; hereafter
PL). The goal was to find an account of the physical behavior of magnetic
lines that could give rise to magnetic forces. Prior to this Kelvin had construed
the Faraday effect (the rotation of the plane of polarized light by magnets) as the
result of the rotation of molecular vortices in a fluid aether. Maxwell used this
idea to develop an account of the magnetic field that involved the rotation of the
aether around lines of force. The paper also offered an account of the forces that
caused the motion of the medium (or aether) and the occurrence of electric
currents. This required an explanation of how the vortices could rotate in the
same direction; the problem which led Maxwell to the development of his
famous mechanical aether model. The model involved a vortex motion that
resulted from a layer of rolling particles called idle wheels that were
interspersed between the vortices. Electromotive force was then explained in
terms of the forces exerted by the vortices on the particles between them.
Although Maxwell was successful in developing the mathematics required for
his model, he was insistent that the representation be considered provisional and
temporary.
The conception of a particle having its motion connected with that of a vortex by
perfect rolling contact may appear somewhat awkward. I do not bring it forward as a
connection existing in nature, or even as one which I would willingly assent to as an
electrical hypothesis. It is, however, a mode of connexion which is mechanically
conceivable, and easily investigated . . . I would venture to say that anyone who
understands the provisional and temporary character of this hypothesis, will find
himself rather helped than hindered by it in his search after the true interpretation of
the phenomena (PL, p. 486).
In fact, Maxwell called this representation an imitation of electromagnetic
phenomena by an imaginary system of molecular vortices (PL, p. 486).
The difficulty with the model was that Maxwell was unable to extend it to
electrostatics, a problem that led him to propose a rather different model in part
three of the paper. Instead of the hydrodynamic model consisting of a fluid
cellular structure containing vortices and idle wheels, he developed an elastic
solid model made up of spherical cells endowed with elasticity. The cells were
separated by electrical particles whose action on the cells would cause a kind of
distortion. Hence the effect of an electromotive force was to distort the cells by
a change in position of the electrical particles. This gave rise to an elastic force
which set off a chain reaction throughout the entire structure. Maxwell saw the
162 Margaret Morrison
15
R = 4E2h, where h = displacement, R = electromotive force, and E = coefficient of rigidity,
which depended on the nature of the dielectric. The amount of displacement depended on the nature
of the body and the electromotive force.
16
R was interpreted as an electromotive force in the direction of displacement and as an elastic
restoring force in the opposite direction. E was considered an electric constant and elastic
coefficient while h was interpreted as a charge per unit area and a linear displacement.
Approximating the Real: The Role of Idealizations in Physical Theory 163
model and attempt to derive the field equations with the aid of experimental
facts and general dynamical principles. Before discussing the later formulations
of electromagnetism it is important to isolate the differences between what
Maxwell considered a theory and the kind of model he put forth in PL.
Although the method Maxwell used was that of physical analogy, it did
not involve a physical identification of properties or phenomena in the two
systems. Instead it was simply taken to mean that two branches of science had
the same mathematical form, as in the case of heat flow and electrostatics. A
subspecies of physical analogy is what Maxwell called dynamical analogy,
where again both analogues have the same mathematical form, but one is
concerned with the motion and configuration of material systems. An example
is Maxwells analogy between electrostatics and the motion of an incompres-
sible fluid, where the latter is concerned with fluid flow from sources to sinks.
The extension of these dynamical analogies to a more substantive interpretation
constituted a dynamical explanation, where the properties of one system were
literally identified with the properties of the other. When one is able to provide
this kind of identification then the account of the material system can be
understood as a physical hypothesis. The hypothesis must meet other
conditions if it is to be considered legitimate; namely, independent existence for
the entities it postulates and consistency with dynamical principles like
conservation, both of which I will mention below. But first, the most important
issue in distinguishing between Maxwells early use of models and analogies
and his later work is the transition from dynamical analogy to dynamical
theory.17
It is obvious from the discussion above that none of the analogies used in FL
or PL would constitute an explanation, since most of the references were to
fictional phenomena or properties introduced for heuristic purposes. However,
when Maxwell moves on to write A Dynamical Theory of the Electromagnetic
Field (Maxwell 1865; hereafter DT) there is also nothing in this work that
would qualify as an explanation. The latter involved a specific characterization
of a physical system, whereas Maxwells theory made no specific assump-
tions about the aether or electromagnetic medium. Instead the conclusions
reached in DT were supposedly deduced from experimental facts with the aid of
general dynamical principles about matter in motion; principles characterized by
the abstract dynamics of Lagrange.18 This method allowed Maxwell to treat the
field variables as generalized mechanical variables interpreted within the
context of the Lagrangian formalism which contained terms corresponding only
to observable variables. As a result, there were no assumptions about hidden
mechanisms or causes that could be used to explain the behavior of material
17
For a discussion of Maxwells account of dynamical explanation and theory see (Maxwell 1876).
18
Cf. DT, p. 564.
164 Margaret Morrison
19
Here we find Maxwell in agreement with the methodological principle enunciated by Duhem,
that there is a trade-off between accuracy and certainty.
20
For a discussion of the difference between an explanation and a theory, see Maxwells example
of the belfry in T&T, p. 783.
Approximating the Real: The Role of Idealizations in Physical Theory 165
p. 470). The problem was that the vortex hypothesis failed to satisfy the
criterion of independent existence, a necessary condition for all physical
hypotheses (TEM, vol. II, sec. 831). Vortices were proposed initially as an
explanatory tool, from which one could then derive the desired consequences;
because they lacked direct experimental evidence, Maxwell concluded that there
was no independent way to verify their existence. Consequently, if they were to
play any substantive role in the dynamical theory Maxwell would be guilty of
subscribing to the same methods for which he criticized the French molecula-
rists; a species of what we now refer to as hypothetico-deductivism (Maxwell
1876, p. 309).
Maxwell had a specific justification for introducing mechanical ideas or
concepts into his dynamical field theory. As I mentioned above, he stressed the
importance of providing a physical image or interpretation of the phenomena,
even if that image was nothing more than a fictional illustration that was
consistent with mechanical principles. From these visual images one could
develop mathematical representations that might assist in the formulation of
physical laws. Although the specifics of Maxwells mechanical model were
absent from the later work, aspects of his mechanical analogies remained. In
fact, Maxwells insistence that energy be interpreted literally involved a
commitment to the claim that all energy is mechanical energy (DT, p. 564).
This emphasis is also echoed in the Treatise (TEM, vol. II, sec. 550), and in a
lecture to the Chemical Society, where he connects the use of dynamical
analogy and mechanical concepts:
21
As Maxwell remarks at the end of the Treatise (TEM, vol. II, sec. 831):
The attempt which I then made to imagine a working model of this mechanism must be taken
for no more than it really is, a demonstration that a mechanism may be imagined that is capable
of producing a connexion mechanically equivalent to the actual connexion of the parts of the
166 Margaret Morrison
When using the method of physical analogy Maxwell was always quick to
caution against extending analogies beyond the realm of mathematical laws.
The method was intended to yield a knowledge of relations, rather than an
identification of physical structures or essences. In other words, Maxwells use
of analogy is significantly different from what is normally referred to as
argument from analogy, where we attempt to infer the existence of particular
microphenomena or properties from their apparent analogy with their macro
counterparts, or where we attempt to use the analogy as the basis for an
inductive inference.22 The resemblance between mathematical laws was useful
in the development of further mathematical ideas, and in FL, Maxwell showed
how the formal equivalence between the equations of heat flow and action at a
distance could be used to present electromagnetic phenomena from two
different yet mathematically equivalent points of view. Although physical ideas
were prominent there was no attempt to consider them as hypotheses, yet their
heuristic value added to both the qualitative and quantitative aspects of the
analogy. Although one could obtain a system of truth founded strictly on
observation if physical ideas were deleted from the analogies, the result would
be deficient in both the vividness of its conceptions and the fertility of its
method (FL, p. 156).
According to Maxwell the method for recognizing real analogies rested on
experimental identification. If two apparently distinct properties of different
systems are interchangeable in appropriately different physical contexts they
can be considered the same property.23 But, in order to have this kind of
substitutability, there must be independent evidence for both analogues. So,
although Maxwell could identify the optical and electromagnetic aethers, given
the equal velocities for wave transmission, there was only an inferential basis
for the existence of the medium. As a result the analogy could not provide the
foundation for a substantive physical theory.
The entire process of theory construction was one that, for Maxwell,
involved a variety of methods, each of which contributed in an essential way to
the development of electromagnetism. Physical analogies were used in the
development of a mechanical model that eventually led to the formulation of the
field equations. While the end product a dynamical field theory was
independent of hidden assumptions, it did rely on physical ideas whose heuristic
electromagnetic field. The problem of determining the mechanism required to establish a given
species of connexion between the motions of the parts of a system always admits of an infinite
number of solutions. Of these, some may be more clumsy or more complex than others, but all
must satisfy the conditions of mechanism in general.
22
For more on Maxwells distrust of this method see his criticisms of Newtons analogy of nature
in an article entitled Atom (n.d. (b)).
23
This is a species of what Mary Hesse has called substantial identification, which occurs when the
same entity is found to be involved in apparently different systems.
Approximating the Real: The Role of Idealizations in Physical Theory 167
value was not considered mere by Maxwell. Their value was not measured by
their degree of truth but they nevertheless played a constitutive role in the
advancement of science. The process is summed up quite nicely in Maxwells
address to the Mathematical and Physical sections of the British Association.
The figure of speech or of thought by which we transfer the language and ideas of a
familiar science to one with which we are less acquainted may be called Scientific
Metaphor.
Thus, the words velocity, momentum, force, etc., have acquired certain precise
meanings in elementary Dynamics. They are also employed in the Dynamics of a
Connected System in a sense, which, though perfectly analogous to the elementary
sense, is wider and more general.
These generalized forms of elementary ideas may be called metaphorical terms in
the sense in which every abstract term is metaphorical. The characteristic of a truly
scientific system of metaphors is that each term in its metaphorical use retains all the
formal relations to the other terms of the system which it had in its original use. The
method is then truly scientific that is, not only a legitimate product of science but
capable of generating science in its turn.
There are certain electrical phenomena again which are connected together by
relations of the same form as those which connect dynamical phenomena. To apply
to these the phrases of dynamics with proper distinctions and provisional reservations
is an example of a metaphor of a bolder kind; but it is a legitimate metaphor if it
conveys a true idea of the electrical relations to those who have already been trained
in dynamics (Maxwell 1870, p. 227; italics added).
of disparate models. For Duhem, this use of models was undesirable at the level
of qualitative representation and was simply intolerable in the context of
algebraic theories where systematic order was the sine qua non. As we saw
earlier, Duhem himself was not concerned with providing an explanation of
empirical laws, since this would involve an illegitimate appeal to metaphysics.
Instead the goal was to furnish a realistic account of physics that was based on
empirical data and methods that remained faithful to the limitations of
mathematical and experimental science, while providing some degree of unity
and order upon which our theories must be built. Consequently, one could admit
a plurality of laws, provided they were united at some level in a classification
scheme. In contrast, the British failed to distinguish between theory and the
variety of models that they used, giving the appearance of underlying disorder at
every level.
There is an important sense in which Duhems criticisms of his British
counterparts mask some of the similarities that existed between them. Maxwell
clearly was concerned with securing an experimental foundation for electro-
magnetism and paid particular attention to what he saw as the limits of scientific
theorizing. Like Duhem, he emphasized the distinction between the laws of
physical science and an underlying reality that has no place in theory
construction (see Morrison 1992). Kelvin, although somewhat more zealous in
his attitudes toward realism, was similarly concerned with providing an
experimental basis for hypotheses like the vortex aether, and intended the
manipulation of mechanical models as a way of achieving this goal. Although
Duhems dissatisfaction with the inherent disorder in British physics is certainly
well-founded, ironically it has been the proliferation of inconsistent models,
laws and the lack of strict deduction that has produced the successful science we
have today.
5. Conclusion
apparent that the successful use of models does not involve refinements to a
unique idealized representation of some phenomenon or group of properties, but
rather a proliferation of structures, each of which is used for different purposes.
Indeed in many cases we do not have the requisite information to determine the
degree of approximation that the model bears to the real system. For example,
most if not all of the work done in high energy physics since the mid-seventies
has been based on the quark model of the atom. Out of this model grew an
extensive account of the elementary structure of matter, a highly sophisticated
and predictively successful theory called quantum chromodynamics. Although
the models of this theory are highly detailed there has been no experimental
verification that fractionally charged particles (quarks) even exist. As a result
there is no way to determine the accuracy of the model aside from its predictive
power. But this is not simply a problem for cutting-edge research. In other
words, my argument is not that the Galilean account of models fails to take
account of recent work in high energy physics. Rather, it is that the Galilean
picture cannot even accommodate well-entrenched cases in nuclear physics and
the kinetic theory. These relatively simple systems cannot be modeled by simply
adding parameters to a basic theoretical structure. With each addition comes a
change in the fundamental assumptions about the nature of the system. What
this suggests is that in situations other than straightforward cases like
pendulums and rigid rods, we have idealizations that are different in kind from
what McMullin has characterized.
Given that many models cannot be evaluated on their ability to provide
realistic representations, we need to focus less on the distinction between
heuristic and realistic models, and instead, emphasize the way in which
models function in the development of laws and theories. This is especially true
given the character of mathematical physics and the emphasis on approxima-
tion, which increasingly eludes successful calculation. Because all models
function in an heuristic way the distinction serves only to separate features of
models which cannot always be isolated in practice.
The post-seventeenth-century problem of approximation and idealization is
not only a philosophical problem but, as one to be countered by sophisticated
calculations and the addition of parameters, it is also a difficulty that exists
within scientific practice. We must be mindful, however, that the theories we
typically hold up as paradigms of successful science, theories which we claim
are realistic representations of physical systems, are those that require a
proliferation of models at the level of application. What this indicates is that the
question of whether a model corresponds accurately to reality must be recast in
Approximating the Real: The Role of Idealizations in Physical Theory 171
a way that is more appropriate to the way in which models actually function
within the practice that we, as philosophers, are trying to model.*
Margaret Morrison
Department of Philosophy
University of Toronto
mmorris@chass.utoronto.ca
REFERENCES
Cartwright, N. (1983). How the Laws of Physics Lie. Oxford: Clarendon Press.
Cartwright, N. (1989). Natures Capacities and their Measurement. Oxford: Clarendon Press.
Duhem, P. (1902). Les theories electriques de J. Clerk Maxwell: Etude historique et critique. Paris:
Hermann.
Duhem, P. (1977). The Aim and Structure of Physical Theory. New York: Atheneum Press.
Gell-Mann, M. and Neeman, Y. (1964). The Eightfold Way. New York: W. A. Benjamin.
Gitterman, M. and Halpern, V. (1981). Qualitative Analysis of Physical Problems. New York:
Academic Press.
Klein, M. (1974). Historical Origins of van der Waals Equation. Physica 73, 28-47.
Laudan, L. (1981). Science and Hypothesis: Historical Essays on Scientific Methodology.
Dordrecht: D. Reidel.
Laymon, R. (1982). Scientific Realism and the Hierarchical Counterfactual Path from Data to
Theory. In: P. D. Asquith and T. Nickles (eds.), PSA 1982: Proceedings of the 1982 Biennial
Meeting of the Philosophy of Science Association, pp. 107-121. East Lansing, MI: Philosophy
of Science Association.
Laymon, R. (1985). Idealizations and the Testing of Theories by Experimentation. In: P. Achinstein
and O. Hannaway (eds.), Observation, Experiment and Hypothesis in Modern Physical Science,
pp.147-174. Cambridge, MA: MIT Press.
Maxwell, J. C. (1856). On Faradays Lines of Force. Reprinted in: Niven (1965), vol. I, pp. 155-
229.
Maxwell, J. C. (1861-2). On Physical Lines of Force. Reprinted in: Niven (1965), vol. I,
pp. 451-513.
Maxwell, J. C. (1865). A Dynamical Theory of the Electromagnetic Field. Reprinted in: Niven
(1965), vol. I, pp. 526-597.
Maxwell, J. C. (1870). Address to the Mathematical and Physical Sections of the British
Association. Reprinted in: Niven (1965), vol. II, pp. 215-229.
Maxwell, J. C. (1873). A Treatise on Electricity and Magnetism. Oxford: Clarendon Press. 3rd ed.
of 1891 (Oxford: Clarendon Press) reprinted by Dover, New York, 1954. All references are to
the Dover edition.
Maxwell, J. C. (1876). On the Proof of the Equations of Motion of a Connected System. Reprinted
in: Niven (1965), vol. II, pp. 308-9.
Maxwell, J. C. (1879). Thomson and Taits Natural Philosophy. Reprinted in: Niven (1965), vol. II,
pp. 776-785.
*
Thanks for helpful conversations to Mauricio Surez, Mathias Frisch, Nancy Cartwright, Paul
Teller, R.I.G. Hughes, Paddy Blanchette, Paolo Mancosu, Dorit Ganson, and Peter McInerney.
172 Margaret Morrison
In: Martin R. Jones and Nancy Cartwright (eds.), Idealization XII: Correcting the Model.
Idealization and Abstraction in the Sciences (Pozna Studies in the Philosophy of the Sciences and
the Humanities, vol. 86), pp. 173-217. Amsterdam/New York, NY: Rodopi, 2005.
174 Martin R. Jones
Theoretical fluid mechanics is an attempt to predict the behavior of real fluid motions
by solving boundary value problems of either appropriate partial differential
equations or integral equations . . . In deriving the well-set boundary value equations,
we postulate certain boundary and inner conditions which inevitably dictate the
final form of the solution. With such a set of equations, we can solve few problems.
Analytic solutions are impossible, numerical solutions are inappropriate, and nothing
appears to work. Only the simplest fluid flow problems can be solved.
Therefore, we introduce idealizations into the problems. We might assume that
the fluid is independent of time, reasoning that the disturbances are of secondary
importance. We could assume that the fluid is ideal [i.e., has zero viscosity], when in
fact no known fluid is ideal. But because the viscosity may be small, much smaller
than, say, for water, the idealization will yield solutions that are acceptable. What
else might we assume? The possibilities are endless. For example, we could assume
the flow is (a) symmetric, (b) incompressible, (c) not rotating, (d) one-dimensional,
(e) continuous, (f) isothermal, (g) isobaric, (h) adiabatic, (i) reversible, (j) homo-
geneous, etc. The flow of course, may be none of these, for all are idealizations
(Granger 1995, p. 17).
Both Chomsky and Granger are drawing attention to specific ways in which
the real systems in their respective domains of inquiry are knowingly and
systematically misrepresented: no real speaker-listener is unaffected by memory
limitations, and no real fluid has zero viscosity or is, strictly speaking,
incompressible. Representations also omit features of the systems under study
without thereby misrepresenting them, of course: any real speaker-listener has a
specific height and weight, and real fluids are particular colors. Putting these
two points somewhat colorfully, we might say that when, in the various
sciences, we theorize about a certain class of systems, we habitually lie about
some aspects of the systems in question, and entirely neglect to mention others.
I intend to take this distinction between misrepresentation and mere
omission as fundamental, and to suggest that we organize our terminology
around it. On the regimentation of usage I am thus proposing, the term
idealization applies, first and foremost, to specific respects in which a given
representation misrepresents, whereas the term abstraction applies to mere
omissions.1, 2 One of my two primary aims in this paper is to develop this way
1
The examples of omissions just given may make abstractions in this sense seem relatively
uninteresting; the discussion of abstraction and relevance, in section 2 below, will help to dispel that
impression.
2
Nancy Cartwright carves things up somewhat similarly in Natures Capacities and their
Measurement. One important difference, however, is that Cartwright seems to build into the notion
of abstraction she employs at least two features that I do not wish to build into mine: (i) that it is
causal factors which we are focussing on when we subtract various other features of the situation,
and (ii) that the material in which the cause is embedded is subtracted when we formulate an
abstract law. Cartwright also claims that in the case of laws which are abstract in her sense, it
makes no sense to talk about the departure of the . . . law from truth; this is not something that will
necessarily be true of laws which are abstract in the sense I hope to characterize below. Note also
Idealization and Abstraction: A Framework 175
that in the passages in which Cartwright characterizes her notions of idealization and abstraction,
respectively, she speaks for the most part (although not exclusively) of idealized models and
abstract laws. I should thus emphasize that on the proposal I wish to offer, idealizations and
abstractions each appear plentifully both in models and in laws (Cartwright 1989, pp. 187-8).
3
There is by now a considerable body of work on idealization and abstraction in the philosophical
literature; indeed, a significant part of that literature is represented in the series containing this
volume (including much important work which has been done in continental Europe). It is no part of
my ambition in this paper to provide a comprehensive discussion of the range of approaches which
have been developed by various authors. Rather, my aim is to develop one specific proposal and
show that some useful work can be done with it. (For an introduction to some of the European
literature, see, for example, Leszek Nowaks The Idealizational Approach to Science: A Survey
(1992).)
4
The distinction I have in mind thus loosely parallels the medieval legal distinction between
suggestio falsi and suppressio veri, except that the phrase suggestion of a falsehood is rather
euphemistic in the case of many idealizations, and that furthermore, at least on a good day, no one is
deceived by scientific idealizations and abstractions. (I am indebted to Alan Code for some useful
information concerning the medieval terms.) On that note, it is worth emphasizing that idealization
need not involve the assertion of a falsehood on our part it is enough for a model to contain an
idealization that it misrepresent the world in some respect. We can use idealized models without
believing the untruths they speak.
176 Martin R. Jones
5
That is not to say, of course, that a given representation cannot idealize some features of a system
and abstract away from others.
6
See, for example, Ernan McMullins Galilean Idealization (1985), an important contribution to
the discussion of this topic. Whilst recognizing that [t]he term, idealization, itself is a rather loose
one, McMullin opts for taking it to signify a deliberate simplifying of something complicated . . .
with a view to achieving at least a partial understanding of that thing. He then adds: [Idealization]
may involve a distortion of the original or it can simply mean a leaving aside of some components
in a complex in order to focus the better on the remaining ones (p. 248). In my terms, then,
McMullin uses the label idealization for both idealization and abstraction. Interestingly, however,
on the next page, in characterizing what he calls mathematical idealization, of which omission is
the primary characteristic, McMullin shows a momentary preference for the other term:
Aristotle . . ., of course, separate[d] mathematics quite sharply from physics, partly on the basis of
the degree of abstraction (or idealization) characteristic of each . . . [M]athematics abstracts . . .
from qualitative accidents and change. A physics that borrows its principles from mathematics is
thus inevitably incomplete as physics, because it has left aside the qualitative richness of Nature.
But it is not on that account distortive, as far as it goes (p. 249, my emphasis). McMullins
mathematical idealization would by my lights clearly be classed as a form of abstraction, rather
than idealization. (One might, on the other hand, read McMullins distinction between formal and
material idealization (pp. 258-9) as similar to my distinction between idealization and abstraction,
in spirit at least. See n. 35.)
Idealization and Abstraction: A Framework 177
predict and those whose effects we cannot, for example. We might also
distinguish amongst RIs with respect to the source of our knowledge concer-
ning them at the most coarse-grained level, for example, with respect to
whether the source is theory or experiment. Alternatively, we might find it
useful to draw distinctions along lines which reflect the causal relevance of the
features which we have either misrepresented or distorted. Each of these
distinctions will be important and useful in some philosophical contexts, as will
yet others, and some of them have been given positions of central importance in
other approaches to these issues. Indeed, we will return to some of the
distinctions I have just mentioned at various points later in this essay. My claim,
however, is that tying the terms idealization and abstraction to the semantic
distinction between misrepresentation and omission provides a good starting
point if one wishes to construct a larger framework which illuminates the
various ways we think about imperfection in scientific representation, and
which enables us to articulate certain ideas in greater detail. Just such a frame-
work will be developed in the remainder of the paper.
We need to begin by thinking about the sorts of things which contain
abstractions and idealizations. Models, laws, and theories perhaps come first to
mind, but we might add explanations, predictions, calculations, graphs, and
diagrams to the list. (No doubt we could go on.) In what follows, I will focus
largely on the first two sorts of item, models and laws. The hope is that if we
can say what it means for models and laws, respectively, to involve
idealizations, and what it means for them to involve abstractions, then much of
the rest will follow. Scientific explanations and predictions will, at least in many
cases, involve idealizations or abstractions just because they employ laws,
models, or theories which do, and the same can be said for calculations
performed in the service of other ends.7 I will take it that graphs and diagrams
are implicitly covered in my discussion of models, for they are either models
themselves, or, perhaps, means of presenting models.
That leaves only theories. The relation between laws, models, and theories
has been a much-debated issue in the philosophy of science for at least the last
thirty years or so. The older, syntactic view typically regarded theories as
deductively closed sets of sentences in a formal language, or at least as ration-
ally reconstructable along such lines; the language itself was often regarded as
only partially interpreted. Some especially important sentences, or (on some
views) all the sentences which make up the theory, are then taken to state its
7
Consider, for example, calculations performed to check the internal consistency of a theory, or to
check the equivalence of what are intended to be two formulations of the same theory. (Of course, I
do not wish to suggest that all explanations or all predictions involve calculation.)
178 Martin R. Jones
laws.8 The newer semantic view, on the other hand, is usually characterized as
holding that theories are collections of models.9 The only stand I wish to take on
these issues is, in the current climate, quite a minimal one, and it is that theories
tend to involve both laws and models as important components.10 If that is right,
then in characterizing the ways in which both abstractions and idealizations can
occur in laws and models, we can hope to gain a considerable purchase on the
ways in which they occur in scientific theories.
The structure of the rest of this paper is thus as follows: I discuss
idealization and abstraction in models in sections 1-4. I begin by focussing on
models of particular systems, and offer a more precise account of the basic
distinction I have drawn as it applies in that setting (section 1). I then consider
what else we might have in mind when we speak of idealization and of
abstraction, in addition to misrepresentation and mere omission respectively
(section 2). I go on to extend the account to models of kinds of systems (section
3), and to talk of degrees of idealization and abstraction in models, and talk of
idealization and abstraction as processes (section 4). Then, in sections 5-9, I turn
to laws. After a few necessary preliminaries (section 5), I distinguish three
different ways in which idealization can occur in laws and our employment of
them (sections 6-8), and close by saying a few brief words about abstraction in
laws (section 9).
8
This view is still quite often called the Received View, even though the label is, by now, highly
anachronistic. For a statement of the syntactic view, see Carnap (1970); for well-known critiques,
see Suppe (1972) and (1974a), and Putnam (1979).
9
This slogan can be rather misleading, however. See Jones (forthcoming a) for further discussion.
The semantic view is presented and developed in different ways in: Suppes (1957, ch. 12), (1960),
(1967), (1974); van Fraassen (1970), (1972), (1980, ch. 3), (1987), (1989, ch. 9); Suppe (1967),
(1974a), (1989); and Giere (1988).
10
Although proponents of the semantic view typically wish to draw our attention towards models,
and away from such relatively linguistic items as laws (or law statements), they are certainly not
aiming to eliminate the latter notion. Frederick Suppe, one of the earliest and most well-known
proponents of the semantic view, devotes a considerable part of his extended treatise on theory
structure, The Semantic Conception of Theories and Scientific Realism, to providing an account
of various types of law; indeed, the work contains more explicit discussion of the nature of laws
than of the nature of models (Suppe 1989). (Suppes aim is to give an account of laws which avoids
tying them too closely to sentences in any particular language.) Even van Fraassens extended attack
on laws in his Laws and Symmetry (1989) is primarily directed at a number of philosophical
theses about laws and the role they play in science and epistemology; he does not deny that Ohms
law, Boyles law, the Hardy-Weinberg law, or Schrdingers equation play some sort of important
role in the theoretical practice of their various sciences, and in the theories with which they are
associated.
Idealization and Abstraction: A Framework 179
Before we attempt to say more about what it means to talk about idealization
and abstraction in models, it will be useful, particularly in the current
philosophical climate, to say something about models themselves. The term
model is used in a wide variety of ways in the philosophy of science, and in
science itself. Distinguishing the notions which go by that name and relating
them to one another, although crucial for some philosophical purposes, is a
lengthy and complex matter. Fortunately, it is not something we need to
accomplish in any detail here; a few broad outlines will suffice.11
On some uses of the term, a model is a model of a set of sentences, in the
sense that it makes the sentences in the set true,12 often by providing them with
an interpretation on which they turn out true. On other uses, a model is a model
of an object, system, event, or process, in the sense that it represents, or is used
to represent that object, system, event, or process as having certain features,
behaving in certain ways, and so on.13 (For the sake of brevity, I will hereafter
speak simply of systems and features.) It is only models in the latter sense
which will concern us here; for our purposes, a model is, first and foremost, a
representation.14 One can go on to distinguish at least three notions of model as
representation in the philosophy of science and in the sciences themselves, the
differences lying in the kind of object which does the representing in question: a
mathematical structure, such as a vector space with a trajectory running through
it (i.e., a function mapping points in some interval on the real line, representing
times, to elements of the vector space, representing states of the modeled
system); a set of propositions; or a physical object, such as an engineers scale
model of a bridge, or an electrical circuit used to represent the behavior of an
acoustical system.15 These differences, however, are at least initially unimpor-
11
For a taxonomy of some of the central notions of model abroad in the philosophy of science, a
discussion of the suitability of the various notions to certain tasks, a critique of the semantic view
(at least in some of its incarnations), and a case for taking a somewhat different view of theory
structure, as well as references to the relevant literature, see Jones (forthcoming a); see also Jones
(forthcoming b).
12
Or true-in-the-model, in certain logical contexts. See Jones (forthcoming a); and thanks to Charles
Chihara for drawing my attention to this point.
13
See Frisch (1998) and Jones (forthcoming a) for further discussion of the distinction between
models as truth-makers and models as representations.
14
In principle, of course, one and the same object might serve as a model in both senses. If and
when this does occur, then we will be focussing on the objects role as representation, rather than its
role as truth-maker. (Something like this situation arises in the version of the semantic view van
Fraassen presented in On the Extension of Beths Semantics of Physical Theories (1970), in
which the state space for a given system plays a role in the formal semantics for the language of the
theory. As I understand that approach, however, it would be an oversimplification to say that one
and the same object functions as both representation and truth-maker.)
15
For more on these three notions of model as representation, see Jones (forthcoming a).
180 Martin R. Jones
tant from our present point of view. We can make a start on the job of clarifying
the notions of abstraction and idealization simply by thinking of a model as
something which represents a given system as having various features.
In fact, this is insufficiently general, for in addition to models of particular
systems, such as models of the 1989 Loma Prieta earthquake or the Big Bang,
there are also models of kinds of systems, such as Bohrs model of the hydrogen
atom or a classical model of electromagnetic radiation in vacuo.16 The strategy I
will adopt, however, is as follows: I will take as basic the notion of a specific
idealization contained in a model of a particular system (i.e., an aspect of the
model which idealizes the system in some specific respect), and the parallel
notion of a specific abstraction present in a model of a particular system. After
spending some time providing an account of these two notions (in this section
and the next), it will then be a relatively quick matter to extend the account into
certain neighboring areas: talk about a model of a kind idealizing and
abstracting in specific respects (in section 3); the classification of a model as
idealized, highly idealized, or an idealization, or as abstract, highly abstract, or
an abstraction; comparative judgements about the extent to which various
models idealize a given system, or kind of system, both in a given respect and
overall, and about the degree of abstractness of various models; and talk about
idealization and abstraction as component processes in scientific theorizing (in
section 4).17
Let us begin, then, by considering a simple example of the use of a model to
represent a particular system.
Suppose that on some particular afternoon a certain cannon has been
wheeled onto an open plain and fired. In the attempt to predict where the
cannonball will land, or perhaps to explain why it lands where it does, we might
construct a model of the system along the following lines.18 We assume that the
16
We also sometimes speak of using one and the same model to represent different particular
systems, or even different kinds of system, on different occasions. It is worth noting that it is easier
to make sense of such talk when the model in question is an abstract mathematical structure or a
concrete physical object that when it is a set of propositions.
17
To say that this will be a quick matter should not to be taken to suggest that there will be no open
questions by the time we are done.
18
The example dates back to Niccol Tartaglias Nova Scientia of 1537, the first two books of
which are translated in Drake and Drabkin (1969), but the modeling of the situation presented here
is, of course, far more modern, and typical of treatments to be found in contemporary introductory-
level textbooks in classical mechanics. Tartaglia, incidentally, rather charmingly distanced himself
from the choice of example in the later Quesiti (1546): I . . . have never made any profession of or
delighted in shooting of any kind artillery, arquebus, mortar, or pistol and never intend to shoot
(Drake and Drabkin 1969, p. 98). The opening of the Nova Scientia, in which Tartaglia describes
the history of his work in a letter of dedication to the Duke of Urbino, contains an expression of
more emphatic, if somewhat partisan feelings on the matter: [O]ne day I fell to thinking it a
blameworthy thing, to be condemned cruel and deserving of no small punishment by God to
study and improve such a damnable exercise, destroyer of the human species, and especially of
Idealization and Abstraction: A Framework 181
0 x
After it has been fired from the cannon, we suppose that the cannonball moves
under the sole influence of gravity, which exerts a force vertically downwards
with a magnitude of mg (the mass of the cannonball, m, multiplied by a certain
constant, g = 9.8 m/s2) throughout the motion. Thus we have:
Fy = mg (1)
for the force in the y direction, and
Fx = 0 (2)
for the force in the x direction. Newtons second law of motion gives us
d2y
Fy = m (3)
dt2
and
d2x
Fx = m (4)
dt2
Christians in their continual wars. For which reasons . . . not only did I wholly put off the study of
such matters and turn to other studies, but I also destroyed and burned all my calculations and
writings that bore on this subject (Drake and Drabkin 1969, p. 68).
182 Martin R. Jones
so we get
d2y
= g (5)
dt2
and
d2x
=0 (6)
dt2
19
A result which Tartaglia also achieved, and for which he cites experimental evidence see Drake
and Drabkin (1969), pp. 64-5.
20
If we are thinking of a model as a state space with a trajectory running through it, then no specific
model of the cannonballs trajectory has been presented for that, we would need to choose a state
space from amongst the various spaces adequate to the job, and specify values of v, , and m. In the
set of propositions sense, on the other hand, a particular model has been specified; a more detailed
propositional model could simply contain additional propositions concerning the values of the
various parameters.
Idealization and Abstraction: A Framework 183
model represents it as not having (and, correlatively, some property the system
does not have which the model represents it as having). On the regulative
proposal I am making, it is correct to talk of idealization with respect to a model
of a particular system only when such a state of affairs obtains. To put the rule
schematically: If the model represents a system as having the properties 1, 2,
. . ., n, . . ., and lacking the properties 1, 2, . . ., n, . . .,21 then a given
aspect of the picture of the system presented by the model is an idealization only
when that aspect of the picture represents the system as having some i which it
does not in fact have, and/or as lacking some i which in fact it has.22
There are two things I want to note about this proposal before we go on to
formulate a similar proposal concerning abstraction. First, it is important to bear
in mind that what matters, according to the proposed necessary condition on
being an idealization, is whether, in the relevant respect, the model represents
the system as being the way it is; the issue is not whether the model represents
the system as being the way we take it to be, nor even the way we take it to be
when we are speaking as strictly as we can manage. This makes it acceptable to
speak of discovering that some assumption made by some model is an
idealization, or even of discovering that something we had formerly taken for an
idealization is not one (a less likely turn of events, admittedly), and of doing so
simply by discovering something new about a certain system. I take it that this
comports with much standard usage of the term idealization, and the fact that
it does so is a point in favor of the proposal. If, however, it should be decided
that some standard usage makes agreement with our Sunday best beliefs the
crucial thing, and disregards the question of how the system actually is, or if it
should be deemed useful to employ the term in such a way for some
philosophical purposes, it will be an easy matter to introduce a distinction
between senses of the term, and some device to make it clear which sense is
intended on a given occasion. In this essay, though, calling some aspect of a
model an idealization will imply that that aspect distorts the truth of the matter,
but will not imply any conflict with the way we take things to be, even if such
conflict is often present.
Secondly, it is not clear, and I do not mean to be supposing, that there will
be only one way, or even a best way, to individuate the idealizations present in
some particular case. For example, the model we have just considered represents
the gravitational force of the Earth on the cannonball as constant in both
21
The labeling system used here should not be taken too seriously; in particular, I would not wish to
assume that either the properties the model represents the system as having, or those it represents
the system as lacking, form a countable set.
22
This may require some modification if we wish to allow for the possibility that it might be
indeterminate, in some non-epistemic way or other, whether a given system has a given property, as
perhaps some idealizations might then involve a models ascription of a property to a system, for
example, when in fact it is objectively indeterminate whether that system has that property.
184 Martin R. Jones
direction and magnitude throughout the region in which the cannonball moves,
whereas in fact there will be variation in both respects. Is that one idealization,
or two? Or an uncountably infinite number, one (or two) for each spatial point at
which the model misrepresents the Earths gravitational field? There would
seem to be little prospect of settling upon a non-arbitrary answer to such
questions, and that fact will become important later, when we need to account
for talk of degrees of idealization in models.23 Note, however, that we have here
no objection to the coherence of the proposal itself, nothing to prevent us from
saying: This models representing the system as being i (or as not being i)
counts as an idealization only if the system is not in fact i (or is i).
Returning now to our model of the flight of the cannonball, we can treat of
abstraction in a manner quite parallel to that in which we have just treated of
idealization. There are innumerable features of the modeled system which the
model omits, without thereby misrepresenting or distorting the system: no
mention is made of the composition of the cannonball, of its internal structure,
of its color or temperature, of the composition of the ground over which the
cannonball flies, of the mechanism by which an initial velocity is imported to
the ball, or of the country in which the firing takes place. The model is simply
silent in all these respects. And according to the regulative recommendation I
wish to put forward, we should say that a model of a particular system involves
an abstraction in a particular respect only when it omits some feature of the
modeled system without representing the system as lacking that feature.24
The two points just made about my proposal concerning the use of the term
idealization apply here, too, mutatis mutandis. First, that is, the specified ne-
cessary condition for the presence of an abstraction concerns the relationship
between the model and the actual features of the modeled system, not the
relationship between the model and the features we take the system to have
although, as with idealization, it would be simple enough to recognize another
sense of the term abstraction in which it is the latter relationship which is
crucial. Second, individuating and counting abstractions would seem necessarily
to be a somewhat arbitrary process. Again, this will be important later, when we
come to deal with degrees of abstractness (in section 4), but it presents no
obstacle to coherently formulating the condition I have just laid out.
23
See section 4.
24
Talk of a models involving an abstraction is somewhat artificial, but it facilitates a more fine-
grained discussion of the various ways in which the model as a whole is abstract, or is an
abstraction; such phrasing also emphasizes the parallels and the contrasts with idealization.
Perhaps a more familiar way of capturing essentially the same phenomenon is to speak of various
respects in which a given model abstracts away from features of the modeled system.
Idealization and Abstraction: A Framework 185
The proposals I have just put forward provide only partial guides, for each
lays down only a single necessary condition on correct application of the
relevant term: misrepresentation for idealization, and omission without mis-
representation for abstraction. We must thus now turn to the twin questions of
what more it takes for a misrepresentation to count as an idealization, and what
more it takes (if anything) for an omission to count as an abstraction. As we do
so, however, the purpose of my remarks becomes a different one; let me take a
moment to explain how.
So far, my intention has been to delineate two simple proposals for
regimentation of what is currently a confusing tangle of conflicting usages. As I
noted earlier, these proposals come into direct conflict with the way some have
used the relevant terms; nonetheless, they are quite continuous with the usages
of others. We might then hope (and I do) that the proposals in question can
also be read as successfully capturing part of the meanings of the terms
idealization and abstraction on those usages with which they are
continuous. With that hope in mind, we may then be tempted to fall into familiar
philosophical habits, and start looking for full accounts of the existing meanings
of the terms on those usages by searching for additional individually necessary
conditions on correct application, never ceasing until we have a set of
conditions which are jointly sufficient. This is not, however, what I intend to
do.25 Instead, my primary aim is simply to identify some factors which are often
present when we speak of idealization and abstraction, and which perhaps have
something to do with our speaking that way on those occasions when they are
present. (Each factor I will mention, incidentally, has also seemed important to
at least one other person in their discussion of idealization and abstraction.) No
claim is being made that the presence of any of these factors is a necessary
condition on the presence of an idealization (or, later, abstraction), nor that their
joint presence is a sufficient condition for the same; I leave open, in other
words, the possibility that the concepts of idealization and abstraction I am
focussing on here are cluster concepts of some sort, or perhaps entirely
irreducible concepts which are nonetheless importantly connected in some way
to the concepts I am about to mention. Whatever the exact semantic structure of
25
The reader should not be misled by the fact that at one point I will consider whether the presence
of a certain factor might constitute a further substantive necessary condition on abstraction, and
will, at another, ask whether the presence of a certain combination of factors is a sufficient
condition for idealization. In both cases I draw a negative conclusion, and my intention, apart from
that of generally illuminating the conceptual landscape, is if anything to cast doubt on the project of
finding a complete set of individually necessary and jointly sufficient conditions, rather than to
engage in it.
186 Martin R. Jones
things, my hope is that the following discussion (of section 2) will lend greater
clarity to our thinking about idealization and abstraction.26, 27
2. Going Further
It is clear that as far as much standard usage is concerned, not just any
misrepresentation on the part of a model of a particular system counts as an
idealization. One sort of case which makes this clear is that in which the model
as a whole is substantially off-target. Consider some specific episode of
combustion and a model of the process according to which the burning object
releases phlogiston into the air. The model represents the piece of wood, say, as
initially containing phlogiston; this is certainly a misrepresentation, but just as
certainly it would not ordinarily be called an idealization.
This suggests that perhaps a misrepresentation has to approximate the truth,
or appear as part of a model which captures the approximate truth overall, or at
least appear as part of a model which gets the basic ontology of the modeled
system (i.e., its constituents and central features) right in order to count as an
idealization, as opposed to being a useful fiction, say, or an outright mistake.
Whether any of these conditions are or should be made necessary conditions on
being an idealization is, as I have said, a question I will leave open (although I
suspect that the answer is no); but notice that even the conjunction of them,
taken in combination with the condition that misrepresentation must be taking
place, does not make up a sufficient condition for somethings being an ideali-
zation. To see this, imagine a classical electrodynamical model I might con-
struct of the flight of an electron though a simple, homogeneous magnetic field,
one which gets things almost exactly right except that the mass it attributes to
the electron is slightly off 9.108 1031 kg instead of 9.107 1031 kg, say.
Here we surely have both misrepresentation and approximate truth in all of the
above senses, but it would again seem odd, from the point of view of standard
usages, to call the models attribution of a mass of 9.108 1031 kg to the
electron an idealization.28
26
I am also not making further recommendations for regimentation in the next section. Should
further explicit regimentation be deemed philosophically valuable, on the other hand, I would hope
that the following discussion identifies some of the leading candidates for the post of additional
necessary condition.
27
Remember that at this point we are still restricting our focus to talk of idealization and abstraction
in models of particular systems, and in specific respects. Other ways of talking about idealization
and abstraction in models will be covered in sections 3 and 4.
28
That is, odd to call it an idealization in virtue of those features (misrepresentation with
approximate truth) alone. See n. 30.
Idealization and Abstraction: A Framework 187
I will mention just two further features which many idealizations in models
of particular systems seem to have, in addition to that of misrepresenting the
modeled system in some particular respect. First, as is often noted, idealizations
typically make for a simpler model than we would otherwise have on our hands,
and they are often introduced for precisely that reason.29 It is usually presumed
that pragmatic concerns drive such maneuvers: the desire to get started, to make
a decent prediction relatively quickly rather than a perfect prediction in the
distant future or no prediction at all, for example. With simplicity, it is expected,
will come tractability. And in the background lies the hope that, should the
initial results achieved with the simpler model seem promising, a less idealized
and more predictively accurate cousin of the original can be constructed later
on, albeit at the cost of greater complexity.30
As a distinct issue from simplification, many idealizations can also be said to
misrepresent relevant features of the modeled system. But relevant to what, and
how? Here again pragmatic concerns come to the fore.31 We often have some
specific purpose in mind when we construct a model of a particular system: to
explain or make predictions about certain aspects of the behavior of the system,
say. (Of course, there may be an ulterior motive driving the pursuit of those
goals, such as the confirmation or refutation of a theory.) And that specific
purpose might then single out a set of relevant features of the system: those
which are explanatorily or predictively relevant to these aspects of the behavior
of the system. Furthermore, depending on ones views about explanation or
prediction, one might ultimately take causal relevance, say, or statistical
relevance to be the crucial thing.32 In any case, perhaps we are more likely to
call a models misrepresentation of a system in a given respect an idealization
when the misrepresented feature (or features) of the system is (or are) taken to
be relevant, in the appropriate manner, to those aspects of the behavior of the
system which we were primarily concerned to predict or explain when we
29
As we saw above (n. 6), Ernan McMullin puts simplification at the heart of his initial
characterization of the notion of idealization (which, as he understands it, incorporates both
idealization and abstraction in my senses).
Incidentally, it may well be possible to usefully distinguish different kinds of simplicity
mathematical simplicity versus some sort of conceptual simplicity, say, or versus structural
simplicity of the system as modeled but I will not pause to attempt such a thing here.
30
Thus perhaps the misattribution of a certain mass to an electron, mentioned as an example in the
preceding paragraph, might count as an idealization after all if it makes for a model which is
somehow simpler, or simpler to use, than it would be otherwise, say by making certain calculations
easier (perhaps via some convenient canceling out which it makes possible).
31
By pragmatic here, I simply mean having to do with the purposes for which we have
constructed the model in question. One such purpose may be explanation, but no commitment to
what is known as a pragmatic theory of explanation is implied.
32
And of course one might believe those sorts of relevance to be intimately related in one way or
another.
188 Martin R. Jones
constructed the model. (Certainly we are more likely to want to draw attention
to misrepresentations of features which we take to be relevant to the behavior in
which we are especially interested.) And, independently of whether that is so, it
might prove fruitful in certain contexts to regiment our terminology in such a
way that idealizations necessarily misrepresent relevant features.33, 34
We turn now from idealization to abstraction, and ask what more there is to
abstraction than omission. Having just discussed three important things which
some, and perhaps many idealizations do approximating the truth, contributing
to simplicity, and misrepresenting relevant features of the modeled system it is
natural to wonder whether there is a parallel story to be told about abstractions.
The first and quite trivial point in this regard is that clearly there is no sense
in which an abstraction can approximate the truth, nor any interesting sense in
which it can fail to, simply because when a model contains an abstraction with
respect to some particular property, it is entirely silent on the matter of whether
the system it models has the property or lacks it; nor is it obvious (to me,
anyway) that there is any other closely related job which abstractions can do.
Might a contribution to overall simplicity be a crucial condition on an
omissions being an abstraction, or at any rate on being an interesting abstrac-
tion? Well, simplification would seem to be an automatic consequence of
omission, in that, of two models of a given system, the one which omits mention
of a certain feature of the system will thereby be the simpler model, ceteris
paribus. Thus, although it might be true that abstraction always contributes to
simplification, it would be of no help to say that an omission counts as an
abstraction only if it contributes to the simplicity of the model all omissions
do. Still, it is worth noting that abstractions always contribute to simplicity, as
this is no doubt one of the reasons we employ them.
33
Note that if we do think of idealizations as involving the distortion of relevant features, and if
relevance is elaborated along the lines I have sketched here, then which features of a given model
are correctly called idealizations might vary with the pragmatic context of use of the model what
is and is not an idealization, in other words, may depend in part upon what we are at a given
moment hoping to predict, or explain. On the other hand, it may be that certain elements of the
model misrepresent features of the system which count as relevant for most of our typical purposes,
in which case we may classify those elements of the model as idealizations without any particular
context in mind.
34
On the account Nancy Cartwright offers (1989, chapter 5; see esp. p. 187 and pp. 190-1), both
simplification and relevance play a crucial role. According to Cartwright, the point of constructing
an idealized model (that is, a model containing a number of idealizations) is to single out the causal
capacities of one particular factor in a given situation, and such a model does that by representing
all other potentially causally relevant factors as absent. In doing so, the model typically both
simplifies the causal structure of the situation, and misrepresents how things stand with respect to a
number of causally relevant features of the system in question (by representing such features as
absent when they are present).
Idealization and Abstraction: A Framework 189
With regard to the role of relevance, matters are perhaps a little less
straightforward. Both Nancy Cartwright and Ernan McMullin seem committed
to the claim that models will contain abstractions (in my sense) only with
respect to features which we deem irrelevant, for they each suggest that in
constructing a model we will always include some assumption about the
presence or absence of any feature of the modeled system which we deem
relevant to the behavior we seek to explain or predict (Cartwright 1989, p. 187
and McMullin 1985, pp. 258-64, esp. p. 258).35 This seems plausible enough,
especially if we are sufficiently liberal on the question of what assumptions a
given model should be taken to include. For example, in constructing a model of
the cannonball firing discussed above, we might not expressly postulate the
absence of air resistance, but we might nonetheless be said to have constructed a
model in which it is assumed that there is no air resistance, simply because it is
Earths gravitational force on the cannonball which we feed into Newtons
second law, a law which relates mass and acceleration to the total force on the
body.
Note, however, that there are still two reasons for saying that models can
abstract away from relevant features of the modeled system. First, even to the
extent that Cartwright and McMullin are correct, there may be relevant features
of the system which we mistakenly deem irrelevant, and abstract away from
when constructing our model. Secondly, models sometimes seem to abstract
away from features which we in fact deem relevant, but to make some distinct
idealizing assumption which screens off the presence or absence of the
relevant feature.36
This latter point is nicely illustrated by a case McMullin discusses, in fact.
To derive the ideal gas law for some specific body of gas using the kinetic
theory of gases, we construct a model of the gas according to which the
molecules making it up are perfectly elastic spheres which exert no attractive
forces upon one another (1985, p. 259). As McMullin emphasizes (p. 258), such
35
McMullin distinguishes between two central sorts of idealization (which, the reader will recall,
encompasses both idealization and abstraction in my sense of those terms) in models, calling them
formal and material idealization, respectively (1985, pp. 258-64). Formal idealizations deal
with relevant features of the modeled system, and material idealizations with irrelevant ones.
Although McMullin does sometimes write of formal idealizations which omit features of the
modeled system, this always seems to mean omit by representing as absent, and so in context to
imply misrepresentation, as opposed to meaning omit by not mentioning, which is the sort of
omission that abstraction in my sense must involve. It is McMullins material idealization, dealing
as it does only with features which are deemed irrelevant, which involves omission in the sense I
have made crucial. (More carefully we might say not deemed relevant, as in some cases the
feature is one we have not conceived of at the time the model is constructed consider McMullins
own example of electron spin, discussed on pp. 263-4).
36
I am using the term screens off in a general figurative sense here, and not in its technical
probabilistic sense.
190 Martin R. Jones
a billiard ball model of the gas makes no mention of the internal structure of
the molecules. Yet that internal structure is surely relevant to predicting the way
in which the pressure, volume, and temperature of the gas will covary, for it is
the internal arrangement of the parts of the molecules and of the atoms which
make them up that gives rise to the attractive intermolecular van der Waals
forces, and once those forces are taken into consideration, a different equation
relating pressure, volume, and temperature results. In the original model, the
possibility that such relevant features of the modeled system have been ignored
is covered, so to speak, by the idealization that there are no attractive
intermolecular forces; nonetheless, it seems accurate to say that the model omits
mention of a feature of the modeled system namely, the structure of its
molecular components which is relevant to the behavior with which the model
is concerned.
Thus models do sometimes abstract away from relevant features of the
systems they model; and when they do so, that is an important fact about them.
If we wished to stress the importance of relevance, we might choose to stipulate
that abstraction should only be spoken of when what is omitted is a relevant
feature of the system. We might then want to consider a related idea suggested
by (although not explicitly contained in) the work of Cartwright and Mendell
(1984) and Griesemer (this volume), that an abstraction properly so-called
should always involve the omission of a feature of the modeled system which is
of one explanatory kind or another.37 Alternatively, we could take the more
generous line that although there is nothing more to abstraction per se than
omission (omission without misrepresentation, that is), the omission must be of
a relevant feature, in the sense indicated above, if it is to be an interesting or
important abstraction. I shall not, however, try to weigh the relative merits of
these two approaches here, nor even try to determine whether there is anything
at stake in the choice between them.
37
See also Cartwright (1989, pp. 212-24). For Cartwright and Mendell the taxonomy of explanatory
kinds is derived from Aristotles four-fold taxonomy of causes; for Griesemer it is provided by
background scientific theories. These authors have a different quarry in mind than I do at this point:
rather than trying to say when a feature of a model counts as an abstraction, or what it means to say
that a model abstracts away from the concrete situation in this or that respect, they are concerned to
provide a way of ordering models, theories, and the like with regard to their overall degree of
abstractness. Note that there is, nonetheless, an internal tension in Cartwrights views here: if a
representation is more abstract when it specifies features of fewer explanatory kinds, then if there
are to be representations of different degrees of abstractness, there will have to be representations
which abstract away from features of some explanatory kind, and so from features which are at least
explanatorily relevant. This would then seem to cast into doubt the claim that a model must say
something, albeit something idealizing, about all the factors which are relevant (p. 187). But
perhaps not all features which are of some explanatory kind or other count as explanatorily relevant
in a given situation, or perhaps Cartwright has a different sort of relevance in mind, one which is not
too tightly connected to explanatory relevance.
Idealization and Abstraction: A Framework 191
3. Models of Kinds
It is not too difficult to see how we might extend the ideas discussed above to
the case of a model which is, or is on some occasion functioning as, a model of
a kind of system, rather than a particular system. The model will represent
things of kind , the s (neutrons, carbon atoms, cells, free-market economies,
ecosystems . . . ), as each having properties 1, 2, . . ., n, . . ., and as failing to
have properties 1, 2, . . ., n, . . .; it will also omit any mention of a very
large number of properties 1, 2, . . ., n, . . .. In the spirit of the regulative
proposals described above, then, we can adopt the rule that the model should be
said to contain an idealization in representing the s as having i only if some
s fail to have i, and that the model should be said to contain an idealization in
representing the s as failing to have i only if some of the s have i.38
Similarly, we can stipulate that a model of a kind should be said to contain an
abstraction with respect to the property i (of which it makes no mention) only if
some s have i.39
In addition to imposing this bit of regimentation, we then add that idealiza-
tions often approximate the truth about those s they misrepresent (or at least
most of them), that idealizations can contribute to the degree of simplicity the
model enjoys, and that idealizations and abstractions are at their most interesting
when they distort the truth about, or (respectively) omit mention of features
38
Note that some s can strictly be read as meaning at least one ; it is just that if the model
misrepresents only one in the relevant respect, then we are likely to regard the model as idealizing
the s only very slightly. See the account of talk of degrees of idealization in section 4.
39
In some cases we speak of a models abstracting away from a quantity (understood as a
determinable), such as the electric dipole moment of the molecules in a gas, or a qualitative
determinable, such as the color of a cannonball. A model of a kind should then be said to contain
an abstraction with respect to a certain determinable (qualitative or quantitative) when it makes no
mention of that determinable (nor any of the corresponding determinate properties), but some of the
s have one determinate property or other from the set of determinate properties corresponding to
the determinable. (Typically, I suppose, if one of the s possesses a determinate property from the
set corresponding to the determinable, then they all will, but there may be exceptions to this.) A
similar point applies to the notion of abstraction in a model of particular system.
192 Martin R. Jones
which are relevant, in some specified sense, to the behavior we are concerned to
study. It is perhaps also interesting to note that idealizations in models of kinds
sometimes ascribe a property, i, to the s which is in one sense or another a
limiting case of some range of properties actually instantiated by the various
s. (For example, i might be the property of experiencing zero air resistance,
in a model of pendula.)40
40
One difficulty for this approach to understanding talk of idealization and abstraction in models of
kinds is that it clearly cannot be extended straightforwardly to talk of idealization and abstraction in
models of uninstantiated kinds, relying as it does on there being some s to have, or fail to have,
one or another property. Perhaps when a model of an uninstantiated kind (such as a relativistic
model of some particular kind of universe very unlike our own) is said to contain an idealization or
an abstraction, it is implicitly being thought of as a highly idealized model of an actual kind or
particular (such as the actual universe). Or perhaps we might appeal to modal facts about what
properties s would have were there any. (This latter approach seems more promising than the
former when dealing, say, with a model of some sort of molecule which does not occur naturally
and which we have never synthesized.) These are topics for further investigation.
Idealization and Abstraction: A Framework 193
idealization, but that it is highly idealized, or when we explicitly say just that;
when we say that M is a more, or less, idealized model than some other model;
when we speak of a process or activity of idealization and mean the process or
activity of constructing a series (two-membered, in the trivial case) of increasin-
gly idealized models; or when we speak of de-idealization or correction,
and mean the converse process of constructing a series of increasingly less
idealized models. (A term often used to describe the parallel activity of
constructing less and less abstract models is concretization.41 We also speak
of one models being more realistic than another, and it seems to me that this
can mean that the more realistic model is less idealized, or that it is less abstract,
or both.) What these ways of talking have in common, obviously, is that
implicitly or explicitly, they all express judgements (often comparative judge-
ments) about the degrees of idealization various models exhibit. In making such
judgements, I want to suggest, we are typically taking into account a number of
different factors.
Consider first a model, M, of a particular system. The natural approach here
is to regard the question How idealized a model is M? as made up of two
component questions: (i) How many idealizations does M contain? (ii) How
much of an idealization is each of the idealizations which M contains? The
answer to the original question, about the overall degree of idealization M
enjoys, can then be arrived at by taking a sort of weighted sum over all the
particular idealizations M contains, in which attaching a heavy weighting to a
particular idealization amounts to claiming that that idealization idealizes the
modeled system to a significant degree, and so on.42
This little account of the overall extent to which a model of a particular
system is idealized begs a question, however, because it assumes that we
already have a well-defined way of talking about the degree to which a
particular idealization, contained within the model, idealizes the system in
question.43 So how is this latter quantity measured? It seems to me that all we
mean to do when we talk about the degree to which model M idealizes system S
by representing it as having property i (as when we exclaim, for example,
41
Juan Griss description of the relationship between his work and Czannes provides a nice
parallel to the discussion of abstraction and concretization as converse processes of model
construction in science: I try to make concrete that which is abstract . . . Czanne turns a bottle into
a cylinder, but I make a bottle a particular bottle out of a cylinder (quoted in the entry on Gris
in Chilvers (1990)).
42
This talk of weighted sums is not to be taken too seriously: I do not mean to suggest that in
making judgements about the degree of idealization a model enjoys we employ some precise
algorithm, nor that we are always, or even usually able to attach precise strengths to the various
idealizations the model contains, nor even that we are interested in doing so.
43
The account begs another question, too about how we are to count idealizations. This difficulty
was foreshadowed in section 1 (see the text containing n. 23), and I will return to consider it at the
end of this section.
194 Martin R. Jones
44
Not that these claims are strictly incompatible with the idea that degrees of simplification and of
relevance are part of what we are evaluating when we evaluate the degree to which a particular
idealization idealizes: we might be taking a weighted sum of the degree of distortion, the degree of
simplification, and the degree of relevance, and simply weighting the degree of distortion
significantly more heavily than either of the other two factors. Nonetheless, it seems to me as a
matter of linguistic intuition that that is not how we do it, and the alternative picture I am sketching
accounts for the facts stated in parentheses in the text far more straightforwardly.
45
This does not necessarily mean that the question of the degree of misrepresentation and that of the
degree of approximate truth involved in an idealization are interchangeable; whether they are
depends on how we understand the notion of approximate truth. Crudely put, the telling question in
this regard is: Does a representation have to be approximately true to have any degree of
approximate truth? To put the point slightly less crudely: If we understand the notion of
approximate truth in such a way that it is possible for partial truths to lack any degree of
approximate truth (sc., partial truths which are sufficiently close to being complete falsehoods), then
we may want to countenance degrees of distortion which distinguish between representations all of
which have the same degree of approximate truth, namely, zero. If, on the other hand, any partial
truth, no matter how partial, is to have some degree of approximate truth, then perhaps degrees of
misrepresentation and degrees of approximate truth may be regarded as interdefinable.
46
See, for example, Niiniluoto (1998). Teller (2001) argues that Gieres version of the semantic
view (1988, ch. 3), with its emphasis on similarity as the crucial relation between representation and
represented, helps us to see clearly a way of resolving some standard worries about approximate
Idealization and Abstraction: A Framework 195
truth. (And note that Giere himself claims that focussing on a relation of similarity allows us to
circumvent various standard philosophical difficulties centering on the notion of truth, difficulties
which have been taken to pose particular problems for the scientific realist (1988, pp. 81-2 and
p. 93).) See Jones (forthcoming b) for criticisms of Gieres approach, and Tellers paper for a
response. See also Gieres paper in the present volume for more on models, theory structure, and
realism.
47
I am assuming here that this difference is not compensated for by other differences in the
gravitational forces ignored by the two models, differences which will be due to relatively small
differences in the distances of various heavenly bodies from the Earth and from the moon. In any
case, the judicious placement of a black hole can provide us with a more clear-cut example, if one is
needed.
48
This talk of idealizing in respects and to degrees echoes Ronald Gieres language in his (1988),
ch. 3. According to the view of theory structure he presents there, a theoretical hypothesis makes
claims about the respects in and degrees to which a certain model is similar to a particular system,
or type of system, and, roughly speaking, we might say that there is an inverse relation between
degree of similarity and degree of idealization. Again, for criticisms of Gieres notion of model and
his talk of similarity, see Jones (forthcoming b).
196 Martin R. Jones
cases where we have no ready way of measuring with any exactness the feature
of the system misrepresented in the model: returning to our early example from
Chomsky, some speakers have more limited memories, and are more easily
distracted than others.49 This is not to say that fine-grained discriminations can
always be made, however often attempts to compare degrees of idealization in
specific respects will lead only to a partial ordering, and not due to some
limitation in our powers of discernment. And, more importantly, I should
emphasize that in recognizing talk of degrees of misrepresentation, of distortion,
and of truth as legitimate in general, I am not assuming that such talk is always
meaningful. Perhaps in some cases it simply does not make sense to ask how
close to the truth S is is, or how great a misrepresentation it is. On the
account given here, this simply implies that it will not be possible to talk about
the degree of idealization involved in this aspect of the model. The account
offered, such as it is, is intended only as account of the content of such
judgements when they can be made.50
We are now in a position to outline an understanding of talk of degrees of
idealization in models of kinds. At the lowest level of resolution, the natural
approach is exactly parallel to the one we adopted for models of particular
systems: We break the question How idealized a model is M? down into the
two questions (i) How many idealizations does M contain? and (ii) How
much of an idealization is each of the idealizations which M contains?, and
find the answer to the original question by taking a sort of weighted sum over
the various particular idealizations contained in M, bigger weights being
assigned to bigger (particular) idealizations. The difference in this case,
however, lies with the fact that question (ii) can itself then be divided in two.
We begin by classifying Ms ascription of i to the s (systems of the kind )
as an idealization just in case there is at least one which lacks i. Then for
each particular idealization M contains we must ask (iia) What fraction of the
s does M idealize in this respect?; in addition, for each it idealizes, we must
then ask (iib) To what degree does M idealize j by representing it as being
i? (quite possibly receiving different answers for different s). Thus, to put it
another way, when dealing with models of kinds, the answer to question (ii) is
itself arrived at by taking, for each idealization the model contains, a weighted
sum over all the s which that idealization idealizes, bigger weights
corresponding to greater degrees of idealization in the sense discussed above
49
This is not to say that precise measures of, say, memory capacity could not be constructed
indeed, psychologists have constructed such measures. The conceptual point is made, however, as
soon as we recognize that those of us who have at our disposal no such precise measures can clearly
make judgements of degree on these scores nonetheless, and thus could judge one given model to be
a more idealized model of a certain speaker than another, at least in one of these particular respects.
50
Note that nothing I have said prevents the picturing of S as from counting as an idealization
simpliciter in such a case.
Idealization and Abstraction: A Framework 197
Extending what I have been calling the natural approach to talk of degrees
of abstraction (or abstractness) results in a considerably simpler story than the
one I have just told about degrees of idealization, and for two reasons. First,
there is no question of evaluating the degree to which a model omits a given
feature of a given system, and so no parallel to the talk of degrees of
misrepresentation or distortion we employed in the case of idealization. This
fact then seems to preclude talk of the degree to which a model abstracts away
from a feature of a particular system. Second, in the case of a model of a kind, it
would seem that if a model abstracts away from a given feature possessed by
one system of that kind, then it abstracts away from that feature, or from other
determinates of the same determinable, for every system of the kind. So, taking
color as our example of a determinable, if a model of pendula abstracts away
from the redness of this pendulum, then it will abstract away from the redness,
or blueness, or greenness, as the case may be, of any other pendulum. And
counting the fact that a model abstracts away from the redness of red pendula
and the fact that it abstracts away from the greenness of green pendula as two
separate abstractions would seem to be double counting. The point might be put
by saying that, when counting abstractions, we should count the number of
determinables the determinates of which a given model abstracts away from.
And a model cannot abstract from the determinates of a certain determinable in
the case of some systems but not in the case of others. Thus there is no analogue,
when interpreting talk of degrees of abstraction with respect to a model of a kind,
to counting the number of systems of that kind which the model idealizes in
some given respect. Given these two points, then, in the case of abstraction the
natural approach thus reduces to the idea that to ask about degrees of abstraction
is just to ask how many abstractions the model contains, both in the case of a
model of a particular system and in the case of models of kinds.
An intuitively appealing way of making sense of talk about overall degrees
of idealization and abstraction in models readily suggests itself, then. And given
such an account, we seem to be in a position to make equally good sense of the
various sorts of talk about idealization and abstraction listed at the beginning of
51
Again, this mathematical talk should not be taken too literally; see n. 42.
52
There may be a difficulty here with the idea that what is relevant is the make-up of the class of
actual s. Intuitively, the worry would be that the actual s may not be representative of the kind,
especially if by actual s we mean the s existing at some particular time. For some preliminary
thoughts on a related problem, see n. 40.
198 Martin R. Jones
53
Again, see also Cartwright (1989), pp. 212-224.
Idealization and Abstraction: A Framework 199
5. Laws: Preliminaries
54
Although not in those words. Cartwright, for example (1989, p. 220), writes of those features
which are and are not specified.
55
Or some closely related form, such as s are followed by s, s have , etc. (where and
are doing multiple duty as stand-ins for various parts of English). For the purposes of brevity, I
will take s are s to be canonical. and , of course, can stand for pieces of scientific
English which are syntactically far more unwieldy than the Greek letters themselves (and the use of
standard examples concerning black ravens and white swans) might suggest.
56
At least for the most part see the discussions of the contrapositives objection in sections 7 and 8.
57
See Dretske (1977), Tooley (1977), and Armstrong (1983).
58
See Cartwright (1989), esp. chapter 5, for such a view.
200 Martin R. Jones
s tend to be s,
reading this paraphrase in such a way that it entails Most s are s, or Many
s are s, but not All s are s. Or one might take a law statement of form
(N) to be more accurately paraphrased by a statement of the form
All s, in virtue of being s, have the capacity to (be) ,
the truth of which is, I take it, quite compatible with there being no s which
are , even if there are s.59
Attributing form (N) to law statements is thus intended to commit us to very
little; nonetheless, it provides us with a sufficient foothold to allow us to make
some substantive claims about idealization and abstraction in laws, whilst at the
same time allowing us to remain neutral on the standard philosophical issues
just mentioned.60
Before outlining the remainder of what I have to say about laws it will be
helpful if we first fix a piece of terminology: I will say that the law that s are
s, or a law statement (which may or may not express a genuine law) of the
form s are s applies to a given system just in case the system is a .61
The discussion of idealization and abstraction in laws, then, will proceed as
follows: I will begin by devoting some time to distinguishing and characterizing
three different sorts of law-related idealization. First there are statements which
are of form (N) which we treat as statements of law for some purposes, and
which may apply to a large number of systems, but which are actually false, and
false in a way which makes the statements themselves idealizations. For
convenience, I will say that such statements express quasi-laws. Secondly,
there are genuine laws typical employment of which nonetheless involves
idealization; here the idealization is required in order that we may regard the law
59
I emphasize these last two readings with Cartwright (1989) in mind. In her terms (inspired by
Mill), the former reading provides us with a tendency law, the latter with a law about
tendencies (p. 178). See also chapter 5, section 5 of her book.
Some might say that a statement cannot express a genuine law, or a law in some especially
important sense, if it does not entail the corresponding (x)(x x) statement. If that is so,
however, then we may need to reckon with the possibility that the statements which we typically
classify as expressions of law in actual scientific practice do not express genuine laws, or laws in
the especially important sense. Be that as it may, we are concerned here with understanding talk of
idealization and abstraction as it applies to actual scientific practice, and the things we classify as
laws whilst engaged in that practice.
60
Note that if there are laws which do not take form (N), then what I have to say may not cover
them. On the other hand, it seems to me that it would be a relatively straightforward matter to
extend the following account beyond the domain of laws to general scientific claims of all sorts,
provided only that they have the right sort of form.
61
There is a certain worry one might have here about whether the notion of application thus defined
is well-formed in the case of laws (rather than law statements). For a discussion of that point, see
the discussion which closes section 7 below.
Idealization and Abstraction: A Framework 201
as applying. I will refer to such laws as idealized laws. And thirdly, there are
genuine laws which only truly apply to systems which are ideal in some sense;
these are the ideal laws.62 The distinction between the second and third sorts
of law-related idealization may be especially unclear at this point, but things
will become clearer below.63 My elaboration of the various distinctions will rely
very centrally on that fundamental notion an account of which provided the
starting point for the account of idealization in models given in sections 1 and 2
that is, the notion of idealizing a particular system in some specific respect.
Briefly, the idea is (in part) that when a law statement of the form s are s
expresses a quasi-law, then representing the systems we are dealing with as s
is an idealization in at least some cases, whereas when it expresses either an
idealized law or an ideal law, our use of the law requires us in many, most, or all
cases to indulge the idealization that various systems are s.64 Following all of
this, I will then provide a relatively quick account of abstraction in laws, one
which builds in a simple way on the account of abstraction in models presented
above.
6. Quasi-Laws
62
For brevitys sake, I will also apply these labels to the statements which express such laws,
whenever doing so will produce no confusion.
63
It may help to bear in mind that the categories of idealized law and of ideal law are (at least)
overlapping. Again, see below.
64
Although it will turn out that this is only contingently true of ideal laws.
65
The list of sample necessary conditions is included because it is not quite enough to say Law
statement L must be false to express a quasi-law; L must be false for the right reasons, so to speak.
202 Martin R. Jones
and without meaning to prejudice the issue, I will assume hereafter that we have
adopted an account of laws on which a law does entail the corresponding
(x)(x x) statement. I think it will be clear how to modify the following
remarks in order to accommodate views on which this entailment does not hold.
A necessary condition, then, for a statement of form (N) to express a quasi-
law, given the assumption I have just made, is that some s are not s
(regardless of what we believe in that regard).66 Another is that, at least for
some purposes, we treat the statement as a law, citing it in explanations,
employing it to make predictions, using it to support counterfactuals, and so on.
Even together these conditions are clearly not sufficient, however, if being
a quasi-law is to involve idealization in some way: some statements of form
(N) which at one time or another we have treated as expressing laws for various
purposes (and which we have perhaps taken to express actual laws) have
simply been false, without being or involving idealizations, and without it
being appropriate to say that we were idealizing in treating the statements in
question as expressing laws. So what else is involved in somethings being a
quasi-law?
Recalling the discussion of idealization in models, we can immediately
identify three important features a quasi-law may have. One is approximation:
s are s might be approximately true. Another is simplicity: s are s
might be a simplification of the truth about s. And finally there is relevance:
saying that s are s might misrepresent features of the non- s which are
relevant in one or more ways to the purposes we have in mind when employing
If, for example, law statements should be understood as claiming to express relations of
necessitation between universals, then it is not enough for no such relation of necessitation to hold
between the universals in question it may still be accidentally true that all the s are in fact s, in
which case I see no reason to speak of idealization (as opposed to error of another kind). The crucial
feature of a quasi-law I am trying to get at might be rather imperfectly put this way: If L is a quasi-
law, then the extensional content of L must be false. And for an illustration of what I mean by this
imprecise use of the phrase extensional content, I must then refer the reader back to the list of
sample necessary conditions provided in the text.
66
A difficulty arises here in the case in which there are no s, as intuitively it seems that a
statement of the form s are s might still be treated as a law in some circumstances, and might
nonetheless be an idealization in something like the way I am trying to characterize. (Note that this
is reminiscent of some problems we encountered in understanding talk of idealization with respect
to models of uninstantiated kinds.) Perhaps we might attempt to address this difficulty by invoking
counterfactuals, and making it a necessary condition on such a statements being a quasi-law that if
there were s, not all of them would be s. It is easy to discover problems with at least the initial
statement of this solution, however; whether those problems could be overcome, or whether the
original difficulty can be dealt with in some other way, will be left as a question for further
exploration.
Idealization and Abstraction: A Framework 203
the quasi-law as though it were a genuine law.67 Given this, we can settle on the
following characterization of the notion of a quasi-law:
L, a statement of form s are s, is (or expresses) a quasi-law if and only if (i) L is
treated as a law for some purposes, but (ii) some of the s are such that it is an
idealization to represent those s as s.
It follows, of course, from condition (ii) and my account of particular idealiza-
tions in specific respects (in sections 1 and 2) that some of the s are not s if
L states a quasi-law, and thus that statements of quasi-laws are false, and (so) do
not express laws.68
To illustrate the notion of a quasi-law, consider as a simple example the law
of gravitational free fall. A typical statement of this (so-called) law might run as
follows: Near the surface of the Earth, a falling body accelerates at a constant
rate of 9.8 m/s2. In terms of our schema, then, is system falling near the
surface of the Earth, and is system accelerating at a constant rate of
9.8 m/s2. Now it is not true that all systems falling near the surface of the Earth
accelerate constantly at 9.8 m/s2. For one thing, there are feathers, leaves, and
scraps of paper (like Neuraths thousand mark note69) whose fall is influenced
by the passing breezes in such a way that their acceleration is often very
different from the value cited in the law. Suppose we find some way to modify
the law so as to exclude such bodies from consideration, and so as to restrict
attention to objects such as bricks, hammers, cannonballs, and people, objects
which, relatively speaking, are only marginally affected by non-gravitational
forces.70 I will suppose that this restriction can be effected in some principled
manner, and indicate it by letting stand for system falling freely near the
surface of the Earth.71
67
This, however, seems virtually guaranteed, and to the extent that it is, relevance so elaborated
seems ill suited to be an independent criterion of quasi-lawhood. Perhaps instead we might focus on
the distinct question of whether representing the non s as s misrepresents them in a way
which is relevant to our purposes whether, for example, this misrepresentation makes a difference
to the predictions we are interested in making. The distinction here is between merely
misrepresenting relevant features, and misrepresenting relevant features in a way that makes a
(relevant) difference.
68
My intentions here are quite parallel to my intentions in the discussion of section 2, as described
at the very end of section 1. Although I do provide a pair of individually necessary and jointly
sufficient conditions for correct application of the label quasi-law, so that there is a surface-level
difference, the second of those conditions employs the notion of a particular idealization in a
specific respect, a central notion for which I have not tried to give necessary and sufficient
conditions.
69
See, e.g., Hempel (1969), p. 173.
70
As will become clear, restricting the law in this way is a step towards the production of an
idealized law (as opposed to a quasi-law). See especially the last paragraph of section 8.
71
In the present context, then, the phrase free fall is not intended to imply the complete absence of
non-gravitational forces such as air resistance, but merely their relative negligibility. (Note, on the
204 Martin R. Jones
Even with this modification, however, it is not the case that all s are s.
The problem is not that the number 9.8 is not exactly right, but that there is no
right number. The acceleration of a freely falling body varies from place to
place, and in two ways: at a given latitude, it decreases with the height of the
body above the ground, and for a given height, it increases with latitude.72 Thus
the statement that all freely falling bodies near the surface of the Earth
accelerate at a constant rate of 9.8 m/s2 is straightforwardly false.73
Despite its falsity, we often treat the statement in question as though it ex-
presses a genuine law when providing explanations, making predictions, design-
ing equipment, and so on. In doing so, we are idealizing, and the statement itself
can properly be said to be an idealization. This, then, is an example of what I am
calling a quasi-law. And it is interesting to note that each of the other three
factors mentioned above seem to be present: the statement is a simplification of
the truth of the matter, it is an approximation to that truth, and the features of the
systems in question which are misrepresented will certainly be relevant features
in the contexts in which we treat the statement as expressing a law.
As with models, we can coherently talk about the degree of idealization
involved in a given quasi-law, and presumably that is a product of two factors:
the degree of idealization (that is, degree of distortion) involved in saying, of
each non- , that it is a , and the ratio of non-s to s amongst the s.74
7. Idealized Laws
other hand, that even if the complete absence of any force other than the Earths gravitational pull
were required for free fall, everything I say in the next paragraph would still be true.) The hope is
also, of course, that we have been able to arrive at a statement which is at least somewhat plausible
without having to define the notion of free fall in such a way that we end up with a trivial truth.
72
The first sort of variation is a straightforward consequence of Newtons Law of Universal
Gravitational Attraction, which we can treat as a genuine law for the present purposes of
illustration (!); the second is due to fact that the Earth is oblate rather than spherical. (For some data
on this latter point, see Cohen 1985, p. 175.)
73
Remember that for expository purposes I am assuming that such s are s statements should
be read as entailing the corresponding (x) (x x) claims.
74
Or, to use the mathematical metaphor introduced earlier, we can say that the degree of
idealization enjoyed by the quasi-law can be calculated by taking a weighted sum over the non-
s, with the weights representing the degree of idealization involved in representing the
corresponding s as s, and then (to get a ratio) dividing by the total number of s ( and non-).
75
To see that these are distinct claims, we need only allow that it possible for the claim that s are
s to be true whilst the claim that it is a law that s are s is false.
Idealization and Abstraction: A Framework 205
76
To put it another way, with an idealized law we pretend that there are more s than there are.
77
It might be objected that the law finds footholds aplenty if only we write it in the contrapositive
form All accelerating bodies are subject to a net force, for a multitude of accelerating bodies
surrounds us. I will return to consider this potential objection at the end of this section.
78
Given again my assumption, made for purely expository purposes, that s are s should be
taken to entail (x)(x x).
206 Martin R. Jones
79
The same remarks apply to this definition of the notion of an idealized law as I made with respect
to the definition of the notion of a quasi-law see n. 68.
Idealization and Abstraction: A Framework 207
condition on being an idealized law, and how not to understand it. Specifically,
rarity cannot here be tied to absolute number in the most straightforward way.
Suppose that a certain kind of stellar event occurs once every billion years, on
average, but that we are fortunate enough to avoid a Big Crunch, so that the
universe lives on indefinitely into the future. In that case, there will eventually
be a very large number of events of the kind in question; yet such events will
surely still count as rare occasions. Instead, it seems to me that to understand
rarity in this sense, we need to see the law statement in question as embedded in
a wider context as part of a theory, or as a representation employed in the
context of a particular sort of theorizing. We can then draw on the notion of a
domain of inquiry, a class containing just those systems which the theory is
a theory of, or which the theorizing is about. The s then count as rare when a
sufficiently small proportion of systems in the relevant domain of inquiry are
s. To take a relatively extreme example, the bodies subject to a net force of
zero are clearly a very small proportion of the class of all bodies, and it is for
that reason that the second necessary condition on being an idealized law is
satisfied for the law of inertia.80
Given this way of understanding rarity, let me lay aside a potential objection
which strikes me as misguided. The objection I have in mind is that the notions
of proportion and of sufficient smallness on which the explication depend
are unacceptably vague.81 Such a complaint could be handled in a two-pronged
way. If all that is sought is an account of the way in which we classify some
laws as idealized, then some vagueness in the terms of the account is not
obviously a bad thing after all, there is plausibly some vagueness inherent in
the classificatory practice itself. If, on the other hand, it should seem desirable to
construct a notion of idealized law which is considerably more precise for some
philosophical purpose or other, then understanding the proposal at hand in this
80
Note that whether the s are rare will in many cases be a contingent matter not merely logically
or metaphysically contingent, but physically or nomologically contingent. This strikes me as an
advantage of the account I am offering, for it seems to me that in at least some cases when we say
that a law is idealized, we are (and take ourselves to be) uttering a (physically) contingent truth.
(Had the world been full of s, as it might have been, such-and-such a law would not have counted
as idealized; our use of the law constitutes an idealization only relative to the details of the
particular world in which we employ it.) If there are cases, however, of idealized laws which do not
seem to possess their status only contingently, then my hope would be that they qualify in virtue of
there being sorts of rarity which are not physically contingent.
Incidentally, reflection on the notion of a quasi-law makes it immediately clear that whether L
expresses a quasi-law is also contingent, but that, the presumed physical contingency of our
theoretical practices being what they are aside, quasi-lawhood is only a metaphysically, and not a
nomologically contingent matter it is trivially true that whether L expresses a law or not depends
on what the laws are.
81
What is more, the domain of inquiry may well have vague boundaries.
208 Martin R. Jones
way certainly leaves room for such a development; perhaps the relevant notion
of proportion might be cashed out in measure-theoretic terms, for example.82
It is also worth noting that the notion of an idealized law I have outlined
might be, and indeed probably should be elaborated upon by taking into account
the extent to which, in treating of non- systems by employing a s are s
law, thinking of the relevant systems as s also constitutes an idealization, and
thus by taking into account the extent to which contributes to the classifica-
tion of the law as idealized, and as less or more so. Given what has come before,
elaboration of that sort would be a relatively straightforward matter, and I will
not enter into such embroidery here.
Let us close this discussion of the notion of an idealized law by considering
an objection which might be raised to the definition laid out above. The
objection is this: According to the proposed definition, for s are s to state
an idealized law, the s that is, systems of the type mentioned first in the
candidate law statement must be, at best, few and far between in the relevant
domain of inquiry. Another way of stating the very same law, however, is
simply to take the contrapositive of the initial formulation, that is, Non-s are
non-s; and it may be that, while the s are rare, the systems mentioned first
in this new formulation of the law, the non-s, are quite plentiful, or even
ubiquitous in the relevant domain. Thus it seems that whether a given law
counts as an idealized law may depend on which of two syntactically distinct
but semantically equivalent ways of expressing the law we consider. And surely
that is wrong; surely we are here trying to classify laws (or putative laws, in the
case of quasi-laws) with an eye to types of idealization, not law statements.
Indeed, the law of inertia provides a perfect example: it may be that no body has
ever truly been subject to no net force, and yet a multitude of accelerating
bodies (non-non-accelerating bodies, so to speak) surrounds us. So is the law of
inertia an idealized law or not?83
The first point to be made in response to this objection is that it relies on an
assumption about law statements which by no means all accounts of lawhood
82
Comparing the sheer cardinality of the class of s with that of the class of s, however, will
clearly not work, for as the example of rare stellar events in a universe of infinite lifetime suggests,
both sets might have the cardinality aleph-nought even when the s do count as rare. The obvious
problem a measure-theoretic approach would face, on the other hand, is that of locating a suitable
provenance for the necessary measures.
83
Essentially the same line of thought just as readily gives rise to the claim that the definition I
introduced earlier of what it is for a law or law statement to apply to a system is ill-formed with
respect to laws (as opposed to law statements): The law that s are s is supposed to apply to S iff
S is a ; yet (this line of thought goes), the law that non-s are non-s is the same law, and a given
S may be a non- but not a , or vice versa. (Indeed, if the law in question really is a law, then on
this view any given S is bound to be one of the two!) So does the law apply to such an S or not?
This is the worry I alluded to in n. 61, and the discussion which follows in the text should make it
clear how I would respond to it.
Idealization and Abstraction: A Framework 209
84
This is also a very familiar point, of course, as it is the observation which gives rise to the ravens
paradox in confirmation theory.
85
Armstrong springs to mind here as someone who believes that there are universals, but relatively
few of them. More to the point, Armstrong argues that when the predicate is picks out a
universal, the predicate is non- typically will not. See Armstrong (1978), chapters 13 and 14.
86
See nn. 57-59 for these competing views of laws.
87
Specifically, although Ayers (1956) account seems deeply problematic to me (and many others),
and although I am in fact inclined towards a capacities account of the sort Cartwright has been
developing, I would rather not presuppose the falsity of what Earman calls the Mill-Ramsey-
Lewis or M-R-L view (1986, pp. 87-90), and on that view the objection involving
contrapositives would indeed have teeth, just because, as noted, laws on the M-R-L view do have
the logical form traditionally ascribed to them.
210 Martin R. Jones
This new definition retains all the central features of the initial definition, but
avoids the difficulty we have been considering involving contrapositive formu-
lations of laws. In fact, the first and second clauses on their own are sufficient to
defuse the objection involving contrapositives, strictly speaking the law of
inertia would now be declared an idealized law, given this formulation, even
without (iii). But (iii) is included because without it we get the wrong results: if
we only ever used the law of inertia to draw the conclusion about various
accelerating systems that they must be subject to net forces, say, then there
would be little reason to talk of idealization. In other words, it is because we use
the law in a certain way that it counts as an idealized law. But note that this is
quite parallel to the case of quasi-laws: to be a quasi-law, L has to be treated as
a law for some purposes. And this pragmatic dimension to the conceptual
distinctions I am drawing is present even in the case of idealization in models.
The account of idealization in models took as one of its fundamental building
blocks the notion of idealization in a specific respect, and that has partly to do
with relevance, which as characterized is clearly a pragmatic notion, and partly
to do with simplicity, which is quite plausibly pragmatic, too.
Despite all this, it might be argued that the new proposal still has its defects.
As things stand, if the law statement All accelerating bodies are subject to a net
force does express the same law as A body subject to no net force undergoes
no acceleration, then it expresses an idealized law just because it expresses a
law which, by this definition, counts as idealized (and which, in fact, turns out
to be highly idealized). And that former statement thus counts as expressing a
highly idealized law even though there is an abundance of non-idealizing uses
of it available to us we do, after all, use it to conclude that various accelerating
bodies must be subject to some net force. If that should seem too uncomfortable
a way of talking, then at the end of the day it may be best to refrain from talking
of laws themselves as idealized or otherwise, and to limit ourselves instead to
thinking about idealizing uses of laws. Before leaping into such a rethinking of
the territory, however, it is worth remembering that these latter difficulties arise
only if certain views of the nature of lawhood should turn out to be correct.
8. Ideal Laws
In an attempt to capture one more way in which laws can be tied up with
idealization, let me begin simply by defining the notion of an ideal law, as
follows:
L, of the form s are s, is an ideal law if and only if (i) L is a genuine law, and (ii)
s are ideal systems.
Idealization and Abstraction: A Framework 211
The first question which comes to mind here is what it means to say that the s
are ideal systems. The basic idea is that some systems (or possible kinds of
system) are ideal in the sense of being perfect, or just right, and that some
laws then count as ideal just because they govern such perfect systems (or
would govern such systems if there were any). But what makes perfect systems
perfect?
Cartwright proposes an intriguing answer to that question. In essence, she
proposes that certain sorts of system count as perfect or ideal because conditions
are just right for some particular capacity to reveal itself in such systems,
without hindrance from any distinct factor which might otherwise interfere.88
Thus a body subject to no net force is one which will reveal the inherent, if
relatively unexciting capacity every body has to just keep moving with a
constant velocity.89 I find Cartwrights proposals attractive, and will implicitly
allow them to fix the sense of the phrase ideal system in the remainder of the
discussion, but it is worth bearing in mind the fact that one might wish to
consider some other sense in which a system could be perfect, or ideal; the hope
is then that the rest of what I have to say will carry over to other ways of being
ideal.
The next issue to be addressed here is the relationship between the category
of ideal laws and that of idealized laws, and we can become clearer on the
nature of that relationship by thinking first about the connections between being
rare and being ideal. Logically speaking, these would seem to be independent
features of a system (or possible kind of system). Certainly some kinds of
system are both rare and ideal: consider the category of untrammeled bodies. It
is just as surely true, however, that there are kinds of system which are rare
without being ideal (in any obvious sense) charged metallic spheres subject to
a gravitational force of magnitude 10.378 N and a repulsive electrical force of
magnitude 17.58 N pushing in directions which make an angle of 53 with each
other, for example and it also seems possible that the world might have been
overflowing with ideal systems. Thus it is clear that the two features, being rare
and being ideal, are quite separate.
88
Cartwright (1989), chapter 5, esp. pp. 190-1. Ideal laws as I am characterizing them would seem
to correspond to the laws which, on Cartwrights account (and in her terms), describe what happens
in a situation depicted by an idealized model. Cartwright says that a law of this sort is a kind of
ceteris paribus law: it tells what [a] factor does if circumstances are arranged in a particularly ideal
way (p. 192). I do not wish to presuppose, however (and nor, I suspect, does Cartwright), that no
ideal law ever applies to an actual system; perhaps ideal conditions are sometimes realized.
89
Cartwright may prefer to classify this as a (mere) tendency rather than a capacity (1989, p. 226),
but that distinction need not concern us here.
212 Martin R. Jones
On the other hand, it does seem to be true that in the actual world, ideal
systems are rare the two features are contingently correlated.90 Suppose for a
moment that it is true to say that all ideal systems are rare. Suppose, further-
more, that all ideal laws, of the form s are s, are typically employed in a
way which involves treating various non- systems from the domain of inquiry
as s, even though we are idealizing those systems in doing so. Then it would
follow that all ideal laws are idealized laws.
The truth, of course, may not be so simple. Perhaps there are some kinds of
ideal system which are relatively common. Or (more likely) perhaps there are
ideal laws which we rarely or never use in the idealizing way described simply
because we never use the laws in question at all, or because the idealization
involved in treating the actual non- systems around us as s is simply too
great for it ever to be useful for us ever to employ the law in our dealings with
non- systems. Even so, it seems clear that there is considerable overlap
between the categories of ideal law and idealized law. We do in fact often use
ideal laws which apply (or would apply) only to some rare sort of ideal system
as though they applied to various non- systems in the world, and idealize in so
employing such laws. The law of inertia again provides us with a good example.
With this in mind, we can now see that ideal laws are implicated in practices
of idealization a little more loosely than either quasi-laws or idealized laws.
Even if it is true that all ideal laws are in fact idealized laws, this is so only
contingently. It is no part of the definition that an ideal law need ever be used in
a way which involves idealizations, and this distinguishes ideal laws from both
quasi-laws and idealized laws. Nonetheless, contingent though it may be, the
overlap between the categories of ideal law and idealized law is there, and so de
facto a full account of how idealizations arise in our use of laws must take
account of the category of ideal laws.91
There is one problem to be dealt with here before leaving ideal laws behind
the worry about contrapositives again. Is the law of inertia an ideal law or not?
Untrammeled bodies are plausibly to be thought of as ideal systems (at least
90
And note that Cartwrights proposals account for this fact quite straightforwardly, given the
evident fact that our world is a rich and complex one in which numerous causal factors are typically
at work in any given situation.
91
One might wonder why I have defined the notion of ideal law in such a way that the connection
between ideal laws and idealizations is so loose. The answer is that it seems to me that we do pre-
philosophically recognize a category of laws which are special just because they deal with systems
which are ideal in some sense; and although it is true that such laws tend to get used in ways which
involves us in idealization, that is not necessarily part of what we have in mind when we label such
laws ideal laws or laws concerning ideal systems. In other words, doing things this way seems
to me to result in a greater degree of continuity with pre-existing usage.
Incidentally, another standard locution which comes to mind when we think about idealization in
laws is that of the s being a special case; this, it seems to me, is ambiguous between the notions
of being rare and being an ideal system, and so between talk of an idealized law and an ideal one.
Idealization and Abstraction: A Framework 213
insofar as they are untrammeled), but an accelerating body does not thereby
strike one as particularly ideal in any obvious sense. So which formulation of
the law of inertia should we look to when asking whether the law is an ideal
one?
Given the discussion of the parallel worry at the end of the last section, it is
easy enough to see what we might say here. For one thing, we might challenge
the claim that the contrapositives of law statements are semantically equivalent
to the law statements we started with.92 More diplomatically, we might modify
the definition of the notion of an ideal law, without losing anything essential, as
follows:
L is an ideal law if and only if (i) L is a genuine law, (ii) there is a formulation of L of
the form s are s such that s are ideal systems, and (iii) we often employ the
law in a way which involves treating various systems as s.
Although the strategy here is in many ways parallel to the strategy I outlined for
dealing with the contrapositives problem in the case of the notion of an
idealized law, it is important to note that the third clause of this definition is not
quite the same as the third clause of the amended definition of that notion; in
particular, it is not required that to be an ideal law L must be employed in such a
way as to idealize non- systems, for it may be that whenever we employ the
law in a way which involves treating various systems as s, the systems in
question are s. As the discussion above makes clear, this is simply in keeping
with the original definition of the notion of an ideal law. However, this new
definition does require that we sometimes use the law, unlike the initial
definition, for it is that usage which now puts an emphasis on the fact that the
law can be thought of as concerning ideal systems.
* * *
This completes my account of the specific ways in which laws and our
employment of them can involve idealization. Before I turn to consider laws and
abstraction very briefly, however, there are two further points worth making
regarding idealization and laws.
First, there is the option of introducing further categories into our scheme for
classifying laws and law statements. Ideal and idealized laws must genuinely be
laws, whereas quasi-laws must not. Yet perhaps there are statements of the s
are s form which purport to govern rare or ideal systems, and which we treat
92
Note that one can deny this without denying that it follows from the law of inertia (here identified
as the law that a body subject to net force does not accelerate) that all accelerating bodies are
subject to a net force. If, for example, the law of inertia is understood as concerning a relation of
necessitation which holds between two universals, then given the law the contrapositive will surely
be true; it is just that the contrapositive will not state the law of inertia (or, most likely, any law).
214 Martin R. Jones
as expressions of law in at least some contexts, but which do not in fact express
genuine laws.93 Some such statements might be said to express, or to be, ideal or
idealized quasi-laws (ideal for ideal s, idealized for s which are rare but
non-ideal).94 Employment of either an idealized quasi-law or (assuming that the
ideal systems in question are few and far between) an ideal quasi-law will then
typically involve us in the sorry business of making both the idealization that
the statement in question is true and the idealization that it applies to the system
in hand.95
Second, the classificatory scheme I have laid out corresponds nicely to an
important claim Cartwright makes in her 1983 book, How the Laws of Physics
Lie.96 According to Cartwright, the sorts of law statement which we value most
in our scientific work must generally be taken in one of two ways: as widely
applicable but false, or as true but quite restricted in their scope of application.
The idea is that a typical law statement of the form s are s will have
numerous exceptions if it is read as entailing (x)(x x), and so as
applying to all s, and that in the subsequent attempt to produce a true
statement of the (x)( . . . ) form we will find ourselves having to add a slew
of restrictive antecedent clauses, and possibly even resorting to the use of non-
specific ceteris paribus clauses (thus effectively replacing with some new
*); the end result will at best be a true generalization of very restricted
applicability, one which will do very little of the work we expected our original
law statement to do. Cartwright illustrates this dilemma by employing the
example of Snells law of refraction (1983). In the terms of my account,
Cartwrights claim is that our favorite law statements must generally be taken to
express either quasi-laws, or idealized (and possibly ideal) laws, so that we must
either sacrifice truth or applicability. Cartwrights main point is that if this claim
is correct, it is quite devastating for covering-law models of explanation, but it
is also worth reflecting on the fact that the claim raises serious problems for a
wide range of accounts of confirmation in much the same way.
93
And for reasons having to do with the falsehood of what I earlier called their extensional
content; see n. 65.
94
Some because, in line with the preceding discussion, the conditions just specified may not be
enough to make something count as an idealized (as opposed to ideal) quasi-law for that, it needs
to be the case that the quasi-law in question is typically employed in a way which idealizes non-
systems by treating them as s.
95
An example of an idealized quasi-law, in fact, is Galileos law of free fall, as ordinarily
understood and employed (not, that is, understood in terms of the special sense of free I introduced
towards the end of section 6).
96
See especially essay 2, The Truth Doesnt Explain Much (Cartwright 1983).
Idealization and Abstraction: A Framework 215
9. Abstraction in Laws
I wish to say relatively little about abstraction and laws. It is surely true of any
law statement of the form s are s that when we employ it in the treatment
of a system S, describing S as being both a and a will omit mention of many
features S has, without thereby misrepresenting S (that is, without representing S
as lacking them). To classify L as an abstract law is thus presumably to imply
that it involves, in one way or another, a lot of omission as compared to other
laws. Accordingly, I propose that we understand talk of abstract laws in a way
which derives from the notion of an abstract model (where an abstract model is
one which omits a lot). Corresponding to any law statement of the form s are
s, there are two models of any given real system S, one of which simply
represents S as being , and the other of which simply represents it as being . A
law statement (or law, if the statement indeed expresses one) then counts as
abstract if and only if, given an arbitrary S from the relevant domain of inquiry,
one or both of the models in question is an abstract model of S. (Another way of
putting this, perhaps, is that L counts as an abstract law just when one or both of
the concepts -ness and -ness is a relatively abstract concept.) And talk of
degrees of abstraction with respect to laws can then be understood in a way
which derives fairly straightforwardly from our understanding of such talk with
respect to models.97
Two aspects of this simple proposal concerning abstraction should be noted.
First, whether a given law counts as abstract or not is determined with reference
to an arbitrary system from the relevant domain of inquiry. This is simply
because if a model the content of which is captured by S is (say) counts as an
abstract one, where S is any system from the relevant domain of inquiry, then so
too will the model the content of which is captured by S* is , for any other
system from that domain, S*.
Second, note that the proposal in no way precludes the classification of a law
(or law statement) as abstract and ideal, abstract and idealized, or as abstract and
a quasi-law. In particular, although it is indeed built into my own account of
abstraction in models that abstraction with regard to a particular feature of real
systems involves omission without misrepresentation, the misrepresentation
which is prohibited is just misrepresentation of the fact that the real system or
systems in question have the feature in question; nothing in the account prevents
an abstract model from simultaneously misrepresenting how things stand with
97
The only obvious complication being that the degree of abstractness of the law will be a product
(loosely speaking) of the degree of abstractness of the two models to which it gives rise (the
model and the model).
216 Martin R. Jones
respect to other features of systems (features other than the ones the model
abstracts away from).98
The brevity and apparent simplicity of this discussion of abstraction and
laws should not deceive: all the work is done by the notion of an abstract model,
and as the earlier discussion of that notion made plain, providing an account of
the classification of models as more or less abstract is no trivial matter. There
are also a good number of difficult questions about how abstract laws function,
how precisely they should be understood, and what epistemological status they
have.99 The hope is, nonetheless, that the framework laid out in this paper will
help us as we grapple with such questions about idealization, abstraction, and
the implications of their ubiquity for our philosophical understanding of the
sciences.*
Martin R. Jones
Department of Philosophy
Oberlin College
martin.jones@oberlin.edu
REFERENCES
Armstrong, D. M. (1978). A Theory of Universals. Volume II: Universals and Scientific Realism.
Cambridge: Cambridge University Press.
Armstrong, D. M. (1983). What is a Law of Nature? Cambridge: Cambridge University Press.
Ayer, A. J. (1956). What is a Law of Nature? Revue Internationale de Philosophie 10, 144-165.
Carnap, R. (1970). Theories as Partially Interpreted Formal Systems. In: B. A. Brody (ed.),
Readings in the Philosophy of Science, pp. 190-199. Englewood Cliffs, NJ: Prentice-Hall.
Cartwright, N. (1983). How the Laws of Physics Lie. Oxford: Clarendon Press.
Cartwright, N. (1989). Natures Capacities and their Measurement. Oxford: Clarendon Press.
Cartwright, N., and Mendell, H. (1984). What Makes Physics Objects Abstract? In: J. T. Cushing,
C. F. Delaney and G. M. Gutting (eds.), Science and Reality, pp. 134-152. Notre Dame:
University of Notre Dame Press.
Chilvers, I., ed. (1990). The Concise Oxford Dictionary of Art and Artists. Oxford: Oxford
University Press.
Chomsky, N. (1965). Aspects of the Theory of Syntax. Cambridge, Mass.: The M.I.T. Press.
Cohen, I. B. (1985). The Birth of the New Physics. Revised and updated. New York: W. W. Norton
& Co.
98
It is, perhaps, especially important to note that the account allows for abstract quasi-laws in the
context of Cartwrights views, for we might expect a law statement which is abstract in part because
the concept of -ness is an abstract concept to have a wide range of application, and thus, according
to Cartwright, expect it to be false.
99
Some of those questions are addressed elsewhere in this volume. See also Cartwright (1989),
chapter 5, for much more on these issues.
*
Thanks for helpful conversations to Mauricio Surez, Mathias Frisch, Nancy Cartwright, Paul
Teller, R.I.G. Hughes, Paddy Blanchette, Paolo Mancosu, Dorit Ganson, and Peter McInerney.
Idealization and Abstraction: A Framework 217
Drake, S., and Drabkin, I. E., translators and annotators (1969). Mechanics in Sixteenth-Century
Italy: Selections from Tartaglia, Benedetti, Guido Ubaldo, & Galileo. Madison: The University
of Wisconsin Press.
Dretske, F. (1977). Laws of Nature. Philosophy of Science 44, 248-268.
Earman, J. (1986). A Primer on Determinism. Dordrecht: D. Reidel.
Frisch, M. (1998). Theories, Models, and Explanation. Ph.D. dissertation: University of California,
Berkeley.
Giere, R. N. (1988). Explaining Science: A Cognitive Approach. Chicago: The University of
Chicago Press.
Granger, R. A. (1995). Fluid Mechanics. New York: Dover.
Hempel, C. G. (1969). Logical Positivism and the Social Sciences. In: P. Achinstein and S. F.
Barker (eds.), The Legacy of Logical Positivism: Studies in the Philosophy of Science, pp. 163-
194. Baltimore: The Johns Hopkins Press.
Jones, M. R. (forthcoming a). Models and the Semantic View.
Jones, M. R. (forthcoming b). Models and Idealized Systems.
Lewis, D. K. (1983). New Work for a Theory of Universals. Australasian Journal of Philosophy 61,
343-377.
McMullin, E. (1985). Galilean Idealization. Studies in the History and Philosophy of Science 16,
247-73.
Niiniluoto, I. (1998). Verisimilitude: The Third Period. British Journal for the Philosophy of
Science 49, 1-29.
Nowak, L. (1992). The Idealizational Approach to Science: A Survey. In: J. Brzeziski and L.
Nowak (eds.), Idealization III: Approximation and Truth, pp. 9-63. Amsterdam: Rodopi.
Putnam, H. (1979). What Theories Are Not. In: Mathematics, Matter and Method: Philosophical
Papers, vol. 1, pp. 215-27. 2nd edition. Cambridge: Cambridge University Press.
Suppe, F. (1967). The Meaning and Use of Models in Mathematics and the Exact Sciences. Ph.D.
dissertation: University of Michigan.
Suppe, F. (1972). Whats Wrong with the Received View on the Structure of Scientific Theories?
Philosophy of Science 39, 1-19.
Suppe, F. (1974a). The Search for Philosophic Understanding of Scientific Theories. In: Suppe
(1974b), pp. 3-241.
Suppe, F., ed. (1974b). The Structure of Scientific Theories. Urbana: University of Illinois Press.
Suppe, F. (1989). The Semantic Conception of Theories and Scientific Realism. Urbana: University
of Illinois Press.
Suppes, P. (1957). Introduction to Logic. Princeton: Van Nostrand.
Suppes, P. (1960). A Comparison of the Meaning and Uses of Models in Mathematics and the
Empirical Sciences. Synthese 12, 287-301.
Suppes, P. (1967). What is a Scientific Theory? In: S. Morgenbesser (ed.), Philosophy of Science
Today, pp. 55-67. New York: Basic Books.
Suppes, P. (1974). The Structure of Theories and the Analysis of Data. In: Suppe (1974b),
pp. 266-283.
Teller, P. (2001). Twilight of the Perfect Model Model. Erkenntnis 55, 393-415.
Tooley, M. (1977). The Nature of Law. Canadian Journal of Philosophy 7, 667-698.
van Fraassen, B. C. (1970). On the Extension of Beths Semantics of Physical Theories. Philosophy
of Science 37, 325-339.
van Fraassen, B. C. (1972). A Formal Approach to the Philosophy of Science. In: R. Colodny (ed.),
Paradigms and Paradoxes, pp. 303-366. Pittsburgh: University of Pittsburgh Press.
van Fraassen, B. C. (1980). The Scientific Image. Oxford: Clarendon Press.
van Fraassen, B. C. (1987). The Semantic Approach to Scientific Theories. In: N. J. Nersessian
(ed.), The Process of Science, pp. 105-124. Dordrecht: Martinus Nijhoff.
van Fraassen, B. C. (1989). Laws and Symmetry. Oxford: Clarendon Press.
This page intentionally left blank
David S. Nivison
STANDARD TIME
The clock strikes twelve, and the local siren verifies its accuracy. I say its noon.
But of course, when I reflect a moment, I know it isnt. I know this, at least,
unless I happen to be located on one of four lines not even marked on the map
of the country, most points of which are probably uninhabited. I know it is not
really noon, if by noon I mean midday, literally, when the sun is on the
meridian. We used to do it that way, until we had trains, and a problem about
schedules. Now we agree to a fiction, which enables the schedules to work, and
each of us wears a small contraption on his wrist that automatically translates
experienced temporal reality into the language of the fiction.
This conventional model for business-day time is recent, deliberately
adopted, and out in the open. Other models used to bring ordered sense into the
world as we deal with it conceptually are not so recent, not deliberately adopted,
and in some cases not recognized as fictions for centuries. In certain of these
cases that especially interest me, the ordering schema is not thought of as a
fiction at all, but continues to be thought of as ideally true, even though it comes
to be seen that a few technical adjustments are needed to enable people to use
the ideally true schema in dealing with the messy world they live in. We,
looking back at these adjustments, sometimes can see them as halting steps in
the development of what we call science. But sometimes they seem actually to
block scientific discovery.
Trying prudently to speak of something I am familiar with, and hoping to
find something manageably simple, I will look at two problems in ancient
Chinese calendar science. Or science.
The first problem is not specifically Chinese. It must arise for any people,
already in pre-historic times, if they try to organize their activities short-term to
In: Martin R. Jones and Nancy Cartwright (eds.), Idealization XII: Correcting the Model.
Idealization and Abstraction in the Sciences (Pozna Studies in the Philosophy of the Sciences and
the Humanities, vol. 86), pp. 219-231. Amsterdam/New York, NY: Rodopi, 2005.
220 David S. Nivison
accord with the observed sequence of lunar months, but longer-term to accord
with the passing of seasons and years, which depend on the sun.
How many seasons in a year? In our (and Chinas) latitude, four obvious
ones, naturally matched with the solstices and equinoxes. (A tropical civilization
might have noticed zenith crossings instead, getting different calendar
concepts.) The Chinese understood the concepts before they had instruments
that could determine the exact times of these events. They noticed that from
winter solstice to winter solstice was more than 365 days, so they called it 366;
later they saw that it is approximately 365 1/4. The solstices and equinoxes were
taken (as in the ancient West) as the midpoints of the seasons; and since they are
more or less equally spaced, it was assumed that the order of nature must be that
they are exactly equally spaced. It was noticed that the moon moves through a
band of stars, back to its original position and a little more, in a month, and the
sun, which is up there too, is opposite the moon when the moon is full; so the
sun must be doing the same thing, in the course of a year.
A simple crude method was devised for determining the position of the sun
against the (by day invisible) stellar background (one does it by mapping the
zodiac this could be done by noting constellations near successive full moons
and then noticing what stars are on the meridian before dawn and after sunset).
In this way one can see that there must be four ideal points (wei) that the sun
passes through as it moves around the zodiac, that mark the transitions from one
season to another. Since there are 365 1/4 days in a year, one can divide the
zodiac into 365 1/4 degrees (du, step), each being the distance the sun
travels in a day. How many du are there from one wei to the next? The
Huainanzi (ca. 130 B.C.), in its chapter on astronomy, says there are exactly
91 5/16 (chapter 3, at paragraph 12): and the system it describes implies that this
must also be the exact distance between the suns locations at the solstices and
equinoxes, and therefore (since a du is a solar days run) that the solstices and
equinoxes divide the year temporally into fourths (approximately, resolved to
whole days).1
The system in the Huainanzi divided the year into 15-day intervals, with an
extra day at the beginning of each season. One more day is needed in a normal
year, and it was placed at midsummer. This suggests that the Chinese were
aware that the day-count from spring equinox to summer solstice is longer than
a quarter of the year. (At the time of the Huainanzi it was three days longer.)
They apparently recognized the inequality, but minimized it in their system.
So the system is false. The Chinese, by the time they were capable of this
degree of precision, must have known it was false. They didnt care. It served
their purposes well enough; it caught their idea of the order of the heavens, and
so it remained ideally true.
1
See Major (1993).
Standard Time 221
What were their purposes? Back to the basic problem: how to relate the
lunar calendar to the solar year. They did this (I think) as follows:
(1) The moon moves around the zodiac in more than 27 days; so call it 28.
Therefore, divide the zodiac (calling it 365 du) into 28 more or less equal
spaces, or lodges, xiu, one for each night. (And forget that the moon is
not always out at night.) Most will be 13 du (1 du is a suns days run). One
must be 14 du. Let the xiu in which the sun is located at the winter solstice
be the biggest one, and let the location of the sun at solstice be the midpoint
of this xiu. The xiu had names, and rough asterism equivalents. This 14-du
xiu is the one later called Xu, Void, in what we call Aquarius. (The
solstice precesses, at about one du in 70 years, and the Chinese didnt figure
this out until about 400 A.D.; so this system is early: it would have been
true around 2000 B.C.)2
(2) The handle of the Big Dipper is observed to point at a particular time of
night, e.g., midnight, in different directions as the year progresses, its
midnight pointing moving around the sky in the course of the year from left
to right clockwise, please note. So let it be the hand of a clock, created by
imagining the zodiac projected on the horizon, in such a way that at winter
solstice the handles pointing (obviously interpretable) is taken to be due
north at an ideal dusk, i.e., half way from noon to midnight, and due north
is made to be the middle of Xu-on-the-horizon. The relation in celestial
geometry between the Dipper and the actual zodiac is always the same, of
course. The point roughly, Scorpio to which the handle was seen as
pointing in the actual zodiac was the point the sun reached at the autumn
equinox; therefore that point in the horizon projection was where the handle
(moving clockwise) pointed at ideal dusk at the spring equinox; so spring
was east.
(3) The moon and the sun move around the zodiac in the opposite direction
(i.e., counter-clockwise, in this conception), but the moon moves much
faster, the stroboscopic effect being to divide the zodiac roughly into
twelfths. So let the horizon-zodiac be re-divided into exact twelfths.
These were called chen, and were so drawn that the midpoint of the first
chen was the midpoint of horizon-Xu. Thus the time-span for the handle to
move from one chen to the next is a twelfth of the year, and can be thought
of as a solar month. (Some ancient Chinese texts that have been taken to
be talking about months are really talking about months.) A month is
longer than a lunar month, so from time to time an extra lunar month had to
be inserted, at the end of a year or earlier, to keep up the fiction that there
are twelve of them.
2
For reconstructions of early Chinese lunar zodiac systems, see Nivison (1989).
222 David S. Nivison
(4) If one wished to do it earlier, the rule was this: If at the beginning of a lunar
month the Dippers handle is pointing exactly at the boundary between two
chen, then that lunar month is intercalary. Why? What does this mean?
(5) Think now of the calendar of a normal 365-day year. It can be thought of as
divided into approximate twelfths (pace whole numbers) of 30 or 31 days,
and each of these again divided in two, into 15-day or 16-day periods
(called (solar) weather-intervals, qi jie), which are given names, of seasonal
weather conditions and natural phenomena or farmers activities. The first
day of a solar month period is called a jie qi; the first day of the next
15-day or 16-day interval is called a zhong qi, or qi-center. (Or we can
think of a qi-center as the corresponding ideal zodiac point.) The winter
solstice day is the first qi-center, and there are eleven others spaced around
the year, these being the middle days of their correspondingly numbered
solar months. Now, an equivalent rule is that a regular-numbered lunar
month must contain the qi-center day having that number, and if it has no
qi-center day it is intercalary.
The two rules are stated in the same sentence by Kong Yingda (574-648
A.D.) (Commentary to Zuo zhuan, Xiang Gong 27.6 = 12th month). Putting
them together implies conceiving of the clockwise chen pointing of the Dippers
handle as leading the suns corresponding counter-clockwise apparent
movement around the zodiac by half a solar month (so if one still wants to say
that the handle points due north at dusk at the winter solstice, one must suppose
that the solstice had precessed to a point about 15 du short of the middle of Xu;
this would have been true around 900 B.C.).
Using just the qi-center rule, and the explicit concept intercalary month
(run yue), gives this: say that the eighth month counting from the winter solstice
month contains the eighth qi-center, and the next lunar month contains none;
then that month is the intercalary 8th month.
Seventeen centuries before Kong Yingda a slightly different but practically
equivalent rule seems to be used. There are about 80 oracle-bone inscriptions
giving day-by-day movements of the Shang royal army as it marched east to
attack the Ren Fang in the Huai valley. All have day dates in the 60-day
cycle; some have month dates; and a few have year dates, specifying
the 10th-11th years of a king that the inscription style identifies as the last
Shang king, Di Xin, requiring that the dates be in the first half of the 11th
century.
I will select five dates in this inscription sequence that show that an
intercalation has been made:3
3
See Chen (1956), pp. 301-304.
Standard Time 223
The mean length of a lunar month is about 29 1/2 days, and classical calendar
theory therefore stipulates that long (30-day) and short (29-day) months
alternate, though this may not exactly correspond to astronomical fact. To make
the correspondence work out in the long run, it is necessary to have an
intercalary day, in the form of two 30-day months in succession, about twice in
three years, amounting to less than a quarter of a day in a four-month period.
But here, if there is just one 9th month, there must be five extra days in the
four months 9th through 12th, twenty times the norm; and in addition one or
more months must be anomalous 31-day months. Or else the 9th month must
begin with day 31, instead of day 34, 35, or 36 (since the first month of the next
year must begin with day 32, 33, or 34). And then the 9th month would have to
have 33 days. Therefore one of these month numbers represents two lunar
cycles. The only possibility is the 9th month; and from this and other evidence
the 10th year can be identified as 1077 B.C. (Thus the first year of Di Xin
must be 1086 B.C.)4
Standard tables then show the rest of the story: If the years were beginning,
Shang-style, with the post-winter-solstice month, months 8 through 12 run as
follows; supplied sun locations are right ascension (approximately)5:
4
Nivison (1983), pp. 501-2. (E. L. Shaughnessy contributed to this discovery.)
5
In this table, right ascension is computed using Ahnert (1960). Ecliptic locations for the sun (and
planets) can be obtained from Stahlman and Gingerich (1963). The table assumes that a qi-center
day is a day when the sun reaches an ecliptic location that is a multiple of 30.
224 David S. Nivison
6
Nivison (1983) agues that the conquest date was 1045 B.C. I have found errors in this argument. In
recent conference papers and other unpublished writing I argue for 1040 B.C.
Standard Time 225
II
between kings, dukes, ministers, etc., probably compiled in the 4th century.
The paragraph in question (Zhou Yu 3.7) describes certain celestial events in
the 11th century B.C., and I would argue that it was based on some calculations
made six centuries later, in the early 5th century, giving data that were
therefore wrong. But Liu took it at face value. This paragraph contains a run-
down of the zodiac positions of Jupiter, the sun and the moon, etc., on the day
when King Wu of Zhou launched his victorious campaign against the Shang
kingdom. The date of this event was, and has remained, controversial. Liu
attempted to settle the matter scientifically, by astronomy (getting a date that
was about eighty years too early). Along with this effort, he worked out a
mathematical system of positional astronomy, one that could be used to
calculate the position of any planet at any time in the past. I am concerned with
his rule for Jupiter.
The rule makes a small improvement in a piece of general knowledge about
Jupiter. The Chinese called Jupiter sui xing, the year star, because it was
supposed to move, on the average, the equivalent of one zodiac space a year,
completing a circuit of the sky in twelve years. Lius rule in effect said that
Jupiter actually jumped a space (chao-chen) every twelve such twelve-year
cycles, i.e., that it traversed 145 spaces in 144 years. Liu is to be commended for
seeing that the popular view was wrong, but he was wrong himself, because the
actual jump is about one in every seven cycles.7
I am interested in where he got his rule, and (perhaps the same question)
why this wrong rule satisfied him. All I do (or can do) is to give you a possible
explanation. It is the best one I can think of.
Even to be aware that the popular 12-year rule was wrong, Liu, or someone
before him, must have had records thought to be reliable, mentioning or imply-
ing Jupiters zodiac position in some identifiable year many centuries earlier.
For then, simple arithmetic, together with noticing Jupiters current location,
ought to show that the cycle could not be an exact 12 years, and ought to show
at once exactly what it really is. And of course, if the record were wrong, albeit
believed to be right, then ones arithmetic would give a wrong answer.
I propose that this is exactly what happened. But the record in question was
one that lay buried in the tomb of King Xiang of the state of Wei from 299 B.C.
until 281 A.D. So if I am right, Lius rule was discovered (shall we say) all of
three centuries before Lius own time. Here things get controversial because the
record is the so-called Bamboo Annals, Zhushu jinian, which many take to be
a Ming Dynasty fake (of perhaps five centuries ago). It purports to be a
chronicle covering about 2000 years, with exact dates of events, down to 299
B.C. Recent work, some of it by Edward Shaughnessy and some of it mine,
7
For Liu Xins mathematics for Jupiter, see Sivin (1969), p. 16.
Standard Time 227
proves, I think, that it is not a fake, though most of the dates in it prior to 841
B.C. are at least slightly wrong.8
This chronicle records a conjunction of the planets in the so-called lunar
lodge Fang, about at Antares, in the 32nd year of the last Shang king, Di Xin,
i.e., (according to the Annals) in 1071. Fang is the middle space in the Jupiter
station Dahuo, Great Fire, station 11 of the twelve; one of the planets has to
be Jupiter; so the Annals says that Jupiter was in Dahuo in 1071. But the real
conjunction must have been the one in May 1059 B.C., when the planets
clustered in Chunshou, Quails Head, which was station 7.9
This false datum must have been trusted in Wei in the late 4th century B.C.
The Annals also says that the Jin state, which Wei succeeded, began in the 10th
year of King Cheng of Zhou, a date converting to 1035 B.C. (in the Annals
system, which dates Cheng too early) when (since the 12-year rule works in
the short run) Jupiter would also be in Dahuo. When Ying, Duke of Wei,
proclaimed himself King Hui-cheng, he took the year 335, just seven centuries
later, as his first year as king, according to the Annals. Further, King Hui-
chengs successor King Xiang thought so highly of the Annals that it seems he
had it buried with him. We read, moreover, in another part of the Guoyu (Jin
Yu 4.1) that Jupiter was in Dahuo when Jin was first enfeoffed. (Actually
Jupiter was in Dahuo in late 1032 and in 1031, not 1035.)
So I infer that it was believed in high places in Wei in the late 300s B.C.
on the basis of a text that probably was thought of as an official Jin-Wei state
chronicle that Jupiter was in Dahuo in 1035 B.C. But anyone watching the sky
in the year exactly 12 60 years after 1035, i.e., in 315, would see that Jupiter
was not in Dahuo, but was five stations farther on. Simple arithmetic would
show at once that Jupiters cycle was not exactly 12 years. 12 60 is 144 5,
and in that many years, it would seem, Jupiter had traveled 5 + (144 5)
stations, i.e., in each 144 years it had traversed 145 stations. If this ratio,
revealed by text and observation, were accepted, then King Xiang could sleep in
peace: his state chronicle was accurate, and using it had enabled the scientists of
the day to get their latest results. Or so I suppose.
For this is exactly Liu Xins formula. And this is the best explanation I can
think of, of where he might have gotten it. But, necessarily, indirectly: Liu was
Han court bibliographer, and has left a famous catalog of the imperial library,
which was the library. If there had been a copy of the Bamboo Annals above
ground that he knew about, he would have listed it and would have read it; but
he did not list it. So if this is his ultimate source, the rule must have been passed
on among astronomers down to Lius time. But here there is another puzzle,
8
Nivison (1983); Shaughnessy (1986). For a text and translation of the Bamboo Annals, see Legge
(1865), Prolegomena, pp. 105-183. (Legge usually converts year-dates in the Annals to modern
calendar dates that are one year late.)
9
The actual conjunction is identified by Pankenier (1983).
228 David S. Nivison
because every astronomer or astrologer before Liu whose beliefs are known
including pointedly Sima Qian, historian and court astrologer to Emperor Wu Di
a century before Liu says that Jupiters cycle is 12 years.
I tentatively infer from this a deeper truth about Chinese science, that must
distinguish it both from the artificiality of our modern standard time
convention and from our modern ceteris paribus understanding of scientific
laws, ideal-model-wise true, literally mendacious.
It seems to me that in the ancient view of things (Chinese anyway), there
was a set of ideal truths about the world, one being Jupiter cycle, 12 years,
and another being one year, 12 months. These were not recognized as
fictions, consciously adopted to simplify reality into something one could work
with. They were nave ordering conceptions of how the world works, that were
eventually and gradually recognized to be not exactly how it works, but then
kept on being cherished anyway, as somehow really capturing the underlying
order of things. They did not cease to be true merely because the specialist had
to use his little rules of thumb, like the 144:145 ratio, or the qi-center conven-
tion, to adjust the ideal rule to messy empirical reality. Thus one could cling to
the ideal even after one saw that it didnt work.
(The simple 12-year concept persisted in a bizarre form: the twelve chen,
used to count off months, were also used to count off two-hour segments of the
day (the Dippers handle shifts its pointing clockwise in the course of the night).
So a natural extension of use was to use them to count off years. The Dippers
handle doesnt point to the next chen in the next year; but Jupiter was thought to
occupy (ideally) the next of twelve stations (ci) in the next year, counter-
clockwise; and the twelve stations were in one-one correspondence with the
twelve chen, numbered in reverse order in the horizon projection of the zodiac.
So an imaginary Jupiter was posited, as a chen-system shadow of an ideal real
Jupiter, moving backwards chen-wise in an exact 12-year cycle. This concept
continued to be used for centuries after it was known that the real Jupiter cycle
is not 12 years.)
It would follow that, usually, scientific progress lags not a little behind what
could have been scientific discovery. I have probed what I take to be a case in
point in my article The Origin of the Chinese Lunar Lodge System (1989).
The Chinese appear to have worked out a lunar zodiac in the 3rd millennium,
as the basis of their calendar, locating in it the solstice and equinox points,
which they thought to be eternal. After centuries, of course, it didnt work.
When this happened, instead of simply scrapping it and starting over perhaps
thereby being forced to ask themselves why they had to scrap it they simply
tinkered with it so that they could stop worrying about it, until it became
something usable only by astrologers. Throughout these distortions, however,
the scheme continued to be organized on a location of the winter solstice that
had not been valid since about 2000 B.C. A sufficiently astute astronomer in the
Standard Time 229
4th century B.C., if he hadnt looked at the problem in this way, would have
found that he had data in his hands that would virtually have forced on him the
discovery of the precession of the equinoxes a discovery it took the Chinese
another seven centuries to make (1989, pp. 214-16).
* * *
This is an essay in a book addressed to scientists and philosophers of
science, and a very select readership even of that fraternity, one that may not
include a single sinologist. Honesty requires that I admit that I have been
wading in controversial waters. My work on the lunar zodiac is published (in an
astronomical-archaeological, not sinological, symposium), but not generally
accepted yet. My limited analysis of Liu Xins work is not yet published
elsewhere, except as a conference presentation to a possibly uncomprehending
orientalist audience. My views on the relative reliability of the Bamboo Annals
would probably be scoffed at by many scholars in China (and maybe elsewhere)
who consider themselves experts. Other historians of Chinese science may (at
the least) question my claim that intra-year intercalation was practiced in China
long before 500 B.C. You my readers should be warned.
David S. Nivison
Department of Philosophy
Stanford University
Appendix
V being inscriptions that belong to the last two Shang reigns, Di Yi and Di Xin.
Previous work by E. L. Shaughnessy and myself shows that the Di Xin reign
began in 1086, and the Di Yi reign probably in 1105.
If my standard time hypothesis is correct, night refers to the twelve
hours from 6:00 p.m. to 6:00 a.m., approximately, and in all seasons, even when
it is daylight for some time after 6:00 p.m. The inscription would be satisfied,
then, by a solar eclipse with the following characteristics:
(a) It must be on a quiyou day, no. 10 in the 60-day cycle.
(b) It should be total or maximal in the area of the Shang
capital at Anyang after 6:00 p.m.
(c) Thus it must be dated some time after the spring
equinox and some time before the autumn equinox;
(d) and it must terminate at sunset about 1000 miles or less east of Anyang.
(e) Finally, to be in Period IV, it must occur in some year in the
half-century preceding 1105.
An eclipse of this description (and the only one) would be no. 211, 1123 V 18
(18 May 1124 B.C.) in Oppolzer (1887), starting (1887, Tafel 5) in the Atlantic,
west of Africa and just below the equator, at about 14 degrees west longitude,
and terminating in eastern Korea, about 129 degrees east longitude, with a noon
point just north of the Persian Gulf at 8 hours 32.6 minutes WT. The Julian day
number is 131 1020, which is a quiyou day.10 The spring equinox in 1124 B.C.
was March 31; the latitude of the untergang was about 37 degrees north. This
does not give totality in or near Anyang after 6:00 p.m.; I estimate about 5:45
p.m. But this may be close enough; and well after 6:00 p.m. one would probably
still notice the bite of the obscuring moon. Furthermore, perhaps a post-ideal-
days-end eclipse totality would have been reported to the capital from further
east.
Two cautions are required, however. (1) It is at least imaginable that the
divination was made days after the eclipse; this would invalidate the date used
in my analysis. (2) As noted, the graph here translated at night has other
readings, not only (ye or xi) night but also (yue) month or moon. So the
meaning could be eclipses of the sun and moon have occurred, referring
perhaps to an eclipse (at some time in the recent past) of the sun (at whatever
hour), followed by an eclipse of the moon at the next syzygy (though I think this
reading is unlikely).
So this inscription-pair is hardly a confirmation of my standard time
hypothesis; but it is difficult to think of any other plausible sense for the words
the sun is eclipsed at night (ri xi/ye you shi) if that is what the inscription
10
Julian day numbers can be converted to Chinese sixty-day cycle numbers by dividing by 60, and
subtracting 10 from the remainder (or from the remainder plus 60). (Julian day numbers are given in
Tung (1960) and in Stahlman and Gingerich (1963).)
Standard Time 231
says. One might try for an eclipse, even only partial, that would be seen in
Anyang, perhaps at some other season, as developing before sunset or
dwindling after sunrise, with an appropriate date; but I have found none. (If the
inscription could be in Period II, then no. 81 (= 1175 VIII 19) in Oppolzer
(1887) is almost as satisfactory as no. 211.)
I am offering this problem for further study.
One can see its relevance to the problem of an ideal invariant beginning
point of night, in diurnal time. The dipper-dial logically requires a fixed evalua-
tion time each day: for it is said to move exactly one du a day, and it moves
through approximately equal spaces. If that fixed time happened to be before or
after sunset, the location and time would have had to be deduced. The question
is whether the required fixed time was conceived as an ideal standard time
beginning of night.
REFERENCES
IRS is our term for a view about theory testing originated by members and
associates of the Vienna Circle. Its leading idea is that the epistemic bearing of
observational evidence on a scientific theory is best understood in terms of
Inferential Relations between Sentences which represent the evidence and
sentences which represent hypotheses belonging to the theory. The best known
versions of IRS (and the ones we concentrate on in this paper) are Hypothetico-
Deductive and positive instance (including bootstrapping) confirmation theories.
It goes without saying that such accounts, along with the problems they
generate, have exerted a dominant influence on philosophers who study the
epistemology of science.
We maintain that the epistemic import of observational evidence is to be
understood in terms of empirical facts about particular causal connections and
about the error characteristics of detection processes. These connections and
characteristics are neither constituted by nor greatly illuminated by considering
the formal relations between sentential structures which IRS models focus on.
We argue that by taking them seriously, you too can evade the IRS.
We have argued elsewhere1 that theory testing in the natural and social
sciences is typically a two-stage process and that the use of observational
evidence belongs primarily to the first stage. In this stage data are produced and
interpreted in order to draw conclusions about what we call phenomena. This is
usually a matter of considering a number of competing claims about the
phenomenon under investigation and using the data to decide which of those
claims is most likely to be correct. In the second stage, theoretical claims are
confronted with conclusions about phenomena reached in the first stage. Some
examples of data are records of temperature readings used to determine the
melting point of a substance, scores on psychological tests used to investigate
1
Bogen and Woodward (1988); Woodward (1989); Bogen and Woodward (1992).
In: Martin R. Jones and Nancy Cartwright (eds.), Idealization XII: Correcting the Model.
Idealization and Abstraction in the Sciences (Pozna Studies in the Philosophy of the Sciences and
the Humanities, vol. 86), pp. 233-267. Amsterdam/New York, NY: Rodopi, 2005.
234 James Bogen and James Woodward
2
In this we depart from the IRS literature in which the term observation sentence is used for
natural language observation reports as well as their counterparts in first order logical languages.
We emphasize that we are using the term only for the latter.
3
This is roughly the notion invoked by Alvin Goldman and other reliabilists. See, e.g., Goldman
(1986), chs. 4, 5, 9-15 passim. However, unlike Goldman, we think, for reasons that will emerge in
section IX, that the project of investigating the reliability characteristics of most human belief-
forming methods and mechanisms is unlikely to be illuminating or fruitful. Rather we apply the
notion of general reliability to highly specific measurement and detection procedures, or in
connection with the use of instruments for particular purposes. Such procedures and uses of
instruments often have determinate error characteristics that we know how to investigate
empirically, while we suspect that this is not true of many of the methods or psychological
processes that underlie belief formation.
4
For example, consider a technique for staining tissue to be viewed under a light microscope which
(like golgi staining) tends to produce a great many artifacts. The staining technique may
nevertheless occasionally produce preparations which are free from artifacts, or whose artifacts can
be easily distinguished from real cell structures. In such cases a generally unreliable microscopic
technique can be locally reliable, and recognizably so.
236 James Bogen and James Woodward
II
5
The relevance of causal factors in assessing evidential significance is also emphasized in Miller
(1987). While we find much that is valuable and insightful in Millers discussion, his account
diverges from ours in important respects in particular he tends to see inductive inference generally
as a species of inference to the best explanation, while we do not. The evidential relevance of data-
generating processes and the limitations of formal accounts of evidential support are also
emphasized in Humphreys (1989), a discussion we have found very helpful.
6
For an excellent and forceful characterization of the disparity between the literature of science and
the literature of IRS-influenced philosophy of science, see Feyerabend (1985), pp. 83-85. Although
we disagree with much of what Feyerabend says elsewhere we heartily endorse his idea in this
passage that much of what occupies the IRS philosophers is an artifact of their own picture of
science, and in particular, that much (Feyerabend would probably say nearly all) research in the
philosophy of science consists in proposing ideas that fit the boundary conditions, i.e., the
standards of the simple logic chosen by the logical positivists to represent scientific reasoning
(p. 85).
Evading the IRS 237
practice and its IRS depiction derive non-accidentally from its basic goals and
strategies.
Like many of its founders and proponents, Hempel saw the IRS as an
alternative to the idea that scientific theories are not or cannot be tested
objectively that the decision as to whether a given hypothesis is acceptable in
the light of a given body of evidence rests on nothing more than a subjective
sense of evidence, or a feeling of plausibility in view of the relevant data.
This, says Hempel, is analogous to the equally noxious idea that the validity of
a mathematical proof or of a logical argument has to be judged ultimately by
reference to a feeling of soundness or convincingness. Hempel thinks both
ideas rest on a confusion of rational, objective, logical factors which can
actually determine whether the available evidence warrants the acceptance or
rejection of a scientific hypothesis with subjective psychological factors which
may influence scientific belief. To disentangle them we need purely formal
criteria for confirmation of the kind deductive logic provides for the validity of
deductive arguments. Such criteria would provide for rational reconstruction[s]
of the standards of scientific validation, free from the influence of feelings of
conviction, senses of evidence, or other subjective factors which vary from
person to person, and with the same person in the course of time. And like the
standards by which deductive validity is judged, it seems reasonable to require
that the criteria of empirical confirmation, besides being objective in character,
should contain no reference to the specific subject matter of the hypothesis or of
the evidence in question.7
The application of this approach to a real life example of scientific reasoning
from evidence to a conclusion begins with the construction of a highly idealized
representation of the reasoning under consideration. Reichenbach describes this
as the construction of a logical substitute for the real processes by which the
scientist thinks about the evidence (Reichenbach 1938, p. 5). As he describes it,
this is analogous to replacing an informal deductive argument with a formal
version which omits logically irrelevant features and exhibits logical structure
which was not explicit in the original version. For Hempel, it is analogous to the
construction of an idealized, simplified theoretical model of a real process
(Hempel 1965, p. 44).
7
Hempel (1965), pp. 9-10. This is exactly what Glymour promises for his bootstrap theory. Its
confirmation relations are to be entirely structural; they have no connection to the content of the
hypothesis tested, or to the meaning of the evidence sentences, or to the meaning of the theories
with respect to which the tests are supposed to be carried out (1980, pp. 374-5). The goal shared by
Hempel and Glymour is closely related to Poppers goal of providing as formal as possible a
demarcation between real and pseudo science. And it bears an interesting relation to Kuhn,
Feyerabend, Hanson, Shapere, Quine, and many other critics of the original positivist program.
Different as their views obviously are, all of these people subscribe to some version of the idea that
the IRS is the only alternative to the idea that scientific belief is not objectively constrained by
evidence.
238 James Bogen and James Woodward
III
8
Thus Reichenbach requires the construction . . . [to be] bound to actual thinking by the postulate
of correspondence (1938, p. 6) and Hempel says the model should conform to actual behavior as
far as it can without violating constraints imposed for the sake of attaining simplicity, consistency,
and comprehensiveness (1965, p. 44).
9
For positive instance accounts, see Studies in the Logic of Confirmation in Hempel (1965),
pp. 3-46. For bootstrapping accounts, see Glymour (1980). For a simple HD account see
Braithwaite (1953) and Popper (1959). For more complex HD accounts see Schlesinger (1976) and
Merrill (1979).
Evading the IRS 239
IV
10
Different versions of HD and positive instance theories add different conditions on confirmation
to meet counterexamples which concern them. For example, o may be required to have a chance of
being false, to be consistent with the theory whose claims are to be tested, to be such that its denial
would count against the claim it would support, etc. The details of such conditions do not affect our
arguments. Thus our discussion frequently assumes these additional conditions are met so that its
being the case that o ---> h is sufficient for confirmation of the claims represented by h by the
evidence represented by o.
11
See the previous note. For examples of this view, see Braithwaite (1953) and Glymour (1980),
ch. V.
240 James Bogen and James Woodward
12
Examples featuring items which sound more theoretical than birds and colors are easily produced.
13
For an illustration of this point in connection with Glymours treatment of the problem of
irrelevant conjunction, see Woodward (1983).
14
This example is also discussed in Bogen and Woodward (1992).
Evading the IRS 241
To avoid the problem posed by Priestleys and Lavoisiers data the IRS must
assign true observation sentences to epistemically good evidence and false ones to
epistemically bad evidence. What determines whether evidence is good or bad?
The following example illustrates our view that the relevance of evidence to
theory and the epistemic value of the evidence depends in large part upon causal
factors. If we are right about this, it is to be expected as we will argue in
sections VII and VIII that decisions about the value of evidence depend in large
part upon a sort of causal reasoning concerned with what we are calling reliability.
15
Soldners is roughly the same as a value predicted from an earlier theory of Einsteins. See Pais
(1982), p. 304.
Evading the IRS 243
degree to which (E), the value predicted by general relativity, disagrees with
predictions based on the competitor theories under consideration. (1) belongs to
the first of the two stages of theory testing we mentioned in section I:
the production and interpretation of data to answer a question about phenomena.
(2) belongs to the second of these stages the use of a phenomena claim to argue
for or against part of a theory. With regard to the first of these considerations,
evidential relevance depends upon the extent (if any) to which differences
between the positions of star images on eclipse and comparison pictures are due
to differences between paths of starlight due to the gravitational influence of the
sun. Even if the IRS has the resources to analyze the prediction of (E) from
Einsteins theory, the relevance of the data to (E) would be another matter.16
Assuming that the suns gravitational field is causally relevant to differences
between positions of eclipse and comparison images, the evidential value of the
data depended upon a great many other factors as well. Some of these had to do
with the instruments and techniques used to measure distances on the photo-
graph. Some had to do with the resources available to the investigator for
deciding whether and to what extent measured displacements of star images
were due to the deflection of starlight rather than extraneous influences. As
Fig. 1 indicates, one such factor was change in camera angle due to the motion
of the earth. Another was parallax resulting from the distance between the
geographical locations from which Eddingtons eclipse and comparison pictures
were taken.17
16
In the discussion which follows, we ignore the fact that the deflection values calculated from the
best photographic data differed not only from (N) and (NS), but also (albeit to a lesser extent) from
(E). Assuming (as we do) that the data supported general relativity, this might mean that although
(E) is correct, its discrimination does not require it to be identical to the value calculated from the
photographs. Alternatively, it might mean that (E) is false, but that just as inaccurate data can make
it reasonable to believe a phenomenon-claim, some false phenomena claims provide epistemically
good support for theoretical claims in whose testing they are employed. Deciding which if either of
these alternatives is correct falls beyond the scope of this paper. But since epistemic support by
inaccurate data and confirmation by false claims are major difficulties for IRS, the disparities
between (E) and magnitudes calculated from the best data offer no aid and comfort to the IRS
analysis. Important as they are in connection with other epistemological issues, these disparities will
not affect the arguments of this paper.
17
Eddington and Cottingham took eclipse photographs from Principe, but logistical complications
made it necessary for them to have comparison pictures taken from Oxford. In addition to correcting
for parallax, they had to establish scale for photographs taken from two very different locations with
very different equipment (Earman and Glymour 1980, pp. 73-4).
244 James Bogen and James Woodward
Fig. 1. As the earth moves from its position at one time, t1, to its position at a later time, t2, the
positions of the eclipse and comparison cameras change relative to the stars.
18
We are indebted to Alma Zook of the Pomona College physics department for showing and
explaining the use of such measuring equipment to us.
Evading the IRS 245
The epistemic defects of Curtiss star image are not due to the failure of
inferential connections between an observation sentence and a hypothesis
sentence. Nor are they due to the falsity of an observation sentence. By the same
token, the epistemic value of the best photographs was not due to the truth of
observation sentences or to the obtaining of inferential connections between
them and hypothesis sentences. The evidential value of the starlight data
depended upon non-logical, extra-linguistic relations between non-sentential
features of photographs and causes which are not sentential structures.
At this point we need to say a little more about a difficulty we mentioned in
section I. Observation sentences are supposed to represent evidence. But the IRS
tends to associate evidence with sentences reporting observations, and even
though some investigations use data of this sort, the data which supported (E)
were not linguistic items of any sort, let alone sentences. They were photo-
graphs. This is not an unusual case. So many investigations depend upon non-
sentential data that it would be fatal for the IRS to maintain that all scientific
evidence consists of observation reports (let alone the expressions in first order
logic we are calling observation sentences). What then do observation sentences
represent? The most charitable answer would be that they represent whatever
data are actually used as evidence, even where the data are not observation
reports. But this does not tell us which observation sentences to use to represent
the photographs. Thus a serious difficulty is that for theory testing which
involves non-sentential evidence, the IRS provides no guidance for the con-
struction of the required observation sentences.
Lacking an account of what observation sentences the IRS would use to
represent the photographs, it is hard to talk about what would decide their truth
values. But we can say this much: whatever the observation sentences may be,
their truth had better not depend upon how well the photographs depicted the
true positions of the stars. The photographs did not purport to show (and were
not used to calculate) their actual positions or the true magnitudes of distances
between them. They could represent true positions of (or distances between)
stars with equal accuracy only if there were no significant20 discrepancies
19
From a letter from Campbell to Curtis, reproduced in Earman and Glymour (1980), p. 67.
20
We mean measurable discrepancies not accounted for by changes in the position of the earth,
differences in the location of the eclipse and comparison equipment, etc.
246 James Bogen and James Woodward
between the positions of star images on the eclipse photographs and star images
on the comparison photographs. But had there been no such discrepancies the
photographs would have argued against (E). Thus to require both the eclipse and
the comparison photographs to meet the same standard of representational
accuracy would be to rule out evidence needed to support (E). Furthermore, the
truth values of the observation sentences had better not be decided solely on the
basis of whether the measurements of distances between their star images meet
some general standard of accuracy specified independently of the particular
investigation in question. In his textbook on error analysis, John Taylor points
out that even though measurements can be too inaccurate to serve their purposes
. . . it is not necessary that the uncertainties [i.e., levels of error] be extremely
small . . . This . . . is typical of many scientific measurements, where uncertainties
have to be reasonably small (perhaps a few percent of the measured value), but where
extreme precision is often quite unnecessary (Taylor 1982, p. 6).
We maintain that what counts as a reasonably small level of error depends
upon the nature of the phenomenon under investigation, the methods used to
investigate it, and the alternative phenomena claims under consideration. Since
these vary from case to case no single level of accuracy can distinguish between
acceptable and unacceptable measurements for every case. Thus Priestleys
nitric oxide test tolerates considerably more measurement error than did the
starlight bending investigations. This means that in order to decide whether or
not to treat observation sentences representing Eddingtons photographs and
measurements as true, the IRS epistemologist would have to know enough about
local details peculiar to their production and interpretation to find out what
levels of error would be acceptable.
Suppose that one responds to this difficulty by stipulating that whatever
observation sentences are used to represent photographs are to be called true if
the photographs constitute good evidence and false if they do not. This means
that the truth values of the observation sentences will depend, for example, upon
whether the investigators could rule out or correct for the influences of such
factors as mechanical changes in the equipment, parallax, sources of
measurement error, etc., as far as necessary to allow them to discriminate
correctly between (E), (N), and (NS). We submit that this stipulation is
completely unilluminating. The notion of truth as applied to an observation
sentence is now unconnected with any notion of representational correctness or
accuracy (i.e., it is unclear what such sentences are supposed to represent or
correspond to when they are true). Marking an observation sentence as true is
now just a way of saying that the data associated with the sentence possess
various other features that allow them to play a role in reliable discrimination. It
is better to focus directly on the data and the processes that generate them and to
drop the role of an observation sentence as an unnecessary intermediary.
Evading the IRS 247
VI
Recall that an important part of the motivation for the development of IRS was
the question of what objective factors do or should determine a scientists
decision about whether a given body of evidence warrants the acceptance of a
hypothesis. We have suggested that the evidential value of data depends upon
complex and multifarious causal connections between the data, the phenomenon
of interest, and a host of other factors. But it does not follow from this that
scientists typically do (or even can) know much about the fine details of the
relevant causal mechanisms. Quite the contrary, as we have argued elsewhere,
scientists can seldom if ever give, and are seldom if ever required to give,
detailed, systematic causal accounts of the production of a particular bit of data
or its interaction with the human perceptual system and with devices (like the
measuring equipment used by the starlight investigators) involved in its
interpretation.21 But even though it does not involve systematic causal
explanation, we believe that a kind of causal reasoning is essential to the use of
data to investigate phenomena. This reasoning focuses upon what we have
called general and local reliability. The remainder of this paper discusses some
features of this sort of reasoning, and argues that its objectivity does not depend
upon, and is not well explained in terms of the highly general, content
independent, formal criteria sought by the IRS.
VII
21
Bogen and Woodward (1988). For an excellent illustration of this, see Hacking (1983), p. 209.
248 James Bogen and James Woodward
production and interpretation (we shall call this a detection process, for brevity)
and where this process possesses fairly stable, determinate error characteristics
under repetition that are susceptible of empirical investigation. As we shall see
in section VIII, these conditions are met in many, but by no means all the
contexts in which data are used to assess claims about phenomena. Where these
conditions are not met, we must assess evidential support in terms of a distinct
notion of reliability, which we call local reliability.
Here is an example illustrating what we have in mind by general reliability.22
Traditionally paleoanthropologists have relied on fossil evidence to infer
relationships among human beings and other primates. The 1960s witnessed the
emergence of an entirely distinct biochemical method for making such
inferences, which involved comparing proteins and nucleic acids from living
species. This method rests on the assumption that the rate of mutation in
proteins is regular or clocklike; with this assumption one can infer that the
greater the difference in protein structure among species, the longer the time
they have been separated into distinct species. Molecular phylogeny (as such
techniques came to be called) initially suggested conclusions strikingly at
variance with the more traditional, generally accepted conclusions based on
fossil evidence. For example, while fossil evidence suggested an early diver-
gence between hominids and other primates, molecular techniques suggested a
much later date of divergence that hominids appeared much later than
previously thought. Thus while paleoanthropologists classified the important
prehistoric primate Ramapithicus as an early hominid on the basis of its fossil
remains, the molecular evidence seemed to suggest that Ramapithicus could not
be a hominid. Similarly, fossil and morphological data seemed to suggest that
chimpanzees and gorillas were more closely related to each other than to
humans, while molecular data suggested that humans and chimpanzees were
more closely related.
The initial reaction of most paleoanthropologists to these new claims was
that the biochemical methods were unreliable, because they produced results at
variance with what the fossils suggested. It was suggested that because the
apparent rates of separation derived from molecular evidence were more recent
than those derived from the fossil record, this showed that the molecular clock
was not steady and that there had been a slow-down in the rate of change in
protein structure among hominids. This debate was largely resolved in favor of
the superior reliability of molecular methods. The invention of more powerful
molecular techniques based on DNA hybridization, supported by convincing
statistical arguments that the rate of mutation was indeed clocklike, largely
corroborated the results of earlier molecular methods. The discovery of
22
Details of this example are largely taken from Lewin (1987).
Evading the IRS 249
23
See especially Lewin (1987), p. 122ff.
250 James Bogen and James Woodward
24
For additional discussion, see Bogen and Woodward (1992).
25
See Bogen and Woodward (1988) and Woodward (1989). Although, on our view, it is always a
matter of empirical fact whether or not a detection process is generally reliable, we want to
emphasize that there is rarely if ever an algorithm or mechanical procedure for deciding this. Instead
it is typically the case that a variety of heterogeneous considerations are relevant, and building a
case for general reliability or unreliability is a matter of building a consensus that most of these
considerations, or the most compelling among them, support one conclusion rather than another. As
writers like Peter Galison (1987) have emphasized, reaching such a conclusion may involve an
irreducible element of judgement on the part of experimental investigators about which sources of
error need to be taken seriously, about which possibilities are physically realistic, or plausible and
so forth. Similar remarks apply to conclusions about local reliability. (Cf. n. 42.)
Evading the IRS 251
26
For a general argument in support of this conclusion see Friedman (1979). One can think of Larry
Laudans recent naturalizing program in philosophy of science which advocates the testing of
various philosophical theses about scientific change and theory confirmation against empirical
evidence provided by the history of science as (among other things) an attempt to carry out an
empirical investigation of the error or reliability characteristics of the various IRS confirmation
schemas (Donovan et al. 1988). We agree with Laudan that vindicating the various IRS models
would require information about long-run error characteristics of the sort for which he is looking.
But for reasons described in the next paragraph in the text, we are much more pessimistic than
Laudan and his collaborators about the possibility of obtaining such information.
27
Typical attempts to argue for particular IRS models appeal instead to (a) alleged paradoxes, and
inadequacies associated with alternative IRS approaches, (b) various supposed intuitions about
evidential support, and (c) famous examples of successful science that are alleged to conform to the
model in question. (Cf. Glymour 1980.) But (a) is compatible with and perhaps even supports
254 James Bogen and James Woodward
VIII
skepticism about all IRS accounts of evidence, and with respect to (b), it is uncontroversial that
intuitions about inductive support frequently lead one astray. Finally, from a reliabilist perspective
(c) is quite unconvincing. Instead, what needs to be shown is that scientists systematically succeed
in a variety of cases because they accept hypotheses in accord with the recommendations of the IRS
account one favors. That is, what we need to know is not just that there are episodes in the history of
science in which hypotheses stand in the relationship to true observation sentences described by,
say, a bootstrap methodology and that these hypotheses turn out to be true or nearly so, but what the
performance of a bootstrap methodology would be, on a wide variety of different kinds of evidence,
in discriminating true hypotheses from false hypotheses both what this performance is absolutely
and how it compares with alternative methods one might adopt. (As we understand it, this is
Glymours present view as well.)
Evading the IRS 255
terms of general reliability. For example, even if I know that some radioactive
dating technique is generally reliable when applied to fossils, this still leaves
open the question of whether the date assigned to some particular fossil by the
use of the technique is correct: it might be that this particular fossil is
contaminated in a way that gives us mistaken data, or that the equipment I am
using has misfunctioned on this particular occasion of use. That the dating
process is generally reliable doesnt preclude these possibilities.
Some philosophers with a generalist turn of mind will find it tempting to try
to reduce local reliability to general reliability: it will be said that if the data
obtained from a particular fossil are mistaken because of the presence of a
contaminant, then if that very detection process is repeated (with the
contaminant present and so forth) on other occasions, it will have unfavorable
error characteristics, and this is what grounds our judgement of reliability or
evidential import in the particular case. As long as we take care to specify the
relevant detection processes finely enough, all judgements about reliability in
particular cases can be explicated in terms of the idea of repeated error
characteristics. Our response is not that this is necessarily wrong, but that it is
thoroughly unilluminating at least when understood as an account of how
judgements of local reliability are arrived at and justified. As we shall see
below, many judgements of local reliability turn on considerations that are
particular or idiosyncratic to the individual case at hand. Often scientists are
either unable to describe in a non-trivial way what it is to repeat the
measurement or detection process that results in some particular body of data or
lack (and cannot get) information about its long-run error characteristics. It is
not at all clear to us that whenever a detection process is used on some
particular occasion, and a judgement about its local reliability is reached on the
basis of various considerations, there must be some description of the process,
considerations, and judgements involved that exhibits them as repeatable. But
even if this is the case, this description and the relevant error characteristics of
the process when repeated often will be unknown to the individual investigator
this information is not what the investigator appeals to in reaching his
judgement about local reliability or in defending his judgement.
What then are the considerations which ground judgements of local
reliability and how should we understand what it is that we are trying to do
when we make such judgements? While the relevant considerations are, as we
shall see, highly heterogeneous, we think that they very often have a common
point or pattern, which we will now try to describe. Put baldly, our idea is that
judgements of local reliability are a species of singular causal inference in
which one tries to show that the phenomenon of interest causes the data by
means of an eliminativist strategy by ruling out other possible causes of the
256 James Bogen and James Woodward
data.28 When one makes a judgement of local reliability one wants to ascertain
on the basis of some body of data whether some phenomenon of interest is
present or has certain features. One tries to do this by showing that the detection
process and data are such that the data must have been caused by the
phenomenon in question (or by a phenomenon with the features in question)
that all other relevant candidates for causes of the data can be ruled out. Since
something must have caused the data, we settle on the phenomenon of interest
as the only remaining possibility. For example, in the fossil dating example
above, one wants to exclude (among other things) the possibility that ones data
presumably some measure of radioactive decay rate, such as counts with a
Geiger counter were caused by (or result in part from a causal contribution
due to) the presence of the contaminant. Similarly, as we have already noted,
showing that some particular bubble chamber photograph was evidence for the
existence of neutral currents in the CERN experiments of 1973 requires ruling
out the possibility that the particular photograph might have been due instead to
some alternative cause, such as a high energy neutron, that can mimic many of
the effects of neutral currents. The underlying idea of this strategy is nicely
described by Allan Franklin in his recent book Experiments, Right or Wrong
(1990). Franklin approvingly quotes Sherlock Holmess remark to Watson,
How often have I said to you that when you have eliminated the impossible,
whatever remains, however improbable, must be the truth? and then adds, If
we can eliminate all possible sources of error and alternative explanations, then
we are left with a valid experimental result (1990, p. 109).
Here is a more extended example designed to illustrate what is involved in
local reliability and the role of the eliminative strategy described above.29 In
experiments conducted in the late 1960s, Joseph Weber, an experimentalist at
the University of Maryland, claimed to have successfully detected the
phenomenon of gravitational radiation. The production of gravity waves by
massive moving bodies is predicted (and explained) by general relativity.
However, gravitational radiation is so weakly coupled to matter that detection of
such radiation by us is extremely difficult.
Webers apparatus initially consisted of a large metal bar which was
designed to vibrate at the characteristic frequency of gravitational radiation
emitted by relatively large scale cosmological events. The central problem of
28
As with judgements about general reliability, we do not mean to suggest that there is some single
method or algorithm to be employed in this ruling out of alternatives. For example, ruling out an
alternative may involve establishing an observational claim that is logically inconsistent with the
alternative (Popperian falsification), but might take other forms as well; for example, it may be a
matter of finding evidence that renders the alternative unlikely or implausible or of finding evidence
that the alternative should but is not able to explain.
29
The account that follows draws heavily on Collins (1975) and Collins (1981). Other accessible
discussions of Webers experiment on which we have relied include Davis (1980), esp. pp. 102-117,
and Will (1986).
Evading the IRS 257
experimental design was that to detect gravitational radiation one had to be able
to control or correct for other potential disturbances due to electromagnetic,
thermal, and acoustic sources. In part, this was attempted by physical insulation
of the bar, but this could not eliminate all possible sources of disturbance; for
example, as long as the bar is above absolute zero, thermal motion of the atoms
in the bar will induce random vibrations in it. One of the ways Weber attempted
to deal with this difficulty was through the use of a second detector which was
separated from his original detector by a large spatial distance the idea being
that genuine gravitational radiation, which would be cosmological in origin,
should register simultaneously on both detectors while other sorts of
background events which were local in origin would be less likely to do this.
Nonetheless, it was recognized that some coincident disturbances will occur in
the two detectors just by chance. To deal with this possibility, various complex
statistical arguments and other kinds of checks were used to attempt to show
that it was unlikely that all of the coincident disturbances could arise in this
way.
Weber also relied on facts about the causal characteristics of the signal the
gravitational radiation he was trying to detect. The detectors used by Weber
were most sensitive to gravitational radiation when the direction of propagation
of given radiation was perpendicular to the axes of the detectors. Thus if the
waves were coming from a fixed direction in space (as would be plausible if
they were due to some astronomical event), they should vary regularly in
intensity with the period of revolution of the earth. Moreover, any periodic
variations due to human activity should exhibit the regular twenty-four hour
variation of the solar day. By contrast, the pattern of change due to an
astronomical source would be expected to be in accordance with the sidereal
day which reflects the revolution of the earth around the sun, as well as its
rotation about its axis, and is slightly shorter than the solar day. When Weber
initially appeared to find a significant correlation with sidereal, but not solar,
time in the vibrations he was detecting, this was taken by many other scientists
to be important evidence that the source of the vibrations was not local or
terrestrial, but instead due to some astronomical event.
Weber claimed to have detected the existence of gravitational radiation from
1969 on, but for a variety of reasons his claims are now almost universally
doubted. In what follows, we concentrate on what is involved in Webers claim
that his detection procedure was locally reliable and how he attempted to
establish that claim. As we see it, what Weber was interested in establishing was
a singular causal claim: he wanted to show that at least some of the vibrations
and disturbances his data recorded were due to gravitational radiation (the
phenomenon he was trying to detect) and (hence) that such radiation existed.
The problem he faced was that a number of other possible causes or factors
besides gravitational radiation might in principle have caused his data. Unless
258 James Bogen and James Woodward
Weber could rule out, or render implausible or unlikely, the possibility that
these other factors might have caused the disturbances, he would not be justified
in concluding that the disturbances are due to the presence of gravitational
radiation. The various experimental strategies and arguments described above
(physical isolation of the bar, use of a second detector, and so forth) are an
attempt to do just this to make it implausible that the vibrations in his detector
could have been caused by anything but gravitational radiation. For example, in
the case of the sidereal correlation the underlying argument is that the presence
of this pattern or signature in the data is so distinctive that it could only have
been produced by gravitational radiation rather than by some other source.
We will not attempt to describe in detail the process by which Webers
claims of successful detection came to be criticized and eventually disbelieved.
Nonetheless it is worth noting that we can see the underlying point of these
criticisms as showing that Webers experiment fails to conform to the
eliminative pattern under discussion what the critics show is that Weber has
not convincingly ruled out the possibility that his data were due to other causes
besides gravitational radiation. Thus, for example, the statistical techniques that
Weber used turned out to be problematic indeed, an inadvertent natural
experiment appeared to show that the techniques lacked general reliability in the
sense described above. (Webers statistical techniques detected evidence for
gravitational radiation in data provided by another group which, because of a
misunderstanding on Webers part about synchronization, should have been
reported as containing pure noise.) Because of this, Weber could no longer
claim to have convincingly eliminated the possibility that all of the disturbances
he was seeing in both detections were due to the chance coincidence of local
causes.
Secondly, as Weber continued his experiment and did further analysis of his
data, he was forced to retract his claim of sidereal correlation. Finally, and
perhaps most fundamentally, a number of other experiments, using similar and
more sensitive apparatus, failed to replicate Webers results. Here the argument
is that if in fact gravitational radiation was playing a causal role in the
production of Webers data such radiation ought to interact causally with other
similar devices; conversely, failure to detect such radiation with a similar
apparatus, while it does not tell us which alternative cause produced Webers
data, does undermine the claim that it was due to gravitational radiation.
Much of what we have said about the advantages of the notion of general
reliability vis--vis IRS-style accounts holds as well for local reliability. When
we make a judgement of local reliability about certain data when we conclude,
for example, that some set of vibrations in Webers apparatus were or were not
evidence for the existence of gravitational radiation what needs to be
established is not whether there obtains some appropriate formal or logical
relationship of the sort IRS models attempt to capture, but rather whether there
Evading the IRS 259
IX
There is a common element to a number of the difficulties with IRS models that
we have discussed that deserves explicit emphasis. It is an immediate
consequence of our notions of general and local reliability that the processes
that produce or generate data are crucial to its evidential status. Moreover, it is
often hard to see how to represent the evidential relevance of such processes in
an illuminating way within IRS-style accounts. And in fact the most prominent
IRS models simply neglect this element of evidential assessment. The tendency
within IRS models is to assume, as a point of departure, that one has a body of
evidence, that it is unproblematic how to represent it sententially, and to then try
to capture its evidential relevance to some hypothesis by focusing on the formal
or structural relationship of its sentential representation to that hypothesis. But if
the processes that generated this evidence make a crucial difference to its
evidential significance, we cant as IRS approaches assume, simply detach the
evidence from the processes which generated it, and use a sentential representa-
tion of it as a premise in an IRS-style inductive inference.
To make this point vivid, consider (P) a collection of photographs which qua
photographs are indistinguishable from those that in fact constituted evidence
for the existence of neutral current interactions in the CERN experiments of
1973. Are the photographs in (P) also evidence for the existence of neutral
currents? Although many philosophers (influenced by IRS models of
confirmation) will hold that the answer to this question is obviously yes, our
claim is that on the basis of the above information one simply doesnt know
one doesnt know whether the photographs are evidence for neutral currents
until one knows something about the processes by which they are generated.
Suppose that the process by which the photographs were produced failed to
adequately control for high energy neutrons. Then our claim is that such
photographs are not reliable evidence for the existence of neutral currents, even
if the photographs themselves look no different from those that were produced
by experiments (like the CERN experiment) in which there was adequate
control for the neutron background. It is thus a consequence of our discussion of
general and local reliability that the evidential significance of the same body of
data will vary, depending upon what it is reasonable to believe about how it was
produced.
We think that the tendency to neglect the relevance of the data-generating
processes explains, at least in large measure, the familiar paradoxes which face
IRS accounts. Consider the raven paradox, briefly introduced in section IV
above. Given our discussion so far it will come as no surprise to learn that we
think the culprit in this case is the positive instance criterion itself. Our view is
that one just cant say whether a positive instance of a hypothesis constitutes
evidence for it, without knowing about the procedure by which the positive
Evading the IRS 261
is inconsistent with (h2) on the supposition that there is at least one non-black
thing but not a serious competitor since every investigator will have great
confidence that it is false prior to beginning an investigation of (h2). Someone
who is uncertain whether (h2) is true will not take seriously the possibility that
(h3) is true instead and for this reason evidence that merely discriminates
between (h2) and (h3) but not between (h2) and its more plausible alternatives
will not be regarded as supporting (h2). Thus while the observation of a white
shoe does indeed discriminate between (h2) and (h3) this fact by itself does not
show that the observation supports (h2). Presumably the best candidates for
serious specific alternatives to (h2) are various hypotheses specifying the
conditions (e.g., snowy regions) under which non-black ravens will occur. But
given any plausible alternative hypothesis about the conditions under which a
non-black raven will occur, the observation of a white shoe or a red pencil does
nothing to effectively discriminate between (h2) and this alternative. For
example, these observations do nothing to discriminate between (h2) and the
alternative hypotheses that there are white ravens in snowy regions. As far as
these alternatives go, then, there is no good reason to think of an observation of
a white shoe as confirming (h2).
There are other possible alternatives to (h2) that one might consider. For
example, there are various hypotheses, (hp), specifying that the proportion of
ravens among non-black things is some (presumably very small) positive
number p for various values of p. There is also the generic, non-specific
alternative to (h2) which is simply its denial (h4), Some non-black things are
ravens. For a variety of reasons these alternatives are less likely to be of
scientific interest than the alternatives considered in the previous paragraph. But
even if we put this consideration aside, there is an additional problem with the
suggestion that the observation of a white shoe confirms (h2) because it
discriminates between (h2) and one or more of these alternatives.
This has to do with the characteristics of the processes involved in the
production of such observations. In the case of (h1), All ravens are black, we
have some sense of what it would mean to sample randomly from the class of
ravens or at least to sample a representative range of ravens (e.g., from
different geographical locations or ecological niches) from this class. That is,
we have in this case some sense of what is required for the process that
generates relevant observations to be unbiased or to have good reliability
characteristics. If we observe enough ravens that are produced by such a process
and all turn out to be black, we may regard this evidence as undercutting not just
those competitors to (h1) that claim that all ravens are some uniform non-black
color but also those alternative hypotheses that claim that various proportions of
ravens are non-black, or the generic alternative hypothesis that some ravens are
non-black. Relatedly, observations of non-black ravens produced by such a
process might confirm some alternative hypothesis to (h1) about the proportion
Evading the IRS 263
of ravens that are non-black or the conditions under which we may expect to
find them.
By contrast, nothing like this is true of (h2). It is hard to understand even
what it might mean to sample in a random or representative way from the class
of non-black things and harder still to envision a physical process that would
implement such a sampling procedure. It is also hard to see on what basis one
might argue that a particular sample of non-black things was representative of
the entire range of such things. As a result, when we are presented with even a
very generous collection of objects consisting of white shoes, red pencils and so
on, it is hard to see on what sort of basis one might determine whether the
procedure by which this evidence was produced had the right sort of
characteristics to enable us to reliably discriminate between (h2) and either the
alternatives (hp) or (h4), and hence hard to assess what its evidential significance
is for (h2). It is thus unsurprising that we intuitively judge the import of such
evidence for (h2) to be at best unclear and equivocal.
On our analysis, then, an important part of what generates the paradox is the
mistaken assumption, characteristic of IRS approaches, that evidential support
for a claim is just a matter of observation sentences standing in some appro-
priate structural or formal relationship to a hypothesis sentence (in this case the
relationship captured by the positive instance criterion) independently of the
processes which generate the evidence and independently of whether the evi-
dence can be used to discriminate between the hypothesis and alternatives to it.
It might be thought that while extant IRS accounts have in fact neglected the
relevance of those features of data-generating processes that we have sought to
capture with our notions of general and local reliability, there is nothing in the
logic of such accounts that requires this omission. Many IRS accounts assign an
important role to auxiliary or background assumptions. Why cant partisans of
IRS represent the evidential significance of processes of data generation by
means of these assumptions?
We dont see how to do this in a way that respects the underlying aspirations
of the IRS approach and avoids trivialization. The neglect of data generating
processes in standard IRS accounts is not an accidental or easily correctable
feature of such accounts. Consider those features of data generation captured by
our notion of general reliability. What would the background assumptions
designed to capture this notion within an IRS account look like? We have
already argued that in order to know that an instrument or detection process is
generally reliable, it is not necessary to possess a general theory that explains
the operation of the instrument or the detection process. The background
assumptions that are designed to capture the role of general reliability in
inferences from data to phenomena thus cannot be provided by general theories
that explain the operation of instruments or detection processes. The informa-
tion that grounds judgements of general reliability is, as we have seen, typically
264 James Bogen and James Woodward
30
Although we lack the space for a detailed discussion, we think that a similar conclusion holds in
connection with judgements of local reliability. If one wished to represent formally the eliminative
reasoning involved in establishing local reliability, then it is often most natural to represent it by
means of the deductively valid argument pattern known as disjunctive syllogism: one begins with
the premise that some disjunction is true, shows that all of the disjuncts save one are false, and
concludes that the remaining disjunct is true. But, as in the case of the representation of the
argument appealing to general reliability considered above, this formal representation of eliminative
reasoning is obvious and trivial; the really interesting and difficult work that must be done in
connection with assessing such arguments has to do with writing down and establishing the truth of
their premises: has one really considered all the alternatives, does one really have good grounds for
considering all but one to be false? Answering such questions typically requires a great deal of
subject-matter specific causal knowledge. Just as in the case of general reliability, the original IRS
Evading the IRS 265
Here is another way of putting this matter: someone who accepts (1) and (2)
will find his beliefs about the truth of P significantly constrained, and constra-
ined by empirical facts about evidence. Nonetheless the kind of constraint
provided by (1) and (2) is very different from the kinds of non-deductive
constraints on hypothesis choice sought by proponents of IRS models. Consider
again the passage quoted from Hempel in section II. As that passage suggests,
the aim of the IRS approach is to exhibit the grounds for belief in hypotheses
like (3) or (4) in a way that avoids reference to personal or subjective
factors and to subject-matter specific considerations. Instead the aim of the IRS
approach is to exhibit the grounds for belief in (3) or (4) as resulting from the
operation of some small number of general patterns of non-deductive argument
or evidential support which recur across many different areas of inquiry. If (2) is
a highly subject-matter specific claim about, say, the reliability of a carbon-14
dating procedure when applied to a certain kind of fossil or (even worse) a claim
that asserts the reliability of a particular pathologist in correctly discriminating
benign from malignant lung tumors when she looks at x-ray photographs,
reference to subject-matter specific or personal considerations will not have
been avoided. A satisfactory IRS analysis would begin instead with some
sentential characterization of the data produced by the radioactive dating
procedure or the data looked at by the pathologist, and then show us how this
data characterization supports (3) or (4) by standing in some formally
characterizable relationship to it that can be instantiated in many different areas
of inquiry. That is, the evidential relevance of the data to (3) or (4) should be
established or represented by the instantiation of some appropriate IRS pattern,
not by a highly subject-matter specific hypothesis like (2). If our critique of IRS
is correct, this is just what cannot be done.
As the passage quoted from Hempel makes clear, IRS accounts are driven in
large measure by a desire to exhibit science as an objective, evidentially
constrained enterprise. We fully agree with this picture of science. We think that
in many scientific contexts, evidence has accumulated in such a way that only
one hypothesis from some large class of competitors is a plausible candidate for
belief or acceptance. Our disagreement with IRS accounts has to do with the
nature or character of the evidential constraints that are operative in science, not
with whether such constraints exist. According to IRS accounts these constraints
derive from highly general, domain-independent, formally characterizable
patterns of evidential support that appear in many different areas of scientific
investigation. We reject this claim, as well as Hempels implied suggestion that
either the way in which evidence constrains belief must be capturable within an
aspiration of finding a subject-matter independent pattern of inductive argument in which the formal
features of the pattern do interesting, non-trivial work of a sort that might be studied by
philosophers has not been met.
266 James Bogen and James Woodward
IRS-style framework or else we must agree that there are no such constraints at
all. On the contrasting picture we have sought to provide, the way in which
evidence constrains belief should be understood instead in terms of non-formal
subject-matter specific kinds of empirical considerations that we have sought to
capture with our notions of general and local reliability. On our account, many
well-known difficulties for IRS approaches the various paradoxes of
confirmation, and the problem of explaining the connection between a
hypothesiss standing in the formal relationships to an observation sentence
emphasized in IRS accounts and its being true are avoided. And many features
of actual scientific practice that look opaque on IRS approaches the evidential
significance of data generating processes or the use of data that lacks a natural
sentential representation, or that is noisy, inaccurate or subject to error fall
naturally into place.31
James Bogen
Department of Philosophy
Pitzer College (Emeritus) and University of Pittsburgh
rtjbog@aol.com
James Woodward
Division of Humanities and Social Sciences
California Institute of Technology
jfw@hss.caltech.edu
31
We have ignored Bayesian accounts of confirmation. We believe that in principle such accounts
have the resources to deal with some although perhaps not all of the difficulties for IRS approaches
described above. However, in practice the Bayesian treatments provided by philosophers often fall
prey to these difficulties, perhaps because those who construct them commonly retain the sorts of
expectations about evidence that characterize IRS-style approaches. Thus while there seems no
barrier in principle to incorporating information about the process by which data has been generated
into a Bayesian analysis, in practice many Bayesians neglect or overlook the evidential relevance of
such information Bayesian criticisms of randomization in experimental design are one
conspicuous expression of this neglect. For a recent illustration of how Bayesians can capture the
evidential relevance of data generating processes in connection with the ravens paradox, see Earman
(1992); for a rather more typical illustration of a recent Bayesian analysis that fails to recognize the
relevance of such considerations, see the discussion of this paradox in Howson and Urbach (1989).
As another illustration of the relevance of the discussion in this paper to Bayesian approaches,
consider that most Bayesian accounts require that all evidence have a natural representation by
means of true sentences. These accounts thus must be modified or extended to deal with the fact that
such a representation will not always exist. For a very interesting attempt to do just this, see Jeffrey
(1989).
Evading the IRS 267
REFERENCES
Bogen, J. and Woodward, J. (1988). Saving the Phenomena. The Philosophical Review 97, 303-52.
Bogen, J. and Woodward, J. (1992). Observations, Theories, and the Evolution of the Human Spirit.
Philosophy of Science 59, 590-611.
Braithwaite, R. (1953). Scientific Explanation. Cambridge: Cambridge University Press.
Collins, H. M. (1975). The Seven Sexes: A Study in the Sociology of a Phenomenon, or the
Replication of Experiments in Physics. Sociology 9, 205-24.
Collins, H. M. (1981). Son of Seven Sexes: The Social Deconstruction of a Physical Phenomenon.
Social Studies of Science 11, 33-62.
Conant, J. B. (1957). The Overthrow of the Phlogiston Theory: The Chemical Revolution of 1775-
1789. In: J. B. Conant and L. K. Nash (eds.), Harvard Case Histories in Experimental Science,
vol. 1. Cambridge, Mass.: Harvard University Press.
Davis, P. (1980). The Search for Gravity Waves. Cambridge: Cambridge University Press.
Donovan, A., Laudan, L., and Laudan, R. (1988). Scrutinizing Science. Dordrecht: Reidel.
Earman, J. (1992). Bayes or Bust? A Critical Examination of Bayesian Confirmation Theory.
Cambridge, Mass.: The MIT Press.
Earman, J. and Glymour, C. (1980). Relativity and Eclipses. In: J.L. Heilbron (ed.), Historical
Studies in the Physical Sciences, vol. 11, Part I.
Feyerabend, P. K. (1985). Problems of Empiricism. Cambridge: Cambridge University Press.
Franklin, A. (1990). Experiment, Right or Wrong. Cambridge: Cambridge University Press.
Friedman, M. (1979). Truth and Confirmation. The Journal of Philosophy 76, 361-382.
Galison, P. (1987). How Experiments End. Chicago: University of Chicago Press.
Glymour, C. (1980). Theory and Evidence. Princeton: Princeton University Press.
Goldman, A. (1986). Epistemology and Cognition. Cambridge, Mass: Harvard University Press.
Hacking, I. (1983). Representing and Intervening. Cambridge: Cambridge University Press.
Hempel, C. G. (1965) Aspects of Scientific Explanation. New York: The Free Press.
Howson, C. and Urbach, P. (1989). Scientific Reasoning: The Bayesian Approach. La Salle, Ill.:
Open Court.
Humphreys, P. (1989). The Chances of Explanation. Princeton: Princeton University Press.
Jeffrey, R. (1989). Probabilizing Pathology. Proceedings of the Aristotelian Society 89, 211-226.
Lavoisier, A. (1965). Elements of Chemistry. Translated by W. Creech. New York: Dover.
Lewin, R. (1987). Bones of Contention. New York: Simon and Schuster.
Mackie, J. L. (1963). The Paradox of Confirmation. The British Journal for the Philosophy of
Science 13, 265-277.
Merrill, G. H. (1979). Confirmation and Prediction. Philosophy of Science 46, 98-117.
Miller, R. (1987). Fact and Method. Princeton: Princeton University Press.
Pais, A. (1982). Subtle is the Lord. . .: The Science and Life of Albert Einstein. Oxford: Oxford
University Press.
Popper, K. R. (1959). The Logic of Scientific Discovery. New York: Harper & Row.
Priestley, J. (1970). Experiments and Observations on Different Kinds of Air, and Other Branches
of Natural Philosophy Connected with the Subject. Vol. 1. Reprinted from the edition of 1790
(Birmingham: Thomas Pearson). New York: Kraus Reprint Co.
Reichenbach, H. (1938). Experience and Prediction: An Analysis of the Foundations and the
Structure of Knowledge. Chicago: University of Chicago Press.
Schlesinger, G. (1976). Confirmation and Confirmability. Oxford: Clarendon Press.
Taylor, J. (1982). An Introduction to Error Analysis. Oxford: Oxford University Press.
Will, C. (1986). Was Einstein Right? New York: Basic Books.
Woodward, J. (1983). Glymour on Theory Confirmation. Philosophical Studies 43, 147-157.
Woodward, J. (1989). Data and Phenomena. Synthese 79, 393-472.
This page intentionally left blank
M. Norton Wise
REALISM IS DEAD
1. Introduction
2. A New Enlightenment
In: Martin R. Jones and Nancy Cartwright (eds.), Idealization XII: Correcting the Model.
Idealization and Abstraction in the Sciences (Pozna Studies in the Philosophy of the Sciences and
the Humanities, vol. 86), pp. 269-285. Amsterdam/New York, NY: Rodopi, 2005.
270 M. Norton Wise
The heart of the new scheme is representation, which refers to our everyday
ordinary ability to make mental maps or mental models of real systems and to
use those models to negotiate the world. With models replacing ideas,
representation replaces reason, especially deductive reason, which seems to play
a very limited role in the problem solving activity of practical life, whether of
cooks, carpenters, chess players, or physicists. Given the stress on everyday life,
practice is the focus of analysis. Explaining Science: A Cognitive Approach
explains scientific practice in terms of mental practice. It is therefore critical for
Giere to change our conception of what scientific practice is, to show us, in
particular, that the understanding and use of theories is based on representation,
not reason, or not deductive reason. Similarly, there can be no question of
universal rational criteria for accepting or rejecting a theory. There is only
judgement, and judgement is practical. It cannot, however, according to Giere,
be understood in terms of the rational choice models so prominent today in the
social sciences. Instead, he develops a satisficing model. I am going to leave
the critique of this model of judgement to others and content myself with
representation. I would observe, however, that in Gieres model of science the
traditional categories of theory and theory choice get translated into
Realism Is Dead 271
Because Giere sets up his scheme with respect to classical mechanics, I will
discuss in some detail his rendering of it. He has examined a number of
textbooks at the intermediate and advanced level and contends that the
presentation conforms to his scheme, that physicists understand and use
classical mechanics in the way they ought to if representation via models is the
real goal of the theory. I will attempt to show that these claims are seriously
distorting, both historically and in the present.
The problems begin at the beginning, when he throws into one bag
intermediate and advanced textbooks, which are designed for undergraduate and
graduate courses, respectively. The intermediate ones are intermediate rather
than elementary only because they use the vector calculus to develop a wide
range of standard problems, which I agree function as models in the above
sense. But these texts base their treatment on Newtons laws of motion and on
the concept of force, that is, on the principles of elementary mechanics. What
most physicists would call advanced texts, which are based on extremum
conditions like the principle of least action and Hamiltons principle, are said to
differ from the sort based on Newtons laws primarily in the sophistication of
the mathematical framework employed (Giere 1988, p. 64). Nothing could be
farther from the truth. The entire conceptual apparatus is different, including the
causal structure. And it includes large areas of experience to which Newtons
laws do not apply. Thus Gieres phrase the Hamiltonian version of Newtons
laws, rather than the usual Hamiltonian formulation of mechanics, betrays a
serious distortion in the meaning that advanced mechanics has had for most
physicists since around 1870 (Giere 1988, p. 99).
This is significant in the first instance because Giere wants us to believe that
the reason it doesnt seem to matter much in mechanics textbooks, and in the
learning and doing of physics, whether or which of Newtons laws are
definitions, postulates, or empirical generalizations is that the theory is to be
located not so much in these laws themselves as in the model systems which
realize them, like the harmonic oscillator and the particle subject to an inverse-
square central force. But a more direct explanation would be that these laws are
not actually considered foundational by the physicists who write the textbooks.
These writers are teaching the practical side of a simplified theory which has
widespread utility. Foundations are discussed in advanced texts, where
extremum conditions, symmetry principles, and invariance properties are at
issue. Debate within a given context, I assume, normally focuses on what is
important in that context. We should not look to intermediate textbooks for a
discussion of foundations. If we do we will be in danger of reducing theory to
practice, and elementary practice at that.
274 M. Norton Wise
I would like to reiterate this first point through a brief glance at the content
of mechanics texts in historical terms. If we look at the great French treatises
and textbooks of Lagrange, Laplace, Poisson, Duhamel, Delaunay and others
through the nineteenth century we will not find Newtons laws at all. The
foundations of French mechanics, including the standard textbooks of the cole
Polytechnique, were dAlemberts principle and the principle of virtual
velocities, a generalized form of the balance principle which Lagrange used to
reduce dynamics to statics. In Britain, of course, Newtons laws were foundatio-
nal, and judging from the amount of ink spilt, their status mattered considerably:
from their meaning, to how many were necessary, to their justification. William
Whewells Cambridge texts are instructive here (1824, 1832).
After mid-century Newtons laws did take on a standard form, at least in
Britain, but only when they had been superseded. In the new physics, which for
simplicity may be dated from Thomson and Taits Treatise on Natural
Philosophy of 1867, energy functions replace force functions and extremum
principles replace Newtons laws (Kelvin and Tait 1879-83). Force is still a
meaningful concept, but a secondary one, literally derivative, being defined as
the derivative of an energy function (Smith and Wise 1989).
Now this is all rather important because the new theory promised to
penetrate thermodynamics and electromagnetic fields. In these areas Newtons
laws had little purchase. My point is that the value of the new theory lay not so
much in supplying a more powerful way to solve old problems, as in suggesting
a different conceptual base which might encompass entirely new realms of
experience. The value of theory lay not so much in its power to solve problems
as in its power to unify experience. The monolithic attempt to reduce theory to
practice misses this central point. Cartwright and I may fully agree with Giere
that theories consist in the general propositions and idealized model systems
together, indeed we may agree that so far as use of the theory to solve problems
is concerned, the models are what count. But one does not thereby agree that the
general propositions ought to be thought of as what is implicit in the models.
The propositions are what enable one to recognize the diversity of models as of
one kind, or as having the same form. The power of theory lies in its unifying
function. In this sense, theoretical strategy is not the same as practical strategy.
Inventing a chess game is not the same as playing chess. Similarly, theoretical
physicists are not the same sort of bird as mathematical physicists and neither is
of the same sort as an experimentalist. Although they all interbreed, the
evolution of physics has produced distinct varieties. I suspect that at the level of
professional physicists Gieres naturalism reduces the theoretical variety to the
mathematical one.
To capture the essential difference between theoretical and practical strate-
gies the diagram below may be helpful. The top wedge represents the strategy of
a theorist in attempting to encompass as many different natural systems as
Realism Is Dead 275
possible under one set of propositions. If the propositions are taken as the
Hamiltonian formulation of mechanics, then the theorist hopes to include not
only classical mechanics itself, but thermodynamics, geometrical optics, electro-
magnetic theory, and quantum mechanics. The ideal is unity under deduction,
although the deduction must be constructed in each case by factoring in a great
deal of specialized information not contained in the propositions.
Classical mechanics
Thermodynamics
Hamiltons
Electromagnetism
principle
Geometrical optics
Quantum mechanics
Logic
Psychology
Anthropology Mental function
Artificial intelligence
Neurology
4. Similarity Realism
The historical picture takes on a somewhat different tint with respect to realism.
Giere argues that theoretical models represent real systems in some respects and
to some degree and that in these respects and degrees they are realistic. Nearly
276 M. Norton Wise
all of the formulators of the new mechanics, however, rejected precisely this
version of the realism of theories. Thomson and Tait labeled their theoretical
development abstract dynamics to differentiate it from the realistic theory
they lacked, namely physical dynamics. Abstract dynamics worked with rigid
rods, frictionless surfaces, perfectly elastic collisions, point particles and the
like, not with the properties of real materials. They were singularly unimpressed
with the fact that abstract dynamics yielded approximately correct results in
certain idealized situations, because the theory actually violated all experience.
Most simply, its laws were reversible in time, which meant that it contradicted
the second law of thermodynamics and therefore could not be anything like
correct physically. They suspected that its fundamental flaw lay in the fact that
it dealt with only a finite number of variables. It could be formulated, therefore,
only for a finite system of discrete particles and would not apply to a
continuum, which Thomson especially believed to be the underlying reality
(Smith and Wise 1989, chs. 11, 13, 18). But the argument does not depend on
this continuum belief. As I interpret Thomson, he would say that the realist
position is vitiated not by the fact that the theory fails to reproduce natural
phenomena in some respect or to some degree but by the fact that it
straightforwardly contradicts all empirical processes in its most fundamental
principles. He would not understand the point of Gieres contention with respect
to the ether that its non-existence is not [a good basis] for denying all
realistically understood claims about similarities between ether models and the
world (Giere 1988, p. 107; emphasis in original). Why not, pray tell? Why not
simply call the similarities analogies and forget the realism?
Thomson was a realist, most especially about the ether. But to get at reality
he started at the far remove from abstract theory and abstract models, namely at
the phenomenological end, with the directly observable properties of known
materials. For example, he argued for the reality of his elastic-solid model of
ether partly on the grounds that the ether behaved like Scotch shoemakers wax
and caves-foot jelly. He attempted always to construct realistic models which
relied on the practical reality of familiar mechanical systems rather than on the
mathematical structure of idealized hypothetical models. From the perspective
of this contrast between abstract dynamics and practical reality, the point of
reformulating mechanics in terms of energy functions and extremum conditions
was not to obtain a more realistic theory but to obtain a more general one, and
one that was thus more powerful in the sense of organizing a greater range of
experience. To make abstract mechanics subsume thermodynamics one could
represent matter as composed of a finite number of hard atoms interacting via
forces of attraction and repulsion, but then one would have to add on a
randomizing assumption in order to get rid of the effects of finiteness and time-
reversibility in the equations of mechanics. The resulting theory, doubly false,
Realism Is Dead 277
certainly did not count as realistic among its British analysts: Thomson, Tait,
Maxwell, and others.
Maxwell put this point in its strongest form, to the effect that the goal of
theoretical explanation in general, and especially of the new mechanics, was not
to discover a particular concrete model which reproduced the observed behavior
of the system in question but to discover the most general formulation possible
consistent with this observed behavior. Thus one sought the most general energy
function for the system, a Lagrangian or a Hamiltonian, which would yield
empirically correct equations for its motion, this energy function being specified
in terms of observable coordinates alone, like the input and output coordinates
of a black box, or more famously like the bell ropes in a belfry, independent of
any particular model of the interior workings of the belfry. For every such
Lagrangian or Hamiltonian function, Maxwell observed, an infinite variety of
concrete mechanical models might be imagined to realize it. He himself
exhibited uncommon genius in inventing such models, but unlike his friend
Thomson, he was much more sanguine about the value of unrealistic ones,
regarding them as guarantors of the mechanical realizability of the Lagrangian
in principle. He did not suppose that a similarity between the workings of a
given model and observations on a real system indicated that the system was
really like the model, but only analogous to it. Being analogous and being like
were two different things.
Similar remarks could be made for the perspective on generalized dynamics
and on mechanical models of Kirchhoff, Mach, Hertz, and Planck. Even
Boltzmann, the most infamous atomistic-mechanist of the late nineteenth
century expressed himself as in agreement with Maxwell on the relation of
models to real systems. Since Boltzmann, in his 1902 article on Models for
the Encyclopaedia Britannica, cites the others to support his view, I will let him
stand for them all. Boltzmann remarks that On this view our thoughts stand to
things in the same relation as models to the objects they represent . . . but
without implying complete similarity between thing and thought; for naturally
we can know but little of the resemblance of our thoughts to the things to which
we attach them. So far he does not diverge strikingly from Giere on either the
nature or the limitations of similarity. But while Giere concludes realism from
limited similarity, Boltzmann concludes that the true nature and form of the
real system must be regarded as absolutely unknown and the workings of the
model looked upon as simply a process having more or less resemblance to the
workings of nature, and representing more or less exactly certain aspects
incidental to them. Citing Maxwell on mechanical models, he observes that
Maxwell did not believe in the existence in nature of mechanical agents so
constituted, and that he regarded them merely as means by which phenomena
could be reproduced, bearing a certain similarity to those actually existing . . .
The question no longer being one of ascertaining the actual internal structure of
278 M. Norton Wise
period examined the similarity relations that Giere considers an argument for
realism and drew the opposite conclusion. They opted for nihilism. Given the
prominent role of these people in the evolution of physics, it seems that an
evolutionary naturalism in particular, ought not to make the successful pursuit
of physics depend on realism about theories. I therefore advocate following their
nihilistic example with respect to the realism-antirealism game. But this
conclusion does not follow only from history. A reading of many contemporary
theorists supports it. Stephen Hawking, for example, in his popular little book, A
Brief History of Time (1988), contends that theoreticians are merely engaged in
making up more or less adequate stories about the world. The same view
appears at length in a book called Inventing Reality: Physics as Language
(1988), by Bruce Gregory, associate director of the Harvard-Smithsonian Center
for Astrophysics. A physicist is no more engaged in painting a realistic
picture of the world than a realistic painter is, Gregory opines, and again,
[p]hysical theories do not tell physicists how the world is; they tell physicists
what they can predict reliably about the behavior of the world.
The preceding two sections suggest that Gieres reduction of theory to
practice and his similarity realism are linked. Actually I think that if we reject
the former we automatically reject the latter. I will illustrate this linkage with a
final example from mechanics, again emphasizing the more sophisticated
versions which rely on extremum conditions and the variational calculus. The
most important theoretical goal in using such formulations is to be able to
encompass within a single formalism a wide variety of quite different fields of
physics which employ different mathematical relations, such as Newtons laws,
Maxwells equations, and the Schrdinger equation. Goldstein, the author of
one of Gieres advanced textbooks and the one my entire generation of
physicists was brought up on, remarks that [c]onsequently, when a variational
principle is used as the basis of the formulation, all such fields will exhibit, at
least to some degree, a structural analogy (Goldstein 1950, p. 47). This is the
same view that Maxwell promoted about formal analogy. It means, to give
Goldsteins simple example, that an electrical circuit containing inductance,
capacitance, and resistance can be represented by a mechanical system conta-
ining masses, springs, and friction. The similarity, however, has never induced
physicists to call the representation realistic. The same argument applies to the
relation between theoretical models and real systems.
shifts from similarity to manipulation and control. Here comparisons with Ian
Hackings Representing and Intervening are unavoidable, as Giere acknowled-
ges. One thinks particularly of Hackings phrase If you can spray it, its real,
for which Gieres alternative is, Whatever can be physically manipulated and
controlled is real (Hacking 1983; Giere 1988). He has as little time for
philosophers and sociologists who dont believe in protons as he would have for
soldiers who dont believe in bullets, and for much the same reasons. Both can
be produced and used at will with predictable effect. Protons, in fact, have the
status in many experiments, not of hypothetical theoretical entities, but of
research tools.
The part of this discussion I like best is the three pages on technology in the
laboratory. The subject has become a popular one in the last five years, at least
in the history of science. In Gieres naturalistic scheme, technology provides the
main connector between our cognitive capacity for representation and
knowledge of, for example, nuclear structure. He suggests that the sort of
knowledge bound up in technology is an extension of our sensorimotor and
preverbal representational systems. It is both different in kind and more reliable
than knowledge requiring symbolic and verbal manipulation. It is knowledge of
the everyday furniture of the laboratory, which allows one to act in the world of
the nucleus. Here we seem finally to be talking about experimental practice and
the life of an experimental laboratory. Theory testing is certainly involved, but
is by no means the essence. I can only complain that three pages hardly count in
a subject so large as this one.
I do have two reservations, however. First, there seems to be a tendency to
slide from the reality of a thing called a proton to the reality of the model of the
proton, including its properties and characteristics. The model, even for the
simplest purposes, involves a quantum mechanical description of elusive
properties like spin, which is not spin at all in our everyday sense of a spinning
ball. Does this assigned property have the same reality status as the proton
itself? One need not get into the status of quantum mechanical models to raise
this problem. Historically, many entities have been known to exist, and have
been manipulated and controlled to an extraordinary degree, on the basis of
false models. Electric current is an outstanding example. Our ability to
manipulate and control a proton may well guarantee the reality of the proton
without guaranteeing the reality of any model of it. This point goes back
immediately to the previous remarks about similarity realism and theoretical
models.
My second reservation has to do with tool use. How are we to differentiate
between material tools like magnets, detectors, and scattering chambers, on the
one hand, and intellectual tools like mathematical techniques, on the other
hand? Both are used as agents of manipulation and control of protons and both
are normally used without conscious examination of principles or foundations.
Realism Is Dead 281
Both are the stuff of practice. Once again then, any arguments about laboratory
practice are going to have to tie back into those about theoretical practice. A
more thoroughgoing analysis will be required to show their similarities and
differences.
If it is true, as I have argued, that Gieres realism about theories is just a game
with words, why does he pursue the game? The only answer I can see is that he
is anxious to defeat the radical relativism which he associates with sociologists
or social constructivists. That is, Gieres realism serves primarily as a guard
against relativism in the no-constraints version, which maintains that nature puts
no constraints on the models we make up for explaining it. All models are in
principle possible ones and which one is chosen is merely a matter of social
negotiation. The problem of explaining the content of science, therefore, is
exclusively that of explaining how one scheme rather than another comes to
have currency among a particular social group. To move all the way to a realist
position to defeat this form of relativism, however, seems to be bringing up
cannons where peashooters would do. A simpler argument would be to reject
no-constraints relativism on pragmatic grounds, on the grounds that there is no
evidence for it and a great deal of evidence against it. It is so difficult to make
up empirically adequate theories of even three or five different kinds, that we
have no reason to believe we could make up ten, let alone an infinite number,
and much less every conceivable kind. Constructing castles in the air has not
proved a very successful enterprise. The construction of magnetic monopoles
has fared little better. Arguing from all experience with attempts to construct
adequate models, therefore, particularly from the empirical failure of so many of
the ones that have been constructed, we must suppose that nature does put
severe constraints on our capacity to invent adequate ones. Radical relativism,
for a naturalist, ought to be rejected simply because it does not conform to
experience.
This argument leaves a social constructivist perfectly free to claim that,
within the limits of empirical consistency, even though these limits are severe,
all explanations are socially constructed. Among the reformed school, Andrew
Pickering, Steven Shapin, Simon Schaffer, and Bruno Latour all hold this view.
As Pickering puts it, nature exhibits a great deal of resistance to our construc-
tions. Almost no social constructivists, to my knowledge, presently subscribe to
the no-constraints relativism of the strong program except a few devotees
282 M. Norton Wise
remaining in Edinburgh and Bath.1 For those interested in the new constellation
of social construction I would recommend a recent issue of Science in Context,
with contributions from Tim Lenoir, Steve Shapin, Simon Schaffer, Peter
Galison, and myself, among others. Bruno Latours Science in Action (1987)
also presents an exceedingly interesting position. He extends the social
negotiations of the old strong program to negotiations with nature. For all of
these investigators, nature exhibits such strong resistance to manipulation that
no-constraints relativism has become irrelevant. And the constrained relativism
which they do adopt does not differ significantly from what Giere calls realism.
There are, however, significant reasons for not calling it realism. Constraints
act negatively. They tell us which of the options we have been able to invent are
possibly valid ones, but they do not invent the options and they do not tell us
which of the possible options we ought to pursue. This suggests immediately
that in order to understand how knowledge gets generated we must analyze the
social and cultural phenomena which, over and above the constraints which
nature is able to exert, are productive of scientific knowledge. Actually this
position seems to be Gieres own. Noting that his constructive realism was
invented to counter van Fraassens constructive empiricism, he adds, The term
emphasizes the fact that models are deliberately created, socially constructed
if one wishes, by scientists. Nature does not reveal to us directly how best to
represent her. I see no reason why realists should not also enjoy this insight
(Giere 1988, p. 93). They should, but having savored its delights, they should
give up realism.
The position I am advocating has strong instrumentalist aspects. People use
what works, or what has utility. But the instrumentalism of philosophers does
not normally take into account the social relativity of utility. What works is
relative to what purposes one has, and purposes are generally social. Thus it is
no good explaining the growth of Maxwellian electromagnetic theory in Britain
simply in terms of the fact that it gave a satisfactory account of the behavior of
light. Satisfactory to whom? Certainly not to all mathematical physicists in
Britain and certainly not to any mathematical physicists on the Continent
between 1863, when Maxwell first published his theory, and the mid-1890s
when it was superseded by Lorentzs electron theory. Social historians contend
that differences like these require a social interpretation of how scientific
explanations get constituted. Philosophers of science, including instrumentalists
and pragmatists, have generally had nothing to offer. They do not incorporate
the social into the essence of science, but leave it on the borders, perpetuating
the internal-external dichotomy.
1
Shapin, one of the original Edinburgh group, never accepted the social determinist position. He
has removed to San Diego. Pickering has removed to Illinois, but not before feuding with Bloor on
the subject. Latour has been similarly sparring with Collins.
Realism Is Dead 283
7. Realism Is Dead
course we know that models pick out certain aspects for emphasis, subordinate
others, and ignore or even suppress the remainder, but if the model is our means
for interacting with the world then the aspects that it picks out define the reality
of the world for us.
This attitude gains its strongest support from technology. Whenever we
embody models in material systems and then use those systems to shape the
world, we are shaping reality. Prior to the twentieth century, it might be argued,
technology merely enhanced or modified already existing materials and energy
sources. But that is certainly no longer the case. We regularly produce
substances and whole systems that have no natural existence except as artifacts
of our creation. This is most startling in the case of genetic engineering, where
life itself is presently being shaped, but it applies equally to Teflon and
television. I would stress that we typically accomplish these creations by
manipulating models which we attempt to realize in the world. Often nature is
recalcitrant and we have to return to our models. But far from representing a
preexisting nature, the models create the reality that we then learn to recognize
as natural.
The process of creation is most obvious with respect to computer
simulations. They have become so sophisticated, and so essential to basic
research in all of the sciences, that they often substitute for experimental
research. A computer-simulated wind tunnel can have great advantages over a
real one as a result of increased control over relevant variables and
elimination of irrelevant ones. More profoundly, the entire field of chaos theory
has emerged as a result of computer simulations of the behavior of non-linear
systems. The computer-generated pictures make visible distinct patterns of
behavior where previously only chaotic motion appeared. They show how such
patterns can be generated from simple codes iterated over and over again. Of
course, it is possible to hold that the computer simulations merely discover a
reality that was actually there all along. But this way of talking is precisely the
target of the assertion that realism is dead. I prefer to say that the simulations
create one of the constructions of reality possible under the constraints of
nature. They show that it is possible coherently to represent chaotic systems in
terms of iterated codes. But this representation is an artifact of the computer, or
an artifact of an artifact. Herein lies the lesson of the simulacrum scheme. We
are the creators.
M. Norton Wise
Department of History
University of California, Los Angeles
nortonw@history.ucla.edu
Realism Is Dead 285
REFERENCES
Boltzmann, L. (1974). Theoretical Physics and Philosophical Problems: Selected Writings. Edited
by B. McGuinness, with a foreword by S. R. de Groot; translated by P. Foulkes. Dordrecht:
Reidel.
Cartwright, N. (1983). How the Laws of Physics Lie. Oxford: Clarendon Press.
Giere, R. N. (1988). Explaining Science: A Cognitive Approach. Chicago: University of Chicago
Press.
Goldstein, H. (1950). Classical Mechanics. Cambridge, Mass.: Addison-Wesley.
Gregory, B. (1988). Inventing Reality: Physics as Language. New York: Wiley.
Hacking, I. (1983). Representing and Intervening: Introductory Topics in the Philosophy of Natural
Science. Cambridge: Cambridge University Press.
Hawking, S. W. (1988). A Brief History of Time: From the Big Bang to Black Holes. Toronto, New
York: Bantam.
Latour, B. (1987). Science in Action: How to Follow Scientists and Engineers through Society.
Cambridge, Mass.: Harvard University Press.
Lavoisier, A.-L. (1965). Elements of Chemistry: In a New Systematic Order, Containing all the
Modern Discoveries. Translated by R. Kerr. New York: Dover.
Kelvin, W. Thomson and Tait, P. G. (1879-83). Treatise on Natural Philosophy. New edition.
2 vols. Cambridge: Cambridge University Press.
Smith, C. and Wise, M. N. (1989). Energy and Empire: A Biographical Study of Lord Kelvin.
Cambridge: Cambridge University Press.
Van Fraassen, B. C. (1980). The Scientific Image. Oxford: Clarendon Press.
Whewell, W. (1824). An Elementary Treatise on Mechanics: Designed for the Use of Students in
the University. 2nd edition. Cambridge: Printed by J. Smith for J. Deighton.
Whewell, W. (1832). On the Free Motion of Points, and on Universal Gravitation, including the
Principal Propositions of Books I. and III. of the Principia; the First Part of a New Edition of a
Treatise on Dynamics. Cambridge: For J. and J. J. Deighton.
286 M. Norton Wise
IS REALISM DEAD?
1. A New Enlightenment?
2. Mechanics
In: Martin R. Jones and Nancy Cartwright (eds.), Idealization XII: Correcting the Model.
Idealization and Abstraction in the Sciences (Pozna Studies in the Philosophy of the Sciences and
the Humanities, vol. 86), pp. 287-293. Amsterdam/New York, NY: Rodopi, 2005.
288 Ronald N. Giere
3. Realism
science. For Lakatos, progressive research programs are those that generate new
empirical (not theoretical) content. Laudan remained closer to Kuhn in
maintaining that more progressive programs are those with greater problem
solving effectiveness. Both Lakatos and Laudan identified progress with
rationality so as to recover the philosophical position that science is rational, in
opposition to Kuhn who denied any special rationality for science.
There is one more distinction to be made before I can conclude my defense
of realism. Laudan, for example, claims that his account of science is
representational in the sense that scientific hypotheses are statements that are in
fact true or false. He calls this semantic realism. But he goes on to argue that
there are no, and perhaps can be no, rational grounds for any claims one way or
the other. In short, the basis of Laudans anti-realism is not semantic, but
epistemological. The same is true of van Fraassens (1980) anti-realism.
My realism has two parts. First, it rejects notions of truth and falsity as
being too crude for an adequate theory of science. Taken literally, most
scientific claims would have to be judged false, which shows that something is
drastically wrong with the analysis. Rather, I regard scientific hypotheses as
typically representational in the sense of asserting a structural similarity
between an abstract model and some part of the real world. (I say typically
because I want to allow for the possibility of cases where this is not so. Parts of
microphysics may be such a case.)
The second part is the theory of scientific judgment, and the theory of
experimentation, which Wise, for reasons of exposition, put to one side. As
elaborated in Explaining Science, I think there are judgmental strategies for
deciding which of several models possesses the greater structural similarity with
the world. Typically these strategies involve experimentation. And they are at
least sometimes effective in the sense that they provide a substantial probability
for leading one to make the right choice (1988, ch. 6).
cannot be taken as definitive. These accounts provide just one sort of evidence
to be used in the investigation of what the actors are in fact doing.
Scientists are no different. The theories scientists propound about their
scientific activities do not have a privileged role in the study of science as a
human activity. What scientists will say about the nature of their work depends
heavily on the context, their interests and scientific opponents, the supposed
audience, even their sources of funding. Newtons claim not to feign hypotheses
may be the most famous case in point. The claim is obviously false of his actual
scientific practice. It may make more sense when considered in the context of
his disputes with Leibniz and the Cartesians.
But I am no historian. Let me take a more mundane example from my own
experience studying work at a large nuclear physics laboratory (1988, ch. 5).
One of my informants claimed that physics is like poetry and that physicists
are like poets. I dont know if he ever propounded this theory to his physicist
friends. But he told me, most likely because he thought that this is the kind of
thing that interests philosophers of science. Well, maybe there is poetry, as well
as music, in the hum of a well-tuned cyclotron. But the truth is that this man
began his academic life as an English major with strong interests in poetry. I am
sure that this fact had more to do with sustaining his theory about the nature of
physics than anything going on in that laboratory. I can only speculate about the
details of the psychological connections.
7. Is Realism Dead?
Since I finished writing Explaining Science, there has been some softening in
the sociological position, as Wise notes. Perhaps there is no longer a significant
substantive difference between my position and the consensus position among
sociologists and social historians of science.
Since I am not yet sure that convergence has in fact occurred, let me
conclude by stating what I would regard as the minimal consensus on the
starting point for developing an adequate theory of science. It is this: We now
Is Realism Dead? 293
know much more about the world than we did three hundred, one hundred, fifty,
or even twenty-five years ago. More specifically, many of the models we have
today capture more of the structure of various parts of the world, and in more
detail, than models available fifty or a hundred years ago. For example, current
models of the structure of genetic materials capture more of their real structure
than models available in 1950. The primary task of a theory of science is to
explain the processes that produced these results. To deny this minimal position,
or even to be agnostic about it, is to misconceive the task. It is to retreat into
scholasticism and academic irrelevance.
If this is the consensus, it marks not the death, but the affirmation of a realist
perspective. The good news would be that we could at least temporarily put to
rest arguments about realism and get on with the primary task. That would be all
to the good because the primary task is more exciting, and more important.*
Ronald N. Giere
Department of Philosophy and Center for Philosophy of Science
University of Minnesota
giere@maroon.tc.umn.edu
REFERENCES
*
The author gratefully acknowledges the support of the National Science Foundation and the
hospitality of the Wissenschaftskolleg zu Berlin.
294 Ronald N. Giere
VOLUME 1 (1975)
VOLUME 2 (1976)
VOLUME 17 (1990)
IDEALIZATION II: FORMS AND APPLICATIONS
(Edited by Jerzy Brzeziski, Francesco Coniglione, Theo A.F. Kuipers and
Leszek Nowak)
VOLUME 25 (1992)
IDEALIZATION III: APPROXIMATION AND TRUTH
(Edited by Jerzy Brzeziski and Leszek Nowak)
VOLUME 26 (1992)
IDEALIZATION IV: INTELLIGIBILITY IN SCIENCE
(Edited by Craig Dilworth)
VOLUME 34 (1994)
Izabella Nowakowa
VOLUME 38 (1994)
IDEALIZATION VI: IDEALIZATION IN ECONOMICS
(Edited by Bert Hamminga and Neil B. De Marchi)
VOLUME 42 (1995)
IDEALIZATION VII: IDEALIZATION, STRUCTURALISM,
AND APPROXIMATION
(Edited by Martti Kuokkanen)
VOLUME 63 (1998)
IDEALIZATION IX: IDEALIZATION IN CONTEMPORARY PHYSICS
(Edited by Niall Shanks)
VOLUME 69 (2000)
Izabella Nowakowa, Leszek Nowak
VOLUME 82 (2004)
This book presents the work of Polish and American philosophers about
Polands transition from Communist domination to democracy. Among their
topics are nationalism, liberalism, law and justice, academic freedom, religion,
fascism, and anti-Semitism. Beyond their insights into the ongoing situation in
Poland, these essays have broader implications, inspiring reflection on dealing
with needed social changes.
This book examines the role and limits of policies in shaping attitudes and
actions toward war, violence, and peace. Authors examine militaristic
language and metaphor, effects of media violence on children, humanitarian
intervention, sanctions, peacemaking, sex offender treatment programs,
nationalism, cosmopolitanism, community, and political forgiveness to
identify problem policies and develop better ones.
Andr Mineau
This book purports that, given Operation Barbarossas concept and scope, it
would have been impossible without Nazi ideology, that we cannot understand it
in the absence of its reference to the Holocaust. It asks and attempts to answer
whether we can describe ideology without reference to ethics and speak about
genocide while ignoring philosophy.