Sie sind auf Seite 1von 51

The Cybernetic Bayesian Brain

From Interoceptive Inference to Sensorimotor Contingencies

Anil K. Seth

Is there a single principle by which neural operations can account for perception, Author
cognition, action, and even consciousness? A strong candidate is now taking
shape in the form of “predictive processing”. On this theory, brains engage in pre -
Anil K. Seth
dictive inference on the causes of sensory inputs by continuous minimization of
a.k.seth @ sussex.ac.uk
prediction errors or informational “free energy”. Predictive processing can account,
supposedly, not only for perception, but also for action and for the essential con- University of Sussex
tribution of the body and environment in structuring sensorimotor interactions. In Brighton, United Kingdom
this paper I draw together some recent developments within predictive processing
that involve predictive modelling of internal physiological states (interoceptive in- Commentator
ference), and integration with “enactive” and “embodied” approaches to cognitive
science (predictive perception of sensorimotor contingencies). The upshot is a de- Wanja Wiese
velopment of predictive processing that originates, not in Helmholtzian percep- wawiese@uni-mainz.de
tion-as-inference, but rather in 20 th-century cybernetic principles that emphasized Johannes Gutenberg-Universität
homeostasis and predictive control. This way of thinking leads to (i) a new view of Mainz, Germany
emotion as active interoceptive inference; (ii) a common predictive framework link-
ing experiences of body ownership, emotion, and exteroceptive perception; (iii) Editors
distinct interpretations of active inference as involving disruptive and disambigu-
atory—not just confirmatory—actions to test perceptual hypotheses; (iv) a neuro-
Thomas Metzinger
cognitive operationalization of the “mastery of sensorimotor contingencies” (where
metzinger @ uni-mainz.de
sensorimotor contingencies reflect the rules governing sensory changes produced
by various actions); and (v) an account of the sense of subjective reality of percep- Johannes Gutenberg-Universität
tual contents (“perceptual presence”) in terms of the extent to which predictive Mainz, Germany
models encode potential sensorimotor relations (this being “counterfactual rich-
ness”). This is rich and varied territory, and surveying its landmarks emphasizes Jennifer M. Windt
the need for experimental tests of its key contributions. jennifer.windt @ monash.edu
Monash University
Keywords Melbourne, Australia
Active inference | Counterfactually-equipped predictive model | Evolutionary ro-
botics | Free energy principle | Interoception | Perceptual presence | Predictive
processing | Sensorimotor contingencies | Somatic marker hypothesis | Synaes-
thesia

1 Introduction

An increasingly popular theory in cognitive sci- ult of the brain inferring the most likely causes
ence claims that brains are essentially predic- of its sensory inputs by minimizing the differ-
tion machines (Hohwy 2013). The theory is ence between actual sensory signals and the sig-
variously known as the Bayesian brain (Knill & nals expected on the basis of continuously up-
Pouget 2004; Pouget et al. 2013), predictive dated predictive models. Arguably, PP provides
processing (Clark 2013; Clark this collection), the most complete framework to date for ex-
and the predictive mind (Hohwy 2013; Hohwy plaining perception, cognition, and action in
this collection), among others; here we use the terms of fundamental theoretical principles and
term PP (predictive processing). (See Table 1 neurocognitive architectures. In this paper I de-
for a glossary of technical terms.) At its most scribe a version of PP that is distinguished by
fundamental, PP says that perception is the res- (i) an emphasis on predictive modelling of in-
Seth, A. K. (2015). The Cybernetic Bayesian Brain - From Interoceptive Inference to Sensorimotor Contingencies.
In T. Metzinger & J. M. Windt (Eds). Open MIND: 35(T). Frankfurt am Main: MIND Group. doi: 10.15502/9783958570108 1 | 24
www.open-mind.net

Table 1: A glossary of technical terms.

Seth, A. K. (2015). The Cybernetic Bayesian Brain - From Interoceptive Inference to Sensorimotor Contingencies.
In T. Metzinger & J. M. Windt (Eds). Open MIND: 35(T). Frankfurt am Main: MIND Group. doi: 10.15502/9783958570108 2 | 24
www.open-mind.net

ternal physiological states and (ii) engagement counterfactual relations linking potential (but
with alternative frameworks under the banner not necessarily executed) actions to their expec-
of “enactive” and “embodied” cognitive science ted sensory consequences (Friston et al. 2012;
(Varela et al. 1993). Seth 2014b). It also implies the involvement of
I first identify an unusual starting point model comparison and selection—not just the
for PP, not in Helmholtzian perception-as-infer- optimization of parameters assuming a single
ence, but in the mid 20 th-century cybernetic model. These points represent significant devel-
theories associated with W. Ross Ashby (1952, opments in the basic infrastructure of PP.
1956; Conant & Ashby 1970). Linking these ori- The notion of counterfactual predictions
gins to their modern expression in Karl Fris- connects PP with what at first glance seems
ton’s “free energy principle” (2010), perception to be its natural opponent: “enactive” theories
emerges as a consequence of a more funda- of perception and cognition that explicitly re-
mental imperative towards homeostasis and ject internal models or representations (Clark
control, and not as a process designed to furnish this collection; Hutto & Myin 2013; Thompson
a detailed inner “world model” suitable for cog- & Varela 2001). Central to the enactive ap-
nition and action planning. The ensuing view of proach are notions of “sensorimotor contingen-
PP, while still fluently accounting for (extero- cies” and their “mastery” (O’Regan & Noë
ceptive) perception, turns out to be more natur- 2001), where a sensorimotor contingency refers
ally applicable to the predictive perception of to a rule governing how sensory signals change
internal bodily states, instantiating a process of in response to action. On this approach, the
interoceptive inference (Seth 2013; Seth et al. perceptual experience of (for example) redness
2011). This concept provides a natural way of is given by an implicit knowledge (mastery) of
thinking of the neural substrates of emotional the way red things behave given certain pat-
and mood experiences, and also describes a terns of sensorimotor activity. This mastery of
common mechanism by which interoceptive and sensorimotor contingencies is also said to un-
exteroceptive signals can be integrated to derpin perceptual presence: the sense of sub-
provide a unified experience of body ownership jective reality of the contents of perception
and conscious selfhood (Blanke & Metzinger (Noë 2006). From the perspective of PP, mas-
2009; Limanowski & Blankenburg 2013). tery of a sensorimotor contingency corres-
The focus on embodiment leads to distinct ponds to the learning of a counterfactually-
interpretations of active inference, which in gen- equipped predictive model connecting poten-
eral refers to the selective sampling of sensory tial actions to expected sensory consequences.
signals so as to improve perceptual predictions. The resulting theory of PPSMC (Predictive
The simplest interpretation of active inference is Perception of SensoriMotor Contingencies),
the changing of sensory data (via selective Seth 2014b) provides a much needed reconcili-
sampling) to conform to current predictions ation of enactive and predictive theories of
(Friston et al. 2010). However, by analogy with perception and action. It also provides a solu-
hypothesis testing in science, active inference tion to the challenge of perceptual presence
can also involve seeking evidence that goes within the setting of PP: perceptual presence
against current predictions, or that disambigu- obtains when the underlying predictive models
ates multiple competing hypotheses. A nice ex- are counterfactually rich, in the sense of en-
ample of the latter comes from self-modelling in coding a rich repertoire of potential (but not
evolutionary robotics, where multiple competing necessarily executed) sensorimotor relations.
self-models are used to specify actions that are This approach also helps explain instances
most likely to provide disambiguatory sensory where perceptual presence seems to be lack-
evidence (Bongard et al. 2006). I will spend ing, such as in synaesthesia.
more time on this example later. Crucially, This is both a conceptual and theoretical
these different senses of active inference rest on paper. Space limitations preclude any signific-
the capacity of predictive models to encode ant treatment of the relevant experimental lit-
Seth, A. K. (2015). The Cybernetic Bayesian Brain - From Interoceptive Inference to Sensorimotor Contingencies.
In T. Metzinger & J. M. Windt (Eds). Open MIND: 35(T). Frankfurt am Main: MIND Group. doi: 10.15502/9783958570108 3 | 24
www.open-mind.net

Figure 1: A. Schemas of hierarchical predictive coding across three cortical regions; the lowest on the left (R1) and
the highest on the right (R3). Bottom-up projections (red) originate from “error units” (orange) in superficial cortical
layers and terminate on “state units” (light blue) in the deep (infragranular) layers of their targets; while top-down pro -
jections (dark blue) convey predictions originating in deep layers and project to the superficial layers of their targets.
Prediction errors are associated with precisions, which determine the relative influence of bottom-up and top-down sig-
nal flow via precision weighting (dashed lines). B. The influence of precisions on Bayesian inference and predictive cod-
ing. The curves show probability distributions over the value of a sensory signal (x-axis). On the left, high precision-
weighting of sensory signals (red) enhances their influence on the posterior (green) and expectation (dotted line) as
compared to the prior (blue). On the right, low sensory precision weighting has the opposite effect. Figure adapted
from Seth (2013).

erature. However, even an exhaustive treat- 2 The predictive brain and its cybernetic
ment would reveal that this literature so far origins
provides only circumstantial support for the
basics of PP, let alone for the extensions de- 2.1 Predictive processing: The basics
scribed here. Yet an advantage of PP theories
is that they are grounded in concrete compu- PP starts with the assumption that in order to
tational processes and neurocognitive architec- support adaptive responses, the brain must dis-
tures, giving us confidence that informative cover information about the external “hidden”
experimental tests can be devised. Implement- causes of sensory signals. It lacks any direct ac-
ing such an experimental agenda stands as a cess to these causes, and can only use informa-
critical challenge for the future. tion found in the flux of sensory signals them-
Seth, A. K. (2015). The Cybernetic Bayesian Brain - From Interoceptive Inference to Sensorimotor Contingencies.
In T. Metzinger & J. M. Windt (Eds). Open MIND: 35(T). Frankfurt am Main: MIND Group. doi: 10.15502/9783958570108 4 | 24
www.open-mind.net

selves. According to PP, brains meet this chal- dicator of its reliability, and hence can be used
lenge by attempting to predict sensory inputs to determine its influence in updating top-down
on the basis of their own emerging models of predictive models. Precisions, like mean values,
the causes of these inputs, with prediction er- are not given but must be inferred on the basis
rors being used to update these models so as to of top-down models and incoming data; so PP
minimize discrepancies. The idea is that a brain requires that agents have expectations about
operating this way will come to encode (in the precisions that are themselves updated as new
form of predictive or generative models) a rich data arrive (and new precisions can be estim-
body of information about the sources of signals ated). Precision expectations can therefore bal-
by which it is regularly perturbed (Clark 2013). ance the influence of different prediction-error
Applied to cortical hierarchies, PP over- sources on the updating of predictive models.
turns classical notions of perception that de- And if prediction errors have low (expected)
scribe a largely “bottom-up” process of evidence precision, predictive models may overwhelm er-
accumulation or feature detection. Instead, PP ror signals (hallucination) or elicit actions that
proposes that perceptual content is determined confirm sensory predictions (active inference).
by top-down predictive signals emerging from A picture emerges in which cortical net-
multi-layered and hierarchically-organized gen- works engage in recurrent interactions whereby
erative models of the causes of sensory signals bottom-up prediction errors are continuously re-
(Lee & Mumford 2003). These models are con- conciled with top-down predictions at multiple
tinually refined by mismatches (prediction er- hierarchical levels—a process modulated at all
rors) between predicted signals and actual sig- times by precision weighting. The result is a
nals across hierarchical levels, which iteratively brain that not only encodes information about
update predictive models via approximations to the sources of signals that impinge upon its
Bayesian inference (see Figure 1). This means sensory surfaces, but that also encodes informa-
that the brain can induce accurate generative tion about how its own actions interact with
models of environmental hidden causes by oper- these sources in specifying sensory signals. Per-
ating only on signals to which it has direct ac- ception involves updating the parameters of the
cess: predictions and prediction errors. It also model to fit the data; action involves changing
means that even low-level perceptual content is sensory data to fit (or test) the model; and at-
determined via cascades of predictions flowing tention corresponds to optimizing model updat-
from very general abstract expectations, which ing by giving preference to sensory data that
constrain successively more fine-grained predic- are expected to carry more information, which
tions. is called precision weighting (Hohwy 2013). This
Two further aspects of PP need to be em- view of the brain is shamelessly model-based
phasized from the outset. First, sensory predic- and representational (though with a finessed no-
tion errors can be minimized either “passively”, tion of representation), yet it also deeply em-
by changing predictive models to fit incoming beds the close coupling of perception and action
data (perceptual inference), or “actively”, by and, as we will see, the importance of the body
performing actions to confirm or test sensory in the mediation of this interaction.
predictions (active inference). In most cases
these processes are assumed to unfold continu- 2.2 Predictive processing and the free
ously and simultaneously, underlining a deep energy principle
continuity between perception and action (Fris-
ton et al. 2010; Verschure et al. 2003). This pro- PP can be considered a special case of the free
cess of active inference will play a key role in energy principle, according to which perceptual
much of what follows. Second, predictions and inference and action emerge as a consequence of
prediction errors in a Bayesian framework have a more fundamental imperative towards the
associated precisions (inverse variances, Figure avoidance of “surprising” events (Friston 2005,
1). The precision of a prediction error is an in- 2009, 2010). On the free energy principle, or-
Seth, A. K. (2015). The Cybernetic Bayesian Brain - From Interoceptive Inference to Sensorimotor Contingencies.
In T. Metzinger & J. M. Windt (Eds). Open MIND: 35(T). Frankfurt am Main: MIND Group. doi: 10.15502/9783958570108 5 | 24
www.open-mind.net

ganisms – by dint of their continued survival— The free energy principle also emphasizes
must minimize the long-run average surprise of action as a means of prediction error minimiza-
sensory states, since surprising sensory states tion, this being active inference. In general, act-
are likely to reflect conditions incompatible with ive inference involves the selective sampling of
continued existence (think of a fish out of wa- sensory signals so as to minimize uncertainty in
ter). “Surprise” is not used here in the psycholo- perceptual hypotheses (minimizing the entropy
gical sense, but in an information-theoretic of the posterior). In one sense this means that
sense—as the negative log probability of an actions are selected to provide evidence compat-
event’s occurrence (roughly, the unlikeliness of ible with current perceptual predictions. This is
the occurrence of an event). the most standard interpretation of the concept,
The connection with PP arises because since it corresponds most directly to minimiza-
agents cannot directly evaluate the (informa- tion of prediction error (Friston 2009). However,
tion-theoretic) surprise associated with an as we will see, actions can also be selected on
event, since this would require—impossibly— the basis of an attempt to find evidence going
the agent to average over all possible occur- against current hypotheses, and/or to efficiently
rences of the event in all possible situations. In- disambiguate between competing hypotheses.
stead, the agent can only maintain a lower limit These finessed senses of active inference repres-
on surprise by minimizing the difference ent developments of the free energy framework.
between actual sensory signals and those signals Importantly, action itself can be thought of as
predicted according to a generative or predictive being brought about by the minimization of
model. This difference is free energy, which, un- proprioceptive prediction errors via the engage-
der fairly general assumptions, is the long-run ment of classical reflex arcs (Adams et al. 2013;
sum of prediction error. Friston et al. 2010). This requires transiently
An attractive feature of the free energy low precision-weighting of these errors (or else
principle is that it brings to the table a rich predictions would simply be updated instead),
mathematical framework that shows how PP can which is compatible with evidence showing sens-
work in practice. Formally, PP depends on estab- ory attenuation during self-generated move-
lished principles of Bayesian inference and model ments (Brown et al. 2013).
specification, whereby the most likely causes of A more controversial aspect of the free en-
observed data (posterior) are estimated based on ergy principle is its claimed generality (Hohwy
optimally combining prior expectations of these this collection). At least as described by Friston,
causes with observed data, by using a (generative, it claims to account for adaptation at almost
predictive) model of the data that would be ob- any granularity of time and space, from macro-
served given a particular set of causes (likelihood). scopic trends in evolution, through development
(See Figure 1 for an example of priors and pos- and maturation, to signalling in neuronal hier-
teriors.) In practice, because optimal Bayesian in- archies (Friston 2010). However, in some of
ference is usually intractable, a variety of approx- these interpretations reliance on predictive mod-
imate methods can be applied (Hinton & Dayan elling is only implicit; for example the body of a
1996; Neal & Hinton 1998). Friston’s framework fish can be considered to be an implicit model
appeals to previously worked-out “variational” of the fluid dynamics and other affordances of
methods, which take advantage of certain approx- its watery environment (see section 2.3). I am
imations (e.g., Gaussianity, independence of tem- not concerned here with these broader inter-
poral scales)—thus allowing a potentially neat pretations, but will focus on those cases in
mapping onto neurobiological quantities (Friston which biological (neural) mechanisms plausibly
et al. 2006).1 implement explicit predictive inference via ap-
1 Some challenging questions surface here as to whether prediction proximations to Bayesian computations—
errors are used to update priors, which corresponds to standard namely, the Bayesian brain (Knill & Pouget
Bayesian inference, or whether they are used to update the un -
derlying generative/predictive model, which corresponds to learn-
2004; Pouget et al. 2013). Here, the free energy
ing. principle has potentially the greatest explanat-
Seth, A. K. (2015). The Cybernetic Bayesian Brain - From Interoceptive Inference to Sensorimotor Contingencies.
In T. Metzinger & J. M. Windt (Eds). Open MIND: 35(T). Frankfurt am Main: MIND Group. doi: 10.15502/9783958570108 6 | 24
www.open-mind.net

ory power, especially given the convergence of zian view is rather passive, inasmuch as there is
empirical evidence (see Clark 2013 and Hohwy little discussion of active inference or behaviour.
2013 for reviews) and computational modelling The close coupling of perception and action em-
showing how cortical microcircuits might imple- phasized in the free energy principle points in-
ment approximate Bayesian inference (Bastos et stead to a deep connection between PP and
al. 2012). mid-twentieth-century cybernetics. This is most
obvious in the works of W. Ross Ashby (Ashby
1952; 1956; Conant & Ashby 1970) but is also
evident more generally (Dupuy 2009; Pickering
2010). Importantly, cybernetics adopted as its
central focus the prediction and control of beha-
viour in so-called teleological or purposeful ma-
chines.2 More precisely, cybernetic theorists
were (are) interested in systems that appear to
have goals (i.e., teleological) and that particip-
ate in circular causal chains (i.e., involving feed-
back) coupling goal-directed sensation and ac-
tion.
Two key insights from the first wave of cy-
bernetics usefully anticipate the core develop-
ments of PP within cognitive science. These are
Figure 2: A. W. Ross Ashby, British psychiatrist and both associated with Ashby, a key figure in the
pioneer of cybernetics (1903–1972). B. A schematic of ul- movement and often considered its leader, at
trastability, based on Ashby’s notebooks. The system R least outside the USA (Figure 2).
homeostatically maintains its essential variables (EVs) The first insight consists in an emphasis
within viability limits via first-order feedback with the on the homeostasis of internal essential vari-
environment E. When first-order feedback fails, so that ables, which, in physiological settings, corres-
EVs run out-of-bounds, second order “ultrastable” feed- pond to quantities like blood pressure, heart
back is triggered so that S (an internal controller, poten- rate, blood sugar levels, and the like. In Ashby’s
tially model-based) changes the parameters of R govern- framework, when essential variables move bey-
ing the first-order feedback. S continually changes R until ond specific viability limits, adaptive processes
homeostatic relations are regained, leaving the EVs again are triggered that re-parameterize the system
within bounds. C. Ashby’s “homeostat”, consisting of until it reaches a new equilibrium in which
four interconnected ultrastable systems, forming a so- homeostasis is restored (Ashby 1952). Such sys-
called “multistable” system. D. One ultrastable unit from tems are, in Ashby’s terminology, ultrastable,
the homeostat. Each unit had a trough of water with an since they embody (at least) two levels of feed-
electric field gradient and a metal needle. Instability was back: a first-order feedback that homeostatically
represented by the non-central needle positions, which on regulates essential variables (like a thermostat)
occurring would alter the resistances connecting the units and a second-order feedback that allostatically3
via discharge through capacitors. For more details see re-organises a system’s input–output relations
Ashby (1952) and Pickering (2010). when first-order feedback fails, until a new
homeostatic regime is attained. In the most ba-
2.3 Predictive processing, free energy, sic case, as implemented in Ashby’s famous
and cybernetics “homeostat” (Figure 2), this second-order feed-
back simply involves random changes to system
Typically, the origins of PP are traced to the 2 This underlines the close links between cybernetics and behaviour-
work of the 19th Century physiologist Hermann ism. Perhaps this explains why cybernetics was so reluctant to bring
phenomenology into its remit, an exclusion which, looking back,
von Helmholtz, who first formalized the idea of seems like a missed opportunity.
perception as inference. However, the Helmholt- 3 Allostasis: the process of achieving homeostasis.

Seth, A. K. (2015). The Cybernetic Bayesian Brain - From Interoceptive Inference to Sensorimotor Contingencies.
In T. Metzinger & J. M. Windt (Eds). Open MIND: 35(T). Frankfurt am Main: MIND Group. doi: 10.15502/9783958570108 7 | 24
www.open-mind.net

parameters until a new stable regime is reached. This implies the existence of a control mech-
The importance of this insight for PP is that it anism with sufficient complexity to respond to
locates the function of biological and cognitive (i.e., suppress) the variety of perturbations it
processes in generalizing homeostasis to ensure encounters (law of requisite variety). Further,
that internal essential variables remain within this structure must instantiate a model of the
expected ranges. system to be controlled (good regulator the-
Another way to summarize the funda- orem), where the system includes both the
mental cybernetic principle is to say that adapt- body and the environment (and their interac-
ive systems ensure their continued existence by tions). As Ashby himself tells us “[t]he whole
successfully responding to environmental per- function of the brain can be summed up in:
turbations so as to maintain their internal or- error correction” (quoted in Clark 2013, p. 1).
ganization. This leads to the second insight, Put this way, perception emerges as a con-
evident in Ashby’s law of requisite variety. This sequence of a more fundamental imperative to-
states that a successful control system must be wards organizational homeostasis, and not as
capable of entering at least as many states as a stage in some process of internal world-
the system being controlled: “only variety can model construction. This view, while high-
force down variety” (Ashby 1956). This induces lighting different origins, closely parallels the
a functional boundary between controller and assumptions of the free energy principle in
environment and implies a minimum level of proposing a primary imperative towards the
complexity for a successful controller, which is continued survival of the organism (Friston
determined by the causal complexity of the en- 2010).
vironmental states that constitute potential per- It may be surprising to consider the leg-
turbations to a system’s essential variables. This acy of cybernetics in this light. This is be-
view was refined some years later, in a 1970 pa- cause many previous discussions of this legacy
per written with Roger Conant entitled “Every focus on examples which show that complex,
good regulator of a system must be a model of apparently goal-directed behaviour can emerge
that system” (Conant & Ashby 1970). This pa- from simple mechanisms interacting with
per builds on the law of requisite variety by ar- structured bodies and environments (Beer
guing (and attempting to formally show) that 2003; Braitenberg 1984). On this more stand-
the nature of a controller capable of suppressing ard development, cybernetics challenges rather
perturbations imposed by an external system than asserts the need for internal models and
(e.g., the world) must instantiate a model of representations: it is often taken to justify slo-
that system. This provides a clear connection gans of the sort “the world is its own best
with the free energy principle, which proposes model” (Brooks 1991). In fact, cybernetics is
that adaptive systems minimize a limit on free agnostic with respect to the need for deploy-
energy (long-run average surprise) by inducing ment of explicit internally-specified predictive
and refining a generative model of the causes of models. If environmental circumstances are
sensory signals. It also moves beyond Ashby’s reasonably stable, and mappings between per-
homeostat by implying that model-based con- turbations and (homeostatic) responses reas-
trollers can engage in more successful multi- onably straightforward, then the good regu-
level feedback than is possible by random vari- lator theorem can be satisfied by controllers
ation of higher-order parameters. that only implicitly model their environments.
Putting these insights together provides This is the case, for instance, in the Watt gov-
a distinctive way of seeing the relevance of PP ernor: a device that is able exquisitely to con-
to cognition and biological adaptation. It can trol the output of (for instance) a steam en-
be summarized as follows. The purpose of cog- gine, in virtue of its mechanism, and not
nition (including perception and action) is to through the deployment of explicit predictive
maintain the homeostasis of essential variables models or representations (see Figure 3 and
and of internal organization (ultrastability). Van Gelder 1995; note that the governor can
Seth, A. K. (2015). The Cybernetic Bayesian Brain - From Interoceptive Inference to Sensorimotor Contingencies.
In T. Metzinger & J. M. Windt (Eds). Open MIND: 35(T). Frankfurt am Main: MIND Group. doi: 10.15502/9783958570108 8 | 24
www.open-mind.net

be described as an implicit model since it has 3 Interoceptive inference, emotion, and


variables – e.g., eccentricity of the metal balls predictive selfhood
from the central column – which map onto en-
vironmental variables that affect the homeo- 3.1 Interoceptive inference and emotion
static target – engine output). However, where
there exist many-to-many mappings between Considering the cybernetic roots of PP, together
sensory states and their probable causes, as with the free energy principle, leads to a poten-
may be the case more often than not, it will tially counterintuitive idea. This is that PP may
pay to engage explicit inferential processes in apply more naturally to interoception (the sense
order to extract the most probable causes of of the internal physiological condition of the
sensory states, insofar as these causes threaten body) than to exteroception (the classic senses,
the homeostasis of essential variables. which carry signals that originate in the ex-
ternal environment). This is because for an or-
ganism it is more important to avoid encounter-
ing unexpected interoceptive states than to
avoid encountering unexpected exteroceptive
states. A level of blood oxygenation or blood
sugar that is unexpected is likely to be bad
news for an organism, whereas unexpected ex-
teroceptive sensations (like novel visual inputs)
are less likely to be harmful and may in some
cases be desirable, as organisms navigate a del-
icate balance between exploration and exploita-
tion (Seth 2014a), testing current perceptual
hypotheses through active inference (see section
5, below), all ultimately in the service of main-
taining organismic homeostasis.
Perhaps because of its roots in Helm-
Figure 3: The Watt governor. This system, a central holtz, PP has largely been developed in the
contributor to the industrial revolution, enabled precise setting of visual neuroscience (Rao & Ballard
control over the output of (for example) steam engines. 1999), with a related but somewhat independ-
As the speed of the engine increases, power is supplied to ent line in motor control (Wolpert &
the governor (A) by a belt or chain, causing it to rotate Ghahramani 2000). Recently, an explicit ap-
more rapidly so that the metal balls have more kinetic plication of PP to interoception has been de-
energy. This causes the balls to rise (B), which closes the veloped (Seth 2013; Seth & Critchley 2013;
throttle valve (C), thereby reducing the steam flow, Seth et al. 2011; see also Gu et al. 2013). On
which in turn reduces engine speed (D). The opposite this theory of interoceptive inference (or equi-
happens when the engine speed decreases, so that the valently interoceptive predictive coding), emo-
governor maintains engine speed at a precise equilibrium. tional states (i.e., subjective feeling states)
arise from top-down predictive inference of the
In summary, rather than seeing PP as causes of interoceptive sensory signals (see
originating solely in the Helmholtzian notion Figure 4). In direct analogy to exteroceptive
of “perception as inference”, it is fruitful to PP, emotional content is constitutively spe-
see it also as part of a process of model-based cified by the content of top-down interoceptive
predictive control entailed by a fundamental predictions at a given time, marking a distinc-
imperative towards internal homeostasis. This tion with the well-studied impact of expecta-
shift in perspective reveals a distinctive tions on subsequent emotional states (see e.g.,
agenda for PP in cognitive science, to which I Ploghaus et al. 1999; Ueda et al. 2003). Fur-
shall now turn. thermore, interoceptive prediction errors can
Seth, A. K. (2015). The Cybernetic Bayesian Brain - From Interoceptive Inference to Sensorimotor Contingencies.
In T. Metzinger & J. M. Windt (Eds). Open MIND: 35(T). Frankfurt am Main: MIND Group. doi: 10.15502/9783958570108 9 | 24
www.open-mind.net

be minimized by (i) updating predictive mod- jective experiences of hunger or thirst (for
els (perception, corresponding to new emo- sugary things). Because these feeling states
tional contents); (ii) changing interoceptive are themselves surprising (and non-viable) in
signals through engaging autonomic reflexes the long run, they signal prediction errors at
(autonomic control or active inference); or (iii) hierarchically-higher levels, where predictive
performing behaviour so as to alter external models integrate multimodal interoceptive and
conditions that impact on internal homeo- exteroceptive signals. These models instanti-
stasis (allostasis; Gu & Fitzgerald 2014; Seth ate predictions of temporal sequences of
et al. 2011). matched exteroceptive and interoceptive in-
puts, which flow down through the hierarchy.
The resulting cascade of prediction errors can
then be resolved either through autonomic
control, in order to metabolize bodily fat
stores (active inference), or through allostatic
actions involving the external environment
(i.e., finding and eating sugary things).
The sequencing and balance of these
events is governed by relative precisions and
their expectations. Initially, interoceptive pre-
diction errors have high precision (weighting)
given a higher-level expectation of stable
homeostasis. Whether the resulting high-level
prediction error engages autonomic control or
allostatic behaviour (or both) depends on the
precision weighting of the corresponding predic-
tion errors. If food is readily available, consum-
matory actions lead to food intake (as described
earlier, these actions are generated by the resol-
ution of proprioceptive prediction errors). If
Figure 4: Inference and perception. Green arrows represent not, autonomic reflexes initiate the metaboliza-
exteroceptive predictions and predictions errors underpin- tion of bodily fat stores, perhaps alongside ap-
ning perceptual content, such as the visual experience of a petitive behaviours that are predicted to lead to
tomato. Orange arrows represent proprioceptive predictions the availability of food, conditioned on perform-
(and prediction errors) underlying action and the experience ing these behaviours.4
of body ownership. Blue arrows represent interoceptive pre-
dictions (and prediction errors) underlying emotion, mood, 3.2 Implications of interoceptive inference
and autonomic regulation. Hierarchically higher levels will
deploy multimodal and even amodal predictive models span- Several interesting implications arise when con-
ning these domains, which are capable of generating mul- sidering emotion as resulting from interoceptive
timodal predictions of afferent signals. inference (Seth 2013). First, the theory general-
izes previous “two factor” theories of emotion
Consider an example in which blood that see emotional content as resulting from an
sugar levels (an essential variable) fall towards interaction between the perception of physiolo-
or beyond viability thresholds, reaching unex- 4 It is interesting to consider possible dysfunctions in this process.
pected and undesirable values (Gu & Fitzger- For example, if high-level predictions about the persistence of low
blood sugar become abnormally strong (i.e., low blood sugar be-
ald 2014; Seth et al. 2011). Under interocept- comes chronically expected), allostatic food-seeking behaviours
ive inference, the following responses ensue. may not occur. This process, akin to the transition from hallucin -
ation to delusion in perceptual inference (Fletcher & Frith 2009),
First, interoceptive prediction error signals may help understand eating disorders in terms of dysfunctional
update top-down expectations, leading to sub - signalling of satiety.

Seth, A. K. (2015). The Cybernetic Bayesian Brain - From Interoceptive Inference to Sensorimotor Contingencies.
In T. Metzinger & J. M. Windt (Eds). Open MIND: 35(T). Frankfurt am Main: MIND Group. doi: 10.15502/9783958570108 10 | 24
www.open-mind.net

gical changes (James 1894) and “higher-level” autism spectrum disorders may originate in
cognitive appraisal of the context within which aberrant encoding of the salience or precision
these changes occur (Schachter & Singer 1962). of interoceptive prediction errors (Quattrocki
Instead of distinguishing “physiological” and & Friston 2014). The reasoning here is that
“cognitive” levels of description, interoceptive aberrant salience during development could
inference sees emotional content as resulting disrupt the assimilation of interoceptive and
from the multi-layered prediction of interocept- exteroceptive cues within generative models of
ive input spanning many levels of abstraction. the “self”, which would impair a child’s ability
Thus, interoceptive inference integrates cogni- to properly assign salience to socially relevant
tion and emotion within the powerful setting of signals.
PP.
The theory also connects with influential 3.3 The predictive embodied self
frameworks that link interoception with decision
making, notably the “somatic marker hypo- The maintenance of physiological homeostasis
thesis” proposed by Antonio Damasio (1994). solely through direct autonomic regulation is
According to the somatic marker hypothesis, in- obviously limited: behavioural (allostatic) inter-
tuitive decisions are shaped by interoceptive re- actions with the world are necessary if the or-
sponses (somatic markers) to potential out- ganism is to avoid surprising physiological
comes. This idea, when placed in the context of states in the long run. The ability to deploy ad-
interoceptive inference, corresponds to the guid- aptive behavioural responses mandates the ori-
ance of behavioural (allostatic) responses to- ginal Helmholtzian view of perception-as-infer-
wards the resolution of interoceptive prediction ence, which has been the primary setting for the
error (Gu & Fitzgerald 2014; Seth 2014a). It development of PP so far. A critical but argu-
follows that intuitive decisions should be af- ably overlooked middle ground, which mediates
fected by the degree to which an individual between physiological state variables and the
maintains accurate predictive models of his or external environment, is the body. On one hand,
her own interoceptive states; see Dunn et al. the body is the material vehicle through which
2010, Sokol-Hessner et al. 2014 for evidence behaviour is expressed, permitting allostatic in-
along these lines. teractions to take place. On the other, the body
There are also important implications for is itself an essential part of the organismic sys-
disorders of emotion, selfhood, and decision- tem, the homeostatic integrity of which must be
making. For example, anxiety may result from maintained. In addition, the experience of own-
the chronic persistence of interoceptive predic- ing and identifying with a particular body is a
tion errors that resist top-down suppression key component of being a conscious self (Apps
(Paulus & Stein 2006). Dissociative disorders & Tsakiris 2014; Blanke & Metzinger 2009;
like alexithymia (the inability to describe Craig 2009; Limanowski & Blankenburg 2013;
one’s own emotions), and depersonalization Seth 2013).
and derealisation (the loss of sense of reality It is tempting to ask whether common
of the self and world) may also result from predictive mechanisms could underlie not only
dysfunctional interoceptive inference, perhaps classical exteroceptive perception (like vision)
manifest in abnormally low interoceptive pre- and interoception (see above), but also their in-
cision expectations (Seth 2013; Seth et al. tegration in supporting conscious and uncon-
2011). In terms of decision-making, it may be scious representations of the body and self (Seth
productive to think of addiction as resulting 2013). The significance of this question is un-
from dysfunctional active inference, whereby derlined by realising that just as the brain has
strong interoceptive priors are confirmed no direct access to causal structures in the ex-
through action, overriding higher-order or hy- ternal environment, it also lacks direct access to
per-priors relating to homeostasis and organis- its own body. That is, given that the brain is in
mic integrity. It has even been suggested that the business of inferring the causal sources of
Seth, A. K. (2015). The Cybernetic Bayesian Brain - From Interoceptive Inference to Sensorimotor Contingencies.
In T. Metzinger & J. M. Windt (Eds). Open MIND: 35(T). Frankfurt am Main: MIND Group. doi: 10.15502/9783958570108 11 | 24
www.open-mind.net

Figure 5: The interaction of interoceptive and exteroceptive signals in shaping the experience of body ownership. A. Set-up
for applying cardio-visual feedback in the rubber hand illusion. A Microsoft Kinect obtains a real-time 3D model of a sub-
ject’s left hand. This is re-projected into the subject’s visual field using a head-mounted display and augmented reality (AR)
software. B. The colour of the virtual hand is modulated by the subject’s heart-beat. C. A similar set-up for the full-body il-
lusion whereby a visual image of a subject’s body is surrounded by a halo pulsing either in time or out of time with the
heartbeat. Panels A and B are adapted from Suzuki et al. (2013); panel C is adapted from Aspell et al. (2013).

sensory signals, a key challenge emerges when Tsakiris 2014; Blanke & Metzinger 2009). One
distinguishing those signals that pertain to the classic example is the rubber hand illusion,
body from those that originate from the ex- where the stroking of an artificial hand syn-
ternal environment. A clue to how this chal- chronously with a participant’s real hand,
lenge is met is that the physical body, unlike while visual attention is focused on the artifi-
the external environment, constantly generates cial hand, leads to the experience that the ar-
and receives internal input via its interoceptive tificial hand is somehow part of the body
and proprioceptive systems (Limanowski & (Botvinick & Cohen 1998). According to cur-
Blankenburg 2013; Metzinger 2003). This sug- rent multisensory integration models, this
gests that the experienced body (and self) de- change in the experience of body ownership is
pends on the brain’s best guess of the causes of due to correlation between vision and touch
those sensory signals most likely to be “me” overriding conflicting proprioceptive inputs
(Apps & Tsakiris 2014), across interoceptive, (Makin et al. 2008). Through the lens of PP,
proprioceptive, and exteroceptive domains (Fig- this implies that prediction errors induced by
ure 4). multisensory conflicts will over time update
There is now considerable evidence that self-related priors (Apps & Tsakiris 2014),
the experience of body ownership is highly with different signal sources (vision, touch,
plastic and depends on the multisensory integ- proprioception) each precision-weighted ac-
ration of body-related signals (Apps & cording to their expected reliability, and all in
Seth, A. K. (2015). The Cybernetic Bayesian Brain - From Interoceptive Inference to Sensorimotor Contingencies.
In T. Metzinger & J. M. Windt (Eds). Open MIND: 35(T). Frankfurt am Main: MIND Group. doi: 10.15502/9783958570108 12 | 24
www.open-mind.net

the setting of strong prior expectations for tions. Though a detailed analysis is beyond the
correlated input. 5 scope of the present paper, it is worth noting
While the potential for exteroceptive that attention is increasingly focused on the in-
multisensory integration to modulate the exper- sular cortex (especially its anterior parts) as a
ience of body ownership has been extensively potential source of interoceptive predictions,
explored both for the ownership of body parts and also as a comparator registering interocept-
and for the experience of ownership of the body ive prediction errors. The anterior insula has
as a whole (for reviews, see Apps & Tsakiris long been considered a major cortical locus for
2014; Blanke & Metzinger 2009), only recently the integration of interoceptive and exterocept-
has attention been paid to interactions between ive signals (Craig 2003; Singer et al. 2009); it is
interoceptive and exteroceptive signals. Initial strongly implicated in interoceptive sensitivity
evidence in this line of investigation was indir- (Critchley et al. 2004); it is sensitive to intero-
ect, for example showing correlation between ceptive prediction errors—at least in some con-
susceptibility to the rubber hand illusion and texts (Paulus & Stein 2006); and it has a high
individual differences in the ability to perceive density of so-called “von Economo” neurons,6
interoceptive signals (“interoceptive sensitivity”, which have been frequently though circumstan-
typically indexed by heartbeat detection tasks; tially associated with consciousness and self-
Tsakiris et al. 2011). Other relevant studies hood (Critchley & Seth 2012; Evrard et al.
have shown that body ownership illusions lead 2012).
to temperature reductions in the corresponding
body parts, perhaps reflecting altered active 3.4 Active inference, self-modeling, and
autonomic inference (Moseley et al. 2008; Sa- evolutionary robotics
lomon et al. 2013).
Emerging evidence now points more dir- What role might active inference play in pre-
ectly towards the predictive multisensory integ- dictive self-modelling? Autonomic changes dur-
ration of interoceptive and exteroceptive signals ing illusions of body ownership (see above) are
in shaping the experience of body ownership. consistent with active inference; however they
Two recent studies have taken advantage of so- do not speak directly to its function. In the
called “cardio-visual synchrony” where virtual- classic rubber hand illusion, hand or finger
reality representations of body parts (Suzuki et movements can be considered active inferential
al. 2013) or the whole body (Aspell et al. 2013) tests of self-related hypotheses. If these move-
are modulated by simultaneously recorded ments are not reflected in the “rubber hand”,
heartbeat signals, with the modulation either the illusion is destroyed—presumably because
in-time or out-of-time with the actual heartbeat predicted visual signals are not confirmed (Apps
(Figure 5). These data suggest that statistical & Tsakiris 2014). However, if hand movements
correlations between interoceptive (e.g., cardiac) are mapped to a virtual “rubber hand”—
and exteroceptive (e.g., visual) signals can lead through clever use of virtual and augmented
to the updating of predictive models of self-re- reality—the illusion is in fact strengthened, pre-
lated signals through (hierarchical) minimiza- sumably because the multisensory correlation of
tion of prediction error, just as happens for peri-hand visual and proprioceptive signals con-
purely exteroceptive multisensory conflicts in stitutes a more stringent test of the perceptual
the classic rubber hand illusion. hypothesis of ownership of the virtual hand (Su-
While these studies underline the plausib- zuki et al. 2013). This introduces the idea that
ility of common predictive mechanisms underly- active inference is not simply about confirming
ing emotion, selfhood, and perception, many sensory predictions but also involves seeking
open questions nevertheless remain. A key chal- “disruptive” actions that are most informative
lenge is to detail the underlying neural opera- with respect to testing current predictions,
5 Interestingly the expectation of perceptual correlations seems to be 6 These are long-range projection neurons found selectively in hominid
sufficient for inducing the rubber hand illusion (Ferri et al. 2013). primates and certain other species.

Seth, A. K. (2015). The Cybernetic Bayesian Brain - From Interoceptive Inference to Sensorimotor Contingencies.
In T. Metzinger & J. M. Windt (Eds). Open MIND: 35(T). Frankfurt am Main: MIND Group. doi: 10.15502/9783958570108 13 | 24
www.open-mind.net

and/or at disambiguating competing predictions highly resilient to unexpected perturbations.


(Gregory 1980). A nice example of how this For instance, if a leg is removed then proprio-
happens in practice comes from evolutionary ro- ceptive prediction errors will immediately ensue.
botics7—which is obviously a very different liter- As a result, the system will engage in another
ature, though one that inherits directly from round of self-model evolution (including the co-
the cybernetic tradition. specification of competing self-models and dis-
In a seminal 2006 study, Josh Bongard ambiguating actions) until a new, accurate, self-
and colleagues described a four-legged “starfish” model is regained. This revised self-model can
robot that engaged in a process much like active then be used to develop a new gait, allowing
inference in order to model its own morphology movement, even given the disrupted body (E,
so as to be able to control its movement and at- F).8
tain simple behavioural goals (Bongard et al. This study emphasizes that the opera-
2006). While there are important differences tional criterion for a successful self-model is not
between evolutionary robotics and (active) so much its fidelity to the physical robot, but
Bayesian inference, there are also broad similar- rather its ability to predict sensory inputs under
ities; importantly, both can be cast in terms of a repertoire of actions. This underlines that pre-
model selection and optimization. dictive models are recruited for the control of
The basic cycle of events is shown in Fig- behaviour (as cybernetics assumes) and not to
ure 6. The robot itself is shown in the centre furnish general-purpose representations of the
(A). The goal is to develop a controller capable world or the body.
of generating forward movement. The challenge The study also provides a concrete ex-
is that the robot’s morphology is unknown to ample of how actions can be performed, not to
the robot itself. The system starts with a range achieve some externally specified goal, but to
of (generic prior) potential self-models (B), here permit inference about the system’s own phys-
specified by various configurations of three-di- ical instantiation. Bayesian or not, this implies
mensional physics engines. The robot performs active inference. Indeed, perhaps its most im-
a series of initially random actions and evalu- portant contribution is that it highlights how
ates its candidate self-models on their ability to active inference can prescribe disruptive or
predict the resulting proprioceptive afferent sig- disambiguating actions that generate sensory
nals. Even though all initial models will be prediction errors under competing hypotheses,
wrong, some may be better than others. The and not just actions that seek to confirm sens-
key step comes next. The robot evaluates new ory predictions. This recalls models of atten-
candidate actions on the extent to which the tion based on maximisation of Bayesian sur-
current best self-models make different predic- prise (Itti & Baldi 2009), and is equivalent to
tions as to their (proprioceptive) consequences. hypothesis testing in science, where the best
These disambiguating actions are then per- experiments are those concocted on the basis
formed, leading to a new ranking of self-models of being most likely to falsify a given hypo-
based on their success at proprioceptive predic- thesis (disruptive) or distinguish between
tion. This ranking, via the evolutionary robotics competing hypotheses (disambiguating). It
methods of mutation and replication, gives rise also implies that agents encode predictions
to a new population of candidate self-models. about the likely sensory consequences of a
The upshot is that the system swiftly develops range of potential actions, allowing the selec-
accurate self-models that can be used to gener- tion of those actions likely to be the most dis-
ate controllers enabling movement (D). An in- ruptive or disambiguating. This concept of a
teresting feature of this process is that it is counterfactually-equipped predictive model
7 Evolutionary robotics involves the use of population-based bring us nicely to our next topic: so-called en-
search procedures (genetic algorithms) to automatically active cognitive science and its relation to PP.
specify control architectures (and/or morphologies) of mo-
bile robots. For an excellent introduction see (Bongard 8 Videos showing the evolution of both gait and self-model are avail-
2013). able from http://creativemachines.cornell.edu/emergent_self_models

Seth, A. K. (2015). The Cybernetic Bayesian Brain - From Interoceptive Inference to Sensorimotor Contingencies.
In T. Metzinger & J. M. Windt (Eds). Open MIND: 35(T). Frankfurt am Main: MIND Group. doi: 10.15502/9783958570108 14 | 24
www.open-mind.net

Figure 6: An evolutionary-robotics experiment demonstrating continuous self-modelling Bongard et al. (2006). See
text for details. Reproduced with permission.

4 Predictive processing and enactive ficial intelligence” became apparent (Brooks


cognitive science 1991). Many researchers of artificial intelligence
have indeed returned to cybernetics as an al-
4.1 Enactive theories, weak and strong ternative framework in which closely coupled
feedback loops, leveraging invariants in brain-
The idea that the brain relies on internal rep- body-world interactions, obviate the need for
resentations or models of extra-cranial states of detailed internal representations of external
affairs has been treated with suspicion ever properties (Pfeifer & Scheier 1999). The evolu-
since the limitations of “good old fashioned arti- tionary robotics methodology just described is
Seth, A. K. (2015). The Cybernetic Bayesian Brain - From Interoceptive Inference to Sensorimotor Contingencies.
In T. Metzinger & J. M. Windt (Eds). Open MIND: 35(T). Frankfurt am Main: MIND Group. doi: 10.15502/9783958570108 15 | 24
www.open-mind.net

often coupled with simple dynamical neural net- to behaviour (O’Regan & Noë 2001). In brief,
works in order to realize controllers that are SMC theory claims that experience and percep-
tightly embodied and embedded in just this way tion are not things that are “generated” by the
(Beer 2003). Within cognitive science, such anti- brain (or by anything else for that matter) but
representationalism is most vociferously defen- are, rather, “skills” consisting of fluid patterns
ded by the movement variously known as “en- of on-going interaction with the environment
active” (Noë 2004), “embodied” (Gallese & Sini- (O’Regan & Noë 2001). For instance, on SMC
gaglia 2011), or “extended” (Clark & Chalmers theory the conscious visual experience of red-
1998) cognitive science. Among these ap- ness is given by the exercise of practical mas-
proaches, it is enactivism that is most explicitly tery of the laws governing how interactions with
anti-representationalist. While enactive theorists red things unfold (these laws being the
might agree that adaptive behaviour requires “SMC”s). The theory is not, however, limited to
organisms and control structures that are sys- vision: the experiential quality of the softness of
tematically sensitive to statistical structures in a sponge would be given by (practical mastery
their environment, most will deny that this of) the laws governing its squishiness upon be-
sensitivity implies the existence and deployment ing pressed.
of any “inner description” or model of these Two aspects of SMC theory deserve em-
probabilistic patterns (Chemero 2009; Hutto & phasis here. The first is that the concept of an
Myin 2013). SMC rightly underlines the close coupling of
This tradition has weak and strong expres- perception and action and the critical import-
sions. At the weak extreme is the truism that ance of ongoing agent-environment interaction
perception, cognition, and behaviour—and their in structuring perception, action, and beha-
underlying mechanisms—cannot be understood viour. This is inherited from Gibsonian notions
without a rich appreciation of the roles of the of perceptual affordance (Gibson 1979) and has
body, the environment, and the structured in- certainly advanced our understanding of why
teractions that they support (Clark 1997; different kinds of perceptual experience (vision,
Varela et al. 1993). Weak enactivism is emin- smell, touch, etc.) have different qualitative
ently compatible with PP, as seen especially characters.
with emerging versions of PP that stress em- The second is that mastery of an SMC re-
bodiment through self-modelling and interocep- quires an essentially counterfactual knowledge of
tion, and which emphasize the importance of relations between particular actions and the res-
agent-environment coupling (embeddedness) ulting sensations. In vision, for instance, mas-
through active inference. At the other extreme tery entails an implicit knowledge of the ways in
lie claims that explanations based on internal which moving our eyes and bodies would reveal
representations or models of any sort are funda- additional sensory information about perceptual
mentally misguided, and that a new explicitly objects (O’Regan & Noë 2001). Here SMC the-
non-representational vocabulary is needed in or- ory has made an important contribution to our
der to make sense of the relations between understanding of perceptual presence. Percep-
brains, bodies, and the world (O’Regan et al. tual presence refers to the property whereby (in
2005). Strong enactivism is by definition incom- normal circumstances) perceptual contents ap-
patible with PP since it rejects the core concept pear as subjectively real, that is, as existing. For
of the internal model. example, when viewing a tomato, we see it as
real inasmuch as we seem to be perceptually
4.2 Sensorimotor contingency theory aware of some of its parts (e.g., its back) that
are not currently causally impacting our sensory
A landmark in the strongly enactive approach is surfaces. Looking at a picture of a tomato does
SMC (sensorimotor contingency) theory, which not give rise to the same subjective impression
says that perception depends on the “practical of realness. But how can we be aware of parts of
mastery” of sensorimotor dependencies relevant the tomato that, strictly speaking, we do not
Seth, A. K. (2015). The Cybernetic Bayesian Brain - From Interoceptive Inference to Sensorimotor Contingencies.
In T. Metzinger & J. M. Windt (Eds). Open MIND: 35(T). Frankfurt am Main: MIND Group. doi: 10.15502/9783958570108 16 | 24
www.open-mind.net

see? SMC theory says the answer lies in our therefore be understood as instantiating the im-
(implicit) mastery of SMCs, which relate poten- plicit sub-personal knowledge of sensorimotor
tial actions to their likely sensory effects; and it constructs underlying SMCs and their acquisi-
is in this sense that we can be perceptually tion. Put simply, hierarchical active inference
aware of parts of the tomato that we cannot ac- implies the existence of predictive models en-
tually see (Noë 2006). coding information very much like that required
SMC theory has often been set against by SMC theory.
naïve representationalist theories in cognitive The next step is to incorporate the notion
science that propose such things as “pictures in of mastery of SMCs, which, as mentioned, im-
the head” or that (like good-old-fashioned-AI) plies an essentially counterfactual kind of impli-
treat accurate representations of external prop- cit knowledge. The simple solution is to aug-
erties as general-purpose goal states for cogni- ment the predictive models that animate PP
tion. This is all to the good. Yet by dispensing with counterfactual probability densities.10 As
with implementation-level concepts such as pre- introduced earlier (section 4.1), counterfactu-
dictive inference, it struggles with the import- ally-equipped predictive models encode not only
ant question of what exactly is going on in our the likely causes of current sensory input, but
heads during the exercise of mastery of a sen- also the likely causes of fictive sensory inputs
sorimotor contingency. 9 conditioned on possible but not executed ac-
tions. That is, they encode how sensory inputs
4.3 Predictive perception of sensorimotor (and their expected precisions) would change on
contingencies the basis of a repertoire of possible actions (ex-
pressed as proprioceptive predictions), even if
A powerful response is given by integrating those actions are not performed. The counter-
SMC theory with PP, in the guise of PPSMC factual encoding of expected precision is im-
(Predictive Perception of SensoriMotor Contin- portant here, since it is on this basis that ac-
gencies; Seth 2014b). An extensive development tions can be selected for their likelihood of min-
of PPSMC is given elsewhere (see Seth 2014b imizing the conditional uncertainty associated
plus commentaries and response). Here I sum- with a perceptual prediction. There is a math-
marize the main points. First, recall that under ematical basis for manipulating counterfactual
PP prediction errors can be minimized either by beliefs of this kind, as shown in a recent model
updating perceptual predictions or by perform- where counterfactual PP drives oculomotor con-
ing actions, where actions are generated trol during visual search (Friston 2014; Friston
through the resolution of proprioceptive predic- et al. 2012).11 Here the main point is that coun-
tion errors. Also recall that PP is inherently terfactually-rich predictive models supply just
hierarchical, so that at some hierarchical level what is needed by SMC theory: an answer to
predictive models will encode multimodal and the question of what is going on inside our
even amodal expectations linking exteroceptive heads during the exercise of mastery of SMCs.
(sensory) and proprioceptive (motor) sensations. Counterfactual PP makes sense from sev-
These models generate predictions about linked eral perspectives (Seth 2014b). As mentioned
sequences of sensory and proprioceptive (and above, it provides a neurocognitive operational-
possibly interoceptive) inputs corresponding to isation of the notion of mastery of SMCs that is
specific actions, with predictions becoming in- central to enactive cognitive science. In doing so
creasingly modality-specific at lower hierarchical it dissolves apparent tensions between enactive
levels. These multi-level predictive models can 10 See Beaton (2013) for a distinct approach to incorporating counter-
factual ideas in SMC theory. Beaton’s approach remains squarely
9 At a recent symposium of the AISB society that focused on SMC within the strongly enactivist tradition.
theory, it was stated that “the main question is how to get the brain 11 There are also some challenges lying in wait here. For instance, it
into view from an enactive/sensorimotor perspective. […] Addressing is not immediately clear how important assumptions like the
this question is urgently needed, for there seem to be no accepted al- Laplace approximation can generalize to the multimodal probab-
ternatives to representational interpretations of the inner processes” ility distributions entailed by counterfactual PP (Otworowska et
(O’Regan & Dagenaar 2014). al. 2014).

Seth, A. K. (2015). The Cybernetic Bayesian Brain - From Interoceptive Inference to Sensorimotor Contingencies.
In T. Metzinger & J. M. Windt (Eds). Open MIND: 35(T). Frankfurt am Main: MIND Group. doi: 10.15502/9783958570108 17 | 24
www.open-mind.net

cognitive science and approaches grounded in There are some challenges to thinking that
the Bayesian brain, but only at the price of re- perceptual presence uniquely depends on coun-
jecting the strong enactivist’s insistence that in- terfactual richness. One might think that the
ternal models or representations—of any sort— more familiar one is with an object, the richer
are unacceptable.12 PPSMC also provides a the repertoire of counterfactual relations that
solution to the challenge of accounting for per- will be encoded. If so, the more familiar one is
ceptual presence within PP. The idea here is with an object, the more it should appear to be
that perceptual presence corresponds to the real. But prima facie it is not clear that famili-
counterfactual richness of predictive models. arity and perceptual presence go hand-in-hand
That is, perceptual contents enjoy presence to like this.14 Also, some perceptual experiences
the extent that the corresponding predictive (like the experience of a blue sky) can seem
models encode a rich repertoire of counterfac- highly perceptually present despite engaging an
tual relations linking potential actions to their apparently poor repertoire of counterfactual re-
likely sensory consequences.13 In other words, we lations linking sensory signals to possible ac-
experience normal perception as world-revealing tions. An initial response is to consider that
precisely because the predictive models underly- presence might depend not on counterfactual
ing perceptual content specify a rich repertoire richness per se, but on a “normalized” richness
of counterfactually explicit probability densities based on higher-order expectations of counter-
encoding the mastery of SMCs. factual richness (which would be low for the
A good test of PPSMC is whether it can ac- blue sky, for instance). These considerations
count for cases where normal perceptual presence also point to potentially important distinctions
is lacking. An important example is synaesthesia, between perceived objecthood and perceived
where it is widely reported that synaesthetic “con- presence, a proper treatment of which moves
currents” (e.g., the inexistent colours sometimes beyond the scope of the present paper.
perceived along with achromatic grapheme in-
ducers) are not experienced as being part of the 5 Active inference
world (i.e., synaesthetes generally retain intact
reality testing with respect to their concurrent ex- 5.1 Counterfactual PP and active
periences). PPSMC explains this by noticing that inference
predictive models related to synaesthetic concur-
rents are counterfactually poor. The hidden (envir- Active inference has appeared repeatedly as an
onmental) causes giving rise to concurrent-related important concept throughout this paper. Yet it
sensory signals do not embed a rich and deep stat- is more difficult to grasp than the basics of PP,
istical structure for the brain to learn. In particu- which involve passive predictive inference. This
lar, there is very little sense in which synaesthetic is partly because several senses of active infer-
concurrents depend on active sampling of their ence can be distinguished, which have not previ-
hidden causes. According to PPSMC, it is this ously been fully elaborated.
comparative counterfactual poverty that explains In general, active inference can be har-
why synaesthetic concurrents lack perceptual pres- nessed to drive action, or to improve perceptual
ence. SMC theory itself struggles to account for predictions. In the former case, actions emerge
this phenomenon—not least because it struggles to from the minimization of proprioceptive predic-
account for synaesthesia in the first place (Gray tion errors through engaging classical reflex arcs
2003). (Friston et al. 2010). This implies the existence
12 There is a more dramatic conflict with “radical” versions of enactiv- of generative models that predict time-varying
ism, in which mental processes, and in some cases even their material flows of proprioceptive inputs (rather than just
substrates, are allowed to extend beyond the confines of the skull
(Hutto & Myin 2013). end-points), and also the transient reduction of
13 Presence may also depend on the hierarchical depth of predictive expected precision of proprioceptive prediction
models inasmuch as this reflects object-related invariances in percep-
tion. For further discussion see commentaries and response to (Seth
2014b). 14 Thanks to my reviewers for raising this provocative point.

Seth, A. K. (2015). The Cybernetic Bayesian Brain - From Interoceptive Inference to Sensorimotor Contingencies.
In T. Metzinger & J. M. Windt (Eds). Open MIND: 35(T). Frankfurt am Main: MIND Group. doi: 10.15502/9783958570108 18 | 24
www.open-mind.net

errors, corresponding to sensory attenuation competing models is mediated by trade-offs


(Brown et al. 2013). between accuracy and model complexity (Rosa
In the latter case, actions are engaged in et al. 2012).
order to generate new sensory samples, with the The information-seeking (or “infotropic”15)
aim of minimizing uncertainty in perceptual role of active inference puts a different gloss on
predictions. This can be achieved in several dif- the free energy principle, which had been inter-
ferent ways, as is apparent by analogy with ex- preted simply as minimization of prediction er-
perimental design in scientific hypothesis test- ror. Rather, now the idea is that systems best
ing. Actions can be selected that (i) are expec- ensure their long-run survival by inducing the
ted to confirm current perceptual hypotheses most predictive model of the causes of sensory
(Friston et al. 2012); (ii) are expected to dis- signals, and this requires disruptive and/or dis-
confirm such hypotheses; or (iii) are expected to ambiguating active inference, in order to always
disambiguate between competing hypotheses put the current-best model to the test. This
(Bongard et al. 2006). A scientist may perform view helps dissolve worries about the so-called
different experiments when attempting to find “dark room problem” (Friston et al. 2012), in
evidence against a current hypothesis than which prediction error is minimized by predict-
when trying to decide between different hypo- ing something simple (e.g., the absence of visual
theses. In just the same way, active inference input) and then trivially confirming this predic-
may prescribe different sampling actions for tion (e.g., by closing one’s eyes).16 Previous re-
these different objectives. sponses to this challenge have appealed to the
These distinctions underline that active in- idea of higher-order priors that are incompatible
ference implies counterfactual PP. In order for a with trivial minimization of lower-level predic-
brain to select those actions most likely to con- tion errors: closing one’s eyes (or staying put in
firm, disconfirm, or decide between current pre- a dark room) is not expected to lead to homeo-
dictive model(s), it is necessary to encode ex- static integrity on average and over time (Fris-
pected sensory inputs and precisions related to ton et al. 2012; Hohwy 2013). It is perhaps
potential (but not executed) actions. This is more elegant to consider that disruptive and
evident in the example of oculomotor control disambiguatory active inferences imply explor-
described earlier (Friston et al. 2012). Here, sac- atory sampling actions, independent of any
cades are guided on the basis of the expected higher-order priors about the dynamics of sens-
precision of sensory prediction errors so as to ory signals per se. Further work is needed to see
minimize the uncertainty in current perceptual how cost functions reflecting infotropic active
predictions. Note that this study retained the inference can be explicitly incorporated into PP
higher-order prior that only a single perceptual and the free energy principle.
prediction exists at any one time, precluding
active inference in its disambiguatory sense. 5.2 Active interoceptive inference and
Several related ideas arise in connection counterfactual PP
with these new readings of active inference.
Seeking disconfirmatory or disruptive evidence What can be said about counterfactual PP and
is closely related to maximizing Bayesian sur- active inference when applied to interoception?
prise (Itti & Baldi 2009). This also reminds us Is there a sense in which predictive models un-
that the best statistical models are usually derlying emotion and mood encode counterfac-
those that successfully account for the most tual associations linking fictive interoceptive
variance with the fewest degrees of freedom signals (and their likely causes) to autonomic or
(model parameters), not just those that result allostatic controls? And if so, what phenomeno-
in low residual error per se. In addition, disam- 15 Chris Thornton came up with this term (personal communication).
biguating competing hypotheses moves from 16 The term “dark room problem” comes from the idea that a free-en-
Bayesian model selection and optimization to ergy-minimizing (or surprise-avoiding) agent could minimize predic-
tion error just by finding an environment that lacks sensory stimula-
model comparison, where arbitration among tion (a “dark room”) and staying there.

Seth, A. K. (2015). The Cybernetic Bayesian Brain - From Interoceptive Inference to Sensorimotor Contingencies.
In T. Metzinger & J. M. Windt (Eds). Open MIND: 35(T). Frankfurt am Main: MIND Group. doi: 10.15502/9783958570108 19 | 24
www.open-mind.net

logical dimensions of affective experience de- nishing representations of the external world for
pend on these associations? While these remain the consumption of general-purpose cognitive
open questions, we can at least sketch the ter- mechanisms, towards model-based predictive
ritory. control as a primary survival imperative from
We have seen that active inference in ex- which perception, action, and cognition ensue.
teroception implies counterfactual processing, so This view is aligned with the free energy prin-
that actions can be chosen according to their ciple (Friston 2010); however it attempts to ac-
predicted effects in terms of (dis)confirming or count for specific cognitive and phenomenolo-
disambiguating sensory predictions. The same gical properties, rather than for adaptive sys-
argument applies to interoception. For active in- tems in general. Several implications follow
teroceptive inference to effectively disambiguate from these considerations. Emotion becomes a
predictive models, or (dis)confirm interoceptive process of active interoceptive inference (Seth
predictions, predictive models must be equipped 2013)—a process that also recruits autonomic
with counterfactual associations relating to the regulation and influences intuitive decision-mak-
likely effects of autonomic or (at higher hier- ing through behavioural allostasis. A common
archical levels) allostatic controls. At least in predictive principle underlying interoception
this sense, interoceptive inference then also in- and exteroception also provides an integrative
volves counterfactual expectations. view of the neurocognitive mechanisms underly-
That said, there are likely to be substan- ing embodied selfhood, in particular the experi-
tial differences in how counterfactual active in- ence of body ownership (Apps & Tsakiris 2014;
ference plays out in interoceptive settings. For Limanowski & Blankenburg 2013; Suzuki et al.
instance, it may not be adaptive (in the long 2013). In this view, the experience of embodied
run) for organisms to continually attempt to selfhood is specified by the brain’s “best guess”
disconfirm current interoceptive predictions, as- of those signals most likely to be “me” across
suming these are compatible with homeostatic exteroceptive and interoceptive domains. From
integrity. To put it colloquially, we do not want the perspective of cybernetics the embodied self
to drive our essential variables continually close is both that which needs to be homeostatically
to viability limits, just to check whether they maintained and also the medium through which
are always capable of returning. This recalls our allostatic interactions are expressed.
earlier point (section 4.1) that predictive control A second influential line deriving from cy-
is more naturally applicable to interoception bernetics sets PP within the broader context of
than exteroception, given the imperative of model-based versus enactivist perspectives on
maintaining the homeostasis of essential vari- cognitive science. On one hand, cybernetics has
ables. In addition, the causal structure of coun- been cited in support of non-representational
terfactual associations encoded by interoceptive cognitive science in virtue of its showing how
predictive models is undoubtedly very different simple mechanisms can give rise to complex and
than in cases like vision. These differences may apparently goal-directed behaviour by capitaliz-
speak to the substantial phenomenological dif- ing on agent-environment interactions, mediated
ferences in the kind of perceptual presence asso- by the body (Pfeifer & Scheier 1999). On the
ciated with these distinct conscious contents other, the cybernetic legacy shows how PP can
(Seth et al. 2011). put mechanistic flesh on the philosophical bones
of enactivism, but only by embracing a finessed
6 Conclusion form of representationalism (Seth 2014b). A key
concept within enactive cognitive science is that
This paper has surveyed predictive processing of mastery of sensorimotor contingencies
(PP) from the unusual viewpoint of cybernetic (SMCs). This concept is useful for understand-
origins in active homeostatic control (Ashby ing the qualitative character of distinct percep-
1952; Conant & Ashby 1970). This shifts the tual modalities, yet as expressed within enactiv-
perspective from perceptual inference as fur- ism it lacks a firm implementation basis. “Pre-
Seth, A. K. (2015). The Cybernetic Bayesian Brain - From Interoceptive Inference to Sensorimotor Contingencies.
In T. Metzinger & J. M. Windt (Eds). Open MIND: 35(T). Frankfurt am Main: MIND Group. doi: 10.15502/9783958570108 20 | 24
www.open-mind.net

dictive Perception of SensoriMotor Contingen- mentally and in distinguishing them from al-
cies” (PPSMC) addresses this challenge by pro- ternative explanations that do not rely on in-
posing that SMCs are implemented by predict- ternally-realised inferential mechanisms.
ive models of sensorimotor relations, under-
pinned by the continuity between perception Acknowledgements
and action entailed by active inference. Mastery
of sensorimotor contingencies depends on pre- I am grateful to the Dr. Mortimer and Theresa
dictive models of counterfactual probability Sackler Foundation, which supports the work of
densities that specify the likely causes of sens- the Sackler Centre for Consciousness Science.
ory signals that would occur were specific ac- This work was also supported by ERC FP7
tions taken. By relating PP to key concepts in grant CEEDs (FP7-ICT-2009-5, 258749). Many
enactivism, this theory is able to account for thanks to Thomas Metzinger and Jennifer
phenomenological features well treated by the Windt for inviting me to make this contribu-
latter, such as the experience of perceptual tion, and for the insightful and helpful reviewer
presence (and its absence in cases like synaes- comments they solicited. I’m also grateful to
thesia). Kevin O’Regan and Jan Dagenaar for inviting
Considering these issues leads to distinct me to speak at a symposium entitled “Con-
readings of active inference, which at its most sciousness without inner models?” (London,
general implies the selective sampling of sensory April 2014), which provided a feisty forum for
signals to minimize uncertainty about percep- debating some of the ideas presented here.
tual predictions. At a finer grain, active infer-
ence can involve performing actions to confirm References
current predictions, to disconfirm current pre-
dictions, or to disambiguate competing predic- Adams, R. A., Shipp, S. & Friston, K. J. (2013). Predic-
tions. These different senses rest on the concept tions not commands: Active inference in the motor sys-
of counterfactually-equipped predictive models; tem. Brain Structure and Function, 218 (3), 611-643.
and they generalize the free energy principle to 10.1007/s00429-012-0475-5
include Bayesian-model comparison as well as Apps, M. A. & Tsakiris, M. (2014). The free-energy self:
optimization and inference. A predictive coding account of self-recognition. Neuros-
In summary, the ideas outlined in this pa- cience and Biobehavioral Reviews, 41, 85-97.
per provide a distinctive integration of predict- 10.1016/j.neubiorev.2013.01.029
ive processing, cybernetics, and enactivism. Ashby, W. R. (1952). Design for a brain. London, UK:
This rich blend dissolves apparent tensions Chapman and Hall.
between internalist and enactivist (model-based (1956). An introduction to cybernetics. London,
and model-free) views on the neural mechan- UK: Chapman and Hall.
isms underlying perception, cognition, and ac- Aspell, J. E., Heydrich, L., Marillier, G., Lavanchy, T.,
tion; it elaborates common predictive mechan- Herbelin, B. & Blanke, O. (2013). Turning the body
isms underlying perception and control of self and self inside out: Visualized heartbeats alter bodily
and world; it provides a new view of emotion as self-consciousness and tactile perception. Psychological
active interoceptive inference, and it shows how Science, 24 (12), 2445-2453. 10.1177/0956797613498395
“counterfactual” predictive processing can ac- Bastos, A. M., Usrey, W. M., Adams, R. A., Mangun, G.
count for the phenomenology of conscious pres- R., Fries, P. & Friston, K. J. (2012). Canonical micro-
ence and its absence in specific situations. It circuits for predictive coding. Neuron, 76 (4), 695-711.
also finesses the concept of active inference to 10.1016/j.neuron.2012.10.038
engage distinct forms of hypothesis testing that Beaton, M. (2013). Phenomenology and embodied action.
prescribe different sampling actions (one bonus Constructivist Foundations, 8 (3), 298-313.
is that the “dark room problem” is elegantly Beer, R. D. (2003). The dynamics of active categorical
solved). At the same time, new and difficult perception in an evolved model agent. Adaptive Beha-
challenges arise in validating these ideas experi- vior, 11 (4), 209-243. 10.1177/1059712303114001

Seth, A. K. (2015). The Cybernetic Bayesian Brain - From Interoceptive Inference to Sensorimotor Contingencies.
In T. Metzinger & J. M. Windt (Eds). Open MIND: 35(T). Frankfurt am Main: MIND Group. doi: 10.15502/9783958570108 21 | 24
www.open-mind.net

Blanke, O. & Metzinger, T. (2009). Full-body illusions Critchley, H. D. & Seth, A. K. (2012). Will studies of
and minimal phenomenal selfhood. Trends in Cognitive macaque insula reveal the neural mechanisms of self-
Sciences, 13 (1), 7-13. 10.1016/j.tics.2008.10.003 awareness? Neuron, 74 (3), 423-426.
Bongard, J. (2013). Evolutionary robotics. Communica- 10.1016/j.neuron.2012.04.012
tions of the ACM, 56 (8), 74-85. 10.1145/2493883 Damasio, A. (1994). Descartes’ error. London, UK: Mac
Bongard, J., Zykov, V. & Lipson, H. (2006). Resilient ma- Millan.
chines through continuous self-modeling. Science, 314 Dunn, B. D., Galton, H. C., Morgan, R., Evans, D.,
(5802), 1118-1121. 10.1126/science.1133687 Oliver, C., Meyer, M. & Dalgleish, T. (2010). Listening
Botvinick, M. & Cohen, J. (1998). Rubber hands ‘feel’ to your heart. How interoception shapes emotion ex-
touch that eyes see. Nature, 391 (6669), 756-756. perience and intuitive decision making. Psychological
10.1038/35784 Science, 21 (12), 1835-1844. 10.1177/0956797610389191
Braitenberg, V. (1984). Vehicles: Experiments in syn- Dupuy, J.-P. (2009). On the origins of cognitive science:
thetic psychology. Cambridge, MA: MIT Press. The mechanization of mind. Cambridge, MA: MIT
Brooks, R. A. (1991). Intelligence without reason. In J. Press.
Mylopoulos & R. Reiter (Eds.) Proceedings of the 12th Evrard, H. C., Forro, T. & Logothetis, N. K. (2012). Von
international joint conference on artificial intelligence economo neurons in the anterior insula of the macaque
- volume 1 (pp. 569-595). San Francisco, CA: Morgan monkey. Neuron, 74 (3), 482-489.
Kaufmann Publishers. 10.1016/j.neuron.2012.03.003
Brown, H., Adams, R. A., Parees, I., Edwards, M. & Fris- Ferri, F., Chiarelli, A. M., Merla, A., Gallese, V. & Cost-
ton, K. J. (2013). Active inference, sensory attenuation antini, M. (2013). The body beyond the body: Expect-
and illusions. Cognitive Processing, 14 (4), 411-427. ation of a sensory event is enough to induce ownership
10.1007/s10339-013-0571-3 over a fake hand. Proceedings of the Royal Society B:
Chemero, A. (2009). Radical embodied cognitive science. Biological Sciences, 280 (1765), 20131140-20131140.
Cambridge, MA: MIT Press. 10.1098/rspb.2013.1140
Clark, A. (1997). Being there. Putting brain, body, and Fletcher, P. C. & Frith, C. D. (2009). Perceiving is believ-
world together again. Cambridge, MA: MIT Press. ing: A Bayesian approach to explaining the positive
(2013). Whatever next? Predictive brains, situated symptoms of schizophrenia. Nature Reviews Neuros-
agents, and the future of cognitive science. Behavior cience, 10 (1), 48-58. 10.1038/nrn2536
and Brain Sciences, 36 (3), 181-204. Friston, K. J. (2005). A theory of cortical responses.
10.1017/S0140525X12000477 Philosophical Transactions of the Royal Society B: Bio-
(2015). Embodied prediction. In T. Metzinger & logical Sciences, 360 (1456), 815-836.
J. M. Windt (Eds.) Open MIND (pp. 1-21). Frankfurt 10.1098/rstb.2005.1622
a. M., GER: MIND Group. (2009). The free-energy principle: A rough guide
Clark, A. & Chalmers, D. J. (1998). The extended mind. to the brain? Trends in Cognitive Sciences, 13 (7), 293-
Analysis, 58 (1), 7-19. 10.1093/analys/58.1.7 301. 10.1016/j.tics.2009.04.005.
Conant, R. & Ashby, W. R. (1970). Every good regulator (2010). The free-energy principle: A unified brain
of a system must be a model of that system. Interna- theory? Nature Reviews Neuroscience, 11 (2), 127-138.
tional Journal of Systems Science, 1 (2), 89-97. 10.1038/nrn2787
Craig, A. D. (2003). Interoception: The sense of the (2014). Active inference and agency. Cognitive
physiological condition of the body. Current Opin- Neuroscience, 5 (2), 119-121.
ion in Neurobiology, 13 (4), 500-505. 10.1080/17588928.2014.905517
10.1016/S0959 Friston, K. J., Kilner, J. & Harrison, L. (2006). A free en-
(2009). How do you feel now? The anterior insula ergy principle for the brain. Journal of Physiology -
and human awareness. Nature Reviews Neuroscience, Paris, 100 (1-3), 70-87.
10 (1), 59-70. 10.1038/nrn2555 10.1016/j.jphysparis.2006.10.001
Critchley, H. D., Wiens, S., Rotshtein, P., Ohman, A. & Friston, K. J., Daunizeau, J., Kilner, J. & Kiebel, S. J.
Dolan, R. J. (2004). Neural systems supporting intero- (2010). Action and behavior: A free-energy formula-
ceptive awareness. Nature Neuroscience, 7 (2), 189-195. tion. Biological Cybernetics, 102 (3), 227-260.
10.1038/nn1176 10.1007/s00422-010-0364-z

Seth, A. K. (2015). The Cybernetic Bayesian Brain - From Interoceptive Inference to Sensorimotor Contingencies.
In T. Metzinger & J. M. Windt (Eds). Open MIND: 35(T). Frankfurt am Main: MIND Group. doi: 10.15502/9783958570108 22 | 24
www.open-mind.net

Friston, K. J., Adams, R. A., Perrinet, L. & Breakspear, Lee, T. S. & Mumford, D. (2003). Hierarchical
M. (2012). Perceptions as hypotheses: Saccades as ex- Bayesian inference in the visual cortex. Journal of
periments. Frontiers in Psychology, 3 (151), 1-20. the Optical Society of America A,Optics, image sci-
10.3389/fpsyg.2012.00151 ence and vision, 20 (7), 1434-1448.
Friston, K. J., Thornton, C. & Clark, A. (2012). Free-en- 10.1364/JOSAA.20.001434
ergy minimization and the dark-room problem. Fronti- Limanowski, J. & Blankenburg, F. (2013). Minimal self-
ers in Psychology, 3 (130), 1-7. models and the free energy principle. Frontiers in Hu-
10.3389/fpsyg.2012.00130 man Neurosciences, 7 (547), 1-20.
Gallese, V. & Sinigaglia, C. (2011). What is so special 10.3389/fnhum.2013.00547
about embodied simulation? Trends in Cognitive Sci- Makin, T. R., Holmes, N. P. & Ehrsson, H. H. (2008). On
ences, 15 (11), 512-519. 10.1016/j.tics.2011.09.003 the other hand: Dummy hands and peripersonal space.
Gibson, J. J. (1979). The ecological approach to visual Behavioural Brain Research, 191 (1), 1-10.
perception. Hillsdale, NJ: Lawrence Erlbaum. 10.1016/j.bbr.2008.02.041
Gray, J. A. (2003). How are qualia coupled to functions? Metzinger, T. (2003). Being no one. Cambridge, MA:
Trends in Cognitive Sciences, 7 (5), 192-194. MIT Press.
10.1016/S1364-6613(03)00077-9 Moseley, G. L., Olthof, N., Venema, A., Don, S., Wijers,
Gregory, R. L. (1980). Perceptions as hypotheses. Philo- M., Gallace, A. & Spence, C. (2008). Psychologically
sophical Transactions of the Royal Society B: Biolo- induced cooling of a specific body part caused by the
gical Sciences, 290 (1038), 181-197. illusory ownership of an artificial counterpart. Proceed-
10.1098/rstb.1980.0090 ings of the National Academy of Sciences of the United
Gu, X., Hof, P. R., Friston, K. J. & Fan, J. (2013). An- States of America, 105 (35), 13169-13173.
terior insular cortex and emotional awareness. Journal 10.1073/pnas.0803768105
of Comparative Neurology, 521 (15), 3371-3388. Neal, R. M. & Hinton, G. (1998). A view of the EM al-
10.1002/cne.23368 gorithm that justifies incremental, sparse, and other
Gu, X. & Fitzgerald, T. H. (2014). Interoceptive infer- variants. In M. I. Jordan (Ed.) Learning in Graphical
ence: Homeostasis and decision-making. Trends in Cog- Models (pp. 355-368). Dordrecht, NL: Kluwer Academic
nitive Sciences, 18 (6), 269-270. Publishers.
10.1016/j.tics.2014.02.001 Noë, A. (2004). Action in perception. Cambridge, MA:
Hinton, G. E. & Dayan, P. (1996). Varieties of Helmholtz MIT Press.
Machine. Neural Networks, 9 (8), 1385-1403. (2006). Experience without the head. Clarendon,
10.1016/S0893 NY: Oxford University Press.
Hohwy, J. (2013). The predictive mind. Oxford, UK: Ox- O’Regan, J. K. & Dagenaar, J. (2014). Consciousness
ford University Press. without inner models: A sensorimotor account of what
(2015). The neural organ explains the mind. In T. is going on in our heads. Proceedings of the AISB.
Metzinger & J. M. Windt (Eds.) Open MIND (pp. 1- http://doc.gold.ac.uk/aisb50/
22). Frankfurt a.M., GER: MIND Group. O’Regan, J. K., Myin, E. & Noë, A. (2005). Skill, corpor-
Hutto, D. & Myin, E. (2013). Radicalizing enactivism: ality and alerting capacity in an account of sensory
Basic minds without content. Cambridge, MA: MIT consciousness. Progress in Brain Research, 150, 55-68.
Press. 10.1016/S0079-6123(05)50005-0
Itti, L. & Baldi, P. (2009). Bayesian surprise attracts O’Regan, J. K. & Noë, A. (2001). A sensorimotor account
human attention. Vision Research, 49 (10), 1295- of vision and visual consciousness. Behavioral and
1306. Brain Sciences, 24 (5), 939-1031.
10.1016/j.visres.2008.09.007 Otworowska, M., Kwisthout, J. & van Rooj, I. (2014).
James, W. (1894). The physical basis of emotion. Psycho- Counterfactual mathematics of counterfactual predict-
logical Review, 1, 516-529. ive models. Frontiers in psychology: Consciousness Re-
Knill, D. C. & Pouget, A. (2004). The Bayesian brain: search, 5 (801), 1-2. 10.3389/fpsyg.2014.00801
The role of uncertainty in neural coding and computa- Paulus, M. P. & Stein, M. B. (2006). An insular view of
tion. Trends in Neurosciences, 27 (12), 712-719. anxiety. Biological psychiatry, 60 (4), 383-387.
10.1016/j.tins.2004.10.007 10.1016/j.biopsych.2006.03.042

Seth, A. K. (2015). The Cybernetic Bayesian Brain - From Interoceptive Inference to Sensorimotor Contingencies.
In T. Metzinger & J. M. Windt (Eds). Open MIND: 35(T). Frankfurt am Main: MIND Group. doi: 10.15502/9783958570108 23 | 24
www.open-mind.net

Pfeifer, R. & Scheier, C. (1999). Understanding intelli- ence. Frontiers in Psychology, 2 (395), 1-16.
gence. Cambridge, MA: MIT Press. 10.3389/fpsyg.2011.00395
Pickering, A. (2010). The cybernetic brain: Sketches of Singer, T., Critchley, H. D. & Preuschoff, K. (2009). A
another future. Chicago, IL: University of Chicago common role of insula in feelings, empathy and uncer-
Press. tainty. Trends in Cognitive Sciences, 13 (8), 334-340.
Ploghaus, A., Tracey, I., Gati, J. S., Clare, S., Menon, R. 10.1016/j.tics.2009.05.001
S., Matthews, P. M. & Rawlins, J. N. (1999). Dissoci- Sokol-Hessner, P., Hartley, C. A., Hamilton, J. R. &
ating pain from its anticipation in the human brain. Phelps, E. A. (2014). Interoceptive ability predicts
Science, 284 (5422), 1979-1981. aversion to losses. Cognition and Emotion, 1-7.
10.1126/science.284.5422.1979 10.1080/02699931.2014.925426
Pouget, A., Beck, J. M., Ma, W. J. & Latham, P. E. Suzuki, K., Garfinkel, S. N., Critchley, H. D. & Seth, A.
(2013). Probabilistic brains: Knowns and unknowns. K. (2013). Multisensory integration across exterocept-
Nature Neuroscience, 16 (9), 1170-1178. ive and interoceptive domains modulates self-experi-
10.1038/nn.3495 ence in the rubber-hand illusion. Neuropsychologia, 51
Quattrocki, E. & Friston, K. (2014). Autism, oxytocin (13), 2909-2917.
and interoception. Neuroscience and Biobehavioral Re- 10.1016/j.neuropsychologia.2013.08.014
views, 47C, 410-430. 10.1016/j.neubiorev.2014.09.012 Thompson, E. & Varela, F. J. (2001). Radical embodi-
Rao, R. P. & Ballard, D. H. (1999). Predictive coding in ment: Neural dynamics and consciousness. Trends in
the visual cortex: A functional interpretation of some Cognitive Sciences, 5 (10), 418-425.
extra-classical receptive-field effects. Nature Neuros- 10.1016/S1364-6613(00)01750-2
cience, 2 (1), 79-87. 10.1038/4580 Tsakiris, M., Tajadura-Jimenez, A. & Costantini, M.
Rosa, M. J., Friston, K. J. & Penny, W. (2012). Post-hoc (2011). Just a heartbeat away from one’s body: Intero-
selection of dynamic causal models. Journal of Neuros- ceptive sensitivity predicts malleability of body-repres-
cience Methods, 208 (1), 66-78. entations. Proceedings. Biological sciences / The Royal
10.1016/j.jneumeth.2012.04.013 Society, 278 (1717), 2470-2476. 10.1098/rspb.2010.2547
Salomon, R., Lim, M., Pfeiffer, C., Gassert, R. & Blanke, Ueda, K., Okamoto, Y., Okada, G., Yamashita, H., Hori,
O. (2013). Full body illusion is associated with wide- T. & Yamawaki, S. (2003). Brain activity during ex-
spread skin temperature reduction. Frontiers in Beha- pectancy of emotional stimuli: An fMRI study.
vioral Neuroscience, 7 (65), 1-11. NeuroReport, 14 (1), 51-55.
10.3389/fnbeh.2013.00065 10.1097/01.wnr.0000050712.17082.1c
Schachter, S. & Singer, J. E. (1962). Cognitive, social, Van Gelder, T. (1995). What might cognition be if not
and physiological determinants of emotional state. Psy- computation? Journal of Philosophy, 92 (7), 345-381.
chological Review, 69, 379-399. 10.1037/h0046234 10.2307/2941061
Seth, A. K. (2013). Interoceptive inference, emotion, and Varela, F., Thompson, E. & Rosch, E. (1993). The em-
the embodied self. Trends in Cognitive Sciences, 17 bodied mind: Cognitive science and human experience.
(11), 565-573. 10.1016/j.tics.2013.09.007 Cambridge, MA: MIT Press.
(2014a). Interoceptive inference: From decision- Verschure, P. F., Voegtlin, T. & Douglas, R. J. (2003).
making to organism integrity. Trends in Cognitive Sci- Environmentally mediated synergy between perception
ences, 18 (6), 270-271. 10.1016/j.tics.2014.03.006 and behaviour in mobile robots. Nature, 425 (6958),
(2014b). A predictive processing theory of sensor- 620-624. 10.1038/nature02024nature02024
imotor contingencies: Explaining the puzzle of percep- Wolpert, D. M. & Ghahramani, Z. (2000). Computational
tual presence and its absence in synaesthesia. Cognitive principles of movement neuroscience. Nature Neuros-
Neuroscience, 5 (2), 97-118. cience, 3 Suppl, 1212-1217. 10.1038/81497
10.1080/17588928.2013.877880.
Seth, A. K. & Critchley, H. D. (2013). Interoceptive pre-
dictive coding: A new view of emotion? Behavioral and
Brain Sciences, 36 (3), 227-228.
Seth, A. K., Suzuki, K. & Critchley, H. D. (2011). An in-
teroceptive predictive coding model of conscious pres-

Seth, A. K. (2015). The Cybernetic Bayesian Brain - From Interoceptive Inference to Sensorimotor Contingencies.
In T. Metzinger & J. M. Windt (Eds). Open MIND: 35(T). Frankfurt am Main: MIND Group. doi: 10.15502/9783958570108 24 | 24
Perceptual Presence in the Kuhnian-
Popperian Bayesian Brain
A Commentary on Anil K. Seth

Wanja Wiese

Anil Seth’s target paper connects the framework of PP (predictive processing) and Commentator
the FEP (free-energy principle) to cybernetic principles. Exploiting an analogy to
theory of science, Seth draws a distinction between three types of active infer-
Wanja Wiese
ence. The first type involves confirmatory hypothesis-testing. The other types in-
wawiese@uni-mainz.de
volve seeking disconfirming and disambiguating evidence, respectively. Further-
more, Seth applies PP to various fascinating phenomena, including perceptual Johannes Gutenberg-Universität
presence. In this commentary, I explore how far we can take the analogy between Mainz, Germany
explanation in perception and explanation in science.
In the first part, I draw a slightly broader analogy between PP and con- Target Author
cepts in theory of science, by asking whether the Bayesian brain is Kuhnian or
Popperian. While many aspects of PP are in line with Karl Popper’s falsification - Anil K. Seth
ism, other aspects of PP conform to how Thomas Kuhn described scientific revolu- a.k.seth @ sussex.ac.uk
tions. Thus, there is both a sense in which the Bayesian brain is Kuhnian, and a University of Sussex
sense in which it is Popperian. The upshot of these considerations is that falsific- Brighton, United Kingdom
ation in PP can take many different forms. In particular, active inference can be
used to falsify a model in more ways than identified by Seth. Editors
In the second part of this commentary, I focus on Seth’s PPSMCT (predict-
ive processing account of sensorimotor contingency theory) and its application to
Thomas Metzinger
perceptual presence, which assigns a crucial role to counterfactual richness. In my
metzinger @ uni-mainz.de
discussion, I question the significance of counterfactual richness for perceptual
presence. First, I highlight an ambiguity inherent in Seth’s descriptions of the tar- Johannes Gutenberg-Universität
get phenomenon (perceptual presence vs. objecthood). Then I suggest that coun- Mainz, Germany
terfactual richness may not be the crucial underlying feature (of either perceptual
presence or objecthood). Giving a series of examples, I argue that the degree of Jennifer M. Windt
represented causal integration is an equally good candidate for accounting for jennifer.windt @ monash.edu
perceptual presence (or objecthood), although more work needs to be done. Monash University
Melbourne, Australia
Keywords
Active inference | Binocular rivalry | Counterfactual richness | Cybernetics | De-
marcation problem | Falsification | Free-energy principle | Naïve falsificationism |
Objecthood | Paradigm change | Perceptual presence | Predictive processing |
Rubber hand illusion | Scientific progress | Sensorimotor contingencies | Sophist-
icated falsificationism

1 Introduction

One of the relevant aspects of Seth’s discussion history of science, they also constitute a theor-
is the way in which it highlights interesting etical underpinning of several ways in which
links to theoretical precursors of PP. In doing Seth has recently developed PP accounts of
so, he broadens the historical context in which various phenomena. Due to limited space, I can
the framework is usually situated. However, only address some of these here. In particular, I
these considerations are not just relevant for the will focus on his three interpretations of active
Wiese, W. (2015). Perceptual Presence in the Kuhnian-Popperian Bayesian Brain - A Commentary on Anil K. Seth.
In T. Metzinger & J. M. Windt (Eds). Open MIND: 35(C). Frankfurt am Main: MIND Group. doi: 10.15502/9783958570207 1 | 19
www.open-mind.net

inference, and on his PP account of perceptual ers. This equilibrium is maintained by keeping
presence. In so doing, I will also try to take the the system’s essential variables, like levels of
analogy between explanation in perception and blood oxygenation or blood sugar (cf. Seth this
explanation in science a little further than it collection, p. 7), within a certain range (cf. ibid.
has previously been taken. pp. 7-8.). The process of achieving homeostasis
In section 2, I will briefly summarize is called allostasis (cf. ibid. p. 8). Cybernetic
Seth’s view on the connection between cyber- systems are teleological, i.e., goal-directed, be-
netics and the free-energy principle. One of the cause they are always trying to reach and pre-
results of his considerations is that a distinction serve homeostasis. This suggests that control is
can be drawn between three types of active in- more important than perception (cf. ibid. p. 9),
ference. The first type involves confirmatory hy- and, as Seth emphasizes, it prioritizes intero-
pothesis-testing. The other types involve seeking ceptive control over exteroceptive control: the
disconfirming and disambiguating evidence, re- main goal is to control the system’s essential
spectively. Seth does not say much about what variables; interaction with the world is only ne-
it takes to disconfirm or falsify a hypothesis or cessary to the extent that it affects these vari-
model. Furthermore, he seems to suggest that ables (ibid. pp. 9-10.).
not all types of active inference he distinguishes The principles of cybernetics fit astonish-
are currently part of PP (at least in the version ingly well to ideas motivating Karl Friston’s
described by Karl Friston’s FEP): “[t]hese FEP (which can, in some respects, be seen as a
points represent significant developments of the generalization of predictive processing).2 The
basic infrastructure of PP” (Seth 2014, p. 3).1 fundamental assumption behind this principle is
In section 3, I will provide clarification of the that biological systems seek to “maintain their
notion of falsification by referring to the works states and form in the face of a constantly chan-
of Karl Popper, Imre Lakatos, and Thomas ging environment” (Friston 2010, p. 127). This
Kuhn. I will also provide examples to show that is obviously similar to the goal of achieving
different types of falsification are part and par- homeostasis.3 Another focus of FEP is active in-
cel of PP, not extensions of the basic infrastruc- ference, because action can reduce the surprisal
ture. In section 4, I point out an ambiguity in of the agent’s states (which is necessary to “res-
Seth’s account of perceptual presence (percep- ist a tendency to disorder”, Friston 2009, p.
tual presence vs. objecthood). After this, I sug- 293); perceptual inference can only reduce the
gest that counterfactual richness may not be the free-energy bound on surprise (Friston 2009, p.
crucial underlying feature (of either perceptual 294). This is in stark contrast with the Helm-
presence or objecthood). Giving a series of ex- holtzian roots of PP, according to which action
amples, I argue that the degree of represented is primarily in the service of perception:
causal integration is an equally good candidate
for accounting for perceptual presence (or ob- [...] wir beobachten unter fortdauernder ei-
jecthood), although more work needs to be gener Thätigkeit, und gelangen dadurch
done. zur Kenntniss des Bestehens eines gesetz-
lichen Verhältnisses zwischen unseren In-
2 Cybernetics and the free-energy nervationen und dem Präsentwerden der
principle verschiedenen Eindrücke aus dem Kreise

In his very rich target paper, Anil Seth calls at- 2 It is more general, because predictive processing only plays a role in it if
combined with the Laplace approximation (which entails, roughly, that
tention to one of the less well-considered pre- probability distributions are approximated by Gaussian distributions).
cursors of PP: cybernetics. A central concept of This approximation, however, also turns FEP into a more specific ver-
sion, by assuming that the brain codes probability distribution as Gaus-
cybernetics is the notion of homeostasis, which sian distributions (which is not entailed by the general predictive pro -
denotes an equilibrium of the system’s paramet- cessing framework discussed in Clark 2013, for instance).
3 In fact, the free-energy principle seems to be partly inspired by cy-
1 Unless stated otherwise, all page numbers refer to the target paper bernetic ideas. Friston (2010, p. 127), for instance, cites Ashby
by Anil Seth. (1947) when explaining the motivation for FEP.

Wiese, W. (2015). Perceptual Presence in the Kuhnian-Popperian Bayesian Brain - A Commentary on Anil K. Seth.
In T. Metzinger & J. M. Windt (Eds). Open MIND: 35(C). Frankfurt am Main: MIND Group. doi: 10.15502/9783958570207 2 | 19
www.open-mind.net

der zeitweiligen Präsentabilien. Jede un- perimenter removes one of its limbs), it can
serer willkührlichen Bewegungen, durch switch back to the first phase, re-creating com-
die wir die Erscheinungsweise der Objecte peting models and using action to eliminate
abändern, ist als ein Experiment zu be- most of them (action as hypothesis-testing).
trachten, durch welches wir prüfen, ob wir Seth points out that the second phase, in
das gesetzliche Verhalten der vorliegenden which the robot walks around, suggests that the
Erscheinung, d.h. ihr vorausgesetztes Be- main purpose of predictive models is to control
stehen in bestimmter Raumordnung, behavior effectively, regardless of how accur-
richtig aufgefasst haben.4 (Helmholtz 1959, ately it represents the world or the body (p.
p. 39) 15). In the first phase, by contrast, exploratory
actions are conducted in order to learn some-
According to this view, the main target of ac- thing about the body, not to reach a goal in-
tion is to find confirmatory evidence for intern- volving its environment (ibid.). As noted above,
ally-generated hypotheses. In short, the contrast such instances of action conform more to Helm-
between these two views can be described as holtzian than to cybernetic roots (action as hy-
“action as hypothesis-testing” versus “action as pothesis-testing).
predictive control”. Whereas the first seems to What this shows is that action can fulfill
fit best to the Helmholtzian roots of PP (and different purposes—not just theoretically, but
puts action in the service of perception), the also in real applications. The robot starfish uses
second seems to fit better to its cybernetic ori- action in at least two ways. Drawing on the of-
gins. Most notably, the free-energy principle ten-noted analogy between PP and scientific
combines both aspects, but assigns a pivotal practice (cf. Gregory 1980), Seth explores fur-
role to action (perceptual inference only makes ther purposes of action. This leads to a distinc-
the free-energy bound on surprise tight, active tion between three types of active inference (pp.
inference leads to a further reduction of free en- 18f.). The first involves active sampling to con-
ergy, reducing surprise implicitly). firm predictions derived from currently active
Seth compares model selection and optim- models; the second is employed to seek evidence
ization in evolutionary robotics to how these that would disconfirm currently held hypo-
processes are implemented in active inference theses; the third involves sampling in order to
(pp. 14-15.). He cites the famous starfish robot disambiguate between alternative hypotheses
developed by Josh Bongard, Victor Zykov, & (p. 19).
Hod Lipson (2006) as an example. In a first Crucially, Seth does not elaborate much
phase, the robot generates multiple competing on the notion of falsification or disconfirmation.
models of its own morphology and performs ac- He relates disconfirmation to Bayesian surprise
tions for which these models predict different (which formalizes the extent to which new evid-
sensory feedback. By comparing these predic- ence leads to a revision of prior representations,
tions to the actual feedback, the starfish can cf. Baldi & Itti 2010). Accordingly, he charac-
thus exclude some of the possible models. When terizes seeking falsifying evidence in terms of
the robot has eliminated all but one model, a maximizing Bayesian surprise. However, the pa-
second phase starts and it uses this model to per quoted in this context, Itti & Baldi (2009)
control its body and generate walking behavior only investigates the hypothesis that surprising
(action as predictive control). Crucially, when information attracts attention, not that subjects
the robot’s morphology changes (when an ex- act to maximize surprise. Friston et al. (2012, p.
4 “[...] we observe under constant own activity, and thereby achieve 6) clarify the relation between FEP and maxim-
knowledge of the existence of a lawful relation between our innerva- ization of Bayesian surprise:
tions and the presence of different impressions of temporary present-
ations [Präsentabilien]. All of our willful movements through which
we change the appearance of things should be considered an experi- The term Bayesian surprise can be a bit
ment, through which we test whether we have grasped correctly the
lawful behavior of the appearance at hand, i.e. its supposed existence
confusing because minimizing surprise per
in determinate spatial structures.” (My translation) se (or maximizing model evidence) in-
Wiese, W. (2015). Perceptual Presence in the Kuhnian-Popperian Bayesian Brain - A Commentary on Anil K. Seth.
In T. Metzinger & J. M. Windt (Eds). Open MIND: 35(C). Frankfurt am Main: MIND Group. doi: 10.15502/9783958570207 3 | 19
www.open-mind.net

volves keeping Bayesian surprise (complex- terior is often called model inversion. In FEP,
ity) as small as possible. This paradox can this type of inference is approximated using
be resolved here by noting that agents ex- variational Bayes, which establishes the connec-
pect Bayesian surprise to be maximized tion to predictive processing (cf. footnote 2
and then acting to minimize their surprise, above). FEP can thus either be seen as a partic-
given what they expect. ular instance of the Bayesian brain hypothesis,
or as a generalization.
In the following section, I will clarify the notion As mentioned above, it is often pointed
of falsification, and discuss the ways in which it out that perceptions in PP are analogous to sci-
is used in PP. More specifically, I will illustrate entific hypotheses. The Bayesian brain is thus a
various types of active inference by drawing a hypothesis-testing brain (this analogy is also re-
slightly broader analogy with theory of science. ferred to in titles of papers by Jakob Hohwy, see
In particular, I will consider views put forward Hohwy 2010, 2012). Thanks to active inference,
by Karl Popper and Thomas Kuhn, respect- the Bayesian brain performs an active kind of
ively. This will serve to help us get a handle on hypothesis testing. The three types of active in-
the general merits of confirmation and discon- ference distinguished by Seth assign a role to
firmation. Furthermore, both Popper’s falsifica- both confirmation and disconfirmation (falsific-
tionism and Kuhn’s paradigm change can be re- ation). This dual role of active inference is also
lated to aspects of predictive processing, which emphasized by (Friston et al. 2012, p. 19):
will hopefully lead to a better understanding of
hypothesis-testing in PP. As a consequence, I The resulting active or embodied inference
invite Seth to provide a refined treatment of the means that not only can we regard percep-
relation between falsification and active infer- tion as hypotheses, but we could regard
ence. action as performing experiments that
confirm or disconfirm those hypotheses.
3 Is the Bayesian brain Kuhnian or
Popperian?5 Further exploration of the analogy to theory of
science reveals a puzzle: as we will see, doubts
The free-energy principle subsumes the can be raised regarding the idea that a theory
Bayesian brain hypothesis6 (cf. Friston 2009, p. gains merit when it is confirmed (or even re-
294). According to this view, processing in the garding the very notion of theory confirmation).
brain can usefully be described as Bayesian in- Does this mean that the Bayesian brain gener-
ference. This means that the brain implements ates hypotheses in an unscientific way?
a probabilistic model that is updated in light of
sensory signals using Bayes’ theorem. More spe- 3.1 The Popperian Bayesian brain
cifically, the brain combines prior knowledge
about hidden causes in the world with a meas- 3.1.1 Conceptual clarification: From naïve
urement of likelihood describing how probable to sophisticated falsificationism
the observed (sensory) evidence is, given various
possible hidden causes. The result is a distribu- According to Popper, science advances mainly
tion (posterior) that describes how probable by seeking falsifying evidence. In fact, falsifiabil-
various possible causes are, given the obtained ity is Popper’s proposed solution to the demarc-
evidence. The process of determining the pos- ation problem, i.e., the problem of specifying
5 It should be noted that Popper rejected interpretations of confirma- the difference between science and pseudo-sci-
tion (or corroboration) in terms of probabilities (cf. Popper ence. Scientific theories posit universal proposi-
2005[1934], ch. X), as well as Bayesian interpretations of probability
theory (cf. Popper 2005[1934], ch. *XVII). Here, I only suggest that tions (scientific laws) that can never be proven
a useful analogy between Popper’s theory of science and the in a strict sense, because only finite observa-
Bayesian brain can be drawn.
6 Seth identifies PP and the Bayesian brain (cf. p. 1). I follow suit in
tions can be made. The next observation could,
this commentary. in principle, always disconfirm a universal em-
Wiese, W. (2015). Perceptual Presence in the Kuhnian-Popperian Bayesian Brain - A Commentary on Anil K. Seth.
In T. Metzinger & J. M. Windt (Eds). Open MIND: 35(C). Frankfurt am Main: MIND Group. doi: 10.15502/9783958570207 4 | 19
www.open-mind.net

pirical hypothesis. Hence, being verifiable can- does not falsify the model; rather, it calls for an
not be a criterion for being scientific, because update to the effect that the model becomes
theories cannot be empirically verified (cf. Pop- less likely. Furthermore, according to Popper, it
per 2005[1934], pp. 16-17.). Conversely, it is does not make sense to say that such hypo-
possible to falsify a universal statement using a theses are corroborated to a greater or lesser ex-
single empirical proposition: tent. For being corroborated means that at-
tempts at falsification have failed. But if it is in
Diese Überlegungen legen den Gedanken principle impossible to falsify a hypothesis, then
nahe, als Abgrenzungskriterium nicht die saying that it has been corroborated becomes
Verifizierbarkeit, sondern die Falsifiz- empty—worse, such hypotheses are not even
ierbarkeit des Systems vorzuschlagen; […] scientific hypotheses (cf. Popper 2005[1934], pp.
Ein empirisch-wissenschaftliches System 248-249.). This, then, constitutes the puzzle
muß an der Erfahrung scheitern können. mentioned above: if hypotheses in PP are not
(Popper 2005[1934], p. 17)7 falsifiable, does this mean the Bayesian brain is
unscientific?
Scientific theories thus cannot, according to This conclusion—that no useful analogy to
Popper, be verified, but only falsified. However, Popper’s theory of science can be drawn—rests
when attempts to falsify a hypothesis have on a naïve understanding of falsification (as em-
failed, we can say that the theory has been cor- phasized by Imre Lakatos, cf. Lakatos 1970).8 A
roborated—which still means that the theory closer look at the notion of falsification reveals
could be falsified in the future (cf. Popper that the analogy can be upheld. Furthermore, it
2005[1934], ch. X). helps us gain a better grasp of the notion of
How can we apply these ideas to predict- falsification in the context of PP.
ive processing? First, we have to find an analo- First of all, we can note that in actual sci-
gon to scientific theories. I suggest that models entific practice, it is not the case that scientists
can be treated analogously to theories, because attempt to falsify an isolated, single hypothesis
in PP, predictions or hypotheses are derived —and then try to come up with a new theory
from models and then compared to bottom-up when the hypothesis has been falsified. Rather,
signals. This also fits the way in which Seth de- scientists often operate with different versions of
scribes the starfish example (namely in terms of a theory at the same time, or seek to find the
model selection). What does it mean that a best parameters for a model. The outcomes of
model is falsified in PP? an empirical study are then used to eliminate
The question is not a trivial one, as there some of the different theories or parameter
seems to be a crucial disanalogy between hypo- ranges. This has already been acknowledged by
thesis-testing in Popper’s sense and hypothesis- Popper (cf. 2005[1934], p. 63., fn. 10). As
testing in the Bayesian brain. The reason why Thomas Nickles puts it:
scientific theories are falsifiable is that they al-
low deriving hypotheses deductively. This According to Popper, at any time there
means if a hypothesis is falsified, the theory is may be several competing theories being
falsified as well. By contrast, hypotheses in the proposed and subsequently refuted by
Bayesian brain are not deductively entailed by failed empirical tests—rather like balloons
the models from which they are derived: the re- being launched and then shot down, one
lation between model and hypothesis is probab- by one. (2014)
ilistic (the hypothesis is more or less probable,
given the model). Hence, when a hypothesis or The result of this falsification procedure is that
prediction elicits a large prediction error, this some of the competing theories are eliminated.
This can already be seen as a slight departure
7 “These considerations suggest proposing not verifiability, but falsifiability
as a demarcation criterion; […] An empirical-scientific system must be 8 I am grateful to Thomas Metzinger for pointing me to Lakatos’ work
able to break down in the light of empirical evidence.” (My translation) on falsificationism.

Wiese, W. (2015). Perceptual Presence in the Kuhnian-Popperian Bayesian Brain - A Commentary on Anil K. Seth.
In T. Metzinger & J. M. Windt (Eds). Open MIND: 35(C). Frankfurt am Main: MIND Group. doi: 10.15502/9783958570207 5 | 19
www.open-mind.net

from what Imre Lakatos calls naïve falsification- tional value (without allowing new predictions),
ism: for the elimination may be based on a com- are not scientific.
parison, not on an isolated falsification proced- Lakatos (1970) emphasizes that this en-
ure. If some of the theories are in some sense tails a refined notion of falsificationism. He calls
better than the others (for instance, by making this sophisticated falsificationism (or sophistic-
more empirical predictions, or by being less ated methodological falsificationism). A theory
complex), then they can be preferred without can only be falsified in this “sophisticated”
having independent reasons to reject the elimin- manner when it has been replaced by a theory
ated theories. However, Popper’s falsificationism that:
is even more sophisticated.
Popper noted that there were no theory- 1. has more empirical content (makes new pre-
neutral empirical propositions. Descriptions of dictions), and
empirical facts are not immediately given, they 2. makes at least one prediction that is empiric-
are based on observations and involve interpret- ally corroborated (cf. Lakatos 1970, pp. 183-
ations (cf. Popper 2005[1934], p. 84, fn. 32). 184.).
This means it is always possible to add auxili-
ary hypotheses to a theory, and thereby make 3.1.2 Sophisticated falsification in the
the theory compatible with seemingly falsifying Bayesian brain
evidence. As a consequence, when it comes to
determining whether a theory is scientific or Popper’s sophisticated falsificationism10 can
not, we cannot consider an isolated theory, but more easily be applied to predictive processing,
must assume a diachronic stance, in which we because it does not require that we reject a
consider how a theory is modified in the light of model whenever its predictions yield large pre-
new evidence. Such modifications (e.g., auxili- diction errors. Instead, the model can be up-
ary hypotheses) increase the empirical content dated to achieve a better fit with the data. Fur-
of the theory (cf. Lakatos 1970, p. 183). As thermore, we find a counterpart for the insight
Popper puts it: that there are no theory-neutral observations:
bottom-up signals are never treated as raw
Bezüglich der Hilfshypothesen setzen wir data, but as being (more or less) noisy. Hence,
fest, nur solche als befriedigend zuzulassen, prediction errors are weighted by expected pre-
durch deren Einführung der ‘Falsifizier- cisions. When the expected precision is ex-
ungsgrad’ des Systems […] nicht herabge- tremely low, prediction errors will be attenu-
setzt, sondern gesteigert wird; in diesem ated. A low expected precision can thus be seen
Fall bedeutet die Einführung der Hypo- as analogous to an auxiliary hypothesis that
these eine Verbesserung: Das System ver- makes the model compatible with otherwise
bietet mehr als vorher.9 (Popper contradicting evidence. What is more, it is not
2005[1934], p. 58) an ad hoc move, because the precision estimate
itself is also constantly being updated in light of
When confronted with evidence that contradicts the evidence. Similarly, when a model generates
predictions, we are thus never forced to reject a significant amount of prediction error, but is
the theory from which the prediction has been strongly supported by a higher-level model with
derived. We may always modify the theory. But high prior probability, a relatively high amount
this modification must not be ad hoc. Auxiliary of prediction error may not lead to a major re-
hypotheses that only make the theory compat- vision of the model.
ible with the evidence, without having any addi-
10 Lakatos (1970) points out that Popper himself never made a
9 “Regarding such auxiliary hypotheses we stipulate that we allow only sharp distinction between naïve and sophisticated falsification -
those hypotheses for which the ‘degree of falsifiability’ of the system ism, but that he accepted the assumptions underlying sophistic-
is not decreased, but increased; in this case the introduction of auxil- ated falsificationism, at least in parts of his work—whereas the
iary hypotheses means an improvement: The system prohibits more person Karl Popper may have been more of a naïve than a soph-
than before.” (My translation) isticated falsificationist.

Wiese, W. (2015). Perceptual Presence in the Kuhnian-Popperian Bayesian Brain - A Commentary on Anil K. Seth.
In T. Metzinger & J. M. Windt (Eds). Open MIND: 35(C). Frankfurt am Main: MIND Group. doi: 10.15502/9783958570207 6 | 19
www.open-mind.net

Model competition in PP can also be seen model is corroborated, because it makes the
as an instance of sophisticated falsificationism. strong prediction that the neighbor’s lawn is
Competition need not be resolved by eliminat- wet (i.e., the conditional probability that the
ing those models that yield the largest predic- neighbor’s lawn is wet, given that it has rained,
tion errors (as in the starfish robot). Instead, it is high). The other model is not incompatible
may be that some models make more specific with this evidence, but it is not supported by it
counterfactual predictions. Indeed, this seems to as much (because the conditional probability
be the main rationale behind active inference in that the neighbor’s lawn is wet, given that your
FEP. sprinkler has been on, is not as high). In other
According to the formalization provided in words, it has been explained away. As Jakob
Friston et al. (2012, p. 4), active inference in- Hohwy puts it:
volves minimizing the entropy of a counterfac-
tual density. This density links future internal The Rain model accounts for all the evid-
states and hidden controls to hidden states, ence leaving no evidence behind for the
which cause sensory states; hidden controls are Sprinkler model to explain. Even though
hidden states that can be changed by action the Sprinkler model did increase its prob-
(Friston et al. 2012, p. 3). A density has low en- ability in the light of the first observation,
tropy, roughly, if it assigns high values to a rel- it seems intuitive right to say that its
atively small subset of states, and low values to probability is now returned to near its
most other sets of states. Predictions based on a prior value. The model has been explained
probability density with very low entropy can away. (2010, p. 137)
thus be made with a high level of confidence,
because most other possibilities are more or less Explaining away is another example of sophist-
ruled out (due to the low values assigned to icated falsification. Even when two or more
them by the density). Formally, this is reflected models are compatible with the evidence (and
in the proposition that the negative entropy of with each other), there can be reason to prefer
the counterfactual density is a monotonic func- one of them and reject the others.
tion of the precision of counterfactual beliefs The clarification in this section should
(Friston et al. 2012, p. 4). have shown that there is more to falsification
The entropy of the counterfactual density than just “disconfirming” a hypothesis, and
is minimized with respect to hidden controls. In that competition between models can be re-
effect, this is a selection process, in which a solved in different ways, not only in the way ex-
model (here: a counterfactual density) is selec- emplified by the starfish robot. Furthermore,
ted that has minimal entropy. The other models different types of sophisticated falsificationism
are eliminated, because they have higher en- are part and parcel of predictive processing.
tropies. We can say they are falsified in the Does this mean that the Bayesian brain is
sense of sophisticated falsificationism (but not Popperian? This conclusion would be prema-
in the sense of naïve falsificationism). ture. The above can at best show that there are
Another way in which model competition many situations in which the Bayesian brain is
can be resolved without naïve falsification can a sophisticated falsificationist. But there may be
be illustrated by the famous “wet lawn” ex- situations in which not even sophisticated falsi-
ample (cf. Pearl 1988). Suppose you enter your fication is possible or necessary. In the following
garden and find that the lawn is wet. There are section, I will argue that predictive processing
at least two models that can explain this: either also has Kuhnian aspects.
your sprinkler has been on during the night or
it has rained. Let us assume that both models 3.2 The Kuhnian Bayesian brain
are initially equally likely (i.e., they have the
same prior probability). When you now observe According to Kuhn, scientific research develops
that your neighbor’s garden is also wet, the rain in different recurring phases. Most of the time,
Wiese, W. (2015). Perceptual Presence in the Kuhnian-Popperian Bayesian Brain - A Commentary on Anil K. Seth.
In T. Metzinger & J. M. Windt (Eds). Open MIND: 35(C). Frankfurt am Main: MIND Group. doi: 10.15502/9783958570207 7 | 19
www.open-mind.net

scientists work within an established paradigm, Oberheim & Hoyningen-Huene 2013, §1). A new
in which implications of theories are explored paradigm that becomes dominant is thus not
and puzzles are solved (cf. Kuhn 1962, ch. IV). marked by being closer to the truth, but mainly
In this phase, falsification or confirmation do by constituting a departure from the old
not play a role: paradigm (cf. Kuhn 1962, pp. 170-171). This
seems to entail that scientific progress need not
Normal science does and must continually be a process in which theories approximate the
strive to bring theory and fact into closer truth to an ever higher degree.
agreement, and that activity can easily be Can we find an analogon for such a trans-
seen as testing or as a search for confirma- ition from one paradigm to the other in predict-
tion or falsification. Instead, its object is ive processing? Above, we saw that the sophist-
to solve a puzzle for whose very existence icated falsificationist assumes that scientific pro-
the validity of the paradigm must be as- gress happens only when a theory makes new
sumed. Failure to achieve a solution dis- predictions, and thereby leads to the discovery
credits only the scientist and not the the- of new states of affairs. This need not always be
ory. (cf. Kuhn 1962, p. 80) the case in the Bayesian brain. When a model is
changed to minimize free-energy, this does not
At some stage, however, there will be anom- mean that the empirical content or predictive
alies, i.e., empirical observations that cannot be power has been increased. A particularly clear
explained within the current paradigm. When example of this can be found in perceptual phe-
these anomalies accumulate, scientists will try nomena like binocular rivalry.
to explore new concepts and methods. If, using In binocular rivalry (cf. Blake &
new concepts and methods, previously unex- Logothetis 2002), subjects are presented with
plainable anomalies can be accounted for, a sci- two different images, one to the left eye, the
entific revolution can result, through which a other to the right eye, e.g., a face and a house.
new paradigm is established. Kuhn shares the According to a predictive coding account put
sophisticated falsificationist’s insight that theor- forward by Jakob Hohwy, Andreas Roepstorff &
ies are never rejected in isolation: Karl Friston (2008), the brain generates two
main competing models of what the stimuli de-
[…] the act of judgment that leads scient- pict, one corresponding to the face, the other
ists to reject a previously accepted theory corresponding to the house. However, only one
is always based upon more than a compar- of these models is consciously experienced at
ison of that theory with the world. The any given time (although there can be intermit-
decision to reject one paradigm is always tent phases in which subjects report seeing a
simultaneously the decision to accept an- mixture of both stimuli, i.e., parts of the house
other, and the judgment leading to that and parts of the face at the same time, but usu-
decision involves the comparison of both ally non-overlapping). This means that the
paradigms with nature and with each brain will tend to settle into one of two classes
other. (cf. Kuhn 1962, p. 77) of states (one corresponding to perceiving the
house, the other to perceiving the face). Since
This shows that Kuhn’s theory is in some re- each of the models can only account for part of
spects in line with sophisticated falsificationism the visual input, both cause a significant
—but he goes beyond it, in that he doubts that amount of prediction error (cf. Hohwy et al.
a paradigm that has been adopted instead of 2008, p. 691). Over time, the prior probability
another is always better or closer to the truth. of the currently assumed model (house or face,
The reason for this is that he claims competing respectively) will decrease, leading to a revision
paradigms to be incommensurable (cf. also Fey- of the hypothesis, until the brain settles into a
erabend 1962), which means that they typically state corresponding to the other percept, at
use radically different concepts and methods (cf. least temporarily (cf. Hohwy et al. 2008, pp.
Wiese, W. (2015). Perceptual Presence in the Kuhnian-Popperian Bayesian Brain - A Commentary on Anil K. Seth.
In T. Metzinger & J. M. Windt (Eds). Open MIND: 35(C). Frankfurt am Main: MIND Group. doi: 10.15502/9783958570207 8 | 19
www.open-mind.net

692–694).11 The crucial difference between this Excepting those that are exclusively in-
and cases like the wet lawn example or model strumental, every problem that normal sci-
selection in the starfish robot is that neither of ence sees as a puzzle can be seen, from an-
the two competing models is in any sense better other viewpoint, as a counterinstance and
than the other (in terms of empirical content, thus as a source of crisis. (Kuhn 1962, p.
simplicity, predictive power, etc.). 79)
We can recast binocular rivalry in terms
of Kuhnian paradigm changes. If we liken each If it is treated as a puzzle, it yields questions
of the two models (house/face) to a paradigm, like: how can we account for this phenomenon
we can say that perceiving a single object in within our established framework? If it is
binocular rivalry corresponds to the phase of treated as a counterinstance, a more funda-
normal science, in which many phenomena mental solution is needed. This is analogous to
(inputs) can be explained. After some time, the fact that whether two models in predictive
however, there are anomalies (increasing pre- processing are compatible or not depends on
diction error), which leads to a scientific crisis (hyper)priors (cf. FitzGerald et al. 2014, p. 2).
in which new directions are explored (inter- When a hyper-prior has it that two models are
mittent phase in which no unified percept is incompatible, this can either lead to a competi-
generated), until a new form of scientific prac- tion, in which one of the models is eliminated,
tice becomes dominant (scientific revolution), or it can lead to a revision of the hyper-prior.
and a new phase of normal science (temporar- (Which of the two possibilities corresponds
ily stable perception) is reached. The trans- more to puzzle solving, and which to something
ition from one percept to the other does not more fundamental will depend on whether the
go along with increased veridicality: neither of lower-level models or the high-level prior ini-
the two percepts is closer to the truth than tially have a higher probability.) This is illus-
the other.12 This may also support the cyber- trated by the RHI (rubber hand illusion).
netic idea that internal models are used in the In the RHI (Botvinick & Cohen 1998), the
pursuit of homeostasis, not to approximate brain harbors two contradictory sensory models.
the truth (as also noted by Seth this collec- According to the visual model, tactile stimula-
tion, p. 15). tion occurs on the surface of the rubber hand.
There is another analogy between the According to the proprioceptive model, the felt
Bayesian brain and Kuhn’s theory of science. strokes occur at a different location (i.e., where
According to Kuhn, it is indeterminate whether the real hand is located). While there is, in and
an anomaly (an unexpected experimental result, of itself, no contradiction between these models,
for instance) is something that should be re- it is likely that the brain has a prior that favors
garded as just another puzzle or as a reason to common-cause explanations of sensory signals.
reject the whole paradigm: Relative to this prior, there is a tension between
11 Two possible reasons why the probability of the currently as- the models: they seem to indicate that the seen
sumed model decreases are offered by the authors: either there is stroking and the felt touch occur at distinct loc-
a hyper-prior to the effect that the world changes (which is why
a static hypothesis becomes less likely over time), or there are ations, which is odd, because they occur syn-
random effects that lead to multistability, such that neural dy- chronously (and the prior has it that synchron-
namics switch from one basin of attraction to another (cf. Hohwy
et al. 2008, p. 692).
ous effects have a common cause, which speaks
12 In fact, it seems that the notion of incommensurability has against two distinct locations). As Jakob Hohwy
been inspired by Gestalt switches (as in the perception of a puts it:
Neckar cube), which are very similar to phenomena like binocu -
lar rivalry. However, Kuhn explicitly pointed out that there is a
crucial difference between a Gestalt switch and a paradigm [...] we have a strong expectation that
change: “[…] the scientist does not preserve the gestalt subject’s
freedom to switch back and forth between ways of seeing. Nev- there is a common cause when inputs co-
ertheless, the switch of gestalt, particularly because it is today occur in time. This makes the binding hy-
so familiar, is a useful elementary prototype for what occurs in
full-scale paradigm shift” (1962, p. 85). I am grateful to Sascha
pothesis of the rubber hand scenario a
Fink for drawing my attention to this statement. better explainer, and its higher likelihood
Wiese, W. (2015). Perceptual Presence in the Kuhnian-Popperian Bayesian Brain - A Commentary on Anil K. Seth.
In T. Metzinger & J. M. Windt (Eds). Open MIND: 35(C). Frankfurt am Main: MIND Group. doi: 10.15502/9783958570207 9 | 19
www.open-mind.net

promotes it to determine perceptual infer- enced as real? Interestingly, there are accounts
ence and thereby resolve the ambiguity. of scientific explanation that assign an essen-
(2013, p. 105) tial role to counterfactual knowledge (cf.
Waskan 2008). If someone purports to know
Notice that the common-cause hypothesis (that why a certain event happened or why a phe-
the touch is felt where it is seen) only becomes nomenon was observed, we expect her to also
the dominating hypothesis because the design of be able to tell us what would have happened if
the study prevents subjects from confirming the some of the initial conditions had been differ-
distinct-causes hypothesis (e.g., by looking at ent. Similarly, when the Bayesian brain ex-
their real hands). Because of the common-cause plains sensory signals by inferring their hidden
hypothesis, there is an ambiguity in the per- causes, we would expect the brain’s generative
cepts. This ambiguity can be resolved in at least model to also have the resources to infer in
two ways: either by adjusting the lower-level what ways sensory signals would be different,
(perceptual) models (to the effect that the felt had there been a change to their hidden
touch occurs at the same location as the seen causes.
stroking); or by active inference (which in this This highlights the relevance of counter-
case would lead to a rejection of the higher-level factual models. Seth points out that counter-
model corresponding to the common-cause hy- factuals play a crucial role in active inference.
pothesis). The first way corresponds to puzzle The consideration above may be another way
solving, the second more closely to a paradigm to show the relevance of counterfactual mod-
change. Note that the analogy will be the els. Furthermore, it also highlights the useful-
stronger the more remote the hyper-prior is ness of counterfactual richness. The richer a
from the perceptual models. counterfactual model of hidden causes, the
I hope to have shown that the Bayesian better the brain’s explanation of sensory sig-
brain has aspects that make it Popperian, as nals (all other things being equal). In general,
well as aspects that make it Kuhnian. At the we may also be inclined to say that the richer
very least, it should have become clear that the counterfactual model, the higher the con-
falsification is a more complex concept than de- fidence that it helps track the real explanation
picted in Seth’s target paper (which seems to of sensory signals. But does this mean it goes
tend towards a more naïve form of falsification- along with experienced realness (or perceptual
ism). presence)?
This is, basically, what Seth proposes in
4 Perceptual presence his PP account of perceptual presence (cf. Seth
2014). But what is perceptual presence in the
We have seen how fruitful analogies between PP first place? On the one hand, Seth characterizes
and theory of science can be. As mentioned the notion by contrasting examples. For in-
above, an early formulation of the analogy stance, objects like a tomato possess perceptual
between perception and hypothesis-testing can presence, whereas afterimages do not. On the
be found in Richard Gregory’s seminal paper other hand, Seth provides the following charac-
“Perceptions as Hypotheses”. There, we also terization:
find the suggestion that percepts explain sens-
ory signals (cf. Gregory 1980, p. 13).13 In normal circumstances perceptual con-
How far can we take the analogy between tent is characterized by subjective veridic-
explanation in perception and explanation in ality; that is, the objects of perception are
science? If we know what a good explanation experienced as real, as belonging to the
is in science, does this give us a clue to the world. When we perceive the tomato we
conditions under which percepts are experi- perceive it as an externally existing object
13 It should be noted that Gregory ascribes “far less explanatory
with a back and sides, not simply as a spe-
power” (1980, p. 196) to perceptions than to scientific hypotheses. cific view […]. (2014, p. 98)
Wiese, W. (2015). Perceptual Presence in the Kuhnian-Popperian Bayesian Brain - A Commentary on Anil K. Seth.
In T. Metzinger & J. M. Windt (Eds). Open MIND: 35(C). Frankfurt am Main: MIND Group. doi: 10.15502/9783958570207 10 | 19
www.open-mind.net

The tomato is not perceived as a flat, red disc. the perception of a ripe tomato on a bush, it
Although you do not see the back and sides of might be equally relevant to encode how sens-
the tomato in the same way that you see the ory signals pertaining to the tomato would
front, there is still a sense in which both are change if the wind were to blow the bush or if
perceptually present (cf. Noë 2006, p. 414). I the tomato were to fall down. On the other
shall now point to two ambiguities in Seth’s de- hand, it is unclear how explicit a counterfactual
scription of the explanandum. This calls for a representation has to be. Jakob Hohwy (2014)
conceptual clarification, regarding which I shall suggests that a rich causal structure could be
make a tentative suggestion. After that, I shall modeled by extracting higher-order invariants
argue that there may be possible counter- (features that do not change if the tomato is
examples to Seth’s hypothesis that perceptual dangling in the wind or has fallen down, for in-
presence correlates with the counterfactual rich- stance). Higher-order invariants are relatively
ness of generative models. perspective-independent.14 The degree of percep-
tual presence would then correspond to the
4.1 Ambiguities in Seth’s description of “depth of the inverted model”15 (Hohwy 2014, p.
the explanandum 128). In his target paper, Seth notes that the
depth of the model may indeed play a role (see
The tomato is not only experienced as percep- footnote 13).
tually present, it is also perceived as an object Two ambiguities are thus to be found in
in the external world. In a commentary on Seth, Seth’s account. One concerns the characteriza-
Tom Froese (2014, p. 126) has therefore sugges- tion of the target phenomenon (experienced
ted that Seth conflates perceptual presence with realness versus experienced objecthood). The
experienced objecthood. This proposal has some other lies in the description of the represented
plausibility, because the tomato is perceived as causal structure: counterfactual richness versus
a real object, whereas afterimages are not ex- perspective-independence of hidden causes.
perienced as objects (they are more like un- Counterfactual richness and causal “depth” are
stable colored shades). After all, even Seth ad- not completely independent. Below, I will give
mits, in his target paper, that it may be im- some examples that may be useful to explore
portant to distinguish presence from objecthood the relationship between these two features.
(p. 18). This is one way in which Seth’s defini- Furthermore, I will suggest that it could be
tion of the explanatory target is ambiguous: is helpful to consider another feature with respect
it about experienced presence or experienced to which the represented causal structure of ob-
objecthood (cf. also Seth 2014, pp. 105f.)? (This jects may vary. This feature is the degree of
question becomes more pressing still when we 14 As I am using the term here, the depth of a model can be measured
consider the ethymology of “realness” or “real- by its location in the predictive processing hierarchy (that is,
ity”: the Latin origin of the word is res (thing), whether it is high or low in the hierarchy). Estimates at higher levels
track features that change more slowly (i.e., features that remain in-
which makes it a little confusing that Seth variant when things change, for instance, when the subject changes
seems to identify perceptual presence with the her perspective on a perceptual object like a tomato by walking
around the tomato or by turning it—hence the term “perspective-
sense of subjective reality, cf. Seth this collec- (in)dependence”). A model of a perceived object is deep when it rep-
tion, p. 2.) resents features that change relatively slowly. Alternatively, one
Another ambiguity is related to the notion could stipulate that a model is deep when it represents features that
change slowly and features that change more quickly. In fact, this
of a counterfactual model. In his target paper may come closer to what Hohwy has in mind, but it blurs the dis-
Seth defines a counterfactual model as a model tinction between perspective-dependence and causal integration. Ho-
hwy writes: “[c]oncurrents are causes that do not interact on their
encoding “how sensory inputs (and their expec- own with other causes (presumably a fence won’t occlude a concur-
ted precisions) would change on the basis of a rent)” (2014, p. 128). But encapsulated causes can be represented
both at lower parts of the hierarchy (possible example: afterimages)
repertoire of possible actions” (Seth this collec- and at higher parts of the hierarchy (possible example: certain con-
tion p. 17). On the one hand, one may ask if scious thoughts). This suggests that at least causal encapsulation can
counterfactual models in the brain necessarily be dissociated from perspective-dependence and -independence.
15 The inverted model is the posterior distribution, the computation of
encode SMCs (sensorimotor contingencies). For which is based on the likelihood and the prior (see above).

Wiese, W. (2015). Perceptual Presence in the Kuhnian-Popperian Bayesian Brain - A Commentary on Anil K. Seth.
In T. Metzinger & J. M. Windt (Eds). Open MIND: 35(C). Frankfurt am Main: MIND Group. doi: 10.15502/9783958570207 11 | 19
www.open-mind.net

causal encapsulation. For representations not virtual world is spatially bounded, e.g., with the
only differ with respect to their counterfactual screen as the limit). Note that many modern
richness or their degree of perspective-depend- video games are less causally encapsulated, for
ence, but also with respect to the extent to instance when they are played on a touchscreen
which the represented causal structure is encap- (or on devices with a three-dimensional screen,
sulated or integrated. (In what follows, I will or in an immersive virtual reality).17
use the notion of a counterfactual model mainly As mentioned above, causal integration
in the sense in which Seth uses it: counterfac- and counterfactual richness are not completely
tual models in this sense involve representations independent. High counterfactual richness im-
of possible bodily actions by the subject of ex- plies a certain degree of causal integration (at
perience.) least in some respects), for it means that at
A phenomenal representation of a tomato least the subject can interact with the experi-
on a plate is not only counterfactually rich and enced object in some way—regardless of how
relatively perspective-dependent, the represen- separate the represented causal structure is
ted causal structure is also causally integrated.16 from the rest of the subject’s surroundings.
It is, for instance, represented as being causally Similarly, highly perspective-invariant rep-
related to the plate, because it is experienced as resentations typically also involve the represent-
lying on the plate (that is, it is not hovering ation of an encapsulated causal structure. Ab-
above it). Furthermore, it is in possible causal stract conscious thoughts, for instance, cannot
contact with virtually all other objects in its vi- be touched with the hand or other concrete ob-
cinity (e.g., the subject’s hands). jects. However, the implied encapsulation only
Contrast this with the experience of what holds in some respects. Sometimes thoughts can
is happening in a classical video game—say, a evoke strong emotions or a sequence of mental
racing game. The player influences how the im- imagery. In certain obsessive-compulsive dis-
ages on the two-dimensional screen change, be- orders, for instance, subjects will first have a
cause she has control over the vehicle. Hence, thought (“My hands are dirty”), presumably
we can assume that representations of gaming followed by a feeling of disgust and the urge to
sequences are (usually) counterfactually rich. At wash the hands, which then leads to motor be-
the same time, they are also perspective de- havior (washing the hands); this, in turn, may
pendent (although they mainly depend on the be followed by the thought that the hands are
virtual perspective from which objects are rep- still dirty. The content of the conscious thought
resented in the game). However, virtual objects is relatively perspective-invariant, and yet it in-
in the game are experienced as causally encap- volves, presumably, representations of causal
sulated: although objects can interact with each structure that link it to concrete objects in the
other in the virtual world, they do not interact world.
with most other parts of the player’s environ- As long as we interpret counterfactuals
ment. For instance, they will never break out of only as representations of sensorimotor contin-
the screen and fly around in the room in which gencies, it may also seem that perspective-in-
the player is sitting. Furthermore, they can only variant18 representations are counterfactually
be influenced vicariously through a controller or poor. However, if we include representations of
keyboard. Thus there is not causal encapsula- possible mental actions and their effects, we can
tion in every respect (the virtual world is not also conceive of counterfactually-rich perspect-
experienced as completely disambiguated from ive-invariant representations. A possible ex-
the rest of the experienced world), but in some ample is a philosophical argument or a theory,
respects the encapsulation is rather strong (the which someone can contemplate in their mind,
16 Another possible term for this would be causally open, in the sense being aware that there are several possible ways
that it is represented as being in potential causal exchange with
other objects in its surrounding. By integration, I thus do not mean 17 Thanks to Jennifer Windt for suggesting immersive video games as a
integration within (or internal integration), but integration with further example.
other objects. 18 Perspective-invariant representations are maximally perspective-independent.

Wiese, W. (2015). Perceptual Presence in the Kuhnian-Popperian Bayesian Brain - A Commentary on Anil K. Seth.
In T. Metzinger & J. M. Windt (Eds). Open MIND: 35(C). Frankfurt am Main: MIND Group. doi: 10.15502/9783958570207 12 | 19
www.open-mind.net

Figure 1: The figure illustrates how classes of experiences can be located in a cube, according to the extent to which
they display counterfactual richness, perspective-independence, and causal integration (see main text for explanations).
The cube (without the labels) is adapted from cube figures in Godfrey-Smith (2009); talks by Daniel Dennett brought
this style of illustration to my attention.

in which the argument could be probed and at- terfactually poor models, may, at first sight, seem
tacked, or several important cases to which the to be located somewhere in the middle of the per-
theory could be applied. spective-dependence axis.
Bearing in mind that the degree of causal Grapheme-color concurrents, for in-
encapsulation is not completely independent from stance, are not simply triggered by graphic
the other two dimensions (counterfactual richness representations of glyphes, but by representa-
and perspective-invariance), we can depict differ- tions of abstract objects, i.e., graphemes, asso -
ent types of conscious experiences in a cube, ciated with certain glyphes (cf. Mroczko et al.
where the three axes stand for the three dimen- 2009). Hence, it may seem that the hidden
sions described (see Figure 1). The most interest- cause of the concurrent is not simply an ob -
ing locations in this cube are, of course, its eight ject in the world, but also involves an abstract
corners, because they depict classes of experiences object, i.e., a grapheme, the representation of
for which each of the three features is either com- which is perspective-invariant. This would
pletely absent or maximally pronounced. Finding suggest that synesthetic concurrents cannot
examples of these “extremal experiences” is no conclusively be placed in one of the cube’s
easy task.19 Even neural representations of synes- corners, because their represented hidden
thetic concurrents, Seth’s prime example of coun- causes involve very high-level invariants.
On the other hand, one could object that
19 In fact, it may be that the corners only constitute hypothet -
ical endpoints. Thanks to Jennifer Windt for pointing this
the concurrent itself is represented in a rather
out. perspective-dependent way. It may be part of a
Wiese, W. (2015). Perceptual Presence in the Kuhnian-Popperian Bayesian Brain - A Commentary on Anil K. Seth.
In T. Metzinger & J. M. Windt (Eds). Open MIND: 35(C). Frankfurt am Main: MIND Group. doi: 10.15502/9783958570207 13 | 19
www.open-mind.net

causal network involving hidden causes that are could be manipulated in various ways with a
represented in perspective-invariant ways, but controller, the corresponding representation
the synesthetic percept itself is not a represent- would probably not be as counterfactually rich
ation of an abstract hidden cause.20 Hence, on as a representation corresponding to the experi-
second thought, it seems that concurrents, as in ence of a real tomato. Hence, it is difficult to
grapheme-color synesthesia, are in fact located arrive at a definitive verdict.
close to the origin of our coordinate system: the A more promising path may involve the
representations involved are relatively perspect- experience of objects in asomatic OBEs (out-of-
ive dependent, and they are counterfactually body experiences) or asomatic dream experi-
poor. At the same time, they are causally en- ences (Windt 2010; Metzinger 2013). Counter-
capsulated, because they do not interact with factuals, as conceived of by Seth, always involve
physical objects (they cannot be touched, etc.). action on the part of a subject. Most, if not all,
(non-mental) actions involve the body, so rep-
4.2 Does counterfactual richness resenting counterfactuals involves representing
correlate with perceptual presence (or (parts of) the body. In asomatic OBEs and aso-
objecthood)? matic dream experiences, subjects do not
identify with a body, but with an unextended
What does this tell us about experienced “pres- point in space. I speculate that in such cases,
ence” or “objecthood”? Are all examples of representations of objects are less counterfactu-
counterfactually rich representations in the cube ally rich.21 This, however, does not necessarily
perceptually present, or are they associated mean that they are experienced as less present
with a high degree of objecthood? If so, this or as possessing less objecthood. There are still
would support Seth’s hypothesis that counter- a lot of causal regularities involving external ob-
factual richness correlates with perceptual pres- jects that may be tracked by models in the
ence (or objecthood). I believe that counterfac- brain, even in the absence of an ordinary body
tual richness can be dissociated both from per- representation. External objects can interact
ceptual presence and from objecthood. Olfact- with each other, and counterfactual representa-
ory experiences are, as argued by Michael tions of possible causal processes may contrib-
Madary (2014), both counterfactually poor and ute to the experience of objecthood or percep-
perceptually present. This suggests that coun- tual presence. In particular, this is to be expec-
terfactual richness does not correlate with per- ted if none of the external objects are represen-
ceptual presence. Similarly, experiences of clas- ted as causally encapsulated. If this bears out,
sical video game sequences are counterfactually it provides another reason to believe that coun-
rich, but involve a low degree of perceptual terfactual richness of generative models does
presence; objects in the game are only experi- not correlate with experienced objecthood. Let
enced as virtual objects, not as real objects. us now consider possible examples of other ex-
Counterfactual richness and perceptual presence tremal experiences (in the corners of the cube)
may therefore be doubly dissociable. to investigate whether it is plausible to hypo-
Trying to evaluate whether counterfactual thesize that represented causal depth or causal
richness correlates with phenomenal objecthood encapsulation correlates with perceptual pres-
would presuppose that we know what phenom- ence or objecthood.
enal objecthood means. As I only have an intu- The more perspective-invariant a repres-
itive grasp of what it means, I can only give a entation, the more abstract it is. This also
preliminary statement. To me, it seems that vir- means that perspective-invariant representations
tual objects in two-dimensional video games do typically involve an encapsulated causal struc-
not possess a high degree of phenomenal object- ture. Thinking about a simple equation like
hood. But then again, even if a virtual tomato
21 In fact, asomatic OBEs may be a better example than asomatic dream ex-
20 This may point to an aspect regarding which Hohwy‘s characteriza- periences, since such dreams typically lack concrete objects (cf. LaBerge &
tion of causal depth is ambiguous. DeGracia 2000). I am grateful to Jennifer Windt for pointing this out.

Wiese, W. (2015). Perceptual Presence in the Kuhnian-Popperian Bayesian Brain - A Commentary on Anil K. Seth.
In T. Metzinger & J. M. Windt (Eds). Open MIND: 35(C). Frankfurt am Main: MIND Group. doi: 10.15502/9783958570207 14 | 19
www.open-mind.net

“1+1=2” may be an example of this. There is ably, that such counterfactuals are phenomen-
no way in which the target of this representa- ally represented, whereas representations of
tion can causally interact with the window be- SMCs are usually unconscious (and may impact
hind my desk or the red bottle in front of the on consciousness only indirectly).
window. Furthermore, most (or all) bodily Similar things apply to conscious thoughts
movements will not influence the way I experi- about non-trivial mathematical expressions. For
ence the thought that one plus one equals two. instance, if a mathematician sees the expression
Hence, it is arguably also a counterfactually (1 + x/n)n she will probably think “If n tends
poor representation. to infinity, this expression will converge to ex.
When we move up, in the direction of Now, suppose the mathematician is investigat-
counterfactually rich phenomenal representa- ing the asymptotic behavior of some complic-
tions, we arrive at representations that are ated expression (e.g., she wants to find out
counterfactually rich, perspective-invariant, and what happens to a certain expression when n
still causally encapsulated. Above, I mentioned tends to infinity). While manipulating the terms
conscious thoughts about philosophical argu- on paper, she suddenly realizes that one factor
ments or theories as possible examples. Such contained in the expression is (1 + x/n)n. As
thoughts may involve mental imagery and inner she is using pen and paper while thinking this,
speech, and perhaps even complex phenomenal her brain will not only activate an abstract (but
simulations involving counterfactual situations. conscious) counterfactual thought, but probably
It is not obvious whether it makes sense to say also a representation of SMCs. These SMCs will
that such thoughts involve counterfactual rep- involve taking the limit of the expression with
resentations linking possible mental actions to which she started (i.e., lim n→∞), and this is
their effects. This is even harder without pre- now not only a mental action, but also a pos-
supposing a developed theory of mental action sible bodily action. She could write this down,
(for recent proposals, cf. Proust 2013; Wu and know that (if the limit exists) part of it
2013). would be ex. Her mathematical investigation
Mental actions are goal-directed. Perform- therefore involves:
ing a mental action may therefore, at least in
some cases, be followed by a representation of a • phenomenal representations regarding coun-
situation in which the goal is realized (one pos- terfactual mental actions;
sible example might be: remembering a name; • representations of SMCs (embodied versions
represented situation: telling someone the of the above mentioned counterfactuals);
name). In the case of a theory, a mental action • a close coupling between writing, perceiving,
could be considering whether a certain claim is and thinking.
true or not (or whether it is plausible). This
may trigger thoughts like: “Assuming this is the The third point is especially important, because
case, what implications would this have? Are it suggests that for a mathematician working
these implications plausible, or likely to be with pen and paper (or chalk and blackboard)
true? Are there possible counterexamples?” It the objects of her conscious thoughts are not
might also involve trying to formulate some- causally encapsulated anymore. The causal
thing more clearly. structure represented while thinking about ab-
Furthermore, thinking about a theory or stract concepts is intertwined with the causal
problem may involve conscious counterfactual structure represented while looking at written
thoughts of the form “If I gave up this assump- mathematical expressions. These causal rela-
tion, there would not be a contradiction among tions are still relatively limited, but if the math-
the remaining hypotheses anymore”, or “If the ematician is completely absorbed in her work,
theory could account for this special case, it the paper (or blackboard) may be all she is at-
would be strengthened”. One difference to con- tending to in her environment at the moment,
scious perception of concrete objects is, presum- perhaps to the extent that she does not experi-
Wiese, W. (2015). Perceptual Presence in the Kuhnian-Popperian Bayesian Brain - A Commentary on Anil K. Seth.
In T. Metzinger & J. M. Windt (Eds). Open MIND: 35(C). Frankfurt am Main: MIND Group. doi: 10.15502/9783958570207 15 | 19
www.open-mind.net

ence abstract relations represented by her notes 4.3 Do perspective-invariance or


as causally encapsulated anymore. It is conceiv- represented causal integration
able that this aspect can be enhanced in virtual correlate with perceptual presence (or
environments in which mathematical objects are objecthood)?
not represented by writing on paper or black-
board, but by three-dimensional virtual objects The examples given are certainly not uncontro-
that can be manipulated by touch or manual versial and perhaps not all of them can be sus-
movements, for instance.22 Contrary to what one tained in the light of further research. But
might at first think, there may thus be cases in hopefully the cube can still fulfill heuristic pur-
which high-degrees of perspective-invariance go poses, and can illustrate the need to clarify the
along with both counterfactual richness and relations between counterfactual richness, per-
high degrees of causal integration. spective-dependence, and causal integration.
Another class of abstract thoughts that But assuming that the examples given are loc-
may be experienced as causally integrated could ated in roughly the right places within the
be obsessive thoughts, like the thought that one’s cube, what does this tell us about perceptual
hands are contaminated with germs. Such presence or experienced objecthood? Above, I
thoughts may be triggered by specific events dismissed Seth’s hypothesis that counterfactual
(like touching a door knob) and may go along richness correlates with either presence or ob-
with a fear of getting sick (because of the con- jecthood. Let us now briefly consider perspect-
tamination). Subjects may also try to avoid ive-invariance and causal integration. If con-
touching objects that they fear might be contam- scious thoughts involve causally-deep models
inated. The reason for this is that the hidden (that represent perspective-invariant features),
cause represented by the obsessive thought, i.e., then it seems that the depth of the represented
potential germ contamination, is not causally en- causal structure does not correlate with pres-
capsulated. It is causally connected to concrete ence or objecthood. The thought that one plus
objects in the subjects’ environment: things that one equals two does not possess a high degree of
are perceived as contaminated can cause a con- objecthood or perceptual presence. Hence, it
tamination of the hands; on the other hand, con- seems that Hohwy’s hypothesis that the depth
taminated hands can infect other objects with of the generative model (the degree of perspect-
germs. Furthermore, the inferred hidden cause ive-independence) correlates with objecthood or
(germ contamination) is relatively perspective-in- presence should be dismissed as well. But the
variant. Subjects arguably do not imagine bac- remaining candidate, causal integration, does
teria crawling on their hands, although the ob- not unequivocally correlate with either presence
sessive thought may go along with an altered of objecthood (if the examples I gave make
perception of the hands. Finally, the model in- sense). The represented causal structure in ob-
volved is probably counterfactually poor, as most sessive thoughts need not be encapsulated, and
actions do not change the alleged contamination still they are probably not accompanied by ex-
(with the possible exception of washing the perienced objecthood or perceptual presence.
hands or touching allegedly contaminated ob- Perhaps this shows that one ought first to cla-
jects; but here, the counterfactual effect is prob- rify whether it even makes sense to talk about
ably just an increase or decrease in the acuteness the phenomenology of objecthood or presence
of the felt contamination). Therefore, I list ob- with respect to conscious thoughts.
sessive thoughts as candidate examples of coun-
terfactually poor, perspective-invariant repres- 4.4 How does perception change when
entations the contents of which are represented new sensorimotor contingencies are
as causally integrated. learnt?
22 This could be a case in which there is a particularly strong demand for
the general ability of PP to combine “fast and frugal solutions” with
Another relevant question is whether increasing
“more structured, knowledge-intensive strategies” (Clark this collection). the degree of counterfactual richness, causal integ-
Wiese, W. (2015). Perceptual Presence in the Kuhnian-Popperian Bayesian Brain - A Commentary on Anil K. Seth.
In T. Metzinger & J. M. Windt (Eds). Open MIND: 35(C). Frankfurt am Main: MIND Group. doi: 10.15502/9783958570207 16 | 19
www.open-mind.net

ration, or causal depth of a model just modifies more abstract information (regarding location
(or enriches) the inferred hidden causes, or and direction). This also makes sense in com-
whether it leads to the perception of a new, pos- parison with other sensory modalities. Know-
sibly more abstract object. This relates to the ledge of auditory SMCs, for instance, does not
question raised in the target paper, namely increase the perception of the inner ear. When
whether a person who is highly familiar with an the brain learns the relevant SMCs, it thereby
object perceives it as more real (because she has learns about the hidden causes of signals in the
mastery of more SMCs) than other persons (Seth inner ear. In fact, this may be another reason to
this collection, p. 18). Interestingly, research on believe that counterfactual richness goes along
learning new SMCs tentatively suggests that it with phenomenal objecthood.
leads to the perception of new (more abstract)
objects.
Under the lead of Peter König, cognitive sci-
entists from Osnabrück have, in recent years, de-
veloped a compass belt that indicates to the per-
son wearing it (while moving) changes in direc-
tions (cf. Kaspar et al. 2014). The aim of this
project (called feelspace) is to study how percep-
tion in new sensory modalities can be enabled by Figure 2: The figure shows two versions of the feelspace
sensory augmentation.23 The belt (see Figure 2) belt. (a) The original version used in Nagel et al. (2005).
contains several vibrators, which always signal the (b) The current version used in Karcher et al. (2012) and
direction of magnetic north. Subjects who wear Kaspar et al. (2014). Images used with kind permission of
the belt for a couple of weeks learn new SMCs, Peter König.
e.g., related to how the vibrating signals change
when they turn around. A straightforward applic- This also suggests that when someone is
ation of Seth’s PPSMCT suggests that the in- more familiar with an object, the object itself
creased counterfactual richness simply goes along need not become more real, but its connections to
with an increased perceptual presence (for the other objects might. The causal network in which
belt, or the vibrations, or the hip / waist, etc). it is embedded becomes more real. Perhaps the
But the authors of the study cited report that subject also experiences more abstract objects
perception changes in different ways: (corresponding to higher-level invariants).
All in all, I hope the examples given illus-
Initially the signal was predominantly per- trate the need to provide a conceptually clearer
ceived as tactile evolving to being perceived account of counterfactual richness, causal depth,
as location and direction information. Over and causal integration. For at the moment it
time, the perception of tactile stimulation seems that they are too entangled to allow us to
receded more and more into the back- assess their potential relevance for experienced
ground. Instead the subjects’ reports fo- objecthood or presence in a rigorous way. Fur-
cused more on changes in spatial perception. thermore, it will be crucial to investigate how
Furthermore, two months after the end of phenomenal properties are affected when there
belt wearing the effects subjects reported – are changes in these three features (e.g., when
at least in the FRS questionnaire – dimin- counterfactual richness or causal integration is in-
ished. (Kaspar et al. 2014, p. 59) creased or decreased in a controlled way in a
study).
What changes is not just that SMCs for
tactile stimulation on the skin where the belt is 5 Conclusion
worn are learnt, but that these are connected to
23 For more information on the project, see: http://feelspace.cogsci.uni-
I have tried to show that useful analogies
osnabrueck.de/ between PP accounts and classical ideas in the-
Wiese, W. (2015). Perceptual Presence in the Kuhnian-Popperian Bayesian Brain - A Commentary on Anil K. Seth.
In T. Metzinger & J. M. Windt (Eds). Open MIND: 35(C). Frankfurt am Main: MIND Group. doi: 10.15502/9783958570207 17 | 19
www.open-mind.net

ory of science run deeper than portrayed in References


Seth’s target paper. Based on such analogies, I
have argued that a proper treatment of active Ashby, W. R. (1947). Principles of the self-organizing dy-
inference needs to be more sophisticated than namic system. The Journal of General Psychology, 37
Seth’s threefold distinction. In particular, Seth (2), 125-128. 10.1080/00221309.1947.9918144
blurs a whole range of ways in which models Baldi, P., & Itti, L. (2010). Of bits and wows: A Bayesian
can be falsified. theory of surprise with applications to attention.
Furthermore, I have suggested that Seth’s Neural Networks, 23 (5), 649-666.
predictive processing account of perceptual 10.1016/j.neunet.2009.12.007
presence may profit from taking not just the Blake, R. & Logothetis, N .K. (2002). Visual competition.
counterfactual richness of generative models, Nature Reviews Neuroscience, 3 (1), 13-21.
but also their degree of perspective-dependence Bongard, J., Zykov, V., & Lipson, H. (2006). Resilient
and their causal encapsulation into account (as machines through continuous self-modeling. Science,
mentioned above, this suggestion is inspired by 314 (5802), 1118-1121. 10.1126/science.1133687
Jakob Hohwy’s work). I have proposed a way in Botvinick, M. & Cohen, J. (1998). Rubber hands ‘feel’ touch
which examples of possible combinations of that eyes see. Nature, 391 (6669), 756-756. 10.1038/35784
these features can be explored, which may serve Clark, A. (2013). Whatever next? Predictive brains, situ-
as a useful tool for future research. ated agents, and the future of cognitive science. Beha-
Thomas Kuhn (1962, p. 88) writes that vioral and Brain Sciences, 36 (3), 181-204.
“normal science usually holds creative philo- 10.1017/S0140525X12000477
sophy at arm’s length, and probably for good (2015). Embodied Prediction. In T. Metzinger &
reasons”. I thus hope that research on predictive J. M. Windt (Eds.) Open MIND (pp. 1-21). Frankfurt
processing and consciousness has not yet a. M., GER: MIND Group.
reached the phase of normal science, so that Feyerabend, P. (1962). Explanation, reduction and empir-
this commentary can still make a humble con- icism. In H. Feigl & G. Maxwell (Eds.) Scientific ex-
tribution. planation, space, and time (pp. 28-97). Minneapolis,
MN: University of Minnesota Press.
Acknowledgments FitzGerald, T. H., Dolan, R. J., & Friston, K. J. (2014).
Model averaging, optimal inference and habit forma-
I am grateful to two anonymous reviewers, and tion. Frontiers in Human Neuroscience, 8 (457), 1-11.
to Jennifer Windt and Thomas Metzinger espe- 10.3389/fnhum.2014.00457
cially for providing a vast number of comments Friston, K. J. (2009). The free-energy principle: a rough
and remarks, which helped tremendously in re- guide to the brain? Trends in Cognitive Sciences, 13
vising the first draft of this paper. This com- (7), 293-301. 10.1016/j.tics.2009.04.005
ment was written with support by a scholarship (2010). The free-energy principle: a unified brain
from the Barbara Wengeler foundation. theory? Nature Reviews Neuroscience, 11 (2), 127-138.
10.1038/nrn2787
Friston, K. J., Adams, R., Perrinet, L., & Breakspear, M.
(2012). Perceptions as hypotheses: saccades as experi-
ments. Frontiers in Psychology, 3 (151), 1-20.
10.3389/fpsyg.2012.00151
Froese, T. (2014). Steps toward an enactive account of
synesthesia. Cognitive Neuroscience, 5 (2), 126-127.
10.1080/17588928.2014.905521
Godfrey-Smith, P. (2009). Darwinian populations and
natural selection. Oxford, UK: Oxford University Press.
Gregory, R. L. (1980). Perceptions as hypotheses. Philosoph-
ical Transactions of the Royal Society of London B: Biolo-
gical Sciences, 290 (1038), 181-197. 10.1098/rstb.1980.0090

Wiese, W. (2015). Perceptual Presence in the Kuhnian-Popperian Bayesian Brain - A Commentary on Anil K. Seth.
In T. Metzinger & J. M. Windt (Eds). Open MIND: 35(C). Frankfurt am Main: MIND Group. doi: 10.15502/9783958570207 18 | 19
www.open-mind.net

Hohwy, J. (2010). The hypothesis testing brain: some Nagel, S. K., Carl, C.,Kringe, T., Märtin, R., & König, P.
philosophical applications. In W. Christensen, E. Schier (2005). Beyond sensory substitution--learning the sixth
& J. Sutton (Eds.) Proceedings of the 9th Conference sense. Journal of Neural Engineering, 2 (4), 13-26.
of the Australasian Society for Cognitive Science (pp. 10.1088/1741-2560/2/4/R02
135-144). Macquarie Centre for Cognitive Science. Nickles, T. (2014). Scientific revolutions. In E. N. Zalta
10.5096/ASCS200922 (Ed.) The Stanford Encyclopedia of Philosophy.
(2012). Attention and conscious perception in the http://plato.stanford.edu/entries/scientific-revolutions/
hypothesis testing brain. Frontiers in Psychology, 3 Noë, A. (2006). Experience without the head. In T. S.
(96), 1-14. 10.3389/fpsyg.2012.00096 Gendler & J. Hawthorne (Eds.) Perceptual experience
(2013). The predictive mind. Oxford, UK: Oxford (pp. 411-434). Oxford, UK: Oxford University Press.
University Press. Oberheim, E. & Hoyningen-Huene, P. (2013). The incom-
(2014). Elusive phenomenology, counterfactual aware- mensurability of scientific theories. In E. N. Zalta (Ed.)
ness, and presence without mastery. Cognitive Neuros- The Stanford Encyclopedia of Philosophy.
cience, 5 (2), 127-128. 10.1080/17588928.2014.906399 http://plato.stanford.edu/entries/incommensurability/
Hohwy, J., Roepstorff, A. & Friston, K. (2008). Predictive Pearl, J. (1988). Embracing causality in default reason-
coding explains binocular rivalry: An epistemological ing. Artificial Intelligence, 35 (2), 259-271.
review. Cognition, 108 (3), 687-701. 10.1016/0004-3702(88)90015-X
http://dx.doi.org/10.1016/j.cognition.2008.05.010 Popper, K. R. (2005[1934]). Logik der Forschung. Tübin-
Itti, L. & Baldi, P. (2009). Bayesian surprise attracts hu- gen, GER: Mohr Siebeck.
man attention. Vision Research, 49 (10), 1295 – 1306. Proust, J. (2013). Mental acts as natural kinds. In A.
http://dx.doi.org/10.1016/j.visres.2008.09.007 Clark, J. Kiverstein & T. Vierkant (Eds.) Decomposing
Kärcher, S. M, Fenzlaff, S., Hartmann, D., Nagel, S. K., the will (pp. 262-280). Oxford, UK: Oxford University
& König, P. (2012). Sensory augmentation for the Press.
blind. Frontiers in Human Neuroscience, 6 (37), 1-15. Seth, A. K. (2014). A predictive processing theory of sen-
Frontiers Media SA. 10.3389/fnhum.2012.00037 sorimotor contingencies: Explaining the puzzle of per-
Kaspar, K., König, S., Schwandt, J., & König, P. (2014). ceptual presence and its absence in synesthesia. Cog-
The experience of new sensorimotor contingencies by nitive Neuroscience, 5 (2), 97-118.
sensory augmentation. Consciousness and Cognition, 10.1080/17588928.2013.877880
28, 47-63. 10.1016/j.concog.2014.06.006 (2015). The Cybernetic Bayesian Brain. In T.
Kuhn, T. S. (1974). The structure of scientific revolu- Metzinger & J. M. Windt (Eds.) Open MIND (pp. 1-
tions. Chicago, IL: The University of Chicago Press. 25). Frankfurt a. M., GER: MIND Group.
LaBerge, S. & DeGracia, D. J. (2000). Varieties of lucid Von Helmholtz, H. (1959). Die Tatsachen in der
dreaming experience. In R. G. Kunzendorf & B. Wal- Wahrnehmung. Zählen und Messen. Darmstadt, GER:
lace (Eds.) Individual differences in conscious experi- Wissenschaftliche Buchgesellschaft.
ence (pp. 269-307). Amsterdam, NL: John Benjamins. Waskan, J. (2008). Knowledge of counterfactual Interven-
Lakatos, I. (1970). Falsification and the methodology of sci- tions through cognitive models of mechanisms. Inter-
entific research programmes. In I. Lakatos & Musgrave, national Studies in Philosophy of Science, 22 (3), 259-
A. (Eds.) Criticism and the growth of knowledge (pp. 91- 275. 10.1080/02698590802567308
196). Cambridge, UK: Cambridge University Press. Windt, J. M. (2010). The immersive spatiotemporal hal-
Madary, M. (2014). Perceptual presence without counter- lucination model of dreaming. Phenomenology and the
factual richness. Cognitive Neuroscience, 5 (2), 131-132. Cognitive Sciences, 9 (2), 295-316.
10.1080/17588928.2014.907257 10.1007/s11097-010-9163-1
Metzinger, T. K. (2013). Why are dreams interesting for Wu, W. (2013). Mental action and the threat of auto-
philosophers? The example of minimal phenomenal maticity. In A. Clark, J. Kiverstein & T. Vierkant
selfhood, plus an agenda for future research. Frontiers (Eds.) Decomposing the will (pp. 244-261). Oxford,
in Psychology, 4 (746). 10.3389/fpsyg.2013.00746 UK: Oxford University Press.
Mroczko, A., Metzinger, T., Singer, W., & Nikoli ć, D.
(2009). Immediate transfer of synesthesia to a novel in-
ducer. Journal of Vision, 9 (12), 1-8. 10.1167/9.12.25

Wiese, W. (2015). Perceptual Presence in the Kuhnian-Popperian Bayesian Brain - A Commentary on Anil K. Seth.
In T. Metzinger & J. M. Windt (Eds). Open MIND: 35(C). Frankfurt am Main: MIND Group. doi: 10.15502/9783958570207 19 | 19
Inference to the Best Prediction
A Reply to Wanja Wiese

Anil K. Seth

Responding to Wanja Wiese’s incisive commentary, I first develop the analogy Author
between predictive processing and scientific discovery. Active inference in the
Bayesian brain turns out to be well characterized by abduction (inference to the
Anil K. Seth
best explanation), rather than by deduction or induction. Furthermore, the em-
a.k.seth @ sussex.ac.uk
phasis on control highlighted by cybernetics suggests that active inference can be
a process of “inference to the best prediction”, leading to a distinction between University of Sussex
“epistemic” and “instrumental” active inference. Secondly, on the relationship Brighton, United Kingdom
between perceptual presence and objecthood, I recognize a distinction between
the “world revealing” presence of phenomenological objecthood, and the experi- Commentator
ence of “absence of presence” or “phenomenal unreality”. Here I propose that
world-revealing presence (objecthood) depends on counterfactually rich predictive Wanja Wiese
models that are necessarily hierarchically deep, whereas phenomenal unreality  wawiese@uni-mainz.de
arises when active inference fails to unmix causes “in the world” from those that Johannes Gutenberg-Universität
depend on the perceiver. Finally, I return to control-oriented active inference in Mainz, Germany
the setting of interoception, where cybernetics and predictive processing are most
closely connected. Editors
Keywords
Thomas Metzinger
Abduction | Control-oriented active inference | Falsification | Objecthood | Pres-
metzinger @ uni-mainz.de
ence
Johannes Gutenberg-Universität
Mainz, Germany

Jennifer M. Windt
jennifer.windt @ monash.edu
Monash University
Melbourne, Australia

1 Introduction

It is a pleasure to respond to Wanja Wiese’s biguatory evidence for perceptual hypotheses.


stimulating commentary (this collection), from This claim transparently calls on analogies
which I learned a great deal. Much of what he with hypothesis testing in science (as well as
says stands easily by itself, so here I select just on counterfactually-equipped generative mod-
a few key points which warrant further develop- els), and so invites comparisons with theoret-
ment in light of his analysis. ical frameworks for scientific discovery, as
Wiese nicely develops. In particular, Wiese
2 Active inference and hypothesis testing notes that I do not “say much about what it
takes to disconfirm or falsify a given hypo-
A central claim in my target paper is that act- thesis or model”, inviting me to “provide a re-
ive inference, typically considered as the resol- fined treatment of the relation between falsi-
ution of sensory prediction errors through ac- fication and active inference” (this collection,
tion, should also (perhaps primarily) be con- p. 2). This is what I shall attempt in this first
sidered as furnishing disruptive and/or disam- section.
Seth, A. K. (2015). Inference to the Best Prediction - A Reply to Wanja Wiese.
In T. Metzinger & J. M. Windt (Eds). Open MIND: 35(R). Frankfurt am Main: MIND Group. doi: 10.15502/9783958570986 1|8
www.open-mind.net

2.1 The abductive brain and applications to interoception developed in


the target article, where allostasic1 control of
Wiese rightly says that a strict Popperian ana- ‘essential variables’ is paramount, and where
logy for active inference is inappropriate since predictive models are recruited towards this
Popperian falsification relies on hypotheses that goal Conant & Ashby 1970; Seth 2013). In this
are derived deductively. Deductive inferences light, active inference in the cybernetic Bayesian
are necessary inferences, meaning that their brain becomes a process of “inference to the
falsification in turn falsifies the premises (theor- best prediction”, where the “best” predictions
ies) from which they derive. Active inference in are those which enable control and homeostasis
the Bayesian brain is not deductive for two im- under a broad repertoire of perturbations. 2 It
portant reasons. First, as Wiese notes, Bayesian will be interesting to fully develop criteria for
inference is inherently probabilistic so that com- “best-making” in this control-oriented form of
peting hypotheses become more or less likely, abductive inference.
rather than corroborated or falsified. Probabil-
istic weighting of hypotheses suggests a process 2.2 Sophisticated falsificationism, active
of induction rather than deduction. Inductive inference, and model disambiguation
inferences are non-necessary (i.e., they are not
inevitable consequences of their premises) and Where does this leave us with respect to theor-
are assessed by observation of outcome statist- ies of scientific discovery? Strict Popperian falsi-
ics, by analogy with classical statistical infer- fication was already discounted as an analogy
ence. Second, Bayesian reasoning pays attention for active inference. At the other extreme, par-
not just to outcome frequencies but to proper- allels with Kuhnian paradigm shifts also seem
ties of the explanation (hypothesis) itself, as inappropriate since these are not based on infer-
captured by the slogan that (Bayesian) percep- ence whether deductive, inductive, or abductive.
tion is the brain’s “best guess” of the causes of Also, such shifts are typically unidirectional:
its sensory inputs. This indicates that the having dispensed with the Copernican world-
Bayesian brain is neither deductive nor induct- view once, we are unlikely to return to it in the
ive but abductive (Hohwy 2014), where abduc- future. These two points challenge Wiese’s ana-
tion is typically understood as “inference to the logy between paradigm shifts and perceptual
best explanation”. In Bayesian inference, what transitions in bistable perception (see Wiese’s
makes a “best” explanation rests not only on footnote 12, this collection, p. 9). What best
outcome frequencies, but also on quantification survives in this analogy is an appeal to hier-
of model complexity (models with fewer para- archical inference, where changes in “paradigm”
meters are preferred), and by priors, likelihoods, correspond to alternations between hierarchic-
as well as hyper-priors which may make some ally deep predictions, each of which recruit
prior-likelihood combinations more preferable more fine-grained predictions which themselves
than others. Importantly, abductive (and in- each explain only part of the ongoing sensor-
ductive) processes are ampliative, meaning that imotor flux, under the hyper-prior that percep-
they are capable of going beyond that which is tual scenes must be self-consistent (Hohwy et
logically entailed by their premises. This is im- al. 2008).
portant for the Bayesian brain, because the Wiese himself seems to favour Lakatos’ in-
fecundity and complexity of the world (and terpretation of Popper, a “sophisticated falsific-
body) requires a flexible and open-ended means ationism” where theories (perceptual hypo-
of adaptive response. theses) can be modified rather than rejected
So, the Bayesian brain is an abductive outright, when predictions are not confirmed,
brain. But I would like to go further, recalling 1 Allostasis: the process of achieving homeostasis.
that active inference enables predictive control 2 There is an interesting analogy here to the overlooked “perceptual
in addition to perception. This emphasis is par- control theory” of William T. Powers, which says that living things
control their perceived environment by means of their behavior, so
ticularly clear in the parallels with cybernetics that perceptual variables are the targets of control (1973).

Seth, A. K. (2015). Inference to the Best Prediction - A Reply to Wanja Wiese.


In T. Metzinger & J. M. Windt (Eds). Open MIND: 35(R). Frankfurt am Main: MIND Group. doi: 10.15502/9783958570986 2|8
www.open-mind.net

and where hypotheses are not tested in isolation isolation, mandating a process of comparison
(more on this later). As Wiese shows, sophistic- among competing models or hypotheses. How-
ated falsification fits well with some aspects of ever, he implies a sequential testing of each hy-
Bayesian inference, like model updating. Ac- pothesis: “balloons being launched and then
cording to Lakatos, core theoretical commit- shot done, one by one” (see Wiese this collec-
ments can be protected from immediate falsific- tion, p. 6). This is quite different from the inter-
ation by introducing “auxiliary hypotheses” pretation of model comparison pursued in my
which account for otherwise incompatible data target article, where multiple models are con-
(1970). The key criterion - in the philosophy of sidered in parallel, and where counterfactual
science sense - is that these auxiliary hypotheses predictions are leveraged to select the action (or
are progressive in virtue of making additional experiment) most likely to disambiguate com-
testable predictions, as opposed to degenerate, peting models. In Bayesian terms this is reflec-
which is when the core commitments become ted in a shift towards model comparison and av-
less testable.3 This maps neatly to counterfactu- eraging (FitzGerald et al. 2014; Rosa et al.
ally-equipped active inference, where hierarchic- 2012), as compared to inference and learning on
ally deep predictive models spawn testable a single model. Bongard and colleagues’ evolu-
counterfactual sensorimotor predictions which tionary robotics example was selected precisely
are selected on the basis of precision expecta- because it illustrates this point so well (Bongard
tions, and which lead to effective updating et al. 2006). Here, repeated cycles of model se-
(rather than “falsification”) of perceptual hypo- lection and refinement lead to the prescription
theses. As Wiese notes, a good example of this of novel actions that best disambiguate the cur-
is given by Friston and colleagues’ model of sac- rent best models (note the plural). Indeed, it is
cadic eye movements (Friston et al. 2012). the repeated refinement of disambiguatory ac-
When it comes to model comparison, sophistic- tions that gives Bongard’s starfish robot its
ated falsification may even approximate some compelling “motor babbling” appearance. To re-
aspects of abductive inference: “Explaining iterate: different actions may be specified when
away is another example of sophisticated falsi- the objective is to disambiguate multiple models
fication. Even when two or more models are in parallel, as compared to testing models one-
compatible with the evidence … there can be at-a-time. In the setting of the cybernetic
reason to prefer one of them and reject the Bayesian brain this example is important for
other” (Wiese this collection, p. 7). This two reasons: it underlines the importance of
strongly recalls Bayesian model comparison and counterfactual processing (to drive the selection
“inference to the best explanation”, if not its of disambiguatory actions) and it emphasizes
control-oriented “inference to the best predic- that predictive modelling can be seen as a
tion” form. means of control in addition to discovery, ex-
One important clarification is needed planation, or representation. In this sense it
about Wiese’s interpretation of model compar- doesn’t matter how accurate the starfish self
ison, highlighting the critical roles of action and model is – what matters is whether it works.
counterfactual processing. Wiese rightly em-
phasizes the important insight of Popper and 2.3 Science as control or science as
Lakatos that hypotheses are never tested in discovery?
3 An important application of this idea is to the Bayesian brain itself The distinction between explanation and control
as a scientific hypothesis. A concern about the Bayesian brain hypo-
thesis is that it can be insulated from falsification by postulating returns us to the philosophy of science. Put
convenient (typically unobservable) priors, much like adaptationist simply, the views of Popper, Lakatos, and (less
explanations in evolutionary biology can be critiqued as “just so”
stories. The key question, not answered here, is whether neural so) Kuhn, are concerned with how science re-
mechanisms implement (approximations to) Bayesian inference, or veals truths about the world, and how falsifica-
whether Bayesian concepts merely provide a useful interpretative
framework. In the former case one would require the Bayesian brain
tion of testable predictions participates in this
hypothesis to be progressive not degenerate. process. Picking up the threads of abduction,
Seth, A. K. (2015). Inference to the Best Prediction - A Reply to Wanja Wiese.
In T. Metzinger & J. M. Windt (Eds). Open MIND: 35(R). Frankfurt am Main: MIND Group. doi: 10.15502/9783958570986 3|8
www.open-mind.net

control-oriented active inference, and “inference 3.1 Presence and objecthood together
to the best prediction”, we encounter the pos-
sibility that theories of scientific discovery As Wiese notes, when visually perceiving a real
might themselves appear differently when con- tomato (figure 1A) there is both a sense of
sidered from the perspective of control. Historic- presence (the subjective sense of reality of the
ally, it is easy to see the narrative of science as tomato) and of objecthood (the perception that
a struggle to gain increasing control over the en- a (real) object is the cause of sensations). Im-
vironment (and over people), rather than a pro- portantly, while distinct, these properties are
cess guided by the lights of increasing know- not independent. There is a “world-revealing”
ledge and understanding.4 A proper exploration dimension to perceptual presence which is
of this territory moves well beyond the present closely aligned with the experience of an extern-
scope (see e.g., Glazebrook 2013). In any case, ally-existing object: “How can it be true … that
whether or not this perspective helps elucidate we are perceptually aware, when we look at a
scientific practice, it certainly suggests import- tomato, of the parts of the tomato which,
ant limits in how far analogies can be taken strictly speaking, we do not perceive. This is
between philosophies of scientific discovery and the puzzle of perceptual presence” (Noë 2006, p.
the cybernetic Bayesian brain. 414).

3 Perceptual presence and counterfactual


richness

The second part of Wiese’s commentary picks


up on the issue of perceptual presence, which
in my target article was associated with the
“richness” of counterfactual sensorimotor pre-
dictions (see also Seth 2014, 2015b). Wiese
makes a number of connected points. First, he Figure 1: A. An image of a tomato. B. An image of a
rightly notes an ambiguity between object- clear blue sky.
hood and presence in perceptual phenomeno-
logy, as presented in my target article (Seth How does this object-related world-reveal-
this collection) and in Seth (2014). Second, he ing presence come about? In predictive pro-
introduces the notion of causal encapsulation cessing (and by extension PPSMC), objecthood
as a third phenomenological dimension, com- depends on predictive models encoding hier-
plementing counterfactual richness and per- archically deep invariances that accommodate
spective dependence. He spends some time de- complex nonlinear mappings from (object-re-
veloping examples based on cognitive phe- lated, world-revealing) hidden causes to sensory
nomenology and mental action to illustrate signals (Clark 2013; Hohwy 2013). There is a re-
how these dimensions might relate. Here, I ciprocal dependency here between hierarchical
will focus on the relationship between pres- depth and counterfactual richness, because (i)
ence and objecthood from the perspective of hierarchically deep invariances in generative
counterfactual predictive processing – or more models enable precise predictions about rich
specifically the theory of “ Predictive Pro- repertoires of counterfactual sensorimotor map-
cessing of SensoriMotor Contingencies” pings, and (ii) counterfactual richness can scaf-
(PPSMC; Seth 2014, 2015b). 5
fold the acquisition of hierarchically deep invari-
ant predictions. One might even say that hier-
4 The continually increasing pressure to justify research in terms of
“impact” – especially when seeking funding – highlights one way in archically deep invariances are partly consti-
which an emphasis on control (rather than discovery) is realized in tuted by (possibly latent) predictions of counter-
scientific practice.
5 See also my response (Seth 2015b) to commentaries on (Seth 2014),
factually rich sensorimotor mappings (Seth
which focuses on this issue. 2015b). These dependencies indicate that ob-
Seth, A. K. (2015). Inference to the Best Prediction - A Reply to Wanja Wiese.
In T. Metzinger & J. M. Windt (Eds). Open MIND: 35(R). Frankfurt am Main: MIND Group. doi: 10.15502/9783958570986 4|8
www.open-mind.net

jecthood and world-revealing presence depend in the world, from those that depend on actions
on expectations about counterfactual richness, (or other properties) of the perceiver (Seth
rather than counterfactual richness per se. Alto- 2015b). This in turn emerges from violations of
gether, counterfactually-informed active infer- counterfactual predictions. For example, con-
ence enables the extraction and encoding of sider how saccadic eye movements engage coun-
hierarchically deep hidden causes of sensory sig- terfactual predictions. Perceptual afterimages
nals. In virtue of hierarchical depth, these in- track eye movements, violating counterfactual
ferred causes will also be perspective invariant, predictions associated with world-revealing hid-
in the sense that they will have been separated den causes that rest on active inference. In con-
from those causes that depend on on actions (or trast, counterfactual predictions associated with
other properties) of the perceiver (see Wiese blue skies are less amenable to disconfirmation
this collection, p. 11). In short, to the extent by eye movements, so (non-object-related) per-
that objecthood and perceptual presence go to- ceptual presence remains.6
gether, so do hierarchical depth (encoding Summarizing, perceptual presence, as an
world-revealing invariances) and (expected) explanatory target, can be refined into (i) a
counterfactual richness. world-revealing presence associated with object-
hood and hierarchical depth, and (ii) a phenom-
3.2 Presence and objecthood apart enal unreality arising from a failure to inferen-
tially separate hidden causes in the world from
So far so good, but it is evident that presence those associated with the perceiver. Both rely
and objecthood do not always go together (Di on counterfactual processing, and so both call
Paolo 2014; Froese 2014; Madary 2014), a phe- on active inference. Perspective invariance is
nomenological fact which requires further ana- also implicated in objecthood (through hier-
lysis (Seth 2015b). Presence without objecthood archical depth) and phenomenal unreality
is exemplified in vision by the experience of a (through isolating worldy causes), suggesting
uniform deep blue sky (Figure 1B), and is also that this dimension may not be as separable
characteristic of non-visual modalities like olfac- from counterfactual richness as proposed by
tion (Madary 2014). The visual impression of a Wiese (this collection, p. 13). But is that all
blue sky, or the tang of briny sea air, both seem there is to presence?
perceptually present but without eliciting any
specific phenomenology of objecthood. At the 3.3 Causal encapsulation and
same time, the corresponding predictive models embodiment
are likely to be hierarchically shallow and coun-
terfactually poor: there is not much I can do Wiese distinguishes three dimensions to percep-
(besides closing my eyes or looking away) to al- tual presence: counterfactual richness (vs.
ter the sensory input evoking a blue-sky experi- poverty), perspective invariance (vs. depend-
ence, and the inferred hidden causes are un- ence), and causal encapsulation (vs. integra-
likely to lie behind multiple inferential layers. tion). The third of these, causal encapsulation,
Hierarchical shallowness may explain the lack of is perhaps the hardest to pin down. The idea as
phenomenal objecthood, but why isn’t there I understand it, is that a representation (pre-
also a lack of perceptual presence? dictive model) is causally encapsulated if it is
Blue-sky-experiences (and olfactory inferentially isolated from other hidden causes;
scenes) actually do lack the world-revealing by contrast it is causally open or integrated if it
presence associated with objecthood. But they expresses a rich set of relations to other inferred
do not appear phenomenally unreal in the sense
6 Phenomenal unreality on this story corresponds to a loss of “transpar-
that perceptual afterimages and synaesthetic ency” as described by (Metzinger 2003). For Metzinger, transparency is
concurrents are experienced as unreal. In lost – and phenomenal unrealness results – when the “construction pro-
PPSMC, phenomenal unreality can arise from cess” underlying perception becomes available for attentional processing.
This maps neatly on a failure to inferentially unmix world-related from
an inferential failure to separate hidden causes perceiver-related hidden causes – see Seth (2015b) for more on this.

Seth, A. K. (2015). Inference to the Best Prediction - A Reply to Wanja Wiese.


In T. Metzinger & J. M. Windt (Eds). Open MIND: 35(R). Frankfurt am Main: MIND Group. doi: 10.15502/9783958570986 5|8
www.open-mind.net

causes. So, a predictive model underlying the The body is of course not only a source of
experience of a tomato may be causally integ- counterfactual predictions, but also the target
rated with that underlying the experience of the of counterfactually-informed active inference,
table on which it lies, and the hand (maybe my both for representation (exemplified by the rub-
hand), which is poised to reach out and pick it ber-hand-illusion, as mentioned by Wiese) and
up. Here, there may be a relation between for control.7 As emphasized in the target article,
causal encapsulation/integration and the infer- control-oriented active inference is particularly
ential unmixing of perceiver-related and world- significant for interoception, where predictive
related hidden causes: a failure to separate modelling is geared towards allostasis and
these causes would presumably prevent rich homeostasis rather than accurate representation
causal integration with other hidden causes in (see also Seth 2013). Returning the focus to in-
the world. teroceptive inference raises a host of intriguing
The concept of causal encapsulation high- questions, which can only gestured at here. One
lights another interesting aspect of Wiese’s com- may straightaway wonder how counterfactual
mentary: the idea that counterfactual predic- aspects of interoceptive inference shape the
tions may not always encode sensorimotor con- “presence” of emotional and body-related exper-
tingencies: “it might be equally relevant to en- iences. Is it possible to have an emotional ex-
code how sensory signals pertaining to the to- perience lacking in “affective presence” – and
mato would change if the wind were to blow … what is the phenomenological correlate of “ob-
or if the tomato were to fall down” (Wiese this jecthood” for interoceptive experience? Other
collection, p. 11). While such extra-personal interesting questions are how precision weight-
causal contingencies may be salient in many ing sets the balance between representation
cases, I see them as secondary to sensorimotor versus control in active interoceptive inference,
body-related counterfactual predictions. By and what it means to isolate “wordly” causes
definition they do not involve active inference: I when both the means and the targets of active
have to wait for the wind to change direction inference are realized in the body. These are not
(though perhaps I might move to get a better just theoretical questions: advances in virtual
view). This means that many central features of reality (Suzuki et al. 2013) and in methods for
active inference discussed here – its relation to measuring interoceptive signals (Hallin & Wu
predictive control, homeostasis, and counterfac- 1998) promise real empirical progress on these
tually-informed model disambiguation – do not issues.
apply.
The body re-emerges here as central, this 4 Conclusions
time as a ground for the generation of coun-
terfactual predictions. Specifically, bodily con- This response has been shaped by Wiese’s per-
straints shape counterfactual predictions since spicuous focus on the philosophy of science and
they place limits on how actions can be de- on the phenomenology of perceptual presence.
ployed in intervening upon the (inferred) My response to the first topic was to frame the
causes of sensory input. This suggests that Bayesian brain in terms of control-oriented ab-
changing action repertoires would alter experi-
ences of presence. Wiese raises out-of-body-ex- 7 Wiese, when discussing König’s FeelSpace project (Kaspar 2014), in-
terprets PPSMC as saying that increased practice with the FeelSpace
periences and dream experiences as a relevant compass belt – and hence increased counterfactual richness– would
context (this collection, p. 15), where subjects lead to “increased perceptual presence (for the belt, or the vibra-
tions, or the hip/waist, etc.)” (Wiese this collection, p. 17). I see
sometimes identify their first-person-perspect- things differently. The counterfactual predictions, while mediated by
ive, not with a body, but with an unextended the belt, relate to hidden causes in the world (e.g., magnetic north).
In fact, PPSMC says that FeelSpace practice would lead to hierarch-
point in space. I agree with him that examin- ically deep and counterfactually rich models of how “magnetic north”
ing world-revealing presence in these situ- impacts on belt vibrations and the like, leading to increased world-
ations would be fascinating, if extremely diffi- revealing presence for these worldly causes but diminished perceptual
presence of the tactile stimulation itself. Still, the FeelSpace project
cult in practice. certainly provides a fertile empirical testbed for the ideas raised here.

Seth, A. K. (2015). Inference to the Best Prediction - A Reply to Wanja Wiese.


In T. Metzinger & J. M. Windt (Eds). Open MIND: 35(R). Frankfurt am Main: MIND Group. doi: 10.15502/9783958570986 6|8
www.open-mind.net

duction, where falsification is replaced by “infer- by precision optimisation and counterfactual


ence to the best prediction” as a criterion for processing. Putting things this way provides a
progress. I also reinforced the dependency new way to link “life” and “mind” (Godfrey-
between active inference and counterfactual pro- Smith 1996) and may help reveal the biological
cessing, which underpins the important case of imperatives underlying perception, emotion,
disambiguatory active inference in Bayesian and selfhood.
model comparison. With respect to perceptual
presence I proposed a distinction between Acknowledgements
world-revealing presence and phenomenal un-
reality (Seth 2015b). World-revealing presence I am grateful to the Dr. Mortimer and Theresa
corresponds to objecthood and is associated Sackler Foundation, which support the work of
with hierarchical depth, expected counterfactual the Sackler Centre for Consciousness Science.
richness, and perspective invariance of percep- Many thanks to Thomas Metzinger, Jennifer
tual hypotheses. Phenomenal unreality tran- Windt and the MIND group for inviting me to
spires when perceptual inference fails to unmix participate in this project, to Jakob Hohwy and
world-related from perceiver-related causes; this Karl Friston for correspondence about abduct-
corresponds to a loss of “phenomenal transpar- ive inference, and to Wanja Wiese for his excel-
ency” (Metzinger 2003) and depends on viola- lent commentary.
tion of counterfactual sensorimotor predictions.
Space constraints prevented me considering
Wiese’s discussion of the “presence” of cognitive
phenomenology, like abstract mathematical and
philosophical thinking, in these terms. There is
of course a rich literature in linking such phe-
nomena to the body (Lakoff & Nunez 2001),
and hence perhaps to active inference where the
concept of a “mental action” becomes critical
(O’Brien & Soteriou 2009). Space constraints
also prevented Wiese from elaborating on in-
teroception, which I consider the most interest-
ing setting for control-oriented active inference,
in virtue of the cybernetics-inspired emphasis
on homeostasis and allostasis. Interesting ques-
tions emerge here about how counterfactual
processing plays into the phenomenology of in-
teroceptive experience.
Cognitive scientists have long argued for a
continuity between perception and action
(Dewey 1896). To close, I suggest thinking in-
stead of a continuum between epistemic and in-
strumental active inference. This is simply the
idea that active inference – a continuous process
involving both perception and action – can be
deployed with an emphasis on predictive control
(instrumental), or on revealing the causes of
sensory signals (epistemic). This process inter-
twines interoception, proprioception, and ex-
teroception, and autonomic and motoric action,
with the balance always delicately orchestrated
Seth, A. K. (2015). Inference to the Best Prediction - A Reply to Wanja Wiese.
In T. Metzinger & J. M. Windt (Eds). Open MIND: 35(R). Frankfurt am Main: MIND Group. doi: 10.15502/9783958570986 7|8
www.open-mind.net

References Lakatos, I. (1970). Falsification and the methodology of


scientific research programmes. In I. Lakatos & A.
Bongard, J., Zykov, V. & Lipson, H. (2006). Resilient ma- Musgrave (Eds.) Criticism and the growth of knowledge
chines through continuous self-modeling. Science, 314 (pp. 91-196). Cambridge, UK: Cambridge University
(5802), 1118-1121. 10.1126/science.1133687 Press.
Clark, A. (2013). Whatever next? Predictive brains, situ- Lakoff, G. & Nunez, R. (2001). Where mathematics
ated agents, and the future of cognitive science. Beha- comes from: How the embodied mind brings mathemat-
vioural and Brain Sciences, 36 (3), 181-204. ics into being. New York, NY: Basic Books.
10.1017/S0140525X12000477 Madary, M. (2014). Perceptual presence without counter-
Conant, R. & Ashby, W. R. (1970). Every good regulator factual richness. Cognitive Neuroscience, 5 (2), 131-133.
of a system must be a model of that system. Interna- 10.1080/17588928.2014.907257
tional Journal of Systems Science, 1 (2), 89-97. Metzinger, T. (2003). Phenomenal transparency and cog-
Dewey, J. (1896). The reflex arc concept in psychology. nitive self-reference. Phenomenology and the Cognitive
Psychological Review, 3, 357-370. Sciences, 2, 353-393.
Di Paolo, E. A. (2014). The worldly constituents of per- Noë, A. (2006). Experience without the head. In T.
ceptual presence. Frontiers in Psychology, 5. Gendler & A. Hawthorne (Eds.) Perceptual experience
10.3389/fpsyg.2014.00450 (pp. 411-434). New York, NY: Clarendon / Oxford
FitzGerald, T. H., Dolan, R. J. & Friston, K. J. (2014). University Press.
Model averaging, optimal inference, and habit forma- O’Brien, L. & Soteriou, M. (Eds.) (2009). Mental actions.
tion. Frontiers in Human Neuroscience, 8. 10.3389/fn- Oxford, UK: Oxford University Press.
hum.2014.00457 Powers, W. T. (1973). Behavior: The control of percep-
Friston, K. J., Adams, R. A., Perrinet, L. & Breakspear, tion. Hawthorne, NY: Aldine de Gruyter.
M. (2012). Perceptions as hypotheses: Saccades as ex- Rosa, M. J., Friston, K. J. & Penny, W. (2012). Post-hoc selec-
periments. Frontiers in Psychology, 3. tion of dynamic causal models. Journal of Neuroscience
10.3389/fpsyg.2012.00151 Methods, 208 (1), 66-78. 10.1016/j.jneumeth.2012.04.013
Froese, T. (2014). Steps toward an enactive account of Seth, A. K. (2013). Interoceptive inference, emotion, and
synesthesia. Cognitive Neuroscience, 5 (2), 126-127. the embodied self. Trends in Cognitive Sciences, 17
10.1080/17588928.2014.905521 (11), 565-573. 10.1016/j.tics.2013.09.007
Glazebrook, T. (Ed.) (2013). Heidegger on science. New (2014). A predictive processing theory of sensorimo-
York, NY: State University of New York Press. tor contingencies: Explaining the puzzle of perceptual
Godfrey-Smith, P. G. (1996). Spencer and Dewey on life presence and its absence in synesthesia. Cognitive Neur-
and mind. In M. Boden (Ed.) The philosophy of artifi- oscience, 5 (2), 97-118. 10.1080/17588928.2013.877880
cial life (pp. 314-331). Oxford, UK: Oxford University (2015). The cybernetic bayesian brain. In T. Met-
Press. zinger & J. M. Windt (Eds.) Open MIND. Frankfurt a.
Hallin, R. G. & Wu, G. (1998). Protocol for microneuro- M., GER: MIND Group.
graphy with concentric needle electrodes. Brain Re- (2015b). Presence, objecthood, and the phenomen-
search Protocols, 2 (2), 120-132. ology of predictive perception. Cognitive Neuroscience
Hohwy, J. (2013). The predictive mind. Oxford, UK: Ox- Suzuki, K., Garfinkel, S. N., Critchley, H. D. & Seth, A. K.
ford University Press. (2013). Multisensory integration across exteroceptive and
(2014). The self-evidencing brain. Noûs. interoceptive domains modulates self-experience in the
10.1111/nous.12062 rubber-hand illusion. Neuropsychologia, 51 (13), 2909-
Hohwy, J., Roepstorff, A. & Friston, K. (2008). Predictive 2917. 10.1016/j.neuropsychologia.2013.08.014
coding explains binocular rivalry: An epistemological Wiese, W. (2015). Perceptual presence in the Kuhnian-
review. Cognition, 108 (3), 687-701. Popperian Bayesian brain. In T. Metzinger & J. M.
10.1016/j.cognition.2008.05.010 Windt (Eds.) Open MIND. Frankfurt a. M., GER:
Kaspar, K., König, S., Schwandt, J. & König, P. (2014). MIND Group.
The experience of new sensorimotor contingencies by
sensory augmentation. Consciousness and Cognition,
28. 10.1016/j.concog.2014.06.006

Seth, A. K. (2015). Inference to the Best Prediction - A Reply to Wanja Wiese.


In T. Metzinger & J. M. Windt (Eds). Open MIND: 35(R). Frankfurt am Main: MIND Group. doi: 10.15502/9783958570986 8|8

Das könnte Ihnen auch gefallen