Beruflich Dokumente
Kultur Dokumente
Intelligent Machines
Introduction
In The Imaginary
Signifier (1982),
Christian Metz
coined the term
Scopic
Regime A visualisation of Kevin Durants shots created by
SportVU, a company whose player tracking technology
originates in an IDF rocket programme.
16
3.8
17
tation from pure imagination, the photo demands a physical object so it might
re-present it. Barthes went so far as to claim that analogue photography is akin to a
message without a code, that is, as raw, or purely legible information.5
The digital image, on the other hand, is all code. It is often described as
a numeric representation (normally binary) of a two-dimensional image, but
behind such a definition lies a much more ambivalent reality. In fact, the raw data
collected by the electro-optic sensor isnt a representation of an image. On the
contrary, the image is just one possible output converted from the raw data.
An electro-optic-sensor (such as a standard CCD) consists of a thin silicon wafer
divided into millions of light sensitive squares (photosites). Each photosite corre-
sponds to an individual pixel in the final image. The sensor turns light (photons)
into electrons (charge). Once exposed, the electrons at each photosite are passed
to a charge-sensing node, amplified, and relayed to read out electronics to be digi-
tised and sent to the computer. The output voltage is consequently converted to
a digital signal rendered into binary code. This code is then, finally, converted
into an image.
To the same extent one could have programmed a device that would render
optical input (light) into sonic or textual output via code. In other words if we must
decide upon the primacy of a medium, the principal mode which lies at the heart
of the electro-optic logic, it would be code, or numeric data, which would gain
prominence. The digital image is a two-dimensional representation of numeric
data, rather than the other way around. In this sense the digital image is symp-
tomatic of the current Scopic Regime for within this entire regime the image is
subjected to the logic of code and computation.
The task of seeing and targeting the bomber was transferred to an automated
algorithmic process, one which was based on statistical patterns extracted from
a database, a history. The Debomber was never completed, it was premature:
digital memory of sufficient capacity was not yet invented, and computers too
were barely more than an idea. But its historical implications have been extensive
both in the field of cybernetics, a science founded by Wiener, and in the devel-
18
3.8
opment of the first computers, at Princeton University, where Bigelow joined the
Electronic Computer Project. Vision would never be the same.
Several key concepts first developed by Wiener and Bigelow laid the foun-
dation for computer vision, artificial intelligence and more generally for the
field of cybernetics. Above all else, Wiener and Bigelows identification between
noise, entropy and information and the negation of entropy, or negentropy, would
be key to developing a method for separating or filtering signal from noise by way
of a statistical analysis of a large database.7 Furthermore, in parallel to his futile
work on the anti-aircraft gunner, Wiener did succeed in developing a noise-reduc-
ing filter. This filter, aptly named the Wiener filter, is used to this day to reduce
sonic or visual noise. Ever since, numerous filters and formats have been invented,
like the codecs and mediums of compression, such as JPEG or the MP3, which are
all based upon the same principle: the statistical separation of signal from noise.
Here we first encounter the principle, which is to be so essential to the current
Scopic Regime: it is a regime that transcends the visual realm in the begin-
ning, before the image, was code. In other words, what distinguishes our current
dominant mode of vision is the way by which the image is subjected to an exterior
logic; a logic which one could translate into a myriad of modes of expression.
As Peter Galison has pointed out, the development of the Debomber intro-
duced an ontology of the enemy. Wiener argued that human behavior could
be mathematically modeled and predicted, particularly under stress; thereby
articulating a new belief that both machines and humans could speak the same
language of mathematics.8 According to Galison, the servomechanical enemy
would become for cybernetic vision the prototype for human physiology and,
ultimately, for all of human nature.9 Thus, Wiener would go so far as to claim that
as objects of scientific inquiry, humans do not differ from machines.10 But this
moment also engendered a new understanding of vision itself, becoming, as Orit
Halpern suggests: a material artefact an algorithm capable of actions and
decisions such as identifying a prey or an enemy.11 programme.
19
lations, one could simply churn through a large set of random simulations one
after the other and observe the results (wins, in the case of Solitaire). The larger
the set of simulations, the more accurate the inducible probabilistic distribution
becomes. Of course crunching numbers in magnitudes of this order was impos-
sible without the invention of the computer. Ulam was one of the first people to
realise the dramatic implications of exponentially accelerating computational
power. But the Monte Carlo method foregrounds another important aspect of
computer vision, namely, its indifference towards the distinction between stim-
ulatory and sensory data, that is between data that is collected, and simulations
that are generated.
Benoit Mandelbrot, an important pioneer in the use of computer generated
imagery (CGI) has written about his contribution to the development of mathe-
matical thinking through image analysis:
In other words, computational vision is, to a great extent, a vision that constantly
spills from virtual to the real; both are perpetually re-touched or re-modeled to
fit each other, creating a constant feedback loop. For instance a fashion models
20
3.8
the current Scopic Regime content is rendered nearly whether they would be able to visit works by
Mark Lombardi then on display. The artist,
superfluous; topological analysis of metadata is more than who committed suicide in 2000, created for
years intricate diagrams mapping power rela-
enough. In the case of phone calls, the people you called, tions between public and private interests.
21
Shibboleth
The most common test used today to distinguish between a bot a software
application that runs automated tasks over the Internet and a human user is the
Completely Automated Public Turing test to tell Computers and Humans Apart
(CAPTCHA). Unlike the original Turing test invented by Alan Turing with the
aim of discerning whether a machine has reached human-like levels of artificial
intelligence a CAPTCHA, rather ironically, relies upon a machine to judge
whether the user is a human. The way this is done is rather simple, and one which
we encounter on a daily basis: the computer generates an image which contains a
set of blurry, smudgy characters (usually a combination of letters and numbers)
and we, as users, are set the task of recognising or deciphering the vague message.
For humans, the inverted Turing test, as inconvenient as it may be, is still quite a
simple task. For your average bot on the other hand, solving such a riddle would
amount to a great intellectual feat. Human vision and recognition is here used as
a shibboleth of sorts, a means of differentiation between man and machine. In
an ironic turn of events, the machine thus becomes the measure of all things. The
images that appear in CAPTCHA tests are produced in such a way as to sabotage
the possibility of automated decryption, or Optical Character Recognition (OCR).
To do so one need only blur or colour the background, smudge the text so that it
isnt horizontal, or distort the characters and their spacing randomly. Apparently
even though computers are capable of executing billions of calculations in a
matter of seconds, theyre quite limited when it comes to seeing.
And yet, this gap,between visual understanding and data analysis is slowly
closing, not because supercomputers have suddenly reached human levels of
intelligence and are able to recognise images and semantically categorise them,
but rather because during the past five decades we have consistently made an
effort to approach the world of computers, to make our world more comprehensi-
ble to them. Computers are essentially blind to images, they see only thanks to
external devices, like data about data. Consequently a computer is able to visual-
ise information in the same way that a blind person navigates the city with a white
cane. In other words we might say that the computer, is also able to see blindly.
Social networks enable computers to translate and map human intersub-
jective relationships, to classify facial characteristics and quantify our wants and
opinions. So called smart devices render our everyday life into troves of data:
medical, social and geographic. During the next decades we can expect more and
more sensors to mine sources of data from archives which are currently growing
exponentially. This is the logic behind the vision of an internet of things, which
advertising companies across the globe are fiercely promoting. The more parts
in our life we make intelligible and quantifiable to computational machines, the
more we are prone to subject ourselves to a logic and epistemology designed to
their benefit.
22
3.8
Its essential we keep in mind that computer vision too is just a perspective,
one which stems out of a concrete historical context, with its own set of biases,
preconceptions and blind spots. During this essay several of these were identified
first was the preference towards quantity over quality, which is based upon the
probabilistic nature of data analysis. This, as we have seen, is manifested in many
different fields, from the underlying logic of noise-reducing algorithms, to the
architecture of surveillance programs. The primacy of code over all other modes
of expression, and consequently, the negation indexicality, is another such bias.
The rise of post-production, computer generated imagery and physical model-
ing, all have economic as well as aesthetic and epistemological implications.
Finally, the pervasiveness of metadata informs the way knowledge is mapped
and extracted and people surveilled. As these different examples all demonstrate,
what is at stake is essentially a new mode of governance and control, as well as a
novel way of seeing, and consequently, representing.
23