The Big Dangers of 'Big Data' (Opinion)

The big dangers of 'big data' (Opinion)
The White House report presents big data as an analytically powerful set of techniques. It says the
social and economic value created by big data should be balanced against "privacy and other core
values of fairness, equity and autonomy."
But the White House effort to balance the costs and benefits of big data misses the bigger picture.
There are limits to the analytic power of big data and quantification that circumscribe big data's
capacity to drive progress.
"Mayer also favored a system of quarterly performance reviews, or Q.P.R.s, that required every
Yahoo employee, on every team, be ranked from 1 to 5. The system was meant to encourage hard
work and weed out underperformers, but it soon produced the exact opposite. Because only so many
4s and 5s could be allotted, talented people no longer wanted to work together; strategic goals were
sacrificed, as employees did not want to change projects and leave themselves open to a lower
score."
As the Yahoo example shows, the presumption that quantitative techniques objectively assess "what
works" is deeply flawed. Many attempts to collect and interpret data not only miss key factors, but
transform for the worse the systems they claim only to be measuring.
A legal test
Sheri Lederman, a fourth grade teacher on Long Island, sued the New York State Education
Department in October 2014 in what is perhaps the clearest legal test case of the dangers of big
data. Lederman is highly http://www.coveo.com/en/solutions/enterprise-search regarded by her
peers and superiors, an "exceptional educator" in the words of her school district's superintendent.
Yet a statistical technique called "value-added modeling" that purports to evaluate teachers based on
students' standardized test scores said Lederman was ineffective. The American Statistical
Association has criticized value-added modeling as an ineffective measure. "Ranking teachers by
their VAM scores can have unintended consequences that reduce quality," the statisticians said.
Despite the skepticism of statisticians -- the experts best aware of the weaknesses of the tools they
created -- bureaucrats at the state Department of Education have embraced the use of value-added
modeling. Lederman appears to be just one individual among the many who are being hurt by the
vogue for data.
The impulse to overuse data is not unique to educational bureaucrats. Quantification centralizes
bureaucratic power and gives outsize importance to short-term effects because they are easier to
measure. It is not a question of balancing the power of big data against its dangers, but of
recognizing the nonobvious limitations of that power.
The central claim of data proponents is that data always has some positive value. This premise is
false. Data-gathering that seems innocuous enough to the managerial class often brings with it
undue burden on the subjects of the data gathering.
Monitoring workers
Take reports attributed to Amazon customer service representatives about how each moment of
their workday is monitored and measured, or similar practices recalled by people who said they had
been Target employees. In both instances, the decentralized, human processes in which supervisors
evaluate their subordinates have been replaced by centralized, quantitative metrics. This shift has
been taking place across retail, customer service and food preparation sectors, which together
account for over 20% of America's workforce.
As a result, in the words of a person reported by Gawker to have been a manager at Target, "Of
course we cheated, as the saying went, if you weren't cheating you weren't trying." According to the
former manager's statement, Target's corporate management was trying to increase customer
satisfaction by measuring customer satisfaction scores, which employees falsified. If those compiling
the data cheat, the data won't be useful to the central office.
The burden of compiling data causes retail employees who previously had some professional
autonomy to feel constantly under centralized surveillance.
Data is far more subject to manipulation than its proponents realize. Even the 2011 McKinsey Global
Institute report that popularized the term "big data" acknowledged that its central claim that "we
are on the cusp of a tremendous wave of innovation, productivity and growth ... all driven by big
data," was supposition. "As of now," McKinsey admitted "there is no empirical evidence of a link
between data intensity ... and productivity in specific sectors." In the intervening years, such
evidence remains scant, even as the quantification bandwagon has gathered steam.
College rankings and federal sentencing What is enterprise search? guidelines, for example, are
both quantifications of complex social systems that are broadly agreed to have harmed the systems
they set out to standardize and order.
What data won't tell us
Many important questions are simply not amenable to quantitative analysis, and never will be.
Where should my child go to college, or when? How should we punish criminals? Are charter schools
a good idea? Should we fund the human genome project, or basic science in general? Should we
have preschools? Taking quantitative answers to these questions seriously not only risks getting the
answer wrong, but shapes the underlying reality in ways that are detrimental to our collective wellbeing.
Such questions call for informed judgment that balances values, incentives, context and other
factors. It is often difficult to find disinterested individuals who can balance these factors and be
trusted. That difficulty is inherent in all social systems.
Settling vital questions on the basis of informed judgment only appears to be more subjective than
using quantitative techniques. By laundering their biases and preconceptions into the methodology
they use to devise quantitative metrics, policymakers and social scientists can fool themselves and
others into believing they are impartial and unbiased.
To take another example, ascertaining the worth of the human genome project ought to depend on
one's view of the value of the knowledge derived from it within the domain of biology and medicine.
In his 2013 State of the Union address, Barack Obama claimed, "Every dollar we invested to map the
human genome returned $140 to our economy." Such claims are as irrelevant as debating whether
the Parthenon was cost-effective.
There is simply no useful way to assess the long-term economic impact of either the human genome
project or of the Parthenon, and to do so is to miss the point. Pericles didn't build the Parthenon to
draw tourists to downtown Athens 2,500 years later. The fact that it is now a tourist attraction does
little to explain its value.
Similarly, investments in understanding the human genetic code will be realized over time, and
cannot be justified in terms of their short-term economic impact.
To focus on the many methodological flaws in the return-on-investment techniques used by the
Battelle Memorial Institute in the study Obama was referring to is to miss the point. (The Battelle
study counts money spent on the human genome project as both a cost and a benefit, for example.)
Measuring knowledge?
Obama proclaimed the precise numerical return as a totem, which legitimized the money spent. But
the strong case for the human genome project rests on the knowledge it created, rather than the
economic benefit, which cannot be meaningfully measured.
The effect of new basic scientific knowledge on the structure of the economy is too diffuse and
complex for economists to measure. We have no access to a counterfactual world in which the
human genome project did not exist.
Of course the fact that quantitative studies of social and economic systems were systematically
flawed in the recent past is no proof that future investigations will suffer from the same
shortcomings. However, it is reasonable to believe that if the same basic methodology is used -- even
if more data is gathered -- these flaws will persist.
The only way to understand the fact of the matter about whether butter is good or bad for you is to
actually understand what happens when you eat butter, not to continue to try to tease out more
intricate statistical regressions between health indicators and butter consumption. (Large effects -like the link between smoking and lung cancer -- do show up using such techniques, but for more
subtle effects, the answers depend very much on how statistics are compiled.)
Chaos theory
In the late 1980s, spurred by the publication of James Gleick's best-selling book, "Chaos: Making A
New Science," there was a wave of popular attention paid to the then-nascent discipline of chaos
theory. Gleick introduced the public to the idea that many real-world systems exhibit "sensitive
dependence on initial conditions." Change the inputs slightly, and radically different outputs will
emerge. It is impossible to pinpoint with certainty just what causes, say, a hurricane to form.
Human social systems -- public primary and secondary schools, universities or the criminal justice
system -- are complex systems, just like the weather. The vogue of attention to chaos theory passed
before policymakers came to take it seriously. Understanding the complexity of social systems
means understanding that conclusive answers to causal questions in social systems will always
remain elusive. Gathering more data -- twice as much, 10 times as much, a hundred times as much -won't change this.
To effectively debate public policy or corporate strategy, we will have to continue to have debates
over principles. In such debates, disagreement among individuals with different ideological
presuppositions will continue.
To believe that disinterested, "rigorous" quantitative judgment can be systematically substituted for
such debate imperils programs and practices whose costs are direct, but whose benefits are indirect
and thus more difficult to measure. Ease of measurement does not correspond with importance. The
administrative apparatus of evidence generation does not, as it claims to, merely pursue "good
policy" but is itself a self-interested actor pursuing particular political ends.
A December 2014 book published by the Brookings Institution, "Show Me the Evidence: Obama's
Fight for Rigor and Results in Social Policy," sums up this belief: "The vision of the evidence-based
movement is that the nation will have thousands of evidence-based social programs that address
each of the nation's most important social problems and that under the onslaught of these
increasingly effective programs, the nation's social problems will at last recede."
This grandiose vision of evidence as panacea is dangerous and damaging. Unless the evangelists of
evidence are resisted, they will steamroll over what they cannot measure, leaving us poorer as
individuals and as a society, buried in a bureaucracy of numbers untethered from reality.
http://www.cnn.com/2015/02/02/opinion/kakaes-big-data/index.html

The Big Dangers of 'Big Data' (Opinion)

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

The Big Dangers of 'Big Data' (Opinion)

Hochgeladen von

Copyright:

Verfügbare Formate

The big dangers of 'big data' (Opinion)

Das könnte Ihnen auch gefallen