Sie sind auf Seite 1von 26

Big Data Wonderland:

Two Views on the Big Data Revolution


Mark Madsen
Third Nature, Inc.
mark@thirdnature.net
@markmadsen
Marc Demarest
Noumenal, Inc.
marc@noumenal.com
Strata Santa Clara
February 2013
2 Third Nature, Inc. || Noumenal, Inc.
Preamble
Twenty Years On
We came up together in this industry
in the early 1990s, as pointy-headed
advocates of star schema design,
trained by the deity himself, Ralph
Kimball
Back then, it was a simpler
world...big iron, big DBMS, hand-
coded ETL, star schema, a
thousand rinky-dink query tools
Mostly, conversation was
dominated by ETL and schema
design
There will never be a decisional
database larger than 10 GB...
St. Ralph
Our Alma Mater
3 Third Nature, Inc. || Noumenal, Inc.
Preamble
Twenty Years On
Twenty years on, we find ourselves
with opposing view on what is either
the biggest con, or the biggest sea-
change, in our data warehousing
odyssey
Question: Is the big data revolution
big, or a revolution?
Question: do we have to change? and
if so, how?
Not a round table. A slugfest....
Demarest as
Shana Alexander?
Madsen as
J ack Kilpatrick?
4 Third Nature, Inc. || Noumenal, Inc.
Regular Programming Is Suspended
Demarest Madsen
5 Third Nature, Inc. || Noumenal, Inc.
Compromise
Demarest Madsen
You take the blue pill.
The story ends, you
wake up in your bed
and believe whatever
you want to believe.
You take the red pill,
you stay in Wonderland,
and I show you how
deep the rabbit hole
goes.
Remember, all I am
offering is the truth:
nothing more.
6 Third Nature, Inc. || Noumenal, Inc.
The Issues
1. Data As A Factor of Production
RED BLUE
Amen.
This change has been
in process for more
than a decade. Social
media leads the way,
but were all affected.
7 Third Nature, Inc. || Noumenal, Inc.
The Issues
1. Data As A Factor of Production
RED BLUE
Amen.
This change has been
in process for more
than a decade. Social
media leads the way,
but were all affected.
Hype.
For most companies,
data remains an
asset, but not a factor
in the production of its
products or services.
8 Third Nature, Inc. || Noumenal, Inc.
The Issues
2. The Reality of Big Data
RED BLUE
Few companies
transformed.
No quantification of
benefits, right now.
Leverage? Maybe.
9 Third Nature, Inc. || Noumenal, Inc.
The Issues
2. The Reality of Big Data
RED BLUE
No company escapes.
Text, social, sensors,
streaming -- the
instrumentation of the
real world transforms
company decision-
making processes.
Few companies
transformed.
No quantification of
benefits, right now.
Leverage? Maybe.
10 Third Nature, Inc. || Noumenal, Inc.
The Issues
3. Merchant DBMSs
RED BLUE
Increasingly irrelevant.
Weve been over-
structured and under-
resourced for 20
years.
CSV is still the
international standard.
11 Third Nature, Inc. || Noumenal, Inc.
The Issues
3. Merchant DBMSs
RED BLUE
Increasingly irrelevant.
Weve been over-
structured and under-
resourced for 20
years.
CSV is still the
international standard.
Will rise to the
challenge.
Any worthwhile
innovation will be
absorbed by the
merchant DBMS
players.
12 Third Nature, Inc. || Noumenal, Inc.
The Issues
4. Query, Reporting & Dashboarding Tools
RED BLUE
Will rise to the
challenge.
We have two
generations of
analysts trained to
feed using these tools.
13 Third Nature, Inc. || Noumenal, Inc.
The Issues
4. Query, Reporting & Dashboarding Tools
RED BLUE
Ineffective, now and in
the future.
Cant do real-time,
cant visualize large
data sets, cant
support discovery and
exploration.
Will rise to the
challenge.
We have two
generations of
analysts trained to
feed using these tools.
14 Third Nature, Inc. || Noumenal, Inc.
The Issues
5. The Commodity Hardware Revolution & Radical Scale-Out
RED BLUE
The new topology.
Cheap compute,
unintelligent direct-
attach storage and
free comms make
large scale-out grids
the future.
15 Third Nature, Inc. || Noumenal, Inc.
The Issues
5. The Commodity Hardware Revolution & Radical Scale-Out
RED BLUE
The new topology.
Cheap compute,
unintelligent direct-
attach storage and
free comms make
large scale-out grids
the future.
The current topology
is alive and well.
These commodity
building blocks are,
after all, just SMP
platforms.
16 Third Nature, Inc. || Noumenal, Inc.
The Issues
6. Structured Query Language
RED BLUE
Tried-and-True.
Powerful, expressive
language for complex
analytical problems.
Thats why the noSQL
vendors reinvent it all
the time.
17 Third Nature, Inc. || Noumenal, Inc.
The Issues
6. Structured Query Language
RED BLUE
Toast.
Too complex, too hard
to code, too hard to
debug. A way of
ensuring dependency
on merchant DBMSs.
Tried-and-True.
Powerful, expressive
language for complex
analytical problems.
Thats why the noSQL
vendors reinvent it all
the time.
18 Third Nature, Inc. || Noumenal, Inc.
The Issues
7. New Programming Models
RED BLUE
Say hello to Pig.
New analytical
problems (decisioning,
discovery, exploration)
require new
languages, new tools
and new programming
models.
19 Third Nature, Inc. || Noumenal, Inc.
The Issues
7. New Programming Models
RED BLUE
Say hello to Pig.
New analytical
problems (decisioning,
discovery, exploration)
require new
languages, new tools
and new programming
models.
Say hello to J ava.
Open source doesnt
mean free. Or easy.
The skills gap here is
huge. And there are
few truly new
analytical problems.
20 Third Nature, Inc. || Noumenal, Inc.
The Issues
8. Conventional DW Architecture
RED BLUE
Perfectly viable.
No need to change
anything. Some new
technologies may play
roles in the existing
architecture, but were
good to go, generally.
21 Third Nature, Inc. || Noumenal, Inc.
The Issues
8. Conventional DW Architecture
RED BLUE
A relic.
Overly complex.
Difficult to implement.
Controlled by the
supply side of the
market, anyway.
Perfectly viable.
No need to change
anything. Some new
technologies may play
roles in the existing
architecture, but were
good to go, generally.
22 Third Nature, Inc. || Noumenal, Inc.
The Issues
9. The Cloud
RED BLUE
We all go there.
Most of the interesting
data is there; its more
effective to move our
data, and our
analyses, to where the
data is, already.
23 Third Nature, Inc. || Noumenal, Inc.
The Issues
9. The Cloud
RED BLUE
We all go there.
Most of the interesting
data is there; its more
effective to move our
data, and our
analyses, to where the
data is, already.
Dont go there.
Public cloud security
is an oxymoron.
Your inside-the-firewall
apps remain the core
information asset.
24 Third Nature, Inc. || Noumenal, Inc.
The Issues
10. New Technologies
RED BLUE
Distract Us.
Weve already seen
what best-of-breed
gives us: a circus.
25 Third Nature, Inc. || Noumenal, Inc.
The Issues
10. New Technologies
RED BLUE
Save Us.
Best of breed
integration led by in-
house designers ins
back, with a
vengeance.
Distract Us.
Weve already seen
what best-of-breed
gives us: a circus.
26 Third Nature, Inc. || Noumenal, Inc.
What We Really Think
1. Data As A Factor of Production
2. The Reality of Big Data
3. Merchant DBMSs
4. Query, Reporting & Dashboarding Tools
5. The Commodity Hardware Revolution
6. Structured Query Language
7. New Programming Models
8. Conventional DW Architecture
9. The Cloud
10. New Technologies