Sie sind auf Seite 1von 91

H T T P : / / WWW. A N A L Y T I C S - MAGA Z I N E .

O R G
JULY/ AUGUST 2014 DRIVING BETTER BUSINESS DECISIONS
BROUGHT TO YOU BY:
WHY ANALYTICS
PROJECTS
FAIL
ALSO INSIDE:
Dark side of digital world
Real-time text analytics
Data scientists time to shine
The future of forecasting
Key considerations
for deep analytics
on big data,
learning and
insights
Executive Edge
Hewlett-Packard
V. P. Rohit Tandon:
Six ways of
value creation via
E-commerce analytics
WWW. I NF OR MS . OR G 2 | A NA LY T I CS - MAGA Z I NE . OR G
What I learned today
I NSI DE STORY
One of the advantages of editing
Analytics (as well as OR/MS Today, the
membership magazine of INFORMS) is I
learn something new every day, thanks to
the wide array of contributed articles we
receive. For example, just in preparing
this issue, I learned:
Nearly 20 years ago, Amazon found-
er Jeff Bezos said that Amazon intended
to sell books at or near cost as a way
of gathering data on affuent, educated
shoppers, as reported by George Packer
in The New Yorker. The implication: The
data, once analyzed, had more value
than the loss-leader books, which proved
absolutely correct when Amazon began
selling everything under the sun to well-
targeted consumers.
Drawing on Packers article, as well
as a couple of books (Who Owns the
Future? and The Ethics of Big Data),
Vijay Mehrotra explores the dark side
of technology, big data and analytics
and the perceived and/or potential threat
it poses in his Analyze This! column.
Dont miss it.
A Formula 1 pit crew, working in an
optimized, well-coordinated fashion, can
change a set of four tires in less than two
seconds. That means that unless youre
Evelyn Wood, that crew can change
12 tires in the time it takes you to read
this sentence. For the story behind the
motorsports magic, check out Andy
Boyds Forum column. Seeing is be-
lieving, so dont miss the amazing videos
referenced at the end of the article.
We all know the digital/technical
world will come to a wordy end without
acronyms, but do you know what MOOC
stands for? I do (massively open online
course), thanks to an interview I did with
executive search honcho Linda Burtch
regarding the red-hot analytics job market.
Finally, I also learned from Linda
that in todays dynamic world, young
people should plan on three or four ca-
reers during their lifetime. Its not good
to specialize in one thing and try to stick
with one company or one industry or one
vertical application for your entire ca-
reer, she says in the Q&A. Its incredibly
dangerous, and it likely wont carry you
through a 35-year career. You need to be
continuously learning something new.
I got that last part going for me,
every day.
PETER HORNER, EDITOR
peter.horner
@
mail.informs.org
OPTIMIZE
YOUR BUSINESS
WITH UNPRECEDENTED
SPEED
info@aimms.com | +1 425 458 4024
To learn more about AIMMS Optimization Apps, visit aimms.com.
TO YOUR ENTERPRISE
OPTIMIZATION
APP STORE
PUBLISHED
INSTANTLY
IN A FEW
DAYS
PROOF OF
CONCEPT
IN A FEW
WEEKS
OPTIMIZATION APP
IN A FEW
MONTHS
MISSION CRITICAL
ENTERPRISE APP
IN A FEW
HOURS
IDEA
WWW. I NF OR MS . OR G 4 | A NA LY T I CS - MAGA Z I NE . OR G
DRIVING BETTER BUSINESS DECISIONS
C O N T E N T S
FEATURES
REAL-TIME TEXT ANALYTICS
By Aveek Mukhopadhyay and Roger Barga
How a cloud-based analytical engine yields instant insight using
unstructured social media data.
WHY DO ANALYTICS PROJECTS FAIL?
By Haluk Demirkan and Bulent Dal
Not just another IT project: Key considerations for deep analytics
on big data, learning and insights.
ITS THEIR TIME TO SHINE
By Peter Horner
Job prospects for data scientists and elite analytics professionals
have never been better and the future is even brighter.
ANALYTICS TRANSFORMS A DINOSAUR
By Brenda Dietrich, Emily Plachy and Maureen Norton
The story of how industry giant IBM not only survived but
thrived by realizing business value from big data.
THE FUTURE OF FORECASTING
By Jack Yurkiewicz
Making predictions from hard and fast data: Biennial survey
of popular software for analytics professionals.
34
44
54
62
70
54
62
70
34
JULY/ AUGUST 2014
Brought to you by
Tel 775 831 0300 Fax 775 831 0314 info@solver.com
ANALYTIC SOLVER PLATFORM
Visualize, Analyze, Decide with Power BI + Premium Solver
Before your company spends a year and a small fortune
on advanced analytcs, shouldnt you fnd out what
your people can do with the latest enhancements to
the tool they already know Microsof Excel in
business intelligence and advanced analytcs today?
Did you know that with Power Pivot in Excel 2013 and
2010, your Excel desktop can easily analyze 100 million
row datasets, with the power of Microsofs SQL Server
Analysis Services xVelocity engine inside Excel?
Did you know that with Power Query in Excel, you can
extract, transform and load (ETL) data from virtually any
enterprise or cloud database with point-and-click ease?
Did you know that with Analytc Solver Platorm in
Excel, you can create powerful data mining, forecastng
and predictve analytcs models, rivaling the best-known
statstcal packages, again with point-and-click ease?
Did you know that with Analytc Solver Platorm, you can
build sophistcated Monte Carlo simulaton, risk analysis,
conventonal and stochastc optmizaton models, using
the worlds best solvers, and modeling tools proven in
use by over 7,000 companies?
Did you know that with Power View and Frontlines
XLMiner Data Visualizaton, you can visualize not only
your data, but the results of your analytc models?
Now you know that with Microsofs Power BI and
Frontlines Premium Solver App, you can publish your
Excel workbook to Ofce 365 in the cloud, share your
visualizatons, refresh from on-premise databases, and
re-optmize your model for new decisions immediately.
Find Out More, Download Your Free Trial Now
Visit www.solver.com/powerbi to learn more, register
and download a free trial or email or call us today.
6 |
DRIVING BETTER BUSINESS DECISIONS
REGISTER FOR A FREE SUBSCRIPTION:
http://analytics.informs.org
INFORMS BOARD OF DIRECTORS
President Stephen M. Robinson, University of
Wisconsin-Madison
President-Elect L. Robin Keller, University of
California, Irvine
Past President Anne G. Robinson, Verizon Wireless
Secretary Brian Denton,
University of Michigan
Treasurer Nicholas G. Hall, Ohio State University
Vice President-Meetings William Bill Klimack, Chevron
Vice President-Publications Eric Johnson, Dartmouth College
Vice President-
Sections and Societies Paul Messinger, CAP, University of Alberta
Vice President-
Information Technology Bjarni Kristjansson, Maximal Software
Vice President-Practice Activities Jonathan Owen, CAP, General Motors
Vice President-International Activities Grace Lin, Institute for Information Industry
Vice President-Membership
and Professional Recognition Ozlem Ergun, Georgia Tech
Vice President-Education Joel Sokol, Georgia Tech
Vice President-Marketing,
Communications and Outreach E. Andrew Andy Boyd,
University of Houston
Vice President-Chapters/Fora David Hunt, Oliver Wyman
INFORMS OFFICES
www.informs.org Tel: 1-800-4INFORMS

Executive Director Melissa Moore
Meetings Director Laura Payne
Marketing Director Gary Bennett
Communications Director Barry List

Headquarters INFORMS (Maryland)
5521 Research Park Drive, Suite 200
Catonsville, MD 21228
Tel.: 443.757.3500
E-mail: informs@informs.org
ANALYTICS EDITORIAL AND ADVERTISING
Lionheart Publishing Inc., 506 Roswell Street, Suite 220, Marietta, GA 30060 USA
Tel.: 770.431.0867 Fax: 770.432.6969
President & Advertising Sales John Llewellyn
john.llewellyn@mail.informs.org
Tel.: 770.431.0867, ext. 209
Editor Peter R. Horner
peter.horner@mail.informs.org
Tel.: 770.587.3172
Assistant Editor Donna Brooks
donna.brooks@mail.informs.org
Art Director Jim McDonald
jim.mcdonald@mail.informs.org
Tel.: 770.431.0867, ext. 223
Advertising Sales Sharon Baker
sharon.baker@mail.informs.org
Tel.: 813.852.9942
Analytics (ISSN 1938-1697) is published six times a year by the
Institute for Operations Research and the Management Sciences
(INFORMS), the largest membership society in the world dedicated
to the analytics profession. For a free subscription, register at
http://analytics.informs.org. Address other correspondence to
the editor, Peter Horner, peter.horner@mail.informs.org. The
opinions expressed in Analytics are those of the authors, and
do not necessarily refect the opinions of INFORMS, its offcers,
Lionheart Publishing Inc. or the editorial staff of Analytics.
Analytics copyright 2014 by the Institute for Operations
Research and the Management Sciences. All rights reserved.
32
82
DEPARTMENTS
2 Inside Story
8 Executive Edge
14 Analyze This!
24 Healthcare Analytics
28 INFORMS Initiatives
32 Forum
82 Conference Preview
84 Five-Minute Analyst
90 Thinking Analytically
Tel 775 831 0300 Fax 775 831 0314 info@solver.com
ANALYTIC SOLVER PLATFORM
Easy to Use, Industrial Strength Predictive Analytics in Excel
How can you get results quickly for business decisions,
without a huge budget for enterprise analytcs
sofware, and months of learning tme? Heres how:
Analytc Solver Platorm does it all in Microsof Excel,
accessing data from PowerPivot and SQL databases.
Sophistcated Data Mining and Predictve Analytcs
Go far beyond other statstcs and forecastng add-ins
for Excel. Use classical multple regression, exponental
smoothing, and ARIMA models, but go further with
regression trees, k-nearest neighbors, and neural
networks for predicton, discriminant analysis, logistc
regression, k-nearest neighbors, classifcaton trees,
nave Bayes and neural nets for classifcaton, and
associaton rules for afnity (market basket) analysis.
Use principal components, k-means clustering, and
hierarchical clustering to simplify and cluster your data.
Simulaton, Optmizaton and Prescriptve Analytcs
Analytc Solver Platorm also includes decision trees,
Monte Carlo simulaton, and powerful conventonal and
stochastc optmizaton for prescriptve analytcs.
Help and Support to Get You Started
Analytc Solver Platorm can help you learn while
getng results in business analytcs, with its Guided
Mode and Constraint Wizard for optmizaton, and
Distributon Wizard for simulaton. Youll beneft from
User Guides, Help, 30 datasets, 90 sample models, and
new textbooks supportng Analytc Solver Platorm.
Surprising Performance on Large Datasets
Excels ease of use wont limit what you can do Analytc
Solver Platorms fast, accurate algorithms rival the
best-known statstcal sofware packages.
Find Out More, Download Your Free Trial Now
Visit www.solver.com to learn more, register and
download a free trial or email or call us today.
WWW. I NF OR MS . OR G 8 | A NA LY T I CS - MAGA Z I NE . OR G
Increasing popularity and access to the Internet
has changed the way marketers are interacting with
customers. These customers are smart, well informed
and empowered, as Internet connectivity is available
to them at their fngertips and on the go. It has there-
fore become imperative for organizations to be on the
customers online radar with respect to new products or
services and to be able to infuence their choices.
Not surprisingly, according to one study, 34 percent
of marketers are generating leads through Twitter. In-
dias online retail market grew at a staggering 88 per-
cent in 2013 to $16 billion and continues to grow. These
examples are a testimony to the growth of e-commerce.
The Internet deluge has opened an assortment of op-
portunities. Customers are able to buy high-end fashion
and designer shoes, book hotels, buy movie tickets and
you-name-it.
Therefore, an opportunity exists for business re-
search to capture, compile, churn and store colos-
sal bytes of information about customers, suppliers
and operations. This is what we call the age of big
data. We believe that this age is a natural progres-
sion in online business and is here to stay. We are al-
ready seeing a surge in adoption of digital channels
such as social media, e-mail marketing and display
ads in e-commerce. Imagine the amount of data this
It has become imperative
for organizations to be
on the customers online
radar with respect to
new products or services
and to be able to
influence their choices.
BY ROHIT TANDON
AND SHRUTI UPADHYAY
Six ways of value-creation
through analytics in
E-commerce
EXECUTI VE EDGE
Tel 775 831 0300 Fax 775 831 0314 info@solver.com
ANALYTIC SOLVER PLATFORM
From Solver to Full-Power Business Analytics in Excel
The Excel Solvers Big Brother Has Everything You
Need for Predictve and Prescriptve Analytcs
From the developers of the Excel Solver, Analytc Solver
Platorm makes the worlds best optmizaton sofware
accessible in Excel. Solve your existng models faster,
scale up to large size, and solve new kinds of problems.
From Linear Programming to Stochastc Optmizaton
Fast linear, quadratc and mixed-integer programming is
just the startng point in Analytc Solver Platorm. Conic,
nonlinear, non-smooth and global optmizaton are just
the next step. Easily incorporate uncertainty and solve
with simulaton optmizaton, stochastc programming,
and robust optmizaton all at your fngertps.
Ultra-Fast Monte Carlo Simulaton and Decision Trees
Analytc Solver Platorm is also a full-power tool for
Monte Carlo simulaton and decision analysis, with a
Distributon Wizard, 50 distributons, 30 statstcs and
risk measures, and a wide array of charts and graphs.
Comprehensive Forecastng and Data Mining
Analytc Solver Platorm samples data from Excel,
PowerPivot, and SQL databases for forecastng and data
mining, from tme series methods to classifcaton and
regression trees, neural networks and associaton rules.
And you can use visual data exploraton, cluster analysis
and mining on your Monte Carlo simulaton results.
Find Out More, Download Your Free Trial Now
Analytc Solver Platorm comes with Wizards, Help, User
Guides, 90 examples, and unique Actve Support that
brings live assistance to you right inside Microsof Excel.
Visit www.solver.com to learn more, register and
download a free trial or email or call us today.
WWW. I NF OR MS . OR G 10 | A NA LY T I CS - MAGA Z I NE . OR G
EXECUTI VE EDGE
has created for marketers to lay their hands on for
analysis. Despite that, in the race to utilize the on-
line space, marketers may be focusing more on ad-
vertising and less on analysis of the data that could
potentially increase sales.
In our opinion, understanding the customer
behavior becomes more complex in business-to-
consumer companies and more so in a 24/7 e-com-
merce business that sells technology products in an
increasingly commoditized industry. A strong analyt-
ics foundation may make e-commerce a thriving and
successful channel of sales. Businesses, therefore,
are increasingly creating customizable campaigns
for their installed base customers and improving
sales effectiveness through e-commerce.
For example, pricing and merchandising deci-
sions need to be taken in real time, and the need to
have real-time insights is ever-increasing. To make
these decisions faster and better, marketers would
need to quickly analyze their digital marketing strate-
gies by mining data exhaustively and cost effectively
through advanced analytics.
KEY DRIVERS OF INCREASED REVENUES
An organizations ability to achieve its goal of
increased revenues and margins would depend
heavily on its ability to improve three key drivers: 1)
volume of customer traffc to the online store (num-
ber of visits); 2) customer conversion (percentage of
conversion); and 3) basket size (revenue per aver-
age order size). Analytics has a very important role
to play in this value chain. So while organizations
may have the best talent with an analytical mindset
and eagerness to apply it, we need to equip data
In the race to utilize
the online space,
marketers may be focusing
more on advertising
and less on analysis
of the data
that could potentially
increase sales.
A NA L Y T I C S J U LY / AU GU S T 2014 | 11
scientists in organizations with the right
tools and insights.
Conversations with analytics profes-
sionals reiterate our belief in some of the
following must-haves that will elevate an
organizations e-commerce agenda to the
next level:
1. Development of best-in-class
tools and techniques are a must to
build scalable solutions and tackle the
optimization of key drivers.
Over the years various products such
as SAS have provided excellent devel-
opment environments, but every data
scientist had to start from scratch and
depend on their personal techniques to
tackle new problems. However, in recent
years, data scientists and organizations
are now moving toward using templates
and building packaged models and solu-
tions to reuse and replicate technologies
with ease.
One of the frst such pilot solutions with-
in HP was developed for HPDirect.coms
demand generation function, where global
analytics developed V.1 of a series of de-
mand generation models. These models
also paved the way for the development of
www.leeds.colorado.edu/ms
303-492-8397
leedsms@colorado.edu
Stand Out.
Put yourself in a lucrative new career.
Apply now for a masters degree in business
analytics or supply chain management.
Intensive nine month programs
World-renowned faculty
Experiential projects with industry clients
Personalized professional development
analytics_Layout 1 4/25/14 12:51 PM Page 1
WWW. I NF OR MS . OR G 12 | A NA LY T I CS - MAGA Z I NE . OR G
customer targeting models. In most organizations,
such initiatives if implemented have the potential to
lay the foundation for similar opportunities with other
business functions such as planning, store opera-
tions and category management. When an organi-
zation reaches such a stage of maturity, thats when
true return on data (ROD) is possible.
2. The three Ws whom, what, when. Tradi-
tionally, marketers have used a uni-dimensional ap-
proach to target customers. However, results show
that these can be sub-optimal and might have an
adverse effect on customer loyalty and brand image.
Answering questions such as whom to target, what
to offer and when to offer bring a paradigm shift in
garnering customer interest and loyalty. These help
rank customers on their propensity to re-purchase,
and lead to preferential treatment of the right cus-
tomers with the right product portfolio or allow mar-
keters to understand when to offer discounts.
Effective tools and modeling will also note clues
on probability of customers picking one product over
another or repeat customer behaviors. This brings
us back to the importance of using effective, proven
analytics tools and techniques.
3. Automate and innovate. Creating and
applying big data algorithms will help organizations
in taking appropriate actions. Many of them are
programmed automatically, save time and allow
better decisions faster. Creating a robust tool-based
ecosystem that allows creation of funnels that track
visitors, bounce rates, conversations, etc., is vital to
a successful Web analytics initiative.
Answering questions
such as whom to target,
what to offer and
when to offer bring
a paradigm shift in
garnering customer
interest and loyalty.
EXECUTI VE EDGE
J U LY / AU GU S T 2014 | 13 A NA L Y T I C S
4. Site search analytics. Tracking
site search is a very useful resource that
allows you to know what your visitors are
looking for in your website. Is the search
engine directing the customer to your web-
site or redirecting them to the next best op-
tion in absence of the product? Keeping
tabs on this will help companies increase
customer loyalty and sales.
Another application of site search an-
alytics allows you to understand what is
being searched on your website. By under-
standing this, marketers can infuence the
site layout and design so that visitors are
able to easily locate answers to common
queries or the most searched products.
5. Marketing spend optimization.
HPs online store uses a mix of marketing
vehicles to reach different customer seg-
ments with different communication and
buying preferences. Optimizing spend on
various marketing vehicles is critical to
optimizing demand generation efforts as
well. However, determining which market-
ing mix is most benefcial to the business
is not an easy process, requiring not only
a scientifc approach to analyzing spend
and revenue, but also a test-learn-opti-
mize culture. For example, ongoing anal-
ysis of the response to different types of
marketing vehicles helps in identifying the
best ft for a particular type of message.
Based on such analysis, one can decide
if a banner would work best vis--vis a
customized landing page, or would an
e-mail campaign be the best option.
6. Connect marketing with ware-
housing. In large supply chain environ-
ments, an accurate forecast of orders
that get shipped out of the warehouse on
a daily basis can be tracked using pre-
dictive analytics methodologies to en-
able accurate warehouse space/staffng
allocation in order to meet the aggressive
shipping timeline.
In conclusion, marketers can apply
data mining and advanced analytical skills
to derive key insights to better understand
drivers of Web traffc and reasonably ac-
curate traffc forecast for use in business
planning. We sense that if companies use
data accurately, they can easily exhibit
a three to fve times growth of the online
business and will make analytics easily
replicable across different functions of the
organization.
Rohit Tandon is vice president of corporate strategy
and worldwide head of Global Analytics at Hewlett-
Packard. As part of HPs corporate strategy team, he
helps drive the analytics ecosystem to support HPs
vision and priorities through delivery of cutting-edge
analytical capabilities across sales, marketing, supply
chain, fnance and HR domains. He was recently
named one of the top-10 most infuential analytics
leaders in India for 2014 by Analytics India Magazine.
Shruti Upadhyay is a manager with HP Global
Analytics.
WWW. I NF OR MS . OR G 14 | A NA LY T I CS - MAGA Z I NE . OR G
BY VIJAY MEHROTRA
ANALYZE THI S!
Given my love of books, it is perhaps not surpris-
ing that Amazon.com where, thanks to the digital
technologies of today, a plethora of books can imme-
diately be found about nearly any idea that pops into
my head and be delivered (free with Amazon Prime
membership!) to my doorstep with remarkable speed
is a website that I love deeply. Like many avid read-
ers, I purport to do my best to support my local inde-
pendent booksellers, but too often there is simply no
denying the powerful pull of the super convenient,
instantly gratifying, highly personalized Amazon.com
experience.
Thanks to my bi-monthly book club, I recently read
Who Owns the Future? by Jaron Lanier, a celebrat-
ed technologist and MacArthur genius award winner
best known for his contributions to the feld of virtual
reality. Lanier is known as a big thinker, and in this
book at once rambling, provocative and thoughtful
he once again shows why.
WOTF begins with a bleak assessment of where
digital technology is leading us all. The main thrust of
Laniers argument is as follows:
Technology makes it very easy to give away for
free a lot of things that people fnd valuable just
Dark side of the
digital world
In the book business
the prospect of a single
owner of both the means
of production and the
modes of distribution is
especially worrisome ...
George Packer
Big data, unintended consequences: What Amazons domination of the
book publishing industry could portend.
J U LY / AU GU S T 2014 | 15 A NA L Y T I C S
think about the search engine. Being
human, we are conditioned to love the
chance to get something for nothing,
and we have gratefully grabbed at it with
both hands.
However, the value that technology
grants us is not actually free. In
exchange, we tacitly give up information
about ourselves, which is then stored
as data.
Thanks largely to analytics
professionals, this data is then pooled
and analyzed to create a variety of
commercial opportunities that would not
otherwise exist.
This commercial wealth confers
extraordinary power upon those who
own the technologies that capture and
analyze this data (Lanier calls them
Siren Servers).
This power in turn enables the
owners of the Siren Servers to have a
huge impact on the society that we live
in, including employment, government,
culture and ideas.
Taken to their logical conclusions,
Your one-stop shop to view top presentations from key INFORMS meetings
Your latest member benefit lets you learn from the best on your schedule.
http://livewebcast.net/INFORMS_Video_Learning_Center
video learning center
NOW ONLINE! 2014 Edelman Presentations
2013 Analytics Conference and Annual Meeting
2012 Analytics Conference and Annual Meeting
2011 Analytics Conference and Annual Meeting
2010 Practice Conference and Annual Meeting
2009 Annual Meeting
WWW. I NF OR MS . OR G 16 | A NA LY T I CS - MAGA Z I NE . OR G
ANALYZE THI S!
all of this ultimately dooms the human
species to a very sad and cataclysmic
ending.
Along the way, Lanier also wanders
off into pleasantly intense digressions on
a broad variety of somewhat related top-
ics, including Aristotle, the tenure system,
biodiversity and the concept of local op-
tima. He too clearly loves to read.
IMPACT ON PUBLISHING
While still digesting this thought-
provoking book, I came across George
Packers recent article entitled Is
Amazon good for books? Taking a long
hard look at Amazon.com, the website
that perhaps most fully embodies Laniers
concept of a Siren Server, Packer fnds
that many of Laniers more dire predic-
tions are already playing out there.
Packers particular focus is Amazons
impact on the publishing industry, and he
believes that the stakes here are incred-
ibly high: In the book business the pros-
pect of a single owner of both the means
of production and the modes of distribu-
tion is especially worrisome; it would give
Amazon more control over the exchange
of ideas than any company in U.S. histo-
ry. Even in the iPhone age, books remain
central to American intellectual life, and
perhaps to democracy.
I wholeheartedly agree.
Just as Lanier predicts, suppliers
and consumers alike had originally both
rushed to embrace Amazon, for like so
many technologies it seemed to magical-
ly (that is, without cost) provide all parties
with something for which they hungered.
As Packer writes, When Amazon
emerged, publishers in New York sud-
denly had a new buyer that paid quickly,
sold their backlist as well as new titles,
and, unlike traditional bookstores, made
very few returns generating fresh rev-
enues for publishers with little incremen-
tal investment. Meanwhile, we readers
focked to Amazon in droves for its con-
venience, its variety, and its low prices.
Amazon.com today accounts for
more than 40 percent of all printed books
purchased as well as 65 percent of all
eBooks, so it is probably fair to say that
book buyers by and large still love Ama-
zon. For us as readers, this is fortuitous,
since the number of independent book-
stores in business has declined by more
than 50 percent since Amazons found-
ing. However, as its share of overall book
sales has ballooned, Amazon has taken
advantage of its market power to aggres-
sively push the terms of its agreements
with book publishers dramatically in its
own favor, often through tactics refect-
ing Amazons famously secretive and
opaque corporate culture. Meanwhile,
Packer reports, the many publishers large
and small whose businesses are now
2014 Fair Isaac Corporation. All rights reserved.
Now part of FICO


Xpress
Optimization Suite.
Parallel
Simplex
S
1
X
1
X
2
X
3
S
1
S
2
P
S
1
P
People have been attempting to add parallel processing to the simplex method for linear programming for well
over 30 years. FICO is proud to announce that we have solved this enormously difcult problem and can now ofer
parallel simplex in our software, including FICO Xpress Optimization Suite.
The addition of parallel processing to simplex algorithms speeds performance of FICO Xpress Optimization Suite
by as much as a factor of 2.5.
Our method for the parallelization of classic simplex algorithms involves picking apart the
algorithmic components and rearranging them to make the algorithm open to parallelization.
Learn more about parallel simplex and FICO

Xpress Optimization Suite:


http://www.co.com/xpress
WWW. I NF OR MS . OR G 18 | A NA LY T I CS - MAGA Z I NE . OR G
ANALYZE THI S!
dependent on Amazon for much of their dis-
tribution and revenues are learning frsthand
that, as Lanier sharply points out, information
supremacy for one company becomes, as a
matter of course, a form of behavior modifca-
tion for the rest of the world.
Packers article also describes an Amazon
culture that places a very low value on human
beings that are involved with development, pro-
motion and distribution of books, placing its faith
in algorithms rather than editors and relying on
volunteer (that is, free) reviewers to take the
place of staff writers. All of this serves as a real
illustration of Laniers premise that as more and
more aspects of the enterprise are mediated by
software, those in the business of carefully cre-
ating content (rather than digitally distributing it)
will be increasingly de-valued and many forms
of employment that have long-term value to our
culture will subsequently perish.
ELIMINATING THE GATEKEEPERS
While Amazons efforts at actually serving
as a publisher have so far failed, it is clear
that we can expect them to continue to pur-
sue the holy grail of eliminating the gate-
keepers from the world of publishing by
producing its own original content. Indeed,
one comes away from Packers article with
the feeling that if Amazons founder and CEO
Jeff Bezos could eliminate the need for au-
thors and publishers by replacing them with
automated content-generating software, he
would not hesitate for an instant.
As more and more aspects
of the enterprise are
mediated by software, those
in the business of
carefully creating content
(rather than digitally
distributing it) will be
increasingly devalued.
J U LY / AU GU S T 2014 | 19 A NA L Y T I C S
In fact, book distribution has from the
outset been only a small part of Bezos
vision. The real prize for Bezos has been
the access to reams of consumer data
and the ability to analyze this data for fun
and proft. According to Packer, as early
as 1995, Bezos had publicly stated that
Amazon intended to sell books as a way
of gathering data on affuent, educated
shoppers. Indeed, today the $5.25 billion
in book sales makes up only 7 percent
of Amazons total revenues. This too is
just as Lanier predicts in WOTF, which
may be why it was somehow not available
directly from Amazon.com when I looked
for it the other day (it has since been
restored somehow).
One book that I was able to fnd on
Amazon.com was Ethics of Big Data,
in which author Kord Davis asks a num-
ber of more fundamental questions
about data and its place in the business
world. As a longtime software/IT pro-
fessional with a deep grounding in phi-
losophy and the history of technology,
Davis is equally comfortable discussing
INFORMS is the foremost association of O.R. and analytics experts. Our
members literally wrote the book on how analytics and the principles of
operations research are used to improve organizational decision making.
To find an
expert to help
you, log onto
INFORMS
Find An
Analytics
Consultant
Database
informs.org/Find-Analytics-Consultant/Search
WWW. I NF OR MS . OR G 20 | A NA LY T I CS - MAGA Z I NE . OR G
ANALYZE THI S!
topics as diverse as digital strategy, supply
chain optimization, application development
and values-based management. As such, he
has a unique perspective that motivates him
to take these important and very thorny
questions seriously. As he writes in the books
Preface, nobody in history has ever had the
opportunity to innovate, or been faced with
the risks of unintended consequences, that
big data now provides.
In particular, Davis identifies four
major aspects of any serious data ethics
discussion:
Identity: In the digital world, who we
are is tacitly defned by the data we leave
behind and indeed our own sense of self
is often tightly intertwined with our online
activities. Davis points out that capturing
and analyzing our digital trail provides
others the ability to quite easily summarize,
aggregate or correlate various aspects of
our identity without our participation or
consent.
Privacy: Does your decision to
engage in a digital interaction confer
upon other entities the right to utilize data
captured in the course of that specifc
interaction, and to link it to other sources
of data that may correspond to you? As
Davis asks, Does privacy mean the same
thing in both online and offine worlds?
should individuals have a legitimate ability
to control data about themselves, and to
what degree?
Nobody in history has
ever had the opportunity
to innovate, or been faced
with the risks of
unintended consequences,
that big data now
provides.
Kord Davis
SCHOLARSHIP FOR SERVICE PROGRAM
Undergraduate, graduate, and doctoral students pursuing degrees
in Science, Technology, Engineering, & Mathematics (STEM) fields
SMART Scholars receive:
+ Full tuition and educational fees
+ Generous cash stipend
+ Employment with Department of Defense facilities after graduation
+ Summer internships, health insurance, & book allowance
For more information and to apply, visit
For more information and to apply, visit HTTP: //SMART. ASEE.ORG
In accordance with Federal statutes and regulations, no person on the grounds of race, color, age, sex, national origin or disability shall be excluded from participating in,
denied the benets of, or be subject to discrimination under any program activity receiving nancial assistance from the Department of Defense.
WWW. I NF OR MS . OR G 22 | A NA LY T I CS - MAGA Z I NE . OR G
ANALYZE THI S!
Ownership: Digital technology,
data and analytics have given some
companies the ability to turn individual
users data into saleable assets and
many others the capacity for improved
decision-making and increased
proftability. Intelligently utilizing
data is something that we typically
celebrate in our profession, but
Davis again challenges this view by
asking some very fundamental and
thought-provoking questions: Does
our existence itself constitute a
creative act, over which we have
copyrights or other rights associated
with creation? If it does, then how
do those offine rights and privileges,
sanctifed by everything from the
Constitution to local, state and federal
laws, apply to the online presence of
that same information?
Reputation: Davis hits the nail
on the head when he points out
that, thanks to the ability of data to
be combined and analyzed to drive
inferential and predictive judgments,
the number of people who can form
an opinion about what kind of person
you are is exponentially larger and
farther removed And while these
online reputations are stubbornly
persistent, the accuracy of this
reputational assessment is too
often an afterthought.
CALL FOR ACTION
Unsatisfed with merely admiring the
problem, both Lanier and Davis also call
for action. Lanier proposes a technologi-
cal and marketplace solution to the oth-
erwise inevitable destiny that he believes
digital technology, user data, and busi-
ness analytics are rapidly leading us into,
problems that are so vividly illustrated
by the case of Amazon. He suggests an
elaborate (though high-level) framework
in which all personal data and creative
works are tagged so as to enable their
owner/creators to capture micropayments
whenever and however their data/works
are utilized. While his proposed remedy
is at this stage sketchy at best, from my
perspective he is to be commended for
engaging us all in a conversation about a
technology-enabled solution to a complex
set of problems that few others are even
willing to acknowledge.
Davis, like Lanier, is a technologist
rather than a Luddite (as he quite rightly
points out, whereas big data is ethical-
ly neutral, the use of big data is not). In
Ethics of Big Data, he strongly encour-
ages organizations that use data exten-
sively (as well as the policy-makers who
attempt to make judgments in support of
social good) to have meaningful discus-
sions about how and why we use data
and what the ethical implications are
J U LY / AU GU S T 2014 | 23 A NA L Y T I C S
of those actions. In his call for serious
ethical inquiry, Davis asserts that Or-
ganizations realize that information has
value that can be extracted and turned
into new productsthe ethical impact is
highly context-dependent. But to ignore
that there is an ethical impact is to court
an imbalance between the benefts of in-
novation and the detriment of risk.
Especially, as Lanier would be quick
to add, with technology itself enabling
the risk to be pushed off onto many, while
the benefts are captured by an ever
smaller few.
As Packer reports, Amazon has giv-
en very little thought to the near-term
ethics or the long-term implications of
the way in which it has used its custom-
ers data to obtain its current level of
market power. But as Amazons current
battle [1] with publisher Hachette rages
on, with publishers, governments and
erstwhile business partners sure to fol-
low, it is clear that this particular story is
far from over.
As analytics professionals, neither is
ours. We have a signifcant stake in the
outcomes of these conversations about
ethics and the future. As such, we would
be wise to actively participate in those
conversations. At this particular moment,
we have considerable leverage to advo-
cate for a digital future that refects our
own values.
The world of digital business our
own personalized Siren Server has
provided us with a massive, lucrative,
and free channel for our products and
services. Todays digital enterprise de-
pends so much on our ever-expanding
ability to capture, transmit, store, inte-
grate and organize data, and our deep
capacity to use this data to summarize,
analyze, correlate, predict and optimize.
Through no fault of our own, we have
been bestowed with The Sexiest Job
of the 21st Century [2], and it is indeed
tempting to believe that we are an inte-
gral and indispensable part of the world
in which we live and work, and that we
always will be.
Turns out this is exactly what the pub-
lishers thought when Amazon frst ap-
peared on the scene too. Beware: There
is no free lunch.
Vijay Mehrotra (vmehrotra@usfca.edu) is a
professor in the Department of Business Analytics
and Information Systems at the University of San
Franciscos School of Management. He is also a
longtime member of INFORMS.
REFERENCES
1. For more on this, see http://www.nytimes.
com/2014/06/21/business/booksellers-score-
some-points-in-amazons-standoff-with-hachette.
html and http://www.latimes.com/books/
jacketcopy/la-et-jc-amazon-and-hachette-
explained-20140602-story.html#page=1.
2. http://hbr.org/2012/10/data-scientist-the-
sexiest-job-of-the-21st-century/ar/1
WWW. I NF OR MS . OR G 24 | A NA LY T I CS - MAGA Z I NE . OR G
2014 is turning out to be an interesting year for
the healthcare industry. On the healthcare technology
front, this year has spurred 16 acquisitions since Jan.
1. State and federal government health insurance
exchanges fnally started to operate at scale, offer-
ing affordable health insurance coverage to millions.
Twenty-six states and Washington, D.C., expanded
their Medicaid program as of May 2014, making a
large number of patients eligible for the safety net.
These are all good things that add to the success
of the Affordable Care Act (ACA), also known as
Obamacare.
At the same time we are just beginning to
see the impact of the new patient inflow on our
health system in the form of emergency room over-
crowding [1]. Opponents of the ACA argue that the
expansion of coverage without expanding the
primary care physician network across the nation
will lead to disaster. It remains to be seen which
way the pendulum will swing.
APPLES BIG SPLASH WITH HEALTHKIT
Meanwhile, Apple has released its HealthKit prod-
uct that connects multiple devices and apps. It has
shown promise to become the health data repository
BY RAJIB GHOSH
The two giants have
all the technology, talent
and financial firepower
needed to drive analytics
into the consumer health
space by enabling a
platform play for various
data generating devices
and apps.
HEALTHCARE ANALYTI CS
Will Apple, Google
usher in new era in
healthcare analytics?
J U LY / AU GU S T 2014 | 25 A NA L Y T I C S
for consumers. In essence this was the promise
of the personal health record, or PHR, a promise
that rose to the peak of infated expectation a few
years back and then fell to the trough of disillusion-
ment quite quickly [2]. But with Apples foray into
the space, this time it could be different.
The key promise, however, is the fusion of
data from multiple sources and use of analytics to
generate user-facing insights. The latter, howev-
er, is not there yet. In my last column I argued that
the true empowerment of the patient consumer
is waiting on the data fusion and analytics to
become mainstream. Consumers do not want
just a data repository like a PHR. They want
actionable information that PHR does not provide.
Apples announcement and subsequent ac-
tion may expedite the health data movement in
the right direction, but I am somewhat skeptical
regarding data liquidity in Apples walled garden
approach. Now that Apple has taken the lead
how far behind can Google be? Recently, Forbes
reported that Google is planning its own version
of a health platform. By the time this column goes
live we will know what Google is concealing up
its sleeves. These two giants have all the tech-
nology, talent and fnancial frepower needed to
drive analytics into the consumer health space
by enabling a platform play for various data
generating devices and apps.
Insights for the consumer, however, will come
at a price. As the insights with actionable consum-
er guidance increase, so too will the level of FDA
scrutiny, including requirement for mandatory FDA
approval. It is unclear how quickly Apple or Google
The key promise is
the fusion of data
from multiple sources
and use of analytics
to generate user-facing
insights. The latter,
however, is
not there yet.
WWW. I NF OR MS . OR G 26 | A NA LY T I CS - MAGA Z I NE . OR G
HEALTHCARE ANALYTI CS
will go for that since it is an unknown territory for
both companies. Having spent a decade in the
medical device industry I know frst hand the pain
points of the manufacturers when their products
come under FDAs purview.
APPLE-EPIC PARTNERSHIP
Apple is also partnering with Epic Systems,
the giant electronic medical record (EMR)
company that controls close to 20 percent of the
enterprise EMR market and covers 51 percent
of the patients in the United States. This is a
smart move by Apple. The ability to send user-
generated data to a healthcare professionals
EMR system has always been a key requirement
for providers. This end-to-end data channel
establishes continuum of care, which acts as
the building block for analytics-driven population
health management (PHM) initiatives.
Since the introduction of the iPhone, Apple
products have enjoyed a widespread adoption
among healthcare professionals. A 2013 study by
the Black Book Rankings found that among physi-
cians who use medical apps on their smartphones,
68 percent used iPhones while 31 percent used
Android devices. Also, 59 percent of physicians ac-
cessed apps from their tablet, and most of those
users prefer iPad. Among U.S. consumers, Apple
has lost some ground recently to its key competitor,
Google Android, but still commands a large con-
sumer following.
When a system enjoys large market share
both among patients and providers and the sys-
tem connects with the largest EMR company in
When a system enjoys
large market share
both among patients and
providers and the system
connects with the largest
EMR company in
the country, we can expect
seamless bi-directional
data flow to reach
critical mass.
the country, we can expect seamless
bi-directional data fow to reach criti-
cal mass. This is a prerequisite to build
a cloud-based analytics solution that
can leverage data hubs at both ends of
the fow.
This is the reason why Apples Health-
Kit introduction is a key phenomenon,
albeit it does not do much in its early
incarnation. If Google wants to become
a serious player in the healthcare feld
beyond ftness lovers, they have to think
in the same direction as well. Once that
happens imagine what sort of revolution
the rivalry of these technology compa-
nies can usher in!
The health data acquisition market is
still fragmented, and as a result EMR com-
panies have not shown much interest in
opening up their data repository to those
players. If Apple and Google can now turn
the table and make this a true platform
play using their controlling stakes in the
mobile device market, then it becomes
meaningful for the EMR companies to
forge powerful partnerships with one or
both of them. In turn that will create the
unifcation of episodic data and continu-
ous user-generated data the Holy Grail!
Interoperability standards will be
frmed up and data security solutions will
emerge. Most importantly, patients and
providers will both beneft from the ana-
lytics solutions that will get a shot in the
arm from a data rich holistic picture of
the patient.
So far IBM is the lone warrior creat-
ing an ecosystem around its Watson in
the cloud analytics solution. It still lacks
the health data source. So what can
Apple, Google, IBM and Epic do together
to shake up healthcare? Im getting goose
bumps just thinking about the possibilities.
Rajib Ghosh (rghosh@hotmail.com) is an
independent consultant and business advisor
with 20 years of technology experience in various
industry verticals where he had senior level
management roles in software engineering, program
management, product management and business
and strategy development. Ghosh spent a decade
in the U.S. healthcare industry as part of a global
ecosystem of medical device manufacturers, medical
software companies and telehealth and telemedicine
solution providers. Hes held senior positions at
Hill-Rom, Solta Medical and Bosch Healthcare. His
recent work interest includes public health and the
feld of IT-enabled sustainable healthcare delivery
in the United States as well as emerging nations.
Follow Ghosh on twitter @ghosh_r.
REFERENCES
1. Laura Ungar, More patients focking to ERs
under Obamacare, http://www.courier-journal.
com/story/news/2014/06/07/patients-focking-
emergency-rooms-obamacare/10181349/
2. Hype Cycle for Healthcare Provider
Applications, Analytics and Systems, 2013,
Gartner http://www.healthcatalyst.com/health-
data-analytics-hype-cycle
J U LY / AU GU S T 2014 | 27 A NA L Y T I C S
Subscribe to Analytics
Its fast, its easy and its FREE!
Just visit: http://analytics.informs.org/
WWW. I NF OR MS . OR G 28 | A NA LY T I CS - MAGA Z I NE . OR G
The Institute for Operations Research and the
Management Sciences (INFORMS), the largest
professional society in the world for professionals
in the felds of analytics, operations research (O.R.)
and management science and the publishers of
Analytics magazine, announced that its Certifed
Analytics Professional (CAP

) exam will now be


given at hundreds of computer-based testing cen-
ters worldwide through an agreement with Kryterion,
the full-service provider of customizable assessment
and certifcation products and services.
Candidates for the CAP certifcation exam can
choose from Kryterions global network of online se-
cured testing locations to schedule their exam at a
convenient time and place. INFORMS online test-
ing center partner Kryterion, through strategic part-
nerships with colleges and universities, as well as
testing and training companies, provides over 700
testing locations in more than 100 countries. In the
United States alone, more than 400 testing centers
are available. CAP exams can now be scheduled al-
most any day of the week and at a time and location
that best suits the candidate.
Candidates for the CAP
certification exam can
choose from Kryterions
global network of online
secured testing locations
to schedule their exam at a
convenient time and place.
I NFORMS I NI TI ATI VES
CAP exam, continuing
education, analytics
conference cluster
J U LY / AU GU S T 2014 | 29 A NA L Y T I C S
Candidates can apply at www.in-
forms.org/applyforcertifcation. Upon ac-
ceptance into the program, candidates
receive an online voucher to present on
the Kryterion site.
Exam locations can be found at http://
www.kryteriononline.com/host_locations/.
Introduced in the spring of 2013, the
CAP program was created by subject
matter experts, many of whom are IN-
FORMS members. The CAP credential
is designed for general analytics pro-
fessionals in early- to mid-career and
is based on a rigorous job task analy-
sis and is vendor- and software-neutral.
Benefts of analytics certifcation include
gaining the ability to advance ones ca-
reer by setting a professional with CAP
apart from the competition and obtain-
ing the structure to make continuing pro-
fessional development an integral part
of ones job performance. The CAP pro-
gram assists hiring managers in fnding
competent analytics talent and shows
that an organization hiring CAP profes-
sionals follows best analytics practice.
NEW INFORMS CONTINUING
EDUCATION COURSES
The INFORMS Continuing Education
program is offering two new courses this
fall: Introduction to Monte Carlo and
Discrete-Event Simulation and Foun-
dations of Modern Predictive Analytics.
The intensive, two-day, in-person
courses, like the programs popular
current courses Essential Practice
Skills for Analytics Professionals and
Data Exploration & Visualization, pro-
vide real take-away value to implement
immediately at work. Once you leave
the classroom, you will be able to ap-
ply the real skills, tools and methods
of analytics. The courses will give par-
ticipants hands-on practice in handling
real data types, real business problems
and practical methods for delivering
business-useful results.
I n t he cour se I nt r oduct i on t o
Mont e Carl o and Di scret e-Event
Simulation, taught by Barry Lawson,
University of Richmond and Lawrence
Leemis, Col l ege of Wi l l i am and
Mar y, parti ci pants wi l l l earn the
basics of Monte Carlo and discrete-
event simulation and how to identify
real-world problem types appropriate
for simulation. Theyll also develop
skills and i nt ui t i on f or appl yi ng
Mont e Carl o and di scret e-event
si mul ati on techniques.
Topic areas covered include Monte
Carlo modeling, sensitivity analysis,
input modeling and output analysis.
The course will be held at the
INFORMS offce, Catonsville (Baltimore
area), Md., Sept 12-13, and Chicago,
Oct. 16-17.
WWW. I NF OR MS . OR G 30 | A NA LY T I CS - MAGA Z I NE . OR G
I NFORMS I NI TI ATI VES
The second new course, Foundations
of Modern Predictive Analytics, will
be taught by James Drew, Worcester
Polytechnic Institute, Verizon (ret.).
Modern predictive analytics, the
science of discovering and exploiting
complex data relationships, has rapidly
changed in recent years, especially in
todays businesses. This course will
give participants hands-on practice in
handling real data types, real business
problems and practical methods for de-
livering business-useful results.
Some of the topic areas to be covered
in this course are: linear regression, re-
gression trees, logistic regression and
CART (classifcation and regression
trees).
The course will be held in Washington,
D.C., Sept. 15-16, and San Francisco,
Nov. 7-8.
Learn more about these courses
including course outlines, instructor
biographies, program objectives and
how to register at: www.informs.org/
continuinged.
ANALYTICS CLUSTER SET FOR
INFORMS ANNUAL MEETING IN S.F.
The Analytics Section of INFORMS
will present the analytics cluster of ses-
sions and presentations at the INFORMS
Annual Meeti ng i n San Franci sco
Nov. 9-12. The cluster encompasses
20 sessions featuring the renowned
analytics practitioners and leaders. Nine
additional sessions will be jointly orga-
nized in collaboration with the Health
Appl i cati ons Soci ety (HAS),CPMS
(the Practice Section of INFORMS)
and the Section on O.R. in Sports
(SpORts).
The sessions/presentations within
the cluster cover such topics as:
Successful application of analytics in
multiple industries such as healthcare,
transportation, defense and sports
Analytics focus areas such as big data,
spreadsheets and predictive analytics
Panel discussions on understand-
ing the connection between O.R. and
analytics, building analytics programs to
support organizations needs and busi -
ness analytics in healthcare industry
Winners of the Innovative Applications
in Analytics Award and the SAS Student
Paper Competition
Whys, hows and whats of analytics
certifcation
More information about the confer-
ence can be found at http://meetings2.
informs.org/sanfrancisco2014/.
Help Promote Analytics Magazine
Its fast and its easy! Visit:
http://analytics.informs.org/button.html
Solve key business problems utilizing big data. Earn an
AACSB-International accredited Master of Business
Administration with a specialization in Business Analytics
from the University of South Dakota.
Learn more: www.usd.edu/cde
The University of South Dakotas
Beacom School of Business has been
continuously accredited by
AACSB-International since 1949.
Advance your career with an online Master of
Business Administration with a specialization
in Business Analytics.
DIVISION OF CONTINUING & DISTANCE EDUCATION
414 East Clark Street | Vermillion, SD 57069
605-677-6240 | 800-233-7937
www.usd.edu/cde | cde@usd.edu
C
M
Y
CM
MY
CY
CMY
K
USD_Online MBA BA Analytics Magazine Ad.pdf 1 6/9/14 9:15 AM
WWW. I NF OR MS . OR G 32 | A NA LY T I CS - MAGA Z I NE . OR G
Magic shows are fun because we get to experi-
ence the impossible. Still, we know theres trickery
afoot. But what about those times when the magic
isnt magic? When we witness something thats seem-
ingly impossible but proves all too real? Not only real,
but the result of optimization?
Such is the case in the Formula 1 race car pit. If
you follow F1 racing, it comes as no surprise that pit
stops have been reduced to two seconds. But if you
arent an F1 devotee, the idea of lifting a car, chang-
ing four tires and sending it on its way in a mere two
seconds stretches the imagination.
The role of the pit has changed dramatically over
the years. For much of racing history it was assumed
cars would only stop in the event of problems. Sched-
uled tire changes or fuel stops werent part of the
BY E. ANDREW BOYD
The idea of lifting a
car, changing four tires
and sending it on its way
in a mere two seconds
stretches the imagination.
FORUM
Pit stop analytics
Quick stop:
Optimized F1
pit teams can
change four
tires in two
seconds.
J U LY / AU GU S T 2014 | 33 A NA L Y T I C S
equation. This orthodoxy was challenged
in 1982 when an analytically minded race
team from the United Kingdom focused in
on two important facts. First, softer tires
stuck to the track better during turns than
their harder cousins, though they wore
out more quickly. Second, less gas in the
tank translated into a lighter, and there-
fore faster, car. Calculations showed
that time spent changing tires and re-
flling the tank was more than offset by
the improved performance of the car on
the track. Its a calculation any analytics
practitioner would be proud of.
The idea quickly caught on, making
pit stops and their effcient execution
an integral part of racing. Refueling was
banned in 1984 out of safety concerns,
but reinstated in 1994. During that 10-year
period pit crews refned their tire chang-
ing skills to the point where the fastest pit
stops took a little over four seconds. When
refueling was again instituted, the impetus
for faster tire changes disappeared since
refueling was the bottleneck. That changed
in 2010 when F1 racing again reverted to
a no refueling policy, setting the stage for
lightening fast tire changes.
Achieving a two-second tire change
required optimizing the entire process.
Engineers took a look at everything from
the design of the wheel nuts (one per
wheel on F1 cars) to the special, self-
positioning pneumatic guns that remove
and tighten each nut. They then turned
their attention to the pit crews.
Teams of three work on each wheel:
one to remove the old tire, one to position
the new tire and one to operate the gun.
Their moves arent left to chance, but are
choreographed down to the position of
their hands and feet from start to fnish.
Its not hard to imagine John and Lillian
Gilbreth progenitors of industrial engi-
neering and pioneers of time and motion
studies standing nearby, stopwatches
in hand. Theyd certainly be smiling in ap-
proval. With two jack operators and scat-
tered observers, as many as 20 people
crowd around a car during a pit stop for
two seconds of work.
Optimization brings to mind models
and mathematical programs. But some-
times optimization is smart without being
sophisticated. And in the F1 pit, it works
like magic.
Andrew Boyd, INFORMS Fellow and INFORMS
VP of Marketing, Communications and Outreach,
served as executive and chief scientist at an
analytics frm for many years. He can be reached
at e.a.boyd@earthlink.net.
NOTES & REFERENCES
1. Gray, W., Tech Talk: Can F1 Pit Stops Get Even
Quicker? Eurosport, April 9, 2013. See also: https://
uk.eurosport.yahoo.com/blogs/will-gray/gray-matter-
f1-stops-even-quicker-101951154.html. Accessed
May 24, 2014.
2. Examples of fast pit stops can be found at:
https://www.youtube.com/watch?v=aHSUp7msCIE
https://www.youtube.com/watch?v=Xvu0GlMa3xQ
WWW. I NF OR MS . OR G 34 | A NA LY T I CS - MAGA Z I NE . OR G
CUSTOMER RELATI ONSHI PS
Cloud-based analytical engine yields instant insight
using unstructured social media data.
nformation is generated in
todays world more rapidly
than ever before, and it
will keep growing at an ex-
ponential rate. The rise of social media
combined with increased Internet pen-
etration has led to a signifcant increase
in user-generated content in the form
of product reviews and feedback, blogs,
independent news articles, Twitter
and Facebook updates. The crux of
leveraging such data lies in identifying
patterns from it and using the data to
generate actionable insights in real time.
This article proposes a cloud-based
analytical engine that analyzes com-
ments, reviews and opinions generated
by customers to understand the main
underlying themes and the general sen-
timent so that actionable insights can
be generated in real time. Algorithms
such as latent Dirichlet allocation for
topic modeling and the holistic lexicon-
based approach for sentiment mining
have been operationalized using a multi-
agent framework deployed in a cloud
Real-Time Text
Analytics
BY (l-r) AVEEK MUKHOPADHYAY
AND ROGER BARGA
I
J U LY / AU GU S T 2014 | 35 A NA L Y T I C S
depended on the time-intensive ETL pro-
cess (extract, transform, load). Depend-
ing upon the system and data complexity,
analytics could be delayed by hours, days
or even weeks while data management
put it all together.
In todays business landscape, mini-
mizing the lag between acquiring data
and generating actionable insight has be-
come the key differentiator. Acting in real
time to respond to an event can result in
huge profts and improved customer rela-
tionships for a frm.
Real-time analytics can benefit in
multiple business scenarios, including:
High-frequency trading (sophisticated
algorithms to rapidly trade securities)
Real-time detection of fraudulent
transactions
Real-time price adjustment based on
competitor information
Real-time feedback from social
media for a product frm about its
new launch
Real-time recommendations by retail
stores based on customers location
Real-time traffc routing based on
information about vehicle frequency,
direction, etc.
Social media content comes from
users without any vested interest, thus
their opinions beget more trust. Orga-
nizations whose products and services
environment. This process meets com-
putational demands as it allows users
to run virtual machines within managed
data centers, freeing them from worry-
ing about acquisition of new hardware
and networks.
UNSTRUCTURED SOCIAL MEDIA DATA
According to a study by International
Data Corporation (IDC), mankind cre-
ated an estimated 150 exabytes (1 bil-
lion gigabytes) of data in 2005, a number
that jumped to 1,200 exabytes in 2010. A
more recent study by IDC and EMC put
the amount of data created in 2011 at 1.8
zettabytes (1 followed by 27 zeroes), a
number the study researchers expected
to double every two years.
Only 5 percent of this data is struc-
tured (comes in a standard format that
can be read by computers). The remain-
ing 95 percent is unstructured (photos,
phone calls and free-fow texts). A large
chunk of such unstructured data is in
text format. Posing challenges owing to
the sheer volume, depth and complex-
ity, such data, however, holds immense
potential for organizations. The key lies
in identifying patterns from the data and
gaining relevant insights.
REAL-TIME ANALYTICS
Not long ago, analyzing data and
generating business intelligence reports
WWW. I NF OR MS . OR G 36 | A NA LY T I CS - MAGA Z I NE . OR G
REAL-TI ME TEXT ANALYTI CS
are mentioned in such media need to
remain current on relevant discussions
and be able to track the sentiment of ev-
ery employee, customer and investor. To
address this challenge, a cloud-based
real-time ecosystem was created for ana-
lyzing comments, reviews and opinions
mined from Twitter. In addition, tracking
trending themes in the customer space
and the evolution of these trends over
time was incorporated.
TEXT MINING ALGORITHMS
Topic modeling. Topic models are
statistical techniques that analyze words/
phrases in textual data to understand
the main themes running through them.
This model algorithm is based on LDA
(latent Dirichlet allocation) and uses the
observed words in tweets (extracted from
Twitter) to infer the hidden topic structure.
LDA is more easily understood by its
generative process. This generative pro-
cess defnes a joint probability distribution
over the observed (the words) and hidden
(the topics) random variables. This joint
distribution is used to compute the condi-
tional distribution of the hidden variables
given the observed variables. This con-
ditional distribution is called the posterior
distribution.
A topic is assumed to be a collec-
tion of words with different probabilities
of occurrence. An individual tweet can
be assumed as generated from multiple
topics in different proportions. Now every
word generated in a tweet can be ran-
domly chosen in a two-step process:
First, a topic is randomly selected
from the distribution of topics.
Second, the chosen word is randomly
selected from the distribution of
words over that topic.
So, the joint probability distribution of word
W and topic T = Probability (W, T) =
Probability (T) * Probability (W | T).
Now when the individual probability of
occurrence of a word is known (because it
has already occurred in the tweet), the pos-
terior distribution is calculated as follows:
Probability (T | W) = Probability (W, T)
/ Probability (W)
Given the probabilities of observed
words, latent information like the vocabu-
lary distribution of a topic and the distri-
bution of topics over the tweet are thus
inferred.
Sentiment analysis. A holistic lexi-
con-based algorithm is used to analyze
individual feature-level sentiments as well
as cumulative sentiments over tweets.
Aggregating opinions for a feature:
The algorithm parses one tweet at a time
identifying the features present. A set of
opinion words for each feature is identi-
fed using a lexicon. An orientation score
Opportunity
at your ngertips.
Visual Analytics
The answers you need, the possibilities you seektheyre
all in your data. SAS helps you quickly see through
the complexity and nd hidden patterns, trends, key
relationships and potential outcomes. Then easily share
your insights in dynamic, interactive reports.
Try Visual Analytics and see for yourself
sas.com/VAdemo
SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. indicates USA registration. Other brand and product names are trademarks of their respective companies. 2014 SAS Institute Inc. All rights reserved. S120597US.0214
WWW. I NF OR MS . OR G 38 | A NA LY T I CS - MAGA Z I NE . OR G
REAL-TI ME TEXT ANALYTI CS
for each feature in the sentence is then
calculated by summing up the feature-
opinion scores for that sentence. (Each
feature-opinion score is obtained from
the sentiment polarity of the opinion
word and a multiplicative inverse of the
distance between the feature and opin-
ion word. Opinion words at a distance
from the feature are assumed to be less
associated to the feature compared to
the nearer words.)
For example, the phone is useful and
a great work of art.
Let the feature here be phone and
opinion words be useful, great.
Semantic orientation of useful = 1
Semantic orientation of great = 1
Distance between the words useful
and phone = 2
Distance between the words great
and phone = 5
score(f)=1/2+1/5= 0.7
Aggregating opinions for tweets: The
sentiment score for a tweet is the sum-
mation of the scores for all opinion words
present in the tweet.
For example, The phone is useful
and a great work of art.
The opinion words in the sentence are
useful, great
Semantic orientation of useful = 1
Semantic orientation of great = 1
score(t) = 1 +1= 2
Negation-rule: This identifes the ne-
gation word (which can be 1 or 2 places
before the opinion word) and reverses
the opinion expressed in a sentence.
For example, The phone is not good.
Here phone gets negative orientation.
Context-dependent rules: The features
for which we fnd no opinion words, context
dependent constructs are used to identify
the orientation score.
For example, The phone is good but
battery-life is short.
The only opinion word in the sentence
is good (short is a context-dependent
word).
Phone gets positive orientation be-
cause of good.
Battery-life gets negative orientation
because of the word but being present
between good and battery-life.
Topic Evolution. The next step to
topic modeling is to understand how top-
ics and trends develop, evolve and go viral
over time.
The algorithm maintains a fxed num-
ber of topic streams and their statistics.
Each tweet is processed as it comes in
and is assigned to the closest topic
stream (the topic stream most similar to
it). If no topic stream is close enough,
then a new stream is created and a stale
stream is killed to maintain a fxed number
INFORMSCONFERENCE
BIG
DATA
THE
BUSINESS
OF
THANKS
To Our
Sponsors
Leadership Sponsors
Corporate & University Sponsors
Big Data Thanks ad_Layout 1 6/17/14 8:39 AM Page 1
WWW. I NF OR MS . OR G 40 | A NA LY T I CS - MAGA Z I NE . OR G
REAL-TI ME TEXT ANALYTI CS
of topic streams. Streams are constantly
monitored for the rate of arrival of tweets.
Whenever there is a burst of tweets in a
particular topic stream, an alert for the
trending topic is generated.
THE REAL-TIME EDGE
A multi-agent distributed framework
enables the processing of real-time data
and facilitates decision-making by al-
lowing for easy deployment of analyti-
cal tasks in the form of process fows. In
this multi-agent paradigm, an agent is a
software program designed to carry out
one or more tasks and can communicate
with other agents in the system using
agent communication language. Thus, an
analytical task can be written as an agent,
and the analytical process fow can be es-
tablished by wiring together a set of com-
municating agents (an agency) that can
run in sequence or in parallel.
These agents were written using R to
offer the analyst the benefts of a powerful
and fexible statistical modeling language.
OPERATIONALIZATION IN THE
CLOUD
The entire real-time platform was then
deployed on a cloud ecosystem to allow
for the following processes:
Effcient resource management: The
cloud platform provides the necessary vir-
tual machine, network bandwidth and other
Figure 1: Real-time text mining agency.
J U LY / AU GU S T 2014 | 41 A NA L Y T I C S
infrastructure resources. Even when a
machine goes down because of an unex-
pected failure, a new virtual machine is al-
located for the application automatically.
Dynamic scaling and load balanc-
ing: The cloud solution allows scaling
out as well as scaling back an appli-
cation depending on resource require-
ments. Multiple services running in
tandem make the whole system com-
putationally resource intensive. As re-
source demands increase, new role
instances can be provisioned to handle
the load. When demand decreases,
these instances can be removed so that
payment for unnecessary computing
power is not required.
Availability & durability: The cloud
storage services replicate data on three
different servers, guaranteeing it can be
accessed at all times, even if a server
shuts down unexpectedly.
Better mobility: The application can
be accessed from any place, as long as
there is an Internet connection. There is
no tight coupling with any physical server
or machine.
RESULTS
Figure 2 shows a snapshot of the topic
treemap generated in one run of the topic
modeling algorithm (different topics are
represented by different colors, with the
areas representing occurrence frequency).
Figure 2: Topic modeling treemap.
WWW. I NF OR MS . OR G 42 | A NA LY T I CS - MAGA Z I NE . OR G
REAL-TI ME TEXT ANALYTI CS
Incoming tweets over a time period
were captured in a stream graph visual-
ization as shown in the Figure 3 screen-
shot. Each topic is represented by a
stream in the visualization and is charac-
terized by the top words in that topic. At
any point of time, the top words in each
topic are displayed in a topic treemap
below the stream graph. It is possible to
get the keyword treemap at any past
time in history.
Successive runs of the sentiment
analysis algorithm for batches of tweets
are represented by the visual in Figure 4.
Each bar captures the sentiment
for that feature in a particular batch
of tweets. The height of the bar rep-
resents the number of opinion words
for the feature in that batch. The col-
or of each bar represents the overall
sentiment level expressed in a batch of
data, ranging from extremely negative
(dark red) to extremely positive (dark
green). The change in color of the bars
across various batches can be used
to identify stimuli that are driving the
change.
Selection of a particular bar provides
a deeper analysis of that batch. The size
of a bubble indicates the number of ref-
erences of a particular opinion word, and
the color shows the overall sentiment
score for the particular opinion word.
Both the size and color are indicators of
which opinion words drive the sentiment
for a feature in a batch.
Figure 3: Trends stream graph.
CLOSING THOUGHTS
Trending topics represent the popular
topics of conversation, and when de-
tected in real time, these hot topics are
the social pulses that are usually ahead
of any standard news media. Data ana-
lyzed via managed data centers can pro-
vide key insights into the evolving nature
and patterns of social information and
opinion and the general sentiment pre-
vailing over such subjects.
Aveek Mukhopadhyay is an associate manager
at Mu Sigma where he works with the Innovation
& Development Team with a core focus on driving
the adoption of advanced analytical platforms
and techniques both internally and externally. He
has interests in the felds of text mining, machine
learning and analytics automation.
Roger Barga, Ph.D., is group program manager
for the CloudML team at Microsoft Corporation
where his team is building machine learning as
a service in the cloud. Barga is also a lecturer
in the Data Science program at the University
of Washington. He joined Microsoft in 1997 as a
researcher in the Database Group of Microsoft
Research (MSR), where he was involved in a
number of systems research projects and product
incubation efforts, before joining the Cloud and
Enterprise Division of Microsoft in 2011.
Figure 4: Sentiment analysis.
NOTES & REFERENCES
1. The Economist (Feb. 25, 2010), The Data Deluge
(http://www.economist.com/node/15579717).
2. David M. Blei, Probabilistic Topic Models,
Communications of the ACM, April 2012, Vol. 55, No.
4 (http://www.cs.princeton.edu/~blei/papers/Blei2012.
pdf).
3. Xiaowen Ding, Bing Liu and Philip S. Yu,
A Holistic Lexicon-Based Approach to Opinion
Mining (http://www.cs.uic.edu/~liub/FBS/opinion-
mining-fnal-WSDM.pdf).
Help Promote Analytics Magazine
Its fast and its easy! Visit:
http://analytics.informs.org/button.html
J U LY / AU GU S T 2014 | 43 A NA L Y T I C S
WWW. I NF OR MS . OR G 44 | A NA LY T I CS - MAGA Z I NE . OR G
Key considerations for deep analytics on big data,
learning and insights.
hat is big data? Big data,
which means many things
to many people, is not a
new technological fad. In
addition to providing innovative solu-
tions and operational insights to endur-
ing challenges and opportunities, big
data with deep analytics instigate new
ways to transform processes, organi-
zations, entire industries and even so-
ciety. Pushing the boundaries of deep
data analytics uncovers new insights
and opportunities, and big depends on
where you start and how you proceed.
Big data is not just big. The expo-
nentially growing volume of data is only
one of many characteristics that are of-
ten associated with big data, such as
variety, velocity, veracity and others (the
six Vs; see box).
According to Gartner Research,
the worldwide market for analytics
will remai n t he t op f ocus f or CI Os
through 2017 [1]. According to Gartner,
Why do so many
analytics projects fail?
BY (l-r) HALUK DEMIRKAN AND BULENT DAL
W
THE DATA ECONOMY
J U LY / AU GU S T 2014 | 45 A NA L Y T I C S
more than half of all analytics projects
fail because they arent completed
within budget or on schedule, or be-
cause they fail to deliver the features
and benefits that are optimistically
agreed on at their outset.
Today, an abundance of knowledge
and experience exists to have success-
ful data and analytics-enabled decision
support systems. So why do so many
of these projects fail, and why are so
many executives and users still so un-
happy? While there are many reasons
for the high failure rate, the biggest rea-
son is that companies still treat these
projects as just another IT project. Big
data analytics is neither a product nor a
computer system. Instead, it should be
considered a constantly evolving strat-
egy, vision and architecture that contin-
uously seeks to align an organizations
operations and direction with its strate-
gic business goals and tactical and op-
erational decisions. Table 1 includes a
list of common mistakes that can doom
analytics projects.
n Volume (data at rest):
terabytes to exabytes, petabytes
to zettabytes of lots of data
n Velocity (data in motion):
streaming data, milliseconds to
seconds, how fast data is being
produced and how fast the data
must be processed to meet the
need or demand
n Variety (data in many forms):
structured, unstructured, text,
multimedia, video, audio, sensor data,
meter data, html, text, e-mails, etc.
nVeracity (data in doubt):
uncertainty due to data
inconsistency and incomplete-
ness, ambiguities, latency, de-
ception, model approximations,
accuracy, quality, truthfulness or
trustworthiness
n Variability (data in change):
the differing ways in which the data
may be interpreted; different ques-
tions require different interpretations
n Value (data for co-creation and
deep learning): The relative impor-
tance of different complex data from
distributed locations. Big data with
deep analytics means greater insight
and better decisions, something that
every organization needs.
The six Vs of big data
WWW. I NF OR MS . OR G 46 | A NA LY T I CS - MAGA Z I NE . OR G
WHY PROJECTS FAI L
KEY CONSIDERATIONS FOR DEEP
ANALYTICS
We live in an era of big data. Whether
you work in fnancial services, consumer
goods, travel, transportation, health-
care, education, supply chain, logistics
or industrial products and professional
services, analytics are becoming a com-
petitive necessity for your organization.
But having big data and even people
who can manipulate it successfully is
not enough. Companies need managers
who can partner effectively with analysts
to ensure that their work yields better
strategic and tactical decisions.
Big data with deep analytics is a jour-
ney that helps organizations solve key
business issues and opportunities by
converting data into insights to infuence
business actions and drive critical busi-
ness outcomes. As organizations try to
take advantage of the big data opportuni-
ty, they need not be overwhelmed by the
various challenges that might await them.
Managers will need to start their
journey by [2]:
Identifying clear business need and
value. Almost everything needs to be a
business rather than a technology solu-
tion. Before companies start collecting big
Going Deep & Wide on big
data with deep analytics for
deep learning
J U LY / AU GU S T 2014 | 47 A NA L Y T I C S
Table 1: Common mistakes for analytics projects.
Failing to build the need for big data within the organization
Islands of analytics with Excel culture
Data quality and reliability related issues
Not enough investigation on vendor products and rather than blindly taking the path of least
resistance
Departmental thinking rather than looking at the big picture
Considering this as a one-time implementation rather than a living eco-system
Developing silo dashboards to answer a few questions rather than strategic, tactical and opera-
tional dashboards
Not establishing company ontology and defnitions for single version of truth culture
Lack of vision and not having a strategy; not having a clear organizational communications plan
Lack of upfront planning; overlooking the development of governance and program oversight
Failure to re-organize for big data
Not establishing a formal training program
Ignoring the need to sell success and market the big data program
Not having the adequate architecture for data integration
Forgetting rapidly increasing complexities with volume, velocity, variety, veracity, and many more
WWW. I NF OR MS . OR G 48 | A NA LY T I CS - MAGA Z I NE . OR G
WHY PROJECTS FAI L
data, they should have a clear idea of what
they want to do with it with from a business
sense. Heres what you need to consider:
Turn over part or all of big data
solution delivery to business leaders.
Project management and ownership
from business (not IT) in big data solu-
tions is the key for success. In the mean-
time, make sure to have clear alignment
between business and IT.
Partner with business peers to
identify opportunities and solutions.
If we talk about big data, the impact of
these projects should also be big. Cre-
ate a cross-organization team and in-
volve all stakeholders early in the game.
Value co-creation of value with
customers. Overall business objective
should always be about customers. If
one of the initiatives is about big market-
ing outcome, than it should be about how
to set up customer-centric marketing,
how to provide targeted dynamic adver-
tisement, how to engage customers and
how to manage personalized shopping.
Start small with an eye to scale
quickly. While big data solutions may
be quite advanced, everything else sur-
rounding it best practices, methodolo-
gies, org structures, etc. is nascent.
No one has all the answers, at least
not yet. Understand why traditional
business intelligence and data ware-
housing projects cant solve a problem.
Small, simple and scalable. When
launching big data initiatives, avoid 1) get-
ting too complicated too fast, and 2) not
being prepared to scale once a solution
catches on. Big data solutions can quickly
grow out of control since discovering val-
ue from data prompts wanting more data.
Identify what part of the business
would beneft from quick wins. Look
for opportunities that will show quick
wins within no more than three months.
Success brings more people to the table.
This is not a one-time implementa-
tion. Understand that this is a living and
evolving organism that will grow expo-
nentially very fast. It is a culture change
in the company with the way that you
collect and use data, and the way you
make outcome-based decisions.
Develop a minimal set of big data
governance directives upfront. Big
data governance is a chicken-and-egg
problem you cant govern or secure
what you havent explored. However,
exploring vast data sets without gover-
nance and security introduces risk.
New processes to manage open
source risks. Most big data solutions
are being built on open source software,
but open source has both legal and skill
implications as frms are: 1) exposed to
risk due to intellectual property issues
and complex licensing agreements; 2)
concerned about liability if systems built
J U LY / AU GU S T 2014 | 49 A NA L Y T I C S
on open source fail; and 3) required to
use technology that is often early re-
lease and not enterprise-class.
New agile processes for solution
delivery. Successful frms will embrace
agile practices that allow end users of
big data solutions to provide highly in-
teractive inputs throughout the imple-
mentation process.
Integrate structured and unstruc-
tured data from multiple sources. Inte-
gration of data is one of the most important
and also complex processes to serve ef-
fcient and effective decision-making. In
terms of data, it includes machine data,
sensor data, videos, audio, documents,
enterprise content in call centers, e-mail
messages, wikis and, indeed, larger vol-
umes of transactional and application data.
Data sharing is key. In order for a
company to build a big data ecosystem
that drives business action, organiza-
tions have to share data.
Build a strong data infrastructure
to host and manage data. Make sure
to have secured and reliable in-house
and/or hosted data (e.g., cloud) and in-
formation management infrastructure.
USINESS ANALYTICS &
PERATIONS RESEARCH
INFORMS CONFERENCE ON
Save the Date!
Catch the Analytics Wave in Huntington Beach, CA
APRIL 12-14, 2015
WWW. I NF OR MS . OR G 50 | A NA LY T I CS - MAGA Z I NE . OR G
WHY PROJECTS FAI L
Think about what information do I
collect today and what analytics should
I perform that can beneft me and others.
New security and compliance
procedures to protect extreme-scale
data. In order to succeed with big data,
new processes must be developed that
recognize and protect the special nature
of extreme-scale data that may be large-
ly unexplored.
Be ready to support rapid growth.
Big data solutions can grow fast and ex-
ponentially. They can start as a pilot with
a few terabytes of data, then becomes
a petabyte very quickly. Since the same
data can be used different ways and re-
analyzed for new insights easily, nothing
ever gets deleted.
Funding must move out of IT for
big data success. Funding for these
projects should come from outside of the
CIO organization and move to a market-
ing or sales organization, for instance,
so that the business has a vested stake
in the game.
Create a road map that gradually
builds the skills of your organization.
Its important to create a road map that
allows you to gradually build the required
skills within your staff, minimize risk and
capitalize on previous successes to gain
more support. In the organization, there
will be new roles and responsibilities such
as the data scientist, who possesses a
blend of skills that includes statistics, ap-
plied mathematics and computer science.
This is different than any current
decision support solution. With big
data, organizations should look for new
capabilities, such as: using advanced
analytics to uncover patterns previously
hidden; visualization and exploration to
help the business fnd more complete
answers, with new types and greater
volumes of data to best represent the
data to the user and highlight important
patterns to the human eye; enable oper-
ational decision-making with on-demand
stream data by making foor employees
into analytic consumers; and turn insight
into action to drive a decision either
with a manual step or an automated pro-
cess. And most important be ready for
rapidly increasing benefts and complex-
ities from the six Vs.
WHAT IS NEXT IN THE DATA
ECONOMY?
Organizations have access to a
wealth of information, but they cant get
value out of it because it is sitting in its
most raw form or in a semi-structured
or unstructured format [3]. As a result,
they dont even know whether its worth
keeping.
So where is deep analytics for
deep learning headed in the next few
years? The exciting news is that many

career analytics.
Enroll now only
AAS
nation.
Wake Technical Community College served
68,919 students in 2012-13 and was
ranked the second largest community
college in the country in 2012 by
Community College Week.

A future forward college, it launched the AAS
in Business Analytics, the first of its kind, in
2013. The program provides students the
knowledge and practical skills necessary for
employment and growth in analytics
professions in as little as two semesters.

Competitive tuition, open-door enrollment,
flexible scheduling options, access to industry
recognized tools, and a variety of credential
options make enrollment in the program
both accessible and affordable.

This program is funded in full by a $2.9 million
Dept. of Labor Trade Adjustment
Assistance Community College & Career

Flexibility






Credential
Options



Executive
Accelerated
Program

Industry
Recognized
Tools & Skills
WWW. I NF OR MS . OR G 52 | A NA LY T I CS - MAGA Z I NE . OR G
WHY PROJECTS FAI L
organizations are already realizing the
value of big data analytics today. Insight-
driven, information-centric initiatives will
be deployed where the ability to capital-
ize on the six Vs of information will cre-
ate new opportunities for organizations
to exploit. By combining and integrating
deep analytics, local rules, scoring, opti-
mization techniques and machine learn-
ing with cognitive science into business
processes and systems, decision man-
agement helps deliver decisions that are
consistently optimized and aligned with
the organizations desired outcomes.
Social analytics will ensure busi-
nesses know how, when and where to
creatively engage with individual con-
sumers and social communities to fos-
ter trusted, one-to-one relationships and
better understand and manage the way
their companies are perceived. Integrat-
ing demographic and transactional data
with what can be learned about attitudes
and opinions allows organizations to
truly understand the motivations and in-
tents of its constituents to better serve
them at the right time and place.
Deep analytics will help organiza-
tions uncover previously hidden patterns,
identify classifcations, associations and
segmentations, and make highly accu-
rate predictions from structured and un-
structured information. Organizations will
use real-time analysis of current activity
to anticipate what will happen and iden-
tify drivers of various business outcomes
so they can address the issues and chal-
lenges before they occur. Many decisions
will be done automatically by computers
that also have deep-learning capabilities.
When you are in a process of starting
a big data journey, consider this ques-
tion: What should our big data with deep
analytics roadmap look like to achieve
our objectives?
Haluk Demirkan (haluk@uw.edu) is a professor
of Service Innovation and Business Analytics, and
the founder and executive director of Center for
Information Based Management at the Milgard
School of Business, University of Washington-
Tacoma. He has a Ph.D. in information systems
and operations management from the University of
Florida. He is a longtime member of INFORMS.
Bulent Dal (bulent.dal@obase.com) is a co-founder
and general manager of Obase Analytical Solutions
(http://www.obase.com/index.php/en/obase),
Istanbul, Turkey. His expertise is in scientifc retail
analytical solutions. He has a Ph.D. in computer
sciences engineering from Istanbul University.
Acknowledgement
Part of this article is excerpted with permission
of the publisher, HBR Turkey, from Demirkan,
H. and Dal, B., Big Data, Big Opportunities, Big
Decisions, Harvard Business Review Turkish
Edition (published in Turkish), March 2014.
REFERENCES
1. Gartner, Inc., 2013, Gartner Predicts Business
Intelligence and Analytics Will Remain Top Focus
for CIOs Through 2017, Dec. 16, 2013, http://www.
gartner.com/newsroom/id/2637615.
2. Demirkan, H. and Dal, B., Big Data, Big
Opportunities, Big Decisions, Harvard Business
Review Turkish Edition (published in Turkish),
March 2014, pp. 28-30.
3. Davenport, T., 2013, Analytics, 3.0, Harvard
Business Review, December.
The Institute of Business Analytics Symposium
is a two-day event where presenters from major
companies across the U.S. share their experiences
in business analytics. We will explore a diverse
landscape from statistics, data-mining, and
forecasting to predictive modeling and operations
research.

Its also a great networking opportunity for
businesses, students and academia.
Keynote Speakers:
- Wayne Winston - Hear from this renowned analytics
expert. Major league sports teams and Fortune 500
companies have requested his business analytics
services.

- Paul Adams, VP of Ticket Sales is beginning his
26th season with the Atlanta Braves.

For a complete list of presenters and to register
visit http://mycba.ua.edu/basymposium. Early
registration is available at a discounted rate
through August 15. Businesses registering four
or more individuals can receive a reduced rate.

The INFORMS Certied Analytics Professional (CAP)
exam will be administered on September 24 as a
pre-symposium event and requires separate payment.
Obviously he (Wayne Winston)
helped start the basketball
analytics revolution with us,
said Dallas Mavericks
owner Mark Cuban.
Wayne Winston
Paul Adams
7
th
ANNUAL BUSINESS
ANALYTICS SYMPOSIUM
Hotel Capstone, The University of Alabama, Tuscaloosa, Alabama
September 25-26, 2014
WWW. I NF OR MS . OR G 54 | A NA LY T I CS - MAGA Z I NE . OR G
DATA SCI ENTI STS I N DEMAND
According to executive search firm head Linda Burtch,
the job prospects for data scientists and other elite
analytics professionals have never been better and
the future is even brighter.
n April, the executive search
frm Burtch Works released
the results of its frst-of-its-
kind salary and demograph-
ics survey of data scientists, a follow-up
survey of big data professionals con-
ducted a year earlier. Among other fnd-
ings, the 2014 survey quantifed that data
scientists are well paid, relatively young,
overwhelmingly male and that almost half
(43 percent) are employed on the West
Coast.
Linda Burtch, managing partner of
Burtch Works, has been involved in the
recruitment and placement of high-end
analytics talent for 30 years. She start-
ed her career with Smith-Hanley before
founding her own company fve years
ago. Analytics magazine editor Peter
Horner interviewed Burtch in April, not
long after the survey of data scientists
was released. Following are excerpts
from the interview.
What did you fnd that surprised
you the most from the salary and de-
mographics survey of data scientists?
First of all, I fnd it funny that every-
one is interested in salaries and what
data scientists and big data profession-
als make, but its such a taboo subject to
actually talk about. Not to me. I talk about
salaries all the time. Thats my business.
What surprised me? Thats an inter-
esting question. It actually turned out
the way I thought it would a lot of the
Its their time
to shine
BY PETER HORNER
I
J U LY / AU GU S T 2014 | 55 A NA L Y T I C S
data scientists. Data
storage has become so
much cheaper, comput-
ing power has become
much faster, nanotech-
nology and sensors are
now becoming ubiqui-
tous. Self-driving cars,
traffc sensors, the en-
ergy grid. The list goes
on and on and on.
Right now the ob-
vious stuff is happen-
ing with understanding
digital streams of data
in applications related
to social media. Thats pretty straight-
forward stuff, but wait until it hits the
healthcare industry, for example. Self-
driving cars are going to be a huge,
huge deal. While a lot of it is being done
out in California now, over the next fve
years we are going to see it scattered
all over the United States.
When it comes to recruiting can-
didates and job placement, who are
you talking to?
I recruit in analytics people who
have masters degrees in statistics, op-
erations research, econometrics, people
who are out there working in business
applications, solving problems related
to marketing spend or credit worthiness
candidates living out on
the West Coast and a
higher predominance
of Ph.D.s among data
scientists than the gen-
eral analytics population
or the big data profes-
sionals, as I call them.
It all pretty much made
sense to me. It was in-
teresting because it was
actually quantifed.
Werent you a little
surprised by the extent
of the concentration of
data scientists nearly 50 percent
on the West Coast?
Thats for the moment, for now, but
watch and see what happens. Analyt-
ics has been around for a long time, yet
some people still ask me, Are you sure
this isnt a fad? Its not.
Analytics has become a hugely proft-
able specialty area within organizations
as they try to optimize their operations,
or target their marketing or look at re-
turn on investment issues, and that has
been around for years and years.
I would argue that those issues are
sort of the humdrum stuff of analytics.
Data-driven decision-making is really
going to explode, and thats what we are
seeing with this whole area going toward
Linda Burtch, founder and managing
partner of Burtch Works.
WWW. I NF OR MS . OR G 56 | A NA LY T I CS - MAGA Z I NE . OR G
Q&A WI TH LI NDA BURTCH
or target marketing. More recently Ive
gotten into data science. Thats a huge
umbrella description.
You mentioned operations research,
the heart and soul of INFORMS.
It is. When I started out in recruit-
ing more than 30 years ago, I focused
on operations research candidates. Its
grown pretty dramatically since then.
They have a very fond place in my heart
because thats how I got started. Its one
of those things that Ive really been in-
volved with the INFORMS group back
in New York when I was living there,
and Im really excited now because the
INFORMS group in Chicago is getting
re-energized. Its really exciting to watch.
When looking at the job market-
place, do you distinguish between,
say, a data scientist and other analyt-
ics professionals?
Let me back up a little bit. Last sum-
mer, when I was putting together the big
data salary study, I saw that data scien-
tists were a breed apart, and that they
had higher compensation levels. So I
made the decision to take them out of the
general big data study and hold them for
later because its such an emerging feld
thats so different. They are working with
what I would call unstructured data. You
could get into a lot more detail over how a
data scientist is different from a big data
professional, but the primary distinguish-
ing feature, in my opinion, is that data
scientists are working with data thats un-
structured. Its something thats going to
grow as sensors become more and more
prevalent and data streams become con-
tinuous in so many applications areas.
How would you describe the current
job market for quants, for lack of a
better word?
Its hot. A couple of months ago we did
a fash survey in which we simply asked
how often are you are contacted about a
new job opportunity through LinkedIn. We
had 400 responses; 89 percent of the re-
spondents said they were contacted at
least monthly, and 25 percent said that they
were contacted at least weekly. Im working
with elite data scientists, and theyre telling
me that they get calls once or twice a day
from recruiters, so its just crazy.
Our candidates are seeing a 14 per-
cent increase in salary when they change
jobs, so theres a lot of churn out there.
If they stay with their existing company,
they might see an annual increase of be-
tween 2 percent and 3 percent, so the
14 percent is a nice bounce if they de-
cide to make a change. One of my data
scientists in Boston said he received 30
calls in one week after he left a job and
went on the job hunt.
INFORMS Continuing
Education program offers
intensive, two-day in-person
courses providing analytics
professionals with key skills,
tools, and methods that can
be implemented immediately
in their work environment.
These courses will give
participants hands-on
practice in handling real
data types, real business
problems and practical
methods for delivering
business-useful results.
NEW!
INTRODUCTION TO MONTE CARLO
AND DISCRETE-EVENT SIMULATION
Topic areas:
Monte Carlo Modeling
Sensitivity Analysis
Input Modeling
Output Analysis
This course will be held
Catonsville, MD (INFORMS HQ)
Sep 12-13, 2014
Chicago, IL
Oct 16-17, 2014
Faculty:
Barry G. Lawson, University of Richmond
Lawrence M. Leemis,
The College of William & Mary
COURSES FOR
ANALYTICS
PROFESSIONALS
ducation
c
ontinuing
Learn more about these
courses at:
informs.org/continuinged
NEW!
FOUNDATIONS OF MODERN
PREDICTIVE ANALYTICS
Topic areas:
Linear Regression
Regression Trees
Classification Techniques
Finding Patterns
This course will be held
Washington, DC Sep 15-16, 2014
San Francisco, CA Nov 7-8, 2014
Faculty:
James Drew, Worcester Polytechnic
Institute, Verizon (ret.)
Its amazing. Competing offers is
another sign that the market is really
hot. Sign-on bonuses are another thing
that has become very commonplace in
the analytics job market. Another sign
that is important to note is the aca-
demic institutions have really stepped
up with many of them developing mas-
ters programs in analytics, predictive
analytics and the like, so thats some-
thing that is very new in the last two or
three years.
In an interview with the New York
Times, you said in reference to MBAs,
and I quote, In 15 years, if you dont
have a solid quant background, you
might have a permanent pink slip.
Thats a little rough, isnt it?
I know, Ive become the harbinger of
the permanent pink slip. Seriously, I have
seen many MBAs, your general MBA,
look around and say, whoa, this is a little
bit scary, because they are seeing this
trend toward analytical decision-making
J U LY / AU GU S T 2014 | 57 A NA L Y T I C S
WWW. I NF OR MS . OR G 58 | A NA LY T I CS - MAGA Z I NE . OR G
Q&A WI TH LI NDA BURTCH
becoming so predominant in business.
Personally, I think within 10 or 15 years
if MBAs dont have a quantitative foun-
dation, they will be prevented from pro-
motion. Well see. I always said back
when I was working with the operations
research people that my guys are so
smart, they are the ones who should be
running these companies. Now Im see-
ing it come true.
In an episode of the TV show Mad
Men, the ad agency employees, cir-
ca late 1960s, were concerned that a
new computer the size of a confer-
ence room would make them expend-
able. Your quote reminded me of that.
Right. A lot of people ask me about
that. There is going to be a disruption.
There already has been. Just yesterday,
the Times had a visual display of analyt-
ics and quants and how it was disrupting
things and what jobs were going to be
eliminated, including truck drivers and
airplane pilots.
Self-driving cars, robots, analytics,
algorithms and all this stuff is here to
stay, and its only going to get bigger,
but its not going to replace the ability to
read, write and think critically. While Im
a big proponent of analytics, communi-
cation will continue to be really impor-
tant; human-to-human contact cant be
replaced, ever.
Just how important are commu-
nication skills to a data scientist?
INFORMS, for example, now routinely
holds soft skills workshops aimed
at helping analysts explain their work
to non-technical audiences in order
to garner corporate buy-in.
Yes. Thats absolutely critical. The
other piece that goes hand in hand with
that is having the ability to understand
the business at hand. Business acumen
is really important. You have to have
that gut check; does it make sense and
how can I best monetize the situation
to beneft a client or employer? Its re-
ally important for people to understand
not only whats interesting what a lot
of quantitative people tend to gravitate
toward but also whats important.
If a company is just starting out on
the analytics journey and has no in-
house expertise in this area, how can
they judge a candidates technical
abilities?
Thats an interesting problem. When
Im talking to a client, especially in this
data science area that is so new, they
will call me and sometime they will have
it down. They are talking the right lan-
guage, they are thinking about the right
things, they are asking the right ques-
tions. Other clients are foundering; they
are still exploring.
J U LY / AU GU S T 2014 | 59 A NA L Y T I C S
I think its very important that they
make sure they understand where their
needs are before they actually bring in
somebody because its not inexpensive
to apply analytics in an organization.
You really need to think very carefully
what the goals are, what the road map
is going to look like and so on. I can cer-
tainly help with that, and I can give the
names of consultants who can help a
company really understand what their
plan should be before they jump in and
make hires.
On the other side of that coin,
whats the best advice you can give
an analytics candidate who is testing
the job market?
Another fash survey we did focused
on understanding what motivates peo-
ple to make a job change. The number
one motivation is money, but its quickly
followed by challenging work and the op-
portunity to grow within an organization.
Money is important to everyone,
but candidates shouldnt make deci-
sions regarding changing jobs based on
Job Seeker Benets
Access to high quality, relevant job postings.
No more wading through postings that arent
applicable to your expertise.
Personalized job alerts notify you of relevant
job opportunities.
Career management you have complete
control over your passive or active job search.
Upload multiple resumes and cover letters,
add notes on employers and communicate
anonymously with employers.
Anonymous resume bank protects your confdential
information. Your resume will be displayed for
employers to view EXCEPT your identity and
contact information which will remain confdential
until you are ready to reveal it.
Value-added benefts of career coaching, resume
services, education/training, articles and advice,
resume critique, resume writing and career
assessment test services.
POWERED BY
http://careercenter.informs.org
CAREER
CENTER
WWW. I NF OR MS . OR G 6 0 | A NA LY T I CS - MAGA Z I NE . OR G
Q&A WI TH LI NDA BURTCH
salary alone because money isnt going to be the
factor thats going to change their life. Rather, its
the kind of work you will do and how engaged
you will be. Its really important to understand
the challenge and the growth opportunity within
whatever it is you are looking to jump into.
The third thing I think is important to analyze
for any quantitative person when theyre talking
to a potential new employer is to understand if
analytics has a seat at the corporate table. You
have to make sure that there is buy-in within the
organization and the stakeholders are really ac-
tively involved and engaged in conversations
about how analytics can and should be used or
imbedded within any organization. Thats a huge
factor in understanding how happy you will be
in your job and how successful you can be as a
quantitative professional.
Getting back to the plight of the quant-
poor MBA, how can a candidate boost ana-
lytical skills mid-career? Many colleges and
universities are now offering analytics pro-
grams, often online, through their business
schools, and INFORMS, for example, holds
continuing education courses in the analyt-
ics area, as well as a certifcation program.
I get that question a lot: Im really interested
in beefng up my analytical skills so what should
I do? As you noted, there are more opportunities
than ever to do that. In addition to the formal edu-
cation programs, there are plenty of good books
on the topic. I get the question all the time: What
books should I be looking at?
For any quantitative
person, when theyre
talking to a potential new
employer, its important to
understand if analytics
has a seat at the
corporate table.
J U LY / AU GU S T 2014 | 61 A NA L Y T I C S
Another way that you can jump into
this is through Kaggle competitions,
which I recommend to people if they are
interested in understanding data science
and who else is out there doing this kind
of work and what they are doing. There
are many tools out there. Certainly what
INFORMS is doing is terrifc.
Its important to keep your skills fresh
and make sure you continue to learn.
When it comes to giving general career
advice, especially to younger candidates,
my advice is this: prepare for three or four
careers during your lifetime. In todays
world, its not good to specialize in one
thing and try to stick with one company
or one industry or one vertical applica-
tion for your entire career. Its incredibly
dangerous, and it likely wont carry you
through a 35-year career. You need to
be continuously learning something new.
People should keep that in mind.
INFORMS offers an analytics cer-
tifcation program (CAP). Is that a dif-
ferentiator in the job marketplace?
No two candidates are ever equal,
but it can certainly help once there are
enough employers out there who under-
stand what it means to be CAP certifed.
Im seeing people put various MOOCs
(massively open online course) on their
resumes now, along with Kaggle com-
petition results. I have a candidate who
actually got his job because of a Kaggle
competition. The frst couple of times
he submitted his solution it was totally
rejected, but as he continued to study
the problem and resubmitted, he
climbed up the leaderboard. Then he
started getting calls and job opportuni-
ties because of his Kaggle rank.
From your perspective, what does
the future hold for data scientists and
other analytics professionals?
In my 30 years of experience, I have
never seen anything like this. The oppor-
tunities for elite analytics candidates have
never been better, and I think what were
seeing now is just the tip of the iceberg.
As I said earlier, I really think that my
quantitative candidates are going to be
running companies one day. Certainly
the CMO (chief marketing offcer) is go-
ing to be coming up through the analyt-
ics ranks. Now theres all this talk about
CAOs (chief analytics offcer). I think the
candidates Im working with have a very
strong chance if they have leadership
ability and the ambition to advance up
the ranks and continue to climb and run
organizations at some point. Their quan-
titative skills are going to be unique and
absolutely required to be a successful
businessperson. Its their time to shine.
Peter Horner (peter.horner@mail.informs.org) is the
editor of Analytics and OR/MS Today magazines.
WWW. I NF OR MS . OR G 62 | A NA LY T I CS - MAGA Z I NE . OR G
ANALYTI CS ACROSS THE ENTERPRI SE
The story of how IBM not only survived but thrived
by realizing business value from big data.
his is the story of how an
iconic company founded
more than a century ago,
and once deemed a dino-
saur that would not be able to survive
the 1990s, has learned lesson after les-
son about survival and transformation.
The use of analytics to bring more sci-
ence into the business decision process
is a key underpinning of this survival and
transformation. Now for the frst time, the
inside story of how analytics is being used
across the IBM enterprise is being told.
According to Ginni Rometty, chairman,
president and chief executive offcer, IBM
Corporation, Analytics is forming the
silver thread through the future of every-
thing we do.
What is analytics? In simple terms,
analytics is any mathematical or scientifc
method that augments data with the intent
of providing new insight. With the nearly
1 trillion connected objects and devices
generating an estimated 2.5 billion giga-
bytes of new data each day, analytics can
help discover insights in the data. That in-
sight creates competitive advantage when
used to inform actions and decisions.
Analytics transforms
a dinosaur
BY (l-r) BRENDA DIETRICH,
EMILY PLACHY AND MAUREEN
NORTON
T
J U LY / AU GU S T 2014 | 63 A NA L Y T I C S
using data, but it involves
more than simple data
(or database) queries.
Analytics involves the use
of mathematical or scien-
tifc methods to generate
insight from the data.
Analytics should be
thought of as a progres-
sion of capabilities, start-
ing with the well-known
methods of business in-
telligence, and extending
through more complex
methods involving sig-
nifcant amounts of both
mathematical modeling
and computation.
Reporting is the most widely used
analytic capability. Reporting gathers
data from multiple sources, such as busi-
ness automation, and creates standard
summarizations of the data. Visualiza-
tions are created to bring the data to life
and make it easy to interpret.
As a generic example, consider store
sales data from a retail chain. The data
is generated through the point of sale
system by reading the product bar codes
at checkout. Daily reports might include
total store revenue for each store, rev-
enue by department for each region, and
national revenue for each stock-keeping
unit (SKU). Weekly reports might include
Data is becoming the
worlds new natural re-
source, and learning how
to use that resource is a
game changer.
Analytics is not just a
technology; it is a way of
doing business. Through
the use of analytics, in-
sights from data can be
created to augment the
gut feelings and intuition
that many decisions are
based on today. Analytics
does not replace human
judgment or diminish the
creative, innovative spirit
but rather informs it with new insights to
be weighed in the decision process.
Analytics for the sake of analytics will
not get you far. To drive the most value,
analytics should be applied to solving your
most important business challenges and
deployed widely. Analytics is a means, not
an end. It is a way of thinking that leads to
fact-based decision-making.
BIG DATA AND ANALYTICS
DEMYSTIFIED
If analytics is any mathematical or sci-
entifc method that augments data with
the intent of providing new insight, arent
all data queries analytics? No. Analytics is
often thought of as answering questions
This article is adapted from
the book, Analytics Across the
Enterprise: How IBM Realizes
Business Value from Big Data
and Analytics.
WWW. I NF OR MS . OR G 64 | A NA LY T I CS - MAGA Z I NE . OR G
ANALYTI CS ACROSS THE ENTERPRI SE
the same metrics, as well as comparisons
to the previous week and comparisons
to the same week in the previous calen-
dar year. Many reporting systems also
allow for expanding the summarized data
into its component parts. This is particu-
larly useful in understanding changes in
the sums.
For example, a regional store man-
ager might want to examine the store-
level detail that resulted in an increase
in revenue from the home entertainment
department. She would be interested in
knowing whether sales increased at most
of the stores in the region, or whether the
increase in total sales resulted from a sig-
nifcant sales jump in just a few stores.
She might also look at whether the in-
crease could be traced back to just a
few SKUs, such as an unusually popular
movie or video game. If a likely cause of
the sales increase can be identifed, she
might alert the store managers to moni-
tor inventory of the popular products, re-
position the products within a store, or
even reallocate inventory of the products
across stores in her region.
WHY ANALYTICS MATTER
Quite simply, analytics matters be-
cause it works. You can be overwhelmed
with data and the value of it may be unat-
tainable until you apply analytics to create
the insights. Human brains were not built
to process the amounts of data that are
today being generated through social me-
dia, sensors, and more. While gut instinct
is often the basis for decisions, analyti-
cally informed intuition is what wins going
forward.
Several studies have highlighted the
value of analytics. Companies that use
predictive analytics are outperforming
those that do not by a factor of fve. In
a 2012 joint survey by the IBM Institute
of Business Value and the Said Busi-
ness School at the University of Oxford of
more than 1,000 professionals around the
world, 63 percent of respondents reported
that the use of information (including big
data and analytics) is creating a competi-
tive advantage for their organizations. IBM
depends on analytics to meet its business
objectives and provide shareholder value.
The bottom line is that analytics helps the
bottom line. Your competition will not be
waiting to take advantage of the new in-
sights from big data. Should you?
IBM has approached the use of ana-
lytics with a spirit of innovation and a be-
lief that analytics will illuminate insights
in data that can help improve outcomes.
The company hasnt been afraid to make
mistakes or redesign programs that
havent worked as planned. Unlike tra-
ditional IT projects, most analytics proj-
ects are exploratory. For example, the
Development Expense Baseline Project
Master of Science in Analytics
Apply technical knowledge to diverse analytical problems in
this program for working adults.
Learn to draw insights from complex data using statistical
methods and modeling.
Develop advanced prociency in applying sophisticated sta-
tistical, database development, and software skills to various
industries.
Apply by August 10.
Join us for an information session.
When Thursday, July 10, 67 pm, or Thursday, July 17, 67 pm
Where July 10: Downtown Chicago Gleacher Center
450 North Cityfront Plaza Drive
July 17: Online
More Info grahamschool.uchicago.edu/MAANMP
RSVP July 10: http://tinyurl.com/o4auzsw
July 17: http://tinyurl.com/nbs2495
BIG DATA.
BIG CAREER.
WWW. I NF OR MS . OR G 66 | A NA LY T I CS - MAGA Z I NE . OR G
ANALYTI CS ACROSS THE ENTERPRI SE
explored innovative ways to determine
development expense at a detailed level,
thereby addressing a problem that many
thought was impossible to solve. IBM
analytic teams havent waited for perfect
data to get started; rather, they have re-
fned and improved their data along the
way.
The key is to put a stake in the ground
with a commitment that analytics will be
woven into your strategy. Thats how IBM
does it. This approach is also effective
with big data. Rather than postpone the
leveraging of big data, you should em-
brace it, establish a link between your
business priorities and your information
agenda, and apply analytics to become a
smarter enterprise.
PROVEN APPROACHES
Staying focused on solving business
problems was the pragmatic start, and
the other crucial element was having very
high-level executive support from the be-
ginning. From a governance perspective,
those are two key levers to drive value:
focus on actions and decisions that will
generate value and have high-level ex-
ecutive sponsorship.
The ideal team to do analytics is a
collaboration between an experienced
data scientist, a person steeped in the
area of the business where the challenge
needs to be solved, and an IT person
with expertise in the data in that particu-
lar area of the business.
A joint study by MIT Sloan and the
IBM Institute for Business Value devel-
oped several recommendations. The frst
is that you start with your biggest and
highest-value business challenge. The
next recommendation is to ask a lot of
questions about that challenge in order to
understand whats going on or what could
be going on. Then you go out and look for
what data you might have thats relevant
to that challenge. Finally, you determine
which analytic technique can be used to
analyze the data and solve the problem.
Because most companies have con-
straints on the amount of money and
skills available for projects, estimating
the ROI can provide a better differentiator
for selecting the project with the highest
potential impact than relying on instincts.
Estimating an analytics projects ROI in-
volves both capturing the project costs
and measuring the value.
EMERGING THEMES
Relationships inferred from data
today may not be present in data col-
lected tomorrow. The relationships that
you infer from data about the past do
not necessarily hold in data that you col-
lect tomorrow. You cannot analyze data
once and then make decisions forever
based on old analysis. Its important to
A NA L Y T I C S J U LY / AU GU S T 2014 | 67
SAS and Hadoop take on
the Big Data challenge.
And win.
Analytics
Why collect massive amounts of Big Data if you cant analyze
it all? Or if you have to wait days and weeks to get results?
Combining the analytical power of SAS with the crunching
capabilities of Hadoop takes you from data to decisions in a
single, interactive environment for the fastest results at the
greatest value.
Read the TDWI report
sas.com/tdwi
SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. indicates USA registration. Other brand and product names are trademarks of their respective companies. 2014 SAS Institute Inc. All rights reserved. S120598US.0214
WWW. I NF OR MS . OR G 68 | A NA LY T I CS - MAGA Z I NE . OR G
ANALYTI CS ACROSS THE ENTERPRI SE
continually analyze data to verify that pre-
viously detected relationships are still val-
id and to discover new ones. Fortunately,
major discontinuities with data do not hap-
pen very often, so change generally hap-
pens gradually. Social media sentiment,
however, has a much shorter half-life than
most data.
Using relationships derived from past
data has been repeatedly demonstrated
to work better than assuming that no re-
lationships exist. The relationships that
have been detected are likely correlation
rather than causality. However, these re-
lationships, if detected and acted upon
quickly, may provide at least a temporary
business advantage.
You dont have to understand ana-
lytics technology to derive value from it.
For a long time, many business leaders
expressed the opinion that mathemat-
ics should be used by only those who
understood the details of the computa-
tions. However, in recent years this view
has been changing, and analytics is be-
ing treated like other technologies. You
must learn how to use it effectively, but
it is not necessary to understand the in-
ner workings in order to apply analytics
to business decisions. You have to apply
analytics methods in the context of the
problem that is being solved and make
the results accessible to the end user. But
just as the user of a car navigation system
does not need to understand the details
of the routing algorithm, the end user of
analytics does not have to understand the
details of the math.
Typically, making the results accessi-
ble to the end user involves wrapping the
math in the language and the process of
the end user. Also, the analytics can be
embedded deep inside things so that the
user does not see it, like in supply chain
operations. Analytics should be usable by
anyone, not just those with Ph.D.s in sta-
tistics or operations research. Some us-
ers will want to understand the algorithms
and inner workings of an analytics model
in order to trust the results prior to adop-
tion, but they are the exception.
Fast, cheap processors and cheap
storage make analysis on big data pos-
sible. Moores law has resulted in vast
increases in computing power and vast
decreases in the cost of storing and ac-
cessing data. With readily available and
inexpensive computing, we can do what-
if calculations often and test a number of
variables in big data for correlation.
Doing things fast is almost always
better than doing things perfectly.
Often inexact but fast approaches pro-
duce enormous gains because they re-
sult in better choices than humans would
have made without the use of analytics.
Over time, the approximate analytics
methods can be refned and improved to
J U LY / AU GU S T 2014 | 69 A NA L Y T I C S
achieve additional gains. However, for
many business processes, there is even-
tually a point of diminishing returns: The
calculations may become more detailed
and precise, but the end results are no
more accurate or valuable.
Using analytics leads to better
auditability and accountability. With
the use of analytics, the decision-making
process becomes more structured and
repeatable, and a decision becomes less
dependent on the individual making the
decision. When you change which peo-
ple are in various positions, things still
happen in the same way. You can often
go back and fnd out what analysis was
used and why a decision was made.
Dr. Brenda L. Dietrich is an IBM Fellow and
vice president. She joined IBM in 1984, and
during her career she has worked with almost
every IBM business unit and applied analytics to
numerous IBM decision processes. She currently
leads the emerging technologies team in the IBM
Watson group. For more than a decade, she led
the Mathematical Sciences function in the IBM
Research division, where she was responsible for
both basic research on computational mathematics
and for the development of novel applications of
mathematics for both IBM and its clients.
In addition to her work within IBM, she has been
the president of INFORMS, the worlds largest
professional society for operations research and
management sciences. An INFORMS Fellow,
she has received multiple service awards from
INFORMS.
Dr. Emily C. Plachy is a distinguished engineer
in Business Analytics Transformation at IBM,
where she is responsible for leading an increased
use of analytics across IBM. Since joining IBM
in 1982, she has integrated data analysis into
her work and has held a number of technical
leadership roles including CTO, process,
methods, and tools in IBM Global Business
Services.
In 1992, Emily was elected to the IBM
Academy of Technology, a body of approximately
1,000 of IBMs top technical leaders, and she
served as its president from 2009 to 2011. She is
a member of INFORMS.
Maureen Fitzgerald Norton, MBA, JD, is a
distinguished market intelligence professional
and executive program manager in Business
Analytics Transformation, responsible for driving
the widespread use of analytics across IBM. In
her previous role, she led project teams applying
analytics to IBM Smarter Planet initiatives in
public safety, global social services, commerce
and merchandising.
Norton became the frst woman in IBM to
earn the designation of Distinguished Market
Intelligence Professional for developing
innovative approaches to solving business issues
and knowledge gaps through analysis.
Note: This article is adapted from the book,
Analytics Across the Enterprise: How IBM
Realizes Business Value from Big Data and
Analytics, authored by Brenda L. Dietrich, Emily
C. Plachy and Maureen F. Norton, published by
Pearson/IBM Press, May 2014, ISBN 978-0-
13-383303-4, 2014 by International Business
Machines Corporation. For more information,
visit: ibmpressbooks.com.
Request a no-obligation INFORMS Member Benets Packet
For more information, visit: http://www.informs.org/Membership
WWW. I NF OR MS . OR G 70 | A NA LY T I CS - MAGA Z I NE . OR G
SOFTWARE SURVEY
Making predictions from hard and fast data.
ere is an easy forecast to
make: Forecasting will be
part of our information fow
for the foreseeable future.
Forecasting is also a key topic in my
Decision Modeling for Management
course. In preparing the midterm exam
for this past spring term, I wanted the stu-
dents to analyze the enrollment fgures for
the Affordable Care Act and make some
forecasts. The media has been talking
about these enrollment fgures since the
rollout, and politicians have been making
projections about them as well. In the
course we covered various forecasting
methodologies, including trend analysis.
Thus, my plan for a midterm problem was
to give the students the enrollment data
and have them make a forecast for the
May 1 enrollment deadline. Getting those
enrollment numbers became obstacle
number one.
Figures 1 and 2 show some typical
results of an Internet search. I found
graphs, some better, more worse (look
at the markers on the x-axis of the graph
The future of
forecasting
BY JACK YURKIEWICZ
H
J U LY / AU GU S T 2014 | 71 A NA L Y T I C S
Figure 1: http://www.cnn.com/interactive/2013/09/health/map-obamacare/.
Figure 2: http://www.whitehouse.gov/the-press-offce/2014/04/17/fact-sheet-affordable-care-act-numbers.
in Figure 1), lots of opinion articles with
forecasts, but no data. I punted and de-
cided to present the class a similar but
far less-pressing problem. On March 31,
the day of the midterm exam, I asked
students to make forecasts for the
cumulative domestic box-offce gross for
the recently released movie Non-Stop.
The action flm starring Liam Neeson
had opened on Feb. 28, and I gave the
students the daily domestic box-offce
gross values from opening day through
WWW. I NF OR MS . OR G 72 | A NA LY T I CS - MAGA Z I NE . OR G
FORECASTI NG
March 16, or 17 days of data. The stu-
dents were asked to make a time plot
of these box-offce fgures (see Figure
3) and, after examining various trend
models, get a forecast for the cumu-
lative domestic box-offce gross for a
target date, midterm day, March 31.
I knew that two days later (after I had
graded their exams and returned them),
Universal Studios would give the actual
cumulative domestic gross of the flm
as of March 31. It was $85.39 million.
Of the various trend models we cov-
ered, the Weibull curve yielded the most
accurate forecast, $86.11 million; anoth-
er model was reasonably close, and the
others we discussed and they tried were
way off.
CATEGORIZING THE FORECAST
SOFTWARE
Commercial forecasting software
is available in two broad categories.
Using the nomenclature from previous
OR/MS Today forecasting surveys, the frst
category is called dedicated software. A
dedicated product implies that the software
only has various forecasting capabilities,
such as Box-Jenkins, exponential smooth-
ing, trend analysis, regression and other
procedures. The second category is called
general statistical software. This implies the
product does have forecasting techniques
as a subset of the many statistical proce-
dures it can do. Thus, a product that can
do ANOVA, factor analysis, etc., as well
as Box-Jenkins techniques would fall into
Figure 3: Initial daily domestic box-offce gross of the motion picture (Non-Stop).
MASTER OF SCIENCE IN ANALYTICS
15-month, full-time, on-campus program
Integrates data science, information technology and business applications
into three areas of data analysis: predictive (forecasting), descriptive (business
intelligence and data mining) and prescriptive (optimization and simulation)
Offered by the McCormick School of Engineering and Applied Science
www.analytics.northwestern.edu
MASTER OF SCIENCE IN PREDICTIVE ANALYTICS
Online, part-time program
Builds expertise in advanced analytics, data mining, database management,
fnancial analysis, predictive modeling, quantitative reasoning, and web analytics,
as well as advanced communication and leadership
Offered by Northwestern University School of Continuing Studies
877-664-3347 | www.predictive-analytics.northwestern.edu/info
NORTHWESTERN ANALYTICS
As businesses seek to maximize the value of vast new streams of available data,
Northwestern University offers two masters degree programs in analytics that
prepare students to meet the growing demand for data-driven leadership and
problem solving. Graduates develop a robust technical foundation to guide
data-driven decision making and innovation, as well as the strategic,
communication and management skills that position them for leadership roles
in a wide range of industries and disciplines.
WWW. I NF OR MS . OR G 74 | A NA LY T I CS - MAGA Z I NE . OR G
FORECASTI NG
this group. In recent years, the number of
products in the second category has been
growing, as statistical software frms have
been adding additional and more sophis-
ticated forecasting methodologies to their
lists of features and capabilities. However,
some dedicated software manufactur-
ers offer specifc capabilities and features
(e.g., transfer function, econometric mod-
els, etc.) that general statistical programs
may not have.
In both software categories, forecast-
ing software varies when it comes to the
degree to which the software can fnd
the appropriate model and the optimal
parameters of that model. For example,
Winters method requires values for three
smoothing constants and Box-Jenkins
models have to be specifed with vari-
ous parameters, such as ARIMA(1,0,1)
x(0,1,2). Forecasting software vary in
their degree to fnd these parameters.
For the purposes of this and previous
surveys, the ability of the software to fnd
the optimal model and parameters for the
data is characterized. Software is labeled
as automatic if it both recommends the
appropriate model to use on a particular
data set and fnds the optimal parame-
ters for that model. Automatic software
typically asks the user to specify some
parameter to minimize (e.g., Akaike Infor-
mation Criterion (AIC), Schwarz Bayes-
ian Information Criterion (SBIC), RMSE,
etc.) and recommends a forecast model
for the data, gives the models optimal
parameters, calculates forecasts for a
user-specifed number of future periods,
and gives various summary statistics and
graphs. The user can manually overrule
the recommended model and choose an-
other, and the software fnds the optimal
parameters, forecasts, etc., for that one.
The second category is called semi-
automatic. Such software asks the user to
pick a forecasting model from a menu and
some statistic to minimize, and the pro-
gram then fnds the optimal parameters
for that model, the forecasts, and various
graphs and statistics.
The third category is called manual
software. Here the user must specify both
the model that should be used and the
corresponding parameters. The software
then fnds the forecasts, summary statis-
tics and charts. If you frequently need to
make forecasts of different types of time
series, using manual software could be a
tedious choice. Unfortunately, that broad
advice may not be apropos for some
software. Some products fall into two
categories. For example, if you choose
a Box-Jenkins model, the software may
fnd the optimal parameters for that mod-
el, but if you specify that Winters method
be used, the product may require that
you manually enter the three smoothing
constants.
J U LY / AU GU S T 2014 | 75 A NA L Y T I C S
When it comes to analyzing trends,
most the products I tried fall into the semi-
automatic group. That is, I need to choose a
trend curve, and the software fnds the ap-
propriate parameters for that model, gives
forecasts, summary statistics and graphs.
WORKING WITH A SAMPLE OF
PRODUCTS
In my class, students use StatTools,
part of the Palisade Software Suite that
comes with their textbook. Its forecasting
capabilities are regression, exponential
smoothing (Brown, Holt and Winters) and
moving averages. If data followed some
nonlinear function, the students could
make mathematical transformations to
make the data linear and then use ordi-
nary linear regression on it, and do the
inverse transformation to get the forecast.
They also have several Excel templates
I developed (Gompertz, Pearl-Reed,
Weibull, etc.) for the course. For this ar-
ticle, I tried a small sample of professional
A membership in INFORMS will help!
How will you stand out from the crowd?
Certification for Analytics Professionals
Online access to the latest in operations research and advanced analytics techniques
Networking Opportunities available at INFORMS Meetings and Communities
New Members receive one free Subdivison membership in 2014 visit http://join.informs.org
Join INFORMS for rest of 2014 for just $80.
Exclusive offer to Analytics subscribers. Promocode ANALYTICS-HALF.
WWW. I NF OR MS . OR G 76 | A NA LY T I CS - MAGA Z I NE . OR G
FORECASTI NG
products from different categories, spe-
cifcally Minitab, IBM SPSS and NCSS
on the Non-Stop movie data. IBM SPSS
falls into the automatic forecasting catego-
ry; Minitab and NCSS are semiautomatic
products. A caveat: This is not meant to be
a critical review of any product mentioned.
I let IBM SPSS frst do the analysis of
the movie data via its automatic mode,
called Expert Modeler (i.e., choose the
model and its parameters and get the
forecasts). Figure 4 shows superimposed
screen shots of IBM SPSS worksheet,
showing the Non-Stop daily domestic
box-offce gross and the menu system
to start the automatic forecasting proce-
dure. The program then gave its recom-
mended model, Browns method for data
with linear trend, which uses one smooth-
ing constant to estimate the intercept and
slope of the ftted line (as compared to
Holts method, which uses two inde-
pendent smoothing constants) [1]. IBM
SPSS accompanying statistics, forecast
plot and additional output are shown in
Figure 5.
Figure 4. IBM SPSS
input worksheet (show-
ing the Non-Stop
movie daily box-offce
returns).
Figure 5: IBM SPSS results of automatic
forecasting of the Non-Stop data.
Published for business forecasters, planners, and managers by the International Institute
of Forecasters (IIF), Foresight: Te International Journal of Applied Forecasting delivers
authoritative guidance on forecasting processes, practices, methods, and tools.
Each issue features a unique blend of insights from experienced practitioners and top
academics, distilled into concise and accessible articles, tutorials, and case studies. Our
mission is to help you improve the accuracy and efciency of your forecasting and
operational planning.
Foresights topics include
S&OP process design and management
Forecasting principles and methods
Measuring and tracking forecast accuracy
Regular columns on forecasting intelligence, prediction markets, fnancial forecasting
Hot new research and its practical value
Reviews of new and popular books, sofware, and other technologies
No matter what
kind of forecasting
you do, we invite you to
take Foresight for a test drive.
To take Foresight for a spin, download a recent issue here:
bit.ly/ForesightTestDrive
Foresight is a publication of the International Institute of Forecasters.
IIF Business Offce: 53 Tesla Avenue, Medford, MA 02155, USA. Tel: 1-781-234-4077
To receive quarterly hard copy issues, unlimited access to our
library of back issues, and much more, subscribe to Foresight here:
forecasters.org/foresight/subscribe
WWW. I NF OR MS . OR G 78 | A NA LY T I CS - MAGA Z I NE . OR G
FORECASTI NG
IBM SPSS does
have a curve ftting
feature, so I utilized
it and specifed three
possible models to
be examined the
linear, growth and
logistic curves. Fig-
ures 6 and 7 give
the resulting output
and plots for these
choices.
NCSS has, in
addition to the stan-
dard forecasting
procedures (Box-Jenkins and exponen-
tial smoothing models), an extensive list
of more than 20 nonlinear curve mod-
els under its menu label Growth and
Other Models. The user chooses a
model, and NCSS fnds the appropriate
parameters for the particular data set.
I chose, for the Non-Stop data, the
Logistic(4) model [i.e., a logistic curve
with four parameters; there is a Logistic(3)
model available as well], and Figure 8
shows the NCSS output.
Minitab is a hybrid of a semi-auto-
matic and manual forecasting product. If
you specify that a Box-Jenkins model be
used, the software fnds the appropriate
parameters for the model. However, if you
choose Winters method, Minitab requires
Figure 6: IBM SPSS
ftted models for
three specifed
growth curves.
Figure 7: IBM SPSS plot of the data and growth curves.
J U LY / AU GU S T 2014 | 79 A NA L Y T I C S
that you manually enter values
for the three smoothing con-
stants. Minitab also has, under
the Time Series choice on the
main menu, a Trend Analysis
option. Choosing that gives the
user four possible curves (lin-
ear, quadratic, exponential and
Pearl-Reed logistic). Figure 9
gives the results of my choice
for the Non-Stop data, the
Pearl-Reed curve (Minitab calls
it the S-Curve Trend Model).
Figure 9: Minitabs output for the Pearl-Reed logistic growth model for
the Non-Stop data.
Figure 8: NCSS output.
I chose the Logistic(4)
from NCSS list of Growth
and Other Models.
WWW. I NF OR MS . OR G 80 | A NA LY T I CS - MAGA Z I NE . OR G
FORECASTI NG
Finally, Figure 10 shows the results of
one of my Excel templates that uses the
four-parameter Weibull trend curve and
uses Solvers nonlinear programming
capability to fnd the optimal parameters
that minimizes the root mean square er-
ror for the entered data.
THE SURVEY
We e-mailed the vendors and asked
them to respond on our online ques-
tionnaire so readers could see the fea-
tures and capabilities of the software.
The purpose of the survey is to inform
the reader of a programs forecasting
capabilities and features. We tried to
identify as many forecasting vendors
and products as possible and contacted
all the vendors that we identifed and/
or responded to the last survey in 2012.
For those who did not respond, we tried
gentle reminders (several e-mails and
some phone calls). In addition to the
features and capability of the software,
we wanted to know what techniques or
enhancements have been added to the
software since our previous survey. The
information comes from the vendors,
and we made no attempt to verify what
they gave us.
Figure 10: The four-parameter Weibull curve ft for the Non-Stop data.
J U LY / AU GU S T 2014 | 81 A NA L Y T I C S
If you use data to make forecasts,
what should you look for in a vendor and
the product? First, fnd out the capabilities
of the software. Specifcally, what fore-
casting methodologies can the product
do? Does it fnd the optimal parameters
of the procedure for your particular data
set or must you manually enter those val-
ues? How extensive, useful and clear is
the output?
Most, but not all, vendors allow you to
download a time-trial version of the soft-
ware that typically expires in anywhere
from a week to a month. Ideally, the trial
version should allow you to work with
your own data and not just canned data
that the vendor bundles with the trial soft-
ware. Verify if the trial version has size
limitations of the data, and if so, are they
overly restrictive.
Ask about technical support, updat-
ing to a newer version when it is released
and differences (if any) depending on the
operating system you are using. Contact
the vendor with your specifc questions.
Users tell me, and I have independently
found, that most vendors have good and
helpful technical support before and after
you buy.
Jack Yurkiewicz (yurk@optonline.net) is a
professor of management science in the MBA
program at the Lubin School of Business, Pace
University, New York. He teaches data analysis,
management science and operations management.
His current interests include developing and
assessing the effectiveness of distance-learning
courses for these topics. He is a longtime member
of INFORMS.
SURVEY DATA & DIRECTORY
To view the survey results as well as a directory
of vendors who participated in the survey,
click here.
WWW. I NF OR MS . OR G 82 | A NA LY T I CS - MAGA Z I NE . OR G
CONFERENCE PREVI EW
BY CANDACE
CANDI YANO
Tony Bennett sang that he left his heart in San
Francisco and at the 2014 INFORMS Annual Meet-
ing in San Francisco, you will begin to understand
why as you take advantage of the opportunity to fll
both your heart and your mind. To fll your mind, you
can attend special presentations:
Alvin Roth, professor of economics at Stanford
University and professor of economics and busi-
ness administration at Harvard University who
was awarded the 2012 Nobel Prize in Economics for
his work in the area of Game Theory, will talk about
his work.
Richard Cottle, emeritus professor at Stanford
University, will offer a commemorative and historical
perspective on George Dantzig in honor of Dantzigs
100th birthday.
Jonathan Caulkins, professor at the Heinz School of
Public Policy at Carnegie Mellon University, will discuss
his work on health and drug-related policy issues.
S.F. conference set to
capture hearts & minds
The conference
will include more than
4,000 technical
presentations by experts
from industry, academia
and government,
from leading-edge
advancements in
operations research
methodologies and
analytics to applications
in healthcare, energy,
environmental management
and supply chain
management.
Some of San Franciscos many
landmarks are mobile.
J U LY / AU GU S T 2014 | 83 A NA L Y T I C S
Anthony Levandowski of Google will talk
about the Google Driverless Car project, of-
fering his perspective as both a developer
and a user of the technology.
A panel of experts from within the
INFORMS community will discuss
their experience with, and offer ad-
vice on, massively open online courses
(MOOCs).
If this is not enough, there will be
more than 4,000 technical presentations
by experts from industry, academia and
government. Topics will be wide-rang-
ing, covering the full breadth of the feld,
from leading-edge advancements in op-
erations research methodologies and
analytics, to applications in healthcare,
energy, critical infrastructure manage-
ment, environmental management and
supply chain management.
If you are not already overwhelmed
while flling your mind, you will have ample
opportunity to fll your heart and stom-
ach. San Francisco is regarded as one of
the most beautiful cities in the world and
offers world-class cuisine from almost
every ethnic heritage. The meeting will
take place in two adjacent hotels, the Hil-
ton San Francisco Union Square and the
Parc 55 Wyndham. The location is in close
proximity to the citys prime shopping dis-
trict and near the boarding point for cable
cars to Fishermans Wharf famous for
fresh seafood and Pier 39 where you
can see dozens of sea lions and walk to
ferries that offer everything from simple
rides across San Francisco Bay to amaz-
ingly scenic tours, as well as Ghirardelli
Square, known for Ghirardelli chocolate.
Venturing into other parts of San Fran-
cisco, you can visit world-class muse-
ums, including the Palace of the Legion of
Honor, DeYoung Museum, Asian Art Mu-
seum and California Academy of Sci-
ences. The performing arts, including the
symphony, ballet, opera, jazz, theater and
concerts, are all within easy reach. If you
prefer the outdoors, you can take a trip
to the former prison on Alcatraz (a limited
number of tickets will be available to con-
ferees for purchase), see the redwoods in
Muir Woods, hike in the Marin Headlands
with an unobstructed view of the Golden
Gate Bridge, sign up to play a round of golf
with other conferees at TPC Harding Golf
Course the day before the conference, or
simply wander through the haunts of the
hippies in Haight-Ashbury or the Beat po-
ets in North Beach. Just a bit further from
the city are the wine regions of Napa and
Sonoma, only an hours drive away.
Both the meeting and the venue will
have much to offer in many dimensions.
We look forward to seeing you there.
Candace Candi Yano is general chair of the
2014 INFORMS Annual Meeting in San Francisco.
She is a longtime member of INFORMS.
WWW. I NF OR MS . OR G 84 | A NA LY T I CS - MAGA Z I NE . OR G
FI VE-MI NUTE ANALYST
Few things make me more conficted than parking
lots. On a personal level, I loathe the whole parking
activity. It brings out what I think is the worst behav-
iors of humankind: hoarding, brinksmanship, scarci-
ty mentality, irrational objective functions and now
you see why as an O.R. professional I love parking
lots: because they are so interesting to study.
At the corner of Hades Street and Styx Ave. is
(at least to me) the worlds worst parking lot. Heres
the set-up: There is an upper level with metered
parking. The meter has a two-hour limit at a rate
of $1.25/hour, but pressing a silver button on the
meter sets the time to 60 minutes if the meter is
currently less than 60 (see Figure 1). This makes
parking here free to most visitors. The lower level is
Probabilistic parking
problems
BY HARRISON
SCHRAMM, CAP
The whole parking
activity brings out the
worst behaviors of
humankind: hoarding,
brinksmanship, scarcity
mentality, irrational
objective functions
and why as an O.R.
professional I love parking
lots: because they are so
interesting to study.
Figure 1: A smart meter in a parking lot. This meter has a button
next to the coin lot that may be pressed for a free hour of parking.
Coins may be added for additional time, up to two hours.
J U LY / AU GU S T 2014 | 85 A NA L Y T I C S
a standard parking garage, which has a
fat $2 per hour fee which can be vali-
dated by the two anchor stores, mak-
ing it essentially free for most patrons
as well. While this is light and explorato-
ry, there is serious work going on with
parking problems [1].
In the sterile world of fgures and
mathematics, this sounds like a reason-
able way to run a parking lot, and pa-
trons who miss the upstairs free parking
will simply renege and take the lower
level free parking. In reality, people
mob the upstairs portion in search of
free parking. My assistant and I had
observed this behavior over a num-
ber of weeks, and we were interested
in learning about the time parked cars
spent in the lot, with an eye for simple
metrics such as expected wait time for a
parking spot or the expected number of
cars trolling for a slot. This interest be-
came action (the key for any analysis),
and we chose 6:30 p.m. on a Thursday
evening a time that we knew the park-
ing lot would be full to collect data
BENEFITS OF CERTIFICATION
Advances your career potential by setting you apart from the competition
Drives personal satisfaction of accomplishing a key career milestone
Helps improve your overall job performance by stressing continuing
professional development
Recognizes that you have invested in your analytics career by pursuing
this rigorous credential
Boosts your salary potential by being viewed as experienced analytics professional
Shows competence in the principles and practices of analytics

APPLICATIONS
Prepare to apply by reviewing Candidate
Handbook & Study Guide Draft
Arrange now to secure academic transcript
and confirmation of soft skills to send
to INFORMS

COMPUTER-BASED TESTING
It is now more convenient than ever to schedule
your CAP exam in more than 700 Kryterion test
centers in more than 100+ countries. To find the
location closest to you, check this site:
www.kryteriononline.com/host_locations/
QUESTIONS? certification@mail.informs.org
DOMAINS OF ANALYTICS PRACTICE
Domain Description Weight*
Business Problem (Question) Framing
Analytics Problem Framing
Data
Methodology (Approach) Selection
Model Building
Deployment
Life Cycle Management
*Percentage of questions in exam
I
II
III
IV
V
VI
VII
15%
17%
22%
15%
16%
9%
6%
100%
BECOME A CERTIFED ANALYTICS PROFESSIONAL
DONT BE LEFT BEHIND.
www.informs.org/Build-Your-Career/Analytics-Certification
WWW. I NF OR MS . OR G 86 | A NA LY T I CS - MAGA Z I NE . OR G
FI VE-MI NUTE ANALYST
from the meters, which is displayed for
anyone who wishes to see.
What we found was surprising.
We expected to see uncorrelated
parking lot data. We did not expect to
fnd many over-time parking spots. I
hoped that the data would be exponen-
tial which would lead to nice, clean
analysis. What we discovered was, well,
a mess.
Of the 100 parking spots surveyed,
25 percent were fashing or over-time
(violation). Of the parking spots that
were not over-time, six showed times
over one hour, implying that the persons
parked there had in fact put money in the
meter. We are completely discarding the
possibility that someone would park in a
spot that had been previously occupied
but was not vacated, i.e., showing up
with 30 minutes remaining on meter and
not pressing the button/inserting coins. I
had hoped that the sojourn times would
be exponentially distributed, but that is a
case that is pretty diffcult to make with
this dataset (see Figure 2).
Now, we dont actually know how
many patrons have paid, or how many
have simply run over. However, there
are 100 parking spots considered, and
of these, six currently have clocks over
one hour. We can (crudely) estimate [2]
the true number of paid parking spots by
realizing that we are observing the last
hour of what may be a two-hour pro-
cess. Therefore, we think approximately
Figure 2: Histogram of raw parking meter data. Note the tri-modal nature of the data. Overtime, i.e.,
fashing parking meters are represented by -1 in the red-shaded oval and constitute the large bar at the
origin of the graph. Known paid parking meters are at the right and have a blue oval.
J U LY / AU GU S T 2014 | 87 A NA L Y T I C S
12 parking spots have been paid for at
any given time.
YES, BUT WHAT DOES IT ALL MEAN?
So in one sense, the distributions of
the data are irrelevant; there are 100
parking spots on average, and the aver-
age time that a parking spot is occupied
is some time greater than 27 minutes. If
we make the (not bad!) assumption that
the parking spots that run over are oc-
cupied for 90 minutes, then the average
occupancy is 43 minutes. In a lot with
100 spots, this means that on average,
Figure 3: Histogram of parking time remaining, less than 60 minutes. Approximately six of these data
points are actually spill over from paying customers.
one spot comes open every 30 seconds.
This doesnt sound so bad. If we treat
the system as a queue, and use the
(observed) steady state cars waiting of
three, we can place a rough lower es-
timate [3] that a new car arrives every
30 seconds looking for a parking spot,
and that they have between a 15 per-
cent and 25 percent chance of fnding
an open spot. These crude estimates,
however, do not agree very well with
observation, because they neglect the
blocking effect of other cars waiting
for spots to open up. A better analysis of
WWW. I NF OR MS . OR G 88 | A NA LY T I CS - MAGA Z I NE . OR G
FI VE-MI NUTE ANALYST
this parking lot would involve simulation,
which would go beyond our intent.
THE WORLDS WORST PARKING LOT?
Because of the behavior of the driv-
ers while trolling for a parking spot, it
might be considered the worlds worst
parking lot. Enforcement of the park-
ing policy might help because it would
decrease the sojourn times of the cars
parked in the lot, but there is no guar-
antee, and more importantly no di-
rect incentive for the parking lot owners
to do so. This is because the number of
free parking spots is fixed, and once
they are filled, they are filled, regard-
less of by whom. From the lot man-
agers point of view, it doesnt matter
if they are long or short parkers.
In fact, the rate structure is such that
short parkers are slightly more lucrative
for the parking lot owner than parking
above ground.
In conclusion, its probably a bit of lit-
erary hyperbole to imply that this is the
worlds worst parking; Im sure there are
others that are much worse. This is be-
cause I like to make short trips to this area
and visit the locations that dont validate
parking, and I really dont like the risky
behaviors aggressive parkers participate
in. On the upside, theres time to write 12
articles in a single push of the button!
Id be interested in hearing real
contenders for the Worlds Worst
Parking lot.
Update: Between the original draft of
this article and its publication, the park-
ing lot in question began installing an
electronic system to help customers de-
termine how many spots were available
before entering the parking queue. It
has yet to be determined if it will change
the behaviors of the parking lot. Look
forward to an update in a future column!
Harrison Schramm (harrison.schramm@gmail.
com) is an operations research professional in the
Washington, D.C., area. He is a member of INFORMS
and a Certifed Analytics Professional (CAP).
NOTES & REFERENCES
1. Fabusuyi, Hampshire, Hill and Sasauma, 2014,
Decision Analytics for Parking Availability in
Downtown Pittsburgh, Interfaces, INFORMS,
Hanover, Md.
2. This is just an estimate. More delicate techniques
may be applied.
3. Using the M/M/1 queuing model to fnd the lower
or optimistic estimates, and the M/G/1 queuing model
to fnd the upper estimate.
Join the Analytics Section of INFORMS
For more information, visit:
http://www.informs.org/Community/Analytics/Membership
meetings2.informs.org/sanfrancisco2014
Thanks to our Sponsors:
Join us in San Francisco
INFORMS returns to the City by the Bay for its 2014 Annual Meeting with a rich
and varied program, bridging data and decisions. Each year, the INFORMS
meeting brings together experts from academia, industry and government to
consider a broad range of ORMS and analytics research and applications. In
2014, well offer that program excellence in one of Americas most exciting
cities. Join us for INFORMS 2014!
Registration Now Open!
November 9-12, 2014
Hilton San Francisco Union Square & Parc 55 Wyndham
San Francisco, California
The Premier Conference for OR/MS Professionals offers you:
Networking connect with colleagues, share knowledge and ideas
Top industry and academic speakers
Two great receptions, Sunday and Tuesday
Tutorials, exhibits and software demonstrations
Extensive tracks on hot topics the best in ORMS
Focus on Analytics and Practice special tracks and sessions
Vibrant Interactive/Poster Sessions
SF'14 5th color ad_Layout 1 6/16/14 9:40 PM Page 1
WWW. I NF OR MS . OR G 9 0 | A NA LY T I CS - MAGA Z I NE . OR G
Frog and y
BY JOHN TOCZEK
THI NKI NG ANALYTI CALLY
A frog is looking to catch his next meal
just as a fy wanders into his pond. The frog
jumps randomly from one lily pad to the next in
hopes of catching the fy. The fy is unaware of
the frog and is moving randomly from one red
fower to another.
The frog can only move on the lily pads
and the fy can only move on the fowers.
The interval at which both the frog and the
fy move to a new space is one second. They
never sit still and always move away from the
space they are currently on. Both the frog
and the fy have an equal chance of moving
to any nearby space including diagonals. For
example, if the frog were on space A1, he
would have a one-in-three chance each of moving
to A2, B2 and B1.
The frog will capture the fy when he lands on the
same space as the fy.
QUESTION: Which space is the frog most likely
to catch the fy?
Send your answer to puzzlor@gmail.com by
Aug. 15. The winner, chosen randomly from correct
answers, will receive a $25 Amazon Gift Card. Past
questions can be found at puzzlor.com.
Figure 1: Where will the frog dine on the fy?
John Toczek is the senior director
of Decision Support and Analytics for
ARAMARK Corporation in the Global
Operational Excellence group. He
earned a bachelor of science degree
in chemical engineering at Drexel
University (1996) and a masters
degree in operations research from
Virginia Commonwealth University
(2005). He is a member of INFORMS.
GENERAL ALGEBRAIC MODELING SYSTEM
sales@gams.com www.gams.com
Scheduled courses for 2014 include:
Advanced Techniques in General Equilibrium Modeling with GAMS
Agro-Economic Modeling with GAMS
Applied Equilibrium Analysis of Energy and Climate Policies
Basic and Advanced GAMS
Development Policy Modeling
Dynamic Impacts of Macroeconomic Policies and Shocks
Environmental Computable General Equilibrium Modeling with GAMS
Financial General Equilibrium Modeling with GAMS
Global Computable General Equilibrium Model Training
Microeconomic Analysis of Welfare and Policy
Modeling and Optimization with GAMS
Practical General Equilibrium Modeling with GAMS
Simulation Techniques for Applied
Microeconomics
Trade and Climate Policy Analysis
with GAMS and MPSGE
For more information please visit: http://www.gams.com/courses.htm
Whether you are new to GAMS or already an experienced user looking to deepen or expand your
knowledge in a certain area - take a look at our diverse list of GAMS related courses. From basic
introductions to equilibrium or agricultural modeling these courses meet your needs in your area
of interest. Courses are led by domain experts at locations worldwide.
GAMS-related Courses and Workshops


p
r
e
s
s
m
a
s
t
e
r

/


J
o
n
a
s

G
l
a
u
b
i
t
z

F
o
t
o
l
i
a
.
c
o
m