Sie sind auf Seite 1von 36

McKinsey Healthcare Analytics

Using big data to drive U.S.


healthcare reform
Speaker Bio
Jack Schwenderman - Technology Expert (Jack_Schwenderman@mckinsey.com)
Jack joined McKinsey in 2008, he designed and developed applications using Java, .Net, C#, and other
technologies. He joined MHA in 2013 and has led the technology innovation including what you are
going to see today.

Nicolás Herrera – Technology Specialist (Nicolas_Herrera@mckinsey.com)


After joining in 2012, Nico has worked in various McKinsey Solutions supporting retail, B2B, B2C and
others. He is the chief architect and primary developer for the MHA episode engine, enabling the
solution to support multiple client engagements.

Pedro Huguet – Senior Technology Specialist (Pedro_Huguet@mckinsey.com)


Pedro holds a MBA from the Universidad de Costa Rica and a Bachelor Sc in Electronics Engineering from
the Instituto Tecnólogico de Costa Rica. Since joining MHA in 2013 he has become the chief architect and
primary developer of the MHA population diagnostics framework.
Sponsors

Diamond

Silver

Bronze

Raffle
Contents
• What is McKinsey Healthcare Analytics
• Analytical Configurable Episodes (ACE)

• Next steps in Big Data

• Further reading
• Q&A
¿What is McKinsey & Company?

Global firm with >8000 management consultants Innovating the consulting industry with new models in
management consulting
More than 100 offices in 50 countries
Our clients are part of the social, public and private
sectors
Serving global leading institutions for over 80 years
We are a non-hierarchical company which makes our
work model special
Experience in all industries: energy, pharmaceutical,
telecommunications, sales, insurance, banking, etc.
US healthcare reform
▪ The U.S. spends $650 billion more on healthcare than expected.
▪ But are they really getting $650 billion of extra value?
▪ The challenge is to retain the current system’s strengths while addressing its
deficiencies and curbing costs.
▪ A big-data revolution is underway in US: 16.4%
2nd: 11.1%
healthcare.
▪ The digitalization of patient records has
created a huge amount of data that can
be processed for insights.
US: $8.713
2nd: $6.325

SOURCE: healthcare.mckinsey.com, OECD


McKinsey Healthcare Analytics
Who we are
 Dedicated team of 80+ healthcare,
analytics, big data, and software
experts
 End-to-end answers – from diagnostics
to design to implementation and
performance tracking
 Proprietary analytics capabilities and
tools supported by big data platform
(~50bn claims, ~50mn lives)
What we do
 Fully integrated component of the HSS
Practice designed and managed to
advance the Practice mission:
 Deliver sustained improvement in
client performance
 Create an unrivaled environment
for people
 Improve healthcare

SOURCE: Image from emcplus.emc.com


MHA geographical footprint
United States
India
 New York: 34
 DC: 13  Gurgaon: 6
 Boston: 3
 West Coast: 8
 Other: 5

Costa Rica
 San Jose: 12

SOURCE: Image from Dreamstime.com


Contents
• What is McKinsey Healthcare Analytics
• Analytical Configurable Episodes (ACE)

• Next steps in Big Data

• Further reading
• Q&A
MHA – Data chain

Client files DeIdentify Input files

DW
(0.5/2.3T)

Operation Advance Datamart


Data Storage Episode
(3.2T/0.8T)
3rd party Engine
SOURCE: Internal
(2.3/0.5T)
ACE contains the algorithm and configuration MHA uses to
convert episode inputs to outputs
In the user interface of the configuration file, rules are entered
into rows
The user configures the inputs that ACE needs for each step
ACE uses dynamic SQL
ACE Engine uses SQL based controllers
How ACE builds a query
How ACE builds a query
How ACE builds a query
Code Review
Contents
• What is McKinsey Healthcare Analytics
• Analytical Configurable Episodes (ACE)

• Next steps in Big Data

• Further reading
• Q&A
Not all Big Data is created equal
• Big data + simple analysis
• Big data + complex analysis

SOURCE: Internet aggregation


New Hadoop Infrastructure
• Why are we doing this?
• What is Hadoop?
• Hive vs. SQL Server
• Link to language manual
• Ace on Hadoop
• Python generates the dynamic sql
• Big effort, lots to do
In Q4 of 2015, our goal is to reach the “run level” thanks to the migration to our
new Advanced Analytics platform

Crawl Walk Run Fly


Pre-2013 2013 to 2015 2015 2016 onwards

▪ Few clients ▪ Many clients ▪ Unlimited clients ▪ Migrate analytics


▪ Data on laptops ▪ Relational DB (SQL ▪ Hadoop on AWS engines (Q1 and Q2
▪ Major security Server) ▪ Full HDM conversion 2016)
limitations ▪ Data Warehouse in Q4 ▪ Expand data type
▪ Standardized incl. unstructured
procedures data
▪ Develop new usages
e.g., machine
learning

Infra-less option Relational DB server in Cloud-based computing leveraging Hadoop


McK datacenter
What changes do I need to run my TSql query on HIVE?
MS Sql Server Query

SELECT TOP 15
[year]
, CASE sex
WHEN 1 THEN 'Male'
WHEN 2 THEN 'Female'
ELSE 'Other' END as Gender
, CASE WHEN age > 55 THEN 'Over 55'
WHEN age > 34 THEN 'Between 35 and 55'
ELSE 'Less than 35' END as Age_Band
, count(*) as Num_Records
FROM dbo.AnnualEnrollment
WHERE ENRIND12 = 1
GROUP BY [year]
, sex
, CASE WHEN age > 55 THEN 'Over 55'
WHEN age > 34 THEN 'Between 35 and 55'
ELSE 'Less than 35' END
ORDER BY [year]
, Gender
, Age_Band;
What changes do I need to run my TSql query on HIVE?
MS Sql Server Query HIVE Query

SELECT TOP 15 SELECT `year`


[year] , CASE sex
, CASE sex WHEN 1 THEN 'Male'
WHEN 1 THEN 'Male' WHEN 2 THEN 'Female'
WHEN 2 THEN 'Female' ELSE 'Other' END as Gender
ELSE 'Other' END as Gender , if(age > 55, 'Over 55',
, CASE WHEN age > 55 THEN 'Over 55' if(age > 34, 'Between 35 and 55',
WHEN age > 34 THEN 'Between 35 and 55' 'Less than 35')) as Age_Band
ELSE 'Less than 35' END as Age_Band , count(*) as Num_Records
, count(*) as Num_Records FROM 3pd.tc_annual_enrollment
FROM dbo.AnnualEnrollment WHERE ENRIND12 = 1
WHERE ENRIND12 = 1 GROUP BY `year`
GROUP BY [year] , sex
, sex , if(age > 55, 'Over 55',
, CASE WHEN age > 55 THEN 'Over 55' if(age > 34, 'Between 35 and 55',
WHEN age > 34 THEN 'Between 35 and 55' 'Less than 35'))
ELSE 'Less than 35' END ORDER BY `year`
ORDER BY [year] , Gender
, Gender , Age_Band
, Age_Band; LIMIT 15;
What changes do I need to run my TSql query on HIVE?
MS Sql Server Query HIVE Query

SELECT TOP 15 SELECT `year`


[year] , CASE sex
, CASE sex WHEN 1 THEN 'Male'
WHEN 1 THEN 'Male' WHEN 2 THEN 'Female'
WHEN 2 THEN 'Female' ELSE 'Other' END as Gender
ELSE 'Other' END as Gender , if(age > 55, 'Over 55',
, CASE WHEN age > 55 THEN 'Over 55' if(age > 34, 'Between 35 and 55',
WHEN age > 34 THEN 'Between 35 and 55' 'Less than 35')) as Age_Band
ELSE 'Less than 35' END as Age_Band , count(*) as Num_Records
, count(*) as Num_Records FROM 3pd.tc_annual_enrollment
FROM dbo.AnnualEnrollment WHERE ENRIND12 = 1
WHERE ENRIND12 = 1 GROUP BY `year`
GROUP BY [year] , sex
, sex , if(age > 55, 'Over 55',
, CASE WHEN age > 55 THEN 'Over 55' if(age > 34, 'Between 35 and 55',
WHEN age > 34 THEN 'Between 35 and 55' 'Less than 35'))
ELSE 'Less than 35' END ORDER BY `year`
ORDER BY [year] , Gender
, Gender , Age_Band
, Age_Band; LIMIT 15;
What changes do I need to run my TSql query on HIVE?
MS Sql Server Query HIVE Query

SELECT TOP 15 SELECT `year`


[year] , CASE sex
, CASE sex WHEN 1 THEN 'Male'
WHEN 1 THEN 'Male' WHEN 2 THEN 'Female'
WHEN 2 THEN 'Female' ELSE 'Other' END as Gender
ELSE 'Other' END as Gender , if(age > 55, 'Over 55',
, CASE WHEN age > 55 THEN 'Over 55' if(age > 34, 'Between 35 and 55',
WHEN age > 34 THEN 'Between 35 and 55' 'Less than 35')) as Age_Band
ELSE 'Less than 35' END as Age_Band , count(*) as Num_Records
, count(*) as Num_Records FROM 3pd.tc_annual_enrollment
FROM dbo.AnnualEnrollment WHERE ENRIND12 = 1
WHERE ENRIND12 = 1 GROUP BY `year`
GROUP BY [year] , sex
, sex , if(age > 55, 'Over 55',
, CASE WHEN age > 55 THEN 'Over 55' if(age > 34, 'Between 35 and 55',
WHEN age > 34 THEN 'Between 35 and 55' 'Less than 35'))
ELSE 'Less than 35' END ORDER BY `year`
ORDER BY [year] , Gender
, Gender , Age_Band
, Age_Band; LIMIT 15;
What changes do I need to run my TSql query on HIVE?
MS Sql Server Query HIVE Query

SELECT TOP 15 SELECT `year`


[year] , CASE sex
, CASE sex WHEN 1 THEN 'Male'
WHEN 1 THEN 'Male' WHEN 2 THEN 'Female'
WHEN 2 THEN 'Female' ELSE 'Other' END as Gender
ELSE 'Other' END as Gender , if(age > 55, 'Over 55',
, CASE WHEN age > 55 THEN 'Over 55' if(age > 34, 'Between 35 and 55',
WHEN age > 34 THEN 'Between 35 and 55' 'Less than 35')) as Age_Band
ELSE 'Less than 35' END as Age_Band , count(*) as Num_Records
, count(*) as Num_Records FROM 3pd.tc_annual_enrollment
FROM dbo.AnnualEnrollment WHERE ENRIND12 = 1
WHERE ENRIND12 = 1 GROUP BY `year`
GROUP BY [year] , sex
, sex , if(age > 55, 'Over 55',
, CASE WHEN age > 55 THEN 'Over 55' if(age > 34, 'Between 35 and 55',
WHEN age > 34 THEN 'Between 35 and 55' 'Less than 35'))
ELSE 'Less than 35' END ORDER BY `year`
ORDER BY [year] , Gender
, Gender , Age_Band
, Age_Band; LIMIT 15;
What changes do I need to run my TSql query on HIVE?
MS Sql Server Query HIVE Query

SELECT TOP 15 SELECT `year`


[year] , CASE sex
, CASE sex WHEN 1 THEN 'Male'
WHEN 1 THEN 'Male' WHEN 2 THEN 'Female'
WHEN 2 THEN 'Female' ELSE 'Other' END as Gender
ELSE 'Other' END as Gender , if(age > 55, 'Over 55',
, CASE WHEN age > 55 THEN 'Over 55' if(age > 34, 'Between 35 and 55',
WHEN age > 34 THEN 'Between 35 and 55' 'Less than 35')) as Age_Band
ELSE 'Less than 35' END as Age_Band , count(*) as Num_Records
, count(*) as Num_Records FROM 3pd.tc_annual_enrollment
FROM dbo.AnnualEnrollment WHERE ENRIND12 = 1
WHERE ENRIND12 = 1 GROUP BY `year`
GROUP BY [year] , sex
, sex , if(age > 55, 'Over 55',
, CASE WHEN age > 55 THEN 'Over 55' if(age > 34, 'Between 35 and 55',
WHEN age > 34 THEN 'Between 35 and 55' 'Less than 35'))
ELSE 'Less than 35' END ORDER BY `year`
ORDER BY [year] , Gender
, Gender , Age_Band
, Age_Band; LIMIT 15;
How do I run my new HIVE query?
How do I run my new HIVE query?
How do I run my new HIVE query?
How do I run my new HIVE query?
Contents
• What is McKinsey Healthcare Analytics
• Analytical Configurable Episodes (ACE)

• Next steps in Big Data

• Further reading
• Q&A
Useful links
• McKinsey & Company
• http://www.mckinsey.com/
• McKinsey Healthcare Systems and Services practice
• Mckinsey Healthcare Analytics
• Reform and big data
• Article - Why Americans pay more for health care
• Article - The big-data revolution in US health care: Accelerating value and innovation
• Big Data
• What is Big Data? The Basics – Meaning and Usage
• Scaling the Facebook data warehouse to 300 PB
• Hadoop/Hive
• Cheat Sheet Hive for SQL Users
• Hive Language Manual
• Hortonworks Sandbox
• Cloudera HUE Demo
• Hive Function Cheat Sheet
Contents
• What is McKinsey Healthcare Analytics
• Analytical Configurable Episodes (ACE)

• Next steps in Big Data

• Further reading
• Q&A

Das könnte Ihnen auch gefallen