Sie sind auf Seite 1von 22

Data Mining Applications In Healthcare

TEPR 2004
May 21, 2004
V. Juggy Jagannathan
VP of Research
juggy@medquist.com

01010010010100100101001010101000101010101000101010010101010101010100101001001010100101010010010010001001001010010010000101010101001010101001001001001001010010101
01010010010010010100101010010010010010010010010010101000101000101001010010010010010101010010100100100100100100100100100100100100100101001010010010010010001010010

Introduction

Goals of todays presentation:

Provide an overview of the


technologies that are
relevant to the development
and deployment of data
mining solutions in
healthcare

Allow participants
to evaluate where
the technology is
useful

01010010010100100101001010101000101010101000101010010101010101010100101001001010100101010010010010001001001010010010000101010101001010101001001001001001010010101
01010010010010010100101010010010010010010010010010101000101000101001010010010010010101010010100100100100100100100100100100100100100101001010010010010010001010010

What isknowledge
Divining
Data
mining?
from
data

01010010010100100101001010101000101010101000101010010101010101010100101001001010100101010010010010001001001010010010000101010101001010101001001001001001010010101
01010010010010010100101010010010010010010010010010101000101000101001010010010010010101010010100100100100100100100100100100100100100101001010010010010010001010010

.Topic Outline

Data mining

Uses
Algorithms
Technology
Applications in
healthcare

01010010010100100101001010101000101010101000101010010101010101010100101001001010100101010010010010001001001010010010000101010101001010101001001001001001010010101
01010010010010010100101010010010010010010010010010101000101000101001010010010010010101010010100100100100100100100100100100100100100101001010010010010010001010010

.Data Mining Uses

Descriptive
Understand and characterize
Clustering
Summarization
Association Rules
Sequence Discovery

Predictive
Extrapolate and forecast
Classification
Regression
Time-Series

01010010010100100101001010101000101010101000101010010101010101010100101001001010100101010010010010001001001010010010000101010101001010101001001001001001010010101
01010010010010010100101010010010010010010010010010101000101000101001010010010010010101010010100100100100100100100100100100100100100101001010010010010010001010010

Data Mining Algorithms

Classification
> Statistical
> K-nearest
neighbors
> Decision trees

ID3
C4.5

> Neural
Networks (Self
Organizing
Maps)

Clustering
> Hierarchical
> Partitioned
> Genetic
Association
> Apriori
Algorithm
> If.Then rules

01010010010100100101001010101000101010101000101010010101010101010100101001001010100101010010010010001001001010010010000101010101001010101001001001001001010010101
01010010010010010100101010010010010010010010010010101000101000101001010010010010010101010010100100100100100100100100100100100100100101001010010010010010001010010

Technology solutions

Technology
Data Mining Infrastructure Technologies

Database Technologies

On-Line Analytical Processing


(OLAP)
Visualization Technologies

Data scrubbing technologies


Natural Language Processing
(NLP)

01010010010100100101001010101000101010101000101010010101010101010100101001001010100101010010010010001001001010010010000101010101001010101001001001001001010010101
01010010010010010100101010010010010010010010010010101000101000101001010010010010010101010010100100100100100100100100100100100100100101001010010010010010001010010

Database Technologies

Database
OLAP

Data warehouse vs. Data mart

Visualization

Relational technologies
> Oracle
> Microsoft

Scrubbing
NLP

XML-databases
> Raining Data

01010010010100100101001010101000101010101000101010010101010101010100101001001010100101010010010010001001001010010010000101010101001010101001001001001001010010101
01010010010010010100101010010010010010010010010010101000101000101001010010010010010101010010100100100100100100100100100100100100100101001010010010010010001010010

On-Line Analytical Processing

Database
OLAP
Visualization

Analyze multi-dimensional
data

Scrubbing

N-dimensional data cubes

NLP

Operations
> Roll-up
> Drill-down
> Slice and dice
> Pivot

01010010010100100101001010101000101010101000101010010101010101010100101001001010100101010010010010001001001010010010000101010101001010101001001001001001010010101
01010010010010010100101010010010010010010010010010101000101000101001010010010010010101010010100100100100100100100100100100100100100101001010010010010010001010010

Visualization

Database
OLAP

2D/3D Charts

Visualization

Topographic displays

Scrubbing

Cluster displays

NLP

Histograms
Scatter plots
Advanced visualization (genomic data
patterns)
http://www.ncbi.nlm.nih.gov/Tools/

01010010010100100101001010101000101010101000101010010101010101010100101001001010100101010010010010001001001010010010000101010101001010101001001001001001010010101
01010010010010010100101010010010010010010010010010101000101000101001010010010010010101010010100100100100100100100100100100100100100101001010010010010010001010010

Database
OLAP
Visualization
Scrubbing
NLP

Data cleansing
Filling in missing data
In healthcare, there is a
strong need for deidentification to protect
privacy

01010010010100100101001010101000101010101000101010010101010101010100101001001010100101010010010010001001001010010010000101010101001010101001001001001001010010101
01010010010010010100101010010010010010010010010010101000101000101001010010010010010101010010100100100100100100100100100100100100100101001010010010010010001010010

De-Identification of Medical Records *

Names;

social security numbers;

all elements of a street address, city, county,


precinct, zip code, & their equivalent

medical record numbers;

health plan beneficiary numbers;

geocodes, except for the initial three digits of


a zip code for areas that contain over 20,000
people;

account numbers;

certificate/license numbers;

all elements of dates (except year) for dates


directly related to the individual, (e.g., birth
date, admission/discharge dates, date of
death); and all ages over 89

license plate numbers, vehicle identifiers


and serial numbers;

device identifiers and serial numbers;

and all elements of dates (including year)


indicative of such age, except that such
ages and elements may be aggregated into
a single category of age 90 or older;

URL addresses;

Internet Protocol (IP) address numbers;

biometric identifiers, including finger and


voice prints;

telephone numbers;

fax numbers;

full face photographic images and


comparable images;

e-mail addresses;

any other unique identifying number except


as created by IHS to re-identify information.

* Source: Policy and Procedures for De-Identification of Protected Health Information and Subsequent Re-Identification 45
CFR 164.514(a)-(c) posted by IHS (Indian Health Services)

01010010010100100101001010101000101010101000101010010101010101010100101001001010100101010010010010001001001010010010000101010101001010101001001001001001010010101
01010010010010010100101010010010010010010010010010101000101000101001010010010010010101010010100100100100100100100100100100100100100101001010010010010010001010010

Natural Language Processing

Database
OLAP
Visualization
Scrubbing
NLP

NLP Uses
> translation,
summarization,
information
extraction,
document
retrieval or
categorization

NLP Companies in
health care
> A-Life
> Language and
Computing

NLP Approaches
> Clustering,
Classification,
Linguistic
analysis,
knowledge-based
analysis

01010010010100100101001010101000101010101000101010010101010101010100101001001010100101010010010010001001001010010010000101010101001010101001001001001001010010101
01010010010010010100101010010010010010010010010010101000101000101001010010010010010101010010100100100100100100100100100100100100100101001010010010010010001010010

Applications in Healthcare

Safety and quality

Clinical Research
Financial
Public Health

01010010010100100101001010101000101010101000101010010101010101010100101001001010100101010010010010001001001010010010000101010101001010101001001001001001010010101
01010010010010010100101010010010010010010010010010101000101000101001010010010010010101010010100100100100100100100100100100100100100101001010010010010010001010010

To err is Human IOM Report

Safety and Quality


Clinical Research
Financial
Public Health

Characterization
> JCAHO Core Measures
> CMS Quality measures starter
set
> Improves patient care
reactive response
Prediction
> Identifying cases that can
result in bad clinical outcomes
and raising appropriate alarms
> Impacts patient care
proactive response

01010010010100100101001010101000101010101000101010010101010101010100101001001010100101010010010010001001001010010010000101010101001010101001001001001001010010101
01010010010010010100101010010010010010010010010010101000101000101001010010010010010101010010100100100100100100100100100100100100100101001010010010010010001010010

Quality Measures Initial Set*


Starter Set of 10 Hospital Quality Measures
Measure
Aspirin at arrival

Condition
Acute Myocardial Infarction (AMI)/Heart attack

Aspirin at discharge
Beta-Blocker at arrival
Beta-Blocker at discharge
ACE Inhibitor for left ventricular systolic dysfunction
Left ventricular function assessment

Heart Failure

ACE inhibitor for left ventricular systolic dysfunction


Initial antibiotic timing

Pneumonia

Pneumococcal vaccination
Oxygenation assessment

*Source: http://www.cms.hhs.gov/quality/hospital/overview.pdf

01010010010100100101001010101000101010101000101010010101010101010100101001001010100101010010010010001001001010010010000101010101001010101001001001001001010010101
01010010010010010100101010010010010010010010010010101000101000101001010010010010010101010010100100100100100100100100100100100100100101001010010010010010001010010

Safety and Quality

University of Mississippi Medical Center


> Data Warehouse Technologies to understand
Medication Errors Funded by AHRQ
> Anonymous report data collection
> Data mining technologies
> Use of Neural networks and associative rule inference

01010010010100100101001010101000101010101000101010010101010101010100101001001010100101010010010010001001001010010010000101010101001010101001001001001001010010101
01010010010010010100101010010010010010010010010010101000101000101001010010010010010101010010100100100100100100100100100100100100100101001010010010010010001010010

Clinical Research & Clinical Trials

Safety and Quality


Clinical Research
Financial
Public Health

Pharmacy and medical


claims data
Drug efficacy and clinical
trials for example how
effective is a particular drug
regimen
Protein structure analysis
Genomic data mining
Diagnostic Imaging data
research

01010010010100100101001010101000101010101000101010010101010101010100101001001010100101010010010010001001001010010010000101010101001010101001001001001001010010101
01010010010010010100101010010010010010010010010010101000101000101001010010010010010101010010100100100100100100100100100100100100100101001010010010010010001010010

The bottom line on cost

Safety and Quality


Clinical Research
Financial
Public Health

General Utilization review


does the care provided meet
accepted clinical and cost
guidelines
Drug Utilization review
Outlier analysis exceptions
to treatment analyzing
treatments which cost more
than the normal or less than
normal.

01010010010100100101001010101000101010101000101010010101010101010100101001001010100101010010010010001001001010010010000101010101001010101001001001001001010010101
01010010010010010100101010010010010010010010010010101000101000101001010010010010010101010010100100100100100100100100100100100100100101001010010010010010001010010

Data mining in public health

Safety and Quality


Clinical Research

Syndromatic surveillance

Financial

Bio-terrorism detection

Public Health

Communicable disease
reporting (Centers for Disease
Control (CDC))

Example effort: AEGIS

DAWN (Drug Awareness and


Warning Network)
Federal Drug Agency (FDA)
reporting of adverse drug
events.

01010010010100100101001010101000101010101000101010010101010101010100101001001010100101010010010010001001001010010010000101010101001010101001001001001001010010101
01010010010010010100101010010010010010010010010010101000101000101001010010010010010101010010100100100100100100100100100100100100100101001010010010010010001010010

Conclusion

Descriptive
Predictive

Classification
Clustering

Data mining
Uses

Database

OLAP
Association rules
Visualization
Scrubbing

Algorithms

NLP
Safety and Quality

Technology

Clinical Research

Applications in
healthcare

Financial
Public Health

01010010010100100101001010101000101010101000101010010101010101010100101001001010100101010010010010001001001010010010000101010101001010101001001001001001010010101
01010010010010010100101010010010010010010010010010101000101000101001010010010010010101010010100100100100100100100100100100100100100101001010010010010010001010010

Technology solutions

Conclusion

juggy@medquist.com

uestions?

01010010010100100101001010101000101010101000101010010101010101010100101001001010100101010010010010001001001010010010000101010101001010101001001001001001010010101
01010010010010010100101010010010010010010010010010101000101000101001010010010010010101010010100100100100100100100100100100100100100101001010010010010010001010010

Das könnte Ihnen auch gefallen