PDF

Sigma Xi, The Scientific Research Society
When Averages Hide Individual Differences in Clinical Trials: Analyzing the results of clinical
trials to expose individual patient's risks might help doctors make better treatment decisions
Author(s): David Kent Rodney Hayward
Source: American Scientist, Vol. 95, No. 1 (JANUARY-FEBRUARY 2007), pp. 60-68
Published by: Sigma Xi, The Scientific Research Society
Stable URL: http://www.jstor.org/stable/27858901
Accessed: 08-09-2015 11:04 UTC
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at http://www.jstor.org/page/
info/about/policies/terms.jsp
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content
in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship.
For more information about JSTOR, please contact support@jstor.org.
Sigma Xi, The Scientific Research Society is collaborating with JSTOR to digitize, preserve and extend access to American
Scientist.
http://www.jstor.org
This content downloaded from 144.173.232.36 on Tue, 08 Sep 2015 11:04:52 UTC
All use subject to JSTOR Terms and Conditions
When
Hide
Averages
in Clinical
Differences
Individual
Trials
Analyzing theresultsofclinical trials toexpose individualpatients' risks

decisions
might help doctorsmake bettertreatment
David Kent and Rodney Hayward

everyday treatmentdecisions be guided by the
results of systematic reviews of clinical trials.
What could be more reasonable? And yet
precisely because evidence-based medicine
statistical data greater
gives impersonal
weight than clinical experience, it has met
strong and at times emotional resistance from
a combination
of
"Ten-and-Fifteen"
purge,
practicing physicians. Some see this resis
com
tance as a self-interested reaction, but we be
10 grains of calomel, a mercury-based
15
and
of
lieve that it arises in part from a fundamental
grains
jalap (the poisonous
pound,
root of a Mexican plant related to themorning
mismatch between the evidence provided
by clinical trials and the needs of practicing
glory). After administering this toxic brew,
doctors then bled patients so profusely they doctors treating individual patients. Because
often passed out. Miraculously, a few hearty many factors other than the treatment affect a
patients survived both the sickness and its patient's outcome, determining the best treat
ment for a particular patient is fundamentally
cure, reinforcing the doctors' belief in the val
ue of their treatment.
different from determining which treatment
is best on average.
The confusion of Rush and his contempo
We believe that changes in theway clinical
raries is understandable; given thenatural vari
trials are analyzed could offer at least a partial
ation inpatient outcomes, itcan be surprisingly
difficult to tellwhether a treatment ishelping or
solution to this dilemma and yield themore
harming patients. Things are no different today: detailed information doctors need tomake
Therapies that are useless or worse can still better treatment decisions.
inspire great enthusiasm among physicians.
But unlike Rush, modern physicians are at least The Modern Clinical Trial
somewhat protected from thehuman tendency The clinical trial is a surprisingly recent inven
tion.The firstmodern trial,conducted in 1947
to draw unwarranted conclusions by a statisti
cal instrument called the clinical trial.
48, showed that thenewly discovered antibiotic
The most powerful form of trial, the ran
streptomycinwas more effective than the con
ventional treatment for tuberculosis. Itwas to
domized controlled clinical trial,was devised
as a means of determining a treatment's effect be 15 years, however, before drugs routinely
underwent clinical trials prior to being sold in
when many other factors, including unknown
theUnited States. In the late 1950s, severe birth
ones, might affectpatient outcomes. Over the
defects were reported after the tranquilizer tha
past 50 years, large-scale human trials have
in lives saved and im
Udomide was given to pregnant women. This
paid rich dividends
a
In
of
Scottish
life.
1972,
epide
tragedy spurred Congress topass theKefauver
proved quality
Harris
named
Cochrane
Archie
published
Drug Amendments of 1962,which finally
miologist
a book urging physicians to follow the evi
forcedmanufacturers to prove that a new drug
dence of clinical trials in theirpractices. By the was both safe and effective.
The randomized controlled clinical trialbe
1990s a group of doctors led by the Canadian
came the standard means of providing this
term
the
David
had
coined
Sackett
physician
a movement
and
"evidence-based
medicine,"
proof. The patients in this type of trial are as
was born. These physicians advocated
that signed to one of two groups?the experimental
1793,when yellow fever reached Philadel

In phia, killing hundreds of people, nobody
knew its cause. But there was no shortage
of theories. Based on these, a few desperate
physicians devised increasingly radical treat
ments. One of the most famous, concocted
was the
by the renowned Dr. Benjamin Rush,
David Kent and Rodney

are both research
Hayward
ers and
practicing physicians
specializing ingeneral internal

medicine. Kent is an assistant
professor ofmedicine at Tufts

New England Medical Center,
as
Tufts University and also
sociate director of the clinical

research program and assistant
professor of clinical research at

the Sadder Graduate School of
Biomedicai Sciences at Tufts.
is the director of the

Hayward
Veterans Affairs Ann Arbor
Health Services Research
& Development Center of
Excellence and is a professor
of public health and internal
medicine at theUniversity of
He was the 2005
Michigan.
recipient of theUnder Sec
retary
forHealthAwardfor
career
accomplishments in
health-services research. Ad
dress for Kent: Institute for

Clinical Research and Health
Policy Studies, 750 Washing
ton Street, Tufts-NEMC #63,
Boston, MA 02111. Internet:
dkentl@tufts-nemc.org
60
American
Scientist, Volume
95
in Suffolk, England,
took 2,000 digital photos of people
living in the small town of Haverhill
Figure 1. In the year 2000 artist Chris Dorley-Brown
one image. This illustration shows 12
blended
into
all
had
been
until
these photographs,
and then used software tomerge
2,000
step by step,
has left out the original
a blend made up of all of the double portraits. (Dorley-Brown
double portraits (two photographs morphed
together) and
are
of thousands of individuals
to
treatment
the
In
the
modern
clinical
in
to
the
of
the
order
trial,
responses
protect
participants.)
privacy
portraits
are averaged,
in the same way the center photograph
in a single number
represents all the other individuals. As the data
typically summarized
to guide medical
practice.
important individual differences are lost. The authors propose ways to better examine clinical-trials results
assessment
group or the control group?and
of the outcome measure is typically blinded or
masked (the assessing physician does not know
whether patients received treatment).Random
ization is the feature that gives trials the power
to find the treatment's effect in the clutter of
differentpatient riskprofiles. Ifpatients are ran
domly assigned to the experimental and control
groups, risk factors should be equally distrib
uted between the groups. Thus any difference
in the aggregated outcomes of the two groups
can be attributed to the effectsof treatment.
The treatment-effect,as it is called, is typical
ly a single number that summarizes the overall
result of the trial. The treatment-effectcan be

expressed as the absolute risk reduction (the
difference between the outcome rate in the ex
perimental group and the outcome rate in the
control group) or the relative risk reduction (the
decrease in bad outcomes in the experimental
group relative t? the outcome rate in the control
group). The absolute risk reduction is always
a much smaller number than the relative risk
reduction. For example, if a trial shows that a
statin drug decreases the risk of heart attacks
from 6 percent (the outcome rate in the control
ex
group) to 4 percent (the outcome rate in the
risk
the
reduction
absolute
perimental group),
2007
www.americanscientist.org
January-February
61
trialwith conventional
typical clinical trial
trialwith riskprofile-based
subgroup analysis
subgroup analysis
???4A
Si
A? &4
4?|
4*
4
benefitedfromtreatment
4?*
harmed
by treatment
4?
Figure 2. The patients in a large clinical trial inevitably have different health histories and risk factors. Many medical
investigators have assumed
more like the real
that this kind of heterogeneity
is an advantage
because
itmakes
the trial population
But summarizing
treatment
population.
results for a population
in a single number may gloss over subgroups
of patients whose
is quite different from the statistical average.
response
Conventional
"one variable at a time" subgroup analysis is unlikely to find subgroups with large differences
in response to therapy. Newer
types
of analysis, using risk profiles, may be more powerful
in grouping patients according
to their likelihood
of benefiting
from treatment.
is 2 percent and the relative risk reduction is 33

several drugs were found that could dissolve
a clot and restore blood flow to heart muscle
percent (the absolute risk reduction divided by
the outcome rate in the control group).
before itwas irretrievably damaged. One of
Doctors aremore likely to adopt a treatment
these was Streptokinase. But in 1978 a Belgian
when the treatment-effect is expressed using a
scientist discovered that the cells lining blood
vessels made an enzyme, tissue-type Plas
larger,more impressive number, even though
the information underlying calculations of ab
minogen activator, or t-PA, that also dissolved
solute and relative risk is identical. Thus, trial clots. In the early 1990s the biotechnology
sponsors (frequently pharmaceutical compa
company Genentech, which had succeeded in
nies) typically emphasize the larger relative risk genetically engineering this enzyme, and the
reduction. But whichever way treatment-effect National Institutes ofHealth sponsored a huge
is expressed, reporting a single number gives
clinical trial of Streptokinase and t-PA.
themisleading
The trial showed that t-PAwas considerably
impression that the treatment
effect is a property of the drug rather than of more effective than Streptokinase, reducing the
the interaction between the drug and the com
relative risk of death by about 15 percent. The
a
newer
risk-benefit
of
plex
particular group
profile
drug was also much more expensive
of patients. Consider what happens when
than Streptokinase, but analysis showed that
sicker patients are enrolled and the rate of the
itsbenefits justified the additional expense. Fol
problematic outcome in the trialgoes up. If the
lowing the gusto study, use of Streptokinase
relative risk reduction stays the same, the ab
declined dramatically, and it isnow very rarely
solute benefit must get proportionally larger. used forheart attacks in theU.S.
This reflects our intuition that sicker patients
Of course, t-PAdoes not reduce every patient's
have potentially more to gain from therapy
risk by the same amount. Consider, forexample,
But when treatments have even a small risk
two patients who both qualify for thrombolytics.
of serious harm, the differences in treatment
Estragon is 72 and diabetic. When he arrives at
effectmay not just be a matter of degree. In
the emergency room by ambulance, he is expe
some
benefit
deed,
patients may
substantially
riencing severe chest pain and has a rapid pulse
from a treatment even when the overall results
and low blood pressure. An electrocardiogram
from a trial are negative. Or a treatmentwith
indicates a heart attack affectinga large and vital
area of theheartmuscle. Vladimir, 52, has stable
benefit on average may be extremely unlikely
to help most patients, while being more likely vital signs and no chronic illnesses.He has come
to harm than help some others. But unless
to the emergency room complaining of chest
the trial investigators analyze their data look
pressure. His electrocardiogram indicates that
he has had a heart attack affectingonly a small
ing for these subgroups, the physician cannot
area of theheartmuscle.
know whether they exist.
Given his condition, Estragon's mortality
Hidden Risks
risk without thrombolyticswould be about 25
Harm to a fewwas the problem lurking in the percent, whereas Vladimir's would be close
statistics of the landmark gusto study,which
to 2 percent. Estragon is at such high risk of
two
compared
thrombolytic (clot-busting)
dying that the potential benefits of t-PA clearly
drugs for heart-attack victims. In the 1970s
outweigh any risks or costs associated with this
62
American
Scientist, Volume
95
agent. But it isnot clear that t-PAwould benefit

Vladimir, who is highly likely to survive no
matter which thrombolytic agent he receives.
In fact, ifX^adirnir has high blood pressure or a
history of stroke,both ofwhich would increase
his risk of intracranial bleeding, giving him the
more potent t-PAmight actually increasehis risk
ofdying(albeitonlyslightly).
In thegusto trial lower-riskpatients likeVlad

imirwere much more common than higher-risk
patients like Estragon. When we re-analyzed
the gusto results using mathematical models
thatestimated the risk of death based on patient
characteristics,we discovered that t-PAprimar
ily benefited a subgroup of high-risk patients.
The highest-risk quartile of patients accounted
formost of theoutcomes thatgave t-PA the edge
over Streptokinase. Paradoxically, even though
the overall results of the trial suggest that t-PA is
better and clearlyworth the extra risks and costs,
thebenefits for the typical patient in the trial are
less, and the trade-offsless clear.
Hidden
Benefits
Summarizing trial results may exaggerate the
benefit of treatment for some patients, but the
reverse is also possible: A negative overall result
can hide significant benefit to some patients.
Consider, forexample, the Atlantis ? trial,un
dertaken in the late 1990s. This trial tested the
efficacy of t-PA in treating strokes instead of
heart attacks. Strokes are trickier than heart at
tacks because the thrombolyticsmust be given
much sooner (within 3 rather than 12 hours)
and therisk of thrombolytic-related intracranial
hemorrhage ismuch greater (probably 6 or 7
percent instead of 1 percent).
Earlier clinical trials had shown that t-PA
did not yield any overall benefit if itwas ad
rrtinisteredmore than three hours after the pa
tient firsthad symptoms of a stroke. This short
window of opportunity meant t-PAwas given
to fewer than 5 percent of stroke patients. To
fectmight be different.When we used a risk

model derived from independent data to di
? patients into thirds,we
vide the Atlantis
found that the third of the patient population
at the lowest risk of thrombolytic-related hem
orrhage actually did better with t-PA?even
though theywere treated outside the approved
time window.
The paradoxical
results of the gusto and
Atlantis ? trials arise from underlying varia
tion in the baseline risks of these populations.
For gusto, variation in the degree of benefit
was due mostly to large variation in the risk
of the outcome (death). For Atlantis b, itwas
attributable to variation in the risk of treat
ment-related
harm.
Estragon
72 years old
Vladimir
52 years old
mm
ewegency
diabetic
historyof stroke
stable vitalsigns but
highblood pressure
rapidpulse
low blood
pressure
electrocardiogram
indicates a massive
anterior-wall
no co-morbid
chronic
illnesses
heart
electrocardiogram
indicates a small
attack
inferior-wall heart
attack
mortality risk25%
more likelytobenefit
from treatment
mortalityrisk2%
less likely
to benefit;might
Ibe harmedby treatment
Figure 3. The common practice of treating everyone just tomake sure patients who will
if a treatment carries a risk of harm for some patients.
benefit get treated is dangerous
Two hypothetical
heart-attack patients illustrate this situation. Estragon is very ill and
would
benefit from t-PA, a clot-busting
probably
drug that performed better than
a patient at low risk of
in a randomized
clinical trial. Vladimir,
Streptokinase
dying
from his heart attack, might actually be harmed by themore potent agent, particularly
ifhe has risks for bleeding,
such as high blood pressure or a prior stroke.
John Ioannidis and Joseph Lau, our col

leagues at theUniversity of Ioannina inGreece
Atlantis
? en
re-test
the treatment
and Tufts-New England Medical Center respec
window,
rolled patients arriving for treatment between
tively,have advocated measuring the degree of
variation in outcome risk in a trialby compar
three and five hours after the onset of symp
toms of a stroke. The trial demonstrated no
ing the outcome rate in the quarter of patients
with the lowest risk score to the outcome rate
overall benefit for t-PA (treated patients were
no more likely than those who received a pla
in the quarter with thehighest risk score. In the
gusto trial, themortality rate in thehighest-risk
or near-normal
to recover
cebo
normal
func
as mentioned
treat
above,
tion). Moreover,
quartile is nearly 10 times higher than that in
ment with t-PA substantially increased their
the lowest risk quartile.
risk of intracranial hemorrhage.
This degree of variation may seem high, but
at
not extreme by anymeans. Looking at trials
it
is
the
result
average
Physicians looking only
of this trialwould be understandably discour
of treatments forHIV infection, Ioannidis and
Lau found examples where the outcome rate
aged by the lack of benefit and the increased
in thehigh-risk group was more than 50 times
risk of harm. But the trial showed that t-PA and
were
The
fact
placebo
essentially equivalent.
higher than that of the low-risk group. And
that some patients given t-PAwere harmed by when we looked at trials testingblood-pressure
it implies others must have benefited from it. medicine for chronic kidney disease, we found
We hypothesized that ifpatients at lower risk similar ratios: Outcome rates were less than 1
of intracranial hemorrhage could be identified percent in the low-risk patients and more than
30 percent inhigh-risk patients.
and t-PA given only to them, the treatment-ef
2007
January-February
63
Ail
Bal
overalloutcomes inATLANTIS
trial
s
2i
co
o "3 50
e o
Si
modified
Rankin
outcomes in low-risk
group inATLANTIS
I."... i control
experimental
trial
low-risk subgroup
benefitedfromtreatment
?
harmed
?
by treatment
trial looked at the outcome of stroke patients treated three to five hours after the onset of symptoms, slightly longer than
Figure 4. The atlantis
the recommended
cut-off. The collective results (top graph) indicated patients in the experimental
group did no better than those in the control
the authors looked at the outcomes among the one-third of patients at least risk of hemorrhage,
group, who were given a placebo. But when
they
found that they were helped by t-PA (bottom graph). In the initial analysis the benefit reaped by some was masked
by the harm others suffered.
that identify those patients least likely to suffer a treatment-related hemorrhage
based on pretreatment characteristics
(right) could
for t-PA to be extended in appropriate
situations. (Graph adapted from Kent, Ruthazer and Selker 2003.)
potentially allow the treatment window
Risk models
It should be obvious thatwhen there is this

degree of variation, one should not expect simi
lar risk-benefit trade-offs in high and low risk
groups. The high degree of variation arises be
cause trials frequentlyenrollmany patients with
a negligible risk for the outcome even in the ab
sence of treatment.For such patients, therapies
associated with even amodest riskof treatment
related adverse effectswill be causing net harm.
Not only is there considerable variation in
risk,but italso appears that the baseline risk is
not always distributed normally, in bell-curve
fashion. Ifrisk were distributed normally, then
the overall trial resultwould at least reflect the
outcome risk and treatment-effect in the typi
cal patient. But in fact the gusto distribution,
where many patients are at low risk and a few
patients at high risk,might be more typical.
There are several reasons why risksmight be
highly skewed. One of them is something called
thefloor effect.
Since there is no such thing as a
negative risk, people with high outcome risks
cannot be balanced out by people with risks less
than zero. Another is that risk factorsare not ran
domly distributed but instead clump: The pres
ence of one risk factoroften increases the likeli
hood of having others.When risks are skewed,
the typical outcome riskmay be different from
theaverage outcome risk,and thereforetheover
all treatment-effect
might not reflectthebenefit to
even the typicalpatient in the trial (Figure5).
One Variable orMany?
Many clinical trials include some attempt to
explore differences in treatment-effectamong
64
American
Scientist, Volume
the enrolled patients. But these analyses almost

always focus on one attribute or risk factor at
a time. For example, theymight compare out
comes inmen and women or in patients with
and without hypertension. But one-variable-at
a-time subgroup analyses are not likely to yield
meaningful information.For one thing,somany
differentvariables can potentially influence the
response to therapy and the likelihood of an
outcome that ifseparate subgroup comparisons
are made, chance alone will ensure that some
subgroups show differences in treatment-effect.
The hazards of "false positive" subgroup analy
sis frommultiple comparisons were amusingly
demonstrated using data from the isis-2 trial,
which looked at the effectof aspirin inpatients
with heart attacks. Post-hoc subgroup analysis
showed that aspirin did not lower mortality in
heart-attack patients born under the signs of
Libra and Gemini, but did in those born under
other signs.
An equally importantand less-well-appreciated
reason
that
one-variable-at-a-time
analysis
is
not effective is that a patient's outcome can be

affected by many factors simultaneously. Since
risks affect outcomes cumulatively, outcome
differences between groups that differ by just
a single risk factor tend to be relatively small.
On the other hand, large outcome differences
are found in analyses that compare subjects
with many risk factors to subjects with none or
few.Even if the experimenters pick a relatively
strong risk factor,a single factor is unlikely to
reliably discriminate between those who are
at greatly different risks for the outcome and
95
who thereforehave widely diverging risk-ben

can improve our
understanding of a clinical
efit trade-offs from treatment.
trial. The trial, the European Carotid
Surgery
We believe that tobe trulyuseful clinical trials Trial (ecst), was
designed to testwhether pa
must routinely include analyses that combine
tients who had recentwarning signs or symp
risk factors into risk scores or indices. Risk mod
toms of a stroke benefited from carotid endar
els forawide variety of diseases can be found in terectomy,a
surgical procedure to clear plaque
the literature,although they have not yet been
from one of themajor arteries that carry blood
to thehead and neck.
exploited to analyze clinical-trial results.
To demonstrate thatour gusto and Atlantis
This trialwas a good candidate for reanalysis
? examples aren't anomalous and thatmultifac
because treatment could be harmful as well as
tor analyses are by nature more
powerful than beneficial; debris from the artery could break
off during surgery and migrate to the brain,
single-variable analyses, we ran computer simu
lations of hypothetical clinical trials.We then
causing a stroke. Patients had varying baseline
or
to
the
risks
forhaving a stroke if they did not have
grouped patients according
presence
absence of one risk factor or according to a risk surgery, and
they also had varying baseline
score based on their count of risk factors.When
risks of sufferingharm during surgery.What's
we looked at the outcomes of these simulated more, the factors
predicting a patient's baseline
were
risk
different
from those predicting his or
patients, single-factor subgroup analyses proved
her risk of stroke during surgery.
statisticallyweak; itwas unlikely such analyses
would reveal real and often large differences in
The ecst trial showed that ifpatients had se
the treatment-effect.Subgroup analyses
vere or "tight" stenoses
using
(narrowings of the carot
id artery that reduced itsdiameter by 70 percent
multiple factors, on the other hand, were ex
tremelypowerful; under typical circumstances,
these analyses would reveal important differ
harm benefit
ences in the treatment-effect in different risk
groups. (See "Finding Answers forVladimir and
Estragon," next page.)
The risk scores in our simulation examined
only one dimension of risk: therisk of having the
outcome of interest. Important variation in this
baseline risk is so common that analysis across
this dimension should be routine. But the im
portance of other dimensions of risk should be
explored in certain cases.
For treatmentswith a particularly high rate
of serious adverse events, such as
thrombolytics
forstroke, scores for therisk of treatment-related
harm may be helpful in
discrirninating patients
likely or unlikely to benefit (as in our Atlantis
? analysis). In some cases, there
might be reason
to examine characteristics that affect the rela
tive responsiveness to therapy, such as time-to
treatment for thrombolytics or other emergency
therapies?although, in thisdimension, itmight
be hard to combine such characteristics into a
score. Lastly,
particularly for chronic diseases
being treated over time in older, sicker popula
tions, competing risks (the riskof succumbing to
an illness not related to the treatment) could also
0
5
10
15
give rise todifferences in treatment-effect.
baseline outcome risk(percent)
In any case, what ismost important is that
themyriad individual risk factors can be sum
Figure 5.Wide variation of patients' baseline risk (their risk of suffering a bad outcome
marized into just a few risk dimensions, which
in the absence of treatment) is one reason trial results don't
apply equally to all patients.
are much more
When
the baseline
risks are skewed, the average trial results
than
the
individual
might not even apply to
powerful
risk distribution with many
variables in sorting patients into those likely
typical patients. The top panel shows a typical skewed
risk in this hypothetical
patients at low risk and a few at high risk. The median baseline
and unlikely to benefit.
A Landmark Study
In 1999 Peter Rothwell of the Radcliffe Infir
mary inOxford, England, and Charles Warlow
ofWestern General Hospital
in Edinburgh,
Scotland, published a landmark reanalysis that
vividly shows how risk-benefit stratification
is less than 4 percent, but the average risk is

the
population
roughly 8 percent because
trial includes a few patients with very
risks
that
high
pull up the average. In the inset
the purple line shows the expected result for placebo
treatment or no treatment If treat
ment reduces the risk of the outcome
by 25 percent but also carries a risk of harm of 1
risk
percent (second line), the sickest patients would
benefit, but those with a baseline
below 4 percent would
actually run a slightly greater risk of being harmed than of being
risk is skewed in this way, a trialmay have an overall
helped. When baseline
positive
outcome even
though most patients are unlikely to benefit
2007
January-February
65
FindingAnswers forVladimir and Estragon
scores reported for clinical trials can be niisleading, and trials are
physicians are aware that the single-number
one
or
to
the
of
exarrtine
another attribute or risk factor on the effectof an intervention.
Most
usually analyzed
impact
But these single-factor analyses are much less likely to yield useful insights than ismultifactor risk stratification.
The difference in the power of these two types of analyses is demonstrated by a clinical-trial simulation. The virtual
patients in this simulation could have any of six differentrisk factors, each of which increased the risk of a bad outcome
by a factor of two. The prevalence of the risk factors varied from 10 percent to 40 percent. Treatment decreased the rela
tive risk of the outcome (a negative event used tomeasure treatment efficacy) at the five-yearmark by 50 percent. But it
also led to three bad outcomes per 1,000 patients per year.
On thewhole the treatment benefited the virtual population: The treatment-effect,expressed as the relative risk reduc
tion, was 23 percent. The statistical power of the simulated trial (A) was also roughly similar to those ofmany real trials.
Given the degree of benefit and the number of patients in the trial, there is a 74 percent chance a single run of the trial
would have a statistically significant result.
true
control
event rate
(outcome
risk)
risk-factor
distributor
randomizer
overall
results
5.6%
true
relative risk
reduction
(treatment
statistical
power
effect)
23%
0.74
When the six risk factorswere used one by one to divide the patients into two groups (B), the treatment-effectdidn't
vary substantially among the groups. A single risk factordidn't much change the patients' risk of suffering the outcome,
and the presence of other risk factors helped obscure what impact itdid have.
are relativelysmall, itwas unlikely thatany one of these single-factoranalyses
Because thedifferences in the treatment-effect
would have statisticallysignificantresults. Indeed, even ifa risk factorwas present in40 percent of the trialsubjects (bottomrow),
therewas only a 19 percent chance that therewould be a significantdifferencebetween the two groups in the treatment-effect.
risk factor
#1 present
absent
#3 present
absent
in 25%
in 75%
#6 present
single-risk-factor
sorter
in 10% of subjects
in 90%
in40%
in60%
absent
of subjects
of subjects
of subjects
of subjects
of subjects
9.5%
5.1%
8.5%
4.6%
7.8%
4.2%
33%
20%
0.11
30%
18%
0.16
29%
14%
0.19
In contrast,when a risk index or scorewas used todivide patients into subgroups (C),we saw tremendous variation both
in the outcome risk and the treatment-effect.
The lowest-riskgroup had an outcome risk of 1.5 percent; thehighest-risk group
The
had a risk of 23.1 percent Because of thevariation inoutcome risk, therewas also largevariation in the treatment-effect.
treatment-effectreversed sign forthe lowest-riskgroup. Their likelihood of sufferinga bad outcome increased49 percent (al
a
a
though this corresponds to just small absolute increase in risk).The highest-risk group had 40 percent relative risk reduction
with treatment.Given the large variation in treatment-effect,theanalysis isquite powerful. The chance of finding a statistically
as finding a treatment-effectin the overall trial.
significantdifference between risk strata is almost as high
4-6
risk factors
3 risk factors
2 risk factors
1 risk factor
riskmodel
0 risk factors
If the six factors are used to assign each patient a risk score and the patients are then divided into two groups of
roughly equal size based on their scores (D), the variation in the outcome risk and the treatment-effectremains large
and the statistical power relatively high. Together
c
these simulations demonstrate thatmore realistic risk
9.0%
2 or more risk factors
0.58
to
treatment
is
the
decisions
analysis
likely
improve
2.6%
0-1 risk factors
physicians and patients must make.
66
American
Scientist,
Volume
95
or more), they benefited from endarterectomy,

which reduced their five-year absolute risk of
suffering a stroke by an average 7 percent. Ac
cording to the overall trial results, all symptom
atic patients with this risk factor should undergo
surgery. Rothwell and Warlow reanalyzed the
results for thisgroup of patients.
The scientists derived twomodels forpatient
risk (risk of future stroke ifuntreated and risk
of stroke during surgery) fromother data. They
then used these models to divide the patients
with tight stenoses into subgroups and looked
at the outcomes in these subgroups. It turned
out that among patients with tight stenoses,
only 16 percent benefited from surgery. Those
who benefited were at relatively high risk of
stroke ifnot treated but at relatively low risk
of stroke during surgery. The other 84 percent
of the patients had nearly identical outcomes
with or without surgery.Again, although the
average outcome suggested patients benefited,
the typicalpatient did not. Reanalysis showed
that only one in five patients with tight steno
seswas helped by surgery.
Consider, for example, Cesario and Viola.
is 76 years old?
Four days ago, Cesario?who
suddenly, though temporarily, lost control of
his righthand and the ability to talk.A cerebral
angiogram (an x-ray image of the arteries that
supply thebrain) showed thathis carotid artery
was 90 percent blocked, and the plaque had a
highly irregular border. Based on this, accord
ing to the ECSTmodel, his risk of a stroke over
thenext five years is over 40 percent.
Viola, on the other hand, is 59 years old.
More than threemonths ago, she experienced
transient loss of vision in one eye (suggesting a
clot in the vessels supplying her eye rather than
her brain). She has had no symptoms since.
Her carotid blockage was 70 percent, and the
plaque was smooth. Based on Viola's charac
teristics,her risk of stroke in the next five years
is less than 5 percent. Moreover, because Viola
is female and has very high blood pressure, her
risk of stroke from the surgery is higher than
Cesario's. For Cesario the benefits are clear; for
Viola, therisks of stroke from the surgery itself
would outweigh the benefits.
What Stands in theWay?
Once the benefits of risk-stratified analysis are
obvious one
explained, they seem obvious?so
would think this type of analysis would already
be commonplace. Yet risk-stratified analyses are
rarely done. In 2001, we reviewed 108 clinical
trials reported in fourmajor journals. At that
timewe found only one trial that used statisti
cal methods similar to those we suggest, and
we haven't noticed many since.
Admittedly there are still methodological
and practical problems with risk stratification
to be ironed out. In a situation where there are
factors that affect the outcome risk, factors that
Viola
Cesario
more
than three
months ago:
four days ago:

temporarily lost control
of his righthand
transient loss of
vision inone eye
temporarily lost the

ability to talk
no symptoms since
cerebral angiogram
showed a 90%
blockage, with a
smooth plaque
causing a 70%
stenosis
highly irregular
border
female with a history
ofhighblood
no history of high
blood pressure and
peripheral vascular
disease
pressure and
peripheral vascular
disease
should
should be treated
not be treated
and Viola, a hypothetical

pair of patients who have suffered strokes,
Figure 6. Cesario
a surgical treatment option.
illustrate the array of risks facing a physician
considering
A re-analysis of the ECST
clinical trial of carotid endarterectomy
(a surgical procedure
one for the
risk models,
to clear blocked
carotid arteries) relied on two multifactor
treatment and another for the risk that the patient
risk of stroke without
the original trial resulted in
suffer a stroke during the surgery itself. Although
of an artery
that everyone with more
than 70 percent blockage
the recommendation
should be sent to surgery, risk-benefit profiling revealed a more complex picture: In
baseline
would
fact only one
men
in five of those patients
stood
actually
50-69% stenosis
smooth
stenosis
to benefit
70-99%
smooth
stenosis
ulcerated/
irregular
from the surgery.
stenosis
ulcerated/
irregular
5-year
riskof
stroke
(percent)
stroke
<10
TIA
10-15
ocular
15-20
-r V
timesince lastevent (weeks)
women
r \/
20-25
50-69% stenosis
smooth
stenosis
?> -7N
\/ni ?>
70-99%
smooth
stenosis
ulcerated/
irregular
25-30
30-35
stenosis
ulcerated/
]35-40
irregular
140-^5
I45-50
|>50
stroke
TIA
ocular
V t>
-7N\/ ni >

in Figure 6, investigators
the results of this kind of
7. Following
the reanalysis
of the trial described
that a chart like that shown above might make
suggested
to physicians.
The chart presents
analysis more accessible
Figure
on five patient
of the patient's most
stroke based
nature
age, amount
of stenosis
characteristics
severe
in addition
recent event
and nature of stenosis.
(a TIA
the five-year risk of a new

to gender, time since last stroke,
(Adapted
is a transient
from Rothwell
2007
ischemie
attack),
et al. 2005.)
January-February
67
For relevant Web

consult
links,
this issue of
American
Scientist
Online:
http://www.american
scientist.org/
lssueTQ?/issue/921
68
American
affect treatment-related harm and other fac

tors that affect the responsiveness of patients
to therapy, it can be difficult to know how best
to combine these different dimensions to ap
nev
propriately stratifypatients. Also, there is
er just a single way to describe risk.Different
riskmodels using different variables may be
equally valid but place individual patients into
different strata,yielding different treatment rec
ommendations depending on which model or
score is applied. It should be noted, however,
that the ambiguity about what to do for some
individual patients currently exists; it ismerely
obscured by our insistence that the average
benefit applies to all.
Additionally, moving risk stratification into
the clinic will mean developing a usable set
of tools for doctors to predict and communi
cate risk?and
overcoming barriers to their
Such
tools might come in the form
adoption.
of charts, such as that shown in Figure 7, or in
the form of informatics tools, such as the elec
trocardiograph-based predictive instruments
developed by our coworker Harry Selker and
other colleagues at Tufts-New England Medical
Center (and used inour gusto reanalysis).
Finally, itmust be recognized that there are
considerable disincentives to taking this ap
we call "individualized therapy"
proach. What
the pharmaceutical
companies that sponsor
most drug trialsmight call "market segmen
tation." If a trial results in a recommendation
that all patients be treated,why look further
and perhaps discover that only a subgroup of
the patients is really benefiting? Indeed, the
only robust risk-stratifiedanalysis we found in
our literature survey was done for a trial that
showed no benefit overall but demonstrated
beneficial effects inhigher-risk patients.
Given these impediments, risk stratification,
like the clinical trial itself,might not be widely
adopted until regulatory agencies require it
as part of the drug-approval process. To our
knowledge the FDA has linked drug approval
to a risk score only once. The prowess trial,
a new drug,
published in 2001, showed that
reduced
mortality by 6.1 percent
drotrecogin,
in patients with sepsis, organ failure caused
by blood infection.Drotrecogin is a genetically
a
engineered version of protein normally found
in thebody that reduces clotting and inflamma
tion. Because the drug is extremely expensive
(about $7,000 per patient), the FDA advisory
committee required that a risk-stratifiedanaly
sis be performed on the prowess results.
When the patients were stratified accord
apache
model (awell-known risk
ing to the
model), it turned out that the half of patients
with lower apache scores fared no betterwith
the agent,whereas thehigher-risk patients ben
efitedmuch more than the overall result sug
gested theywould. The FDA approved drotre
cogin only for these high-risk patients. Perhaps
Scientist, Volume
because theybelieved theiroverall resultsmore

than the risk-stratified results, themakers of
drotrecogin then ran a second clinical trial lim
ited to the lower-risk patients. This trial (the
address trial) confirmed thatdrotrecogin does
not benefit low-risk patients and may instead
cause serious
complications.
Because risk-based subgroup analyses are
so rare, it is impossible to know how often this
kind of clinically important variation in ben
efit goes undetected and leads harmfully to
over- or under-treatment. One might say that
the conventional approach to reporting overall
results of clinical trials consigns us to an impov
erished perspective similar to that described
in Edwin Abbott's
19th-century science
fictionnovella Flatland. in Flatland, characters
inhabit a two-dimensional plane and perceive
objects only if they intersect this plane; the
world of three dimensions is unfathomable. In
our medical Flatland, all the rich data from a
trial is flattened into a single effect; a therapy
eitherworks or itdoesn't. This binary outcome
seems useful, since it conforms well to the
binary decisions doctors must make: to treat
or not treat.But the treatment decision is easy
only because it's fitted to the average patient,
not to real individuals. Analyzing and present
ing clinical trial results across dimensions of
risk can provide us with amore flexible,multi
dimensional evidence base for treating actual,
not average, patients.
Bibliography
Hayward, R. A., D. M. Kent, S. V?jan and X P. Hofer. 2005.

Reporting clinical trial results to inform providers, pay
ers and consumers:
the need to assess benefits and
harms for lower vs. higher risk patients. Health Affairs
24:1571-1581.
Hayward, R. A., D. M. Kent, S. Vijan and X P. Hofer. 2006.
Multivariable
risk prediction can greatly enhance the
statistical power of clinical trial subgroup
analysis.
BMC Medical Research Methodology. 6:18.
J.P., and J.Lau. 1997. The impact of high-risk
patients on the results of clinical trials. Journal ofClinical
Ioannidis,
50(10):1089-1098.
Epidemiology
of the base
Ioannidis, J.R, and J.Lau. 1998. Heterogeneity
line risk within patient populations
of clinical trials:
a
evaluation
algorithm. American Journal of
proposed
148(11):1117-1126.
Epidemiology
Kent, D. M., R. A. Hayward,

J.L. Griffith et al. 2002. An in
derived and validated predictive model
dependently
infarction who
for selecting patients with myocardial
are
activa
likely to benefit from tissue plasminogen
tor compared with Streptokinase. American Journal of
Medicine
113:104-111.
Kent, D. M., R. Ruthazer and H. R Selker. 2003. Are some

to benefit from recombinant
tissue
patients
likely
activator for acute ischemie stroke
type plasminogen
even
3 hours from symptom onset? Stroke
beyond
34:464-467.
Rothwell, P.M. (editor). Formcorning. Treating Individuals:

From Randomised Trials and Systematic Reviews toPerson
alisedMedicine
inRoutine Practice. London:
The Lancet.
P.
S. A. Gutnikov
.,Z. Mehta, S. C. Howard,
P. Warlow.
2005. Treating individuals 3: from
ex
to individuals: general
subgroups
principles and the
ample of carotid endarterectomy. The Lancet Qan 15-21)
Rothwell,
and C
365(9455):256-265.
95

PDF

Hochgeladen von

Dokumentinformationen

Originalbeschreibung:

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

PDF

Hochgeladen von

Copyright:

Verfügbare Formate

Sigma Xi, The Scientific Research Society

Analyzing theresultsofclinical trials toexpose individualpatients' risks

David Kent and Rodney Hayward

1793,when yellow fever reached Philadel

David Kent and Rodney

specializing ingeneral internal

professor ofmedicine at Tufts

sociate director of the clinical

professor of clinical research at

is the director of the

recipient of theUnder Sec

dress for Kent: Institute for

result of the trial. The treatment-effectcan be

typical clinical trial

is 2 percent and the relative risk reduction is 33

agent. But it isnot clear that t-PAwould benefit

In thegusto trial lower-riskpatients likeVlad

fectmight be different.When we used a risk

John Ioannidis and Joseph Lau, our col

It should be obvious thatwhen there is this

the enrolled patients. But these analyses almost

not effective is that a patient's outcome can be

who thereforehave widely diverging risk-ben

is less than 4 percent, but the average risk is

FindingAnswers forVladimir and Estragon

or more), they benefited from endarterectomy,

four days ago:

temporarily lost the

female with a history

and Viola, a hypothetical

fact only one

in five of those patients

from the surgery.

timesince lastevent (weeks)

timesince lastevent (weeks)

and nature of stenosis.

the five-year risk of a new

For relevant Web

affect treatment-related harm and other fac

because theybelieved theiroverall resultsmore

Hayward, R. A., D. M. Kent, S. V?jan and X P. Hofer. 2005.

Kent, D. M., R. A. Hayward,

Kent, D. M., R. Ruthazer and H. R Selker. 2003. Are some

Rothwell, P.M. (editor). Formcorning. Treating Individuals:

inRoutine Practice. London:

Das könnte Ihnen auch gefallen