Table 8-1 Understanding Study Results Table 8-2 Examples of Summary Rates from the Women's Health Initiative (WHI) Study

Health Initiative (WHI) Study

Typical summary rates from randomized, controlled trials:
Incidence rate= Number of new cases of disease over a defined period Number of persons at risk during
the period
Relative risk (RR)
_ Incidence rate among the treated group Incidence rate among the placebo group
Summary measures that may be more meaningful for clinicians:
Attributable risk (AR), or risk difference = (Incidence rate among treated group) (Incidence rate among
placebo group)
Number needed to treat (NM) or number needed to harm (NNH) = Reciprocal of AR, art/AR
The decreased likelihood of a healthy user bias in an RCT may explain why HRT appeared to be
protective its cohort studies but later proved to be harmful. Because RCTs have this inherent ability to
remove many important potential forms of bias (but are not immune to biases themselves), a physician
can have more confidence that they reflect the true association between the if RI' treatment and CH D
outcomes. Despite decades of work, dozens of observational studies, and structured reviews that
strongly suggested a protective effect of HRT for CF1D, a single, large RCT trumped them all and caused
a sudden reversal in physicians' prescribing behavior. The results of the WI II study, released in 2002,
sent a shock wave through the medical community. For the first time, a large, randomized trial showed
that FIRTgiven to otherwise fairly healthy postmenopausal womencaused a statistically significant
increase in CHD events. Within days of the release of the Will primary results, many women called their
physicians to decide whether they should continue with I (RT. Many physicians drastically changed their
prescription of (RI' based on the Will; within 9 months, prescriptions of the most popular formulation of
I IRT decreased by as much as 61% (Majurndar et al., 2004). Perhaps more than any other single study in
modern medical history, the WHI report dra
matically changed a widespread, COIM11011 medical practice.
Understanding the Statistical Significance of Study Results
Reports from RCTs such as the WI II study frequently include relative risk as a summary measure of
differences between the treatment and placebo groups (Table 8-1). To arrive at the relative risk, the
researcher first measures the incidence rate of an outcome in each of the two study groups (i.e.,
treatment and placebo). The incidence rate for each group is a ratio of the number of new outcome
events, such as CH D events, divided by the number of patients at risk for the outcome in that group
over a specific period. In multiyear studies, the average annual incidence rate is often reported as a
summary measure. In a placebo-controlled RCS', the relative risk is then calculated as a ratio of the
incidence rate for the treatment group divided by the incidence rate for the placebo group ('Table 8-2).
The following equations show how to take a summary rate commonly reported in published studies (ie.,
relative risk) and calculate a summary measure leg., number needed to treat, number needed to harm)
that may be more usefulm describing the results to clinicians and patients. The example considers the
average annual incidence rates and relative risk for coronary heart disease KHDI events in the Writ study
on the effects of hormone replacement therapy IHRT):
Average annual incidence among HAT treated women = 37 CHO events/year/10,000 women
Average annual incidence among placebo treated women = 30 CHD events/year/ 10,000 women
Relative RiskofCHD
37CHDevents /10,000 women _ 29 (admted.)
30CHDevents/10,000 women
The relative risk describes a relative 29% increase in CHD events. It may be more useful to consider the
absolute difference in incidence rates between the two groups to understand the magnitude of the
potential risk tor a ghten patient:
37CHD events 30CHD events Attributablefisk(AR)
10.000 women 10,000 women
CliDevent5 10,000 women
The number needed to harm (NNH) can be calculated to describe, on average, how many women must
be treated fort year to cause one additional CHD event attributable to HAT:
NNH 10'000 1430 7 CHD events / 1000 women
Data from EbelIMH, Messmer SR, Barry HC.Puning cornputeebased evidence in the hands of clocians.
AMA 1994,2B:1171-1171.
How can a physician determine whether the reported relative risk from a study is significant enough to
influence clinical decisions? Typically, the statistical significance of the summary measure is reported,
which in this case is relative risk. Statistical significance is usually summarized in published studies by up
value for a given summary measure. The p value describes the statistical probability that the observed
difference between the groups could have happened simply by chance alone. A p value of less than 0.05
is the arbitrary cutoff most often used for "statistical significance." A "p <0.05 " means that there is less
than a I in 20 (5%) probability that a difference as large as that observed would have occurred by chance
alone; a p = 0.04 means a I in 25 (4%) probability; up = 0.06 means a Ito 16 probability (6%).
Although frequently used, p values provide only limited information: the chance that any difference
found is caused by chance, or random error. A p value alone gives no indication of the clinical
significance of a finding and provides no information regarding the likelihood that a finding of "no
difference" is caused by chance, or random error.
Confidence intervals are much more informative than p values. When relative risk is reported as the
summary result of a study, the 95% confidence interval (Cl) is often used to give an