Beruflich Dokumente
Kultur Dokumente
Statistics are used in clinical trials to make inferences about new treatments based on the evidence of the patients in the trial
E.g. New drug for treatment of lung cancer does it work or not?
Ideally design trial that includes all patients with lung cancer
Can only test the new treatment on a representative sample of the population Statistics allow us to draw conclusions about the likely effect on the population using data from the sample
USING STATISTICS
the weight of evidence that a treatment works (or doesnt) Give an estimate (and likely range) of the treatment effect Test to see how likely it is that this effect would have been seen by chance
BUT
Statistics can never PROVE anything beyond any doubt, just beyond reasonable doubt!!
WHO TO INCLUDE?
As a general rule, all patients randomised should be analysed by treatment allocated (regardless of whether they actually received this treatment) INTENTION TO TREAT ANALYSIS Reasons for ITT:
Avoids or certainly minimises risk of bias Is more pragmatic reflects real life
HYPOTHESIS TESTING
We want to compare the outcomes in different treatment arms (A and B) Testing two hypotheses
Calculate test statistic based on the assumption that H0 is true (i.e. there is no real difference) Test will give us a p-value: how likely are the collected data if H0 is true If this is unlikely (small p-value), we reject H0
The p-value is the probability of having observed our data when the null hypothesis is true
Typically if the p-value is less than 0.05, people say that the trial gives statistically significant evidence that there is a difference Tend to ignore results where p-value greater than 0.05 However, 0.05 is a purely arbitrary value, and not really that small one time in twenty we will reject H0 wrongly!
Dont become wedded to the p-value: there is not much difference between 0.051 and 0.049
Better still, use the data collected in the trial to give an estimate of the treatment effect size, together with a measure of how certain we are of our estimate
To determine the true treatment effect, we calculate the confidence interval for our point estimate CI is a range of values within which the true treatment effect is believed to be found, with a given level of confidence. 95% CI is a range of values within which the true treatment effect will lie 95% of the time Generally, 95% CI is calculated as Sample Estimate 1.96 x Standard Error Use the confidence interval to assess the true treatment effect, and not just p-values
DATA ANALYSIS
How do we do this? What type of analysis should be performed? Depending on the sort of outcome measure, different types of analysis are appropriate Because the actual analyses are now done mainly by computer, the skill is now:
CATEGORICAL DATA
Outcomes like good/bad, yes/no or present/absent In testing categorical data, we are looking to see if there is any relationship between the outcome category and the treatment given
For categorical data, the chi-squared test is appropriate if the categories arent ordered For ordered categories, use a trend test
Aspirin Total
No Aspirin 1016
7584
15,367
8600
17,187
- Use chi-squared test of association to determine whether to reject the null hypothesis of no association between aspirin and death
Aspirin Total
- Use chi-squared test of association to determine whether to reject the null hypothesis of no association between aspirin and death
- X21 = (804 909.3)2 / 909.3 + + (7584 7689.3)2 / 7689.3 = 27.26 - X21 = 27.26 (P<0.0001)
Tested hypothesis and found strong evidence of an association between aspirin use and mortality Not very informative - is aspirin harmful or beneficial?
Various measures of treatment effect:
Absolute Risk Reduction Number Needed to Treat Relative Risk Relative Risk Reduction Odds Ratio Odds Reduction
<1 so odds of dying smaller with aspirin 95% CI for the odds ratio = 0.70 to 0.85
With true treatment effect based on CI ranging from a 15% reduction in mortality to a 30% reduction in mortality with aspirin
Moderate treatment effect, narrow-ish CI and P<0.0001 Good evidence that aspirin reduces risk of death following MI
Number needed to 1 / |P1 - P2| treat/harm (NNT/NNH) Relative Risk (RR) Relative Risk Reduction (RRR) P2 / P1
(P1 - P2) / P1 (0.118 0.094) / 0.118 = 0.20 (i.e. aspirin reduces the risk of death by 20%)
CONTINUOUS DATA
Outcomes like blood pressure, weight or scores, summarised using measures of the centre and spread of the distribution
Mean: what we think of as an average add up all data and divide by number of items Median: midpoint of the data half data below median, and other half above Mode: most popular observation
Measures of spread
Variance and standard deviation Standard deviation is average distance individual observations are from the mean
CONTINUOUS DATA
In continuous data, we are comparing the means in the two groups and assessing whether the two groups come from the same population
H0: Mean A = Mean B H1: Mean A Mean B
NORMAL DISTRIBUTION
T-test and ANOVA assumes data are Normally distributed However, if the data are very skew or have multiple peaks, we use a non-parametric test which doesnt assume any particular shape for the data
Wilcoxon Mann-Whitney
Treatment A
N 41 43
SD 5.5 5.5
Treatment B
- Use Students t-test to assess whether means are from the same population (i.e. Mean with Treatment A = Mean with Treatment B)
Tested hypothesis and found evidence that mean diastolic BP in two groups are different Not very informative which of treatment A or B is better?
Point estimate of the treatment effect - calculate the difference between the two means and the confidence interval
So the difference in mean diastolic BP between groups is statistically significant (P=0.0013) With treatment A being more effective in reducing diastolic BP However, the observed difference of 4mmHg in favour of treatment A, could be as small as 1.6mmHg or as large as 6.4mmHg.
SURVIVAL DATA
Interested in studying the time between randomisation and a subsequent event (say death) These times are unlikely to be normally distributed Cannot afford to wait until events have happened to all subjects, for example until all are dead. Some people may have left the study early and become lost to follow up - only information we have about some patients is that they were still alive at last follow-up.
Use survival analysis methods to analyse time to event data, not just the number of events
Take into account that not all patients may have had an event
Basic idea: we split the trial up into distinct time intervals In each time interval: a certain number, N, patients enter that time period alive and still on follow-up, and some of these, D, have an event: Then the probability of surviving that time interval (assuming you live that long) is (1-D/N) Multiply all these probabilities together to give the probability of survival up to a given time point
.8
.6
.4
Time in W eeks
IFN Total
151 307
187
180
338
336
No IFN 156
367
674
Want to assess whether the time to death is the same for the two treatments?
We will have two graphs: how do we say whether one group survives longer than the other?
Could do one test at say 1 year; compare proportions (as before) Could keep testing at small intervals
What are the drawbacks to these methods? Use logrank test to determine whether survival function the same for two treatment groups
H0: Survival function/curve same for both groups H1: Survival function/curve different across groups
Assessed the evidence and found that there is no evidence that time to death differs between the treatment groups Despite lack of difference should still calculate point estimate and confidence interval for treatment effect
Use cox regression to calculate hazard ratio and confidence interval
IFN non-significantly reduces the risk of death by 6%, with the true treatment effect based on the confidence interval ranging from a 25% reduction in mortality to an adverse 18% increase in mortality with IFN.
Do not explore all endpoints until you find one that is significant (data dredging)
Looking at multiple outcomes, increases chance of finding something significant In 20 outcomes, just by chance 1 outcome will be significant Is this real, or the play of chance?
Give confidence intervals where possible, and not just p-values Keep subgroup analyses to a minimum
Subgroup analyses should be pre-specified When interpreting subgroups, assess whole picture Do not focus upon one subgroup and individual p-values
FINAL WORDS
The idea of statistics is to look at the strength of the evidence for a given hypothesis and determine the reliability of the treatment effect observed in the trial Calculations are based on formulas, but the application of the formulas and the interpretation of the results is an art rather than a science Significance is not black and white
A little common sense can go a long way in medical statistics If in doubt, ask a statistician!
To call in the statistician after the experiment is done may be no more than asking him to perform a post mortem examination: he may be able to say what the experiment died of.
Sir R.A. Fisher
Indian Statistical Congress, Sankhya, c. 1938
BOOK LIST
Swinscow TDV and Campbell MJ. Statistics at Square One (10th edition). BMJ Books 2002
Campbell MJ. Statistics at Square Two. BMJ Books 2001 Altman D, Machin D, Bryant T and Gardner M. Statistics with Confidence. BMJ Books 2000 Pereira-Maxwell F. A-Z of Medical Statistics. Arnold1998