I need 50 participants in my study About 5 individuals per year will be enrolled Therefore, it will take 10 years to finish the study I will follow a cohort of 500 individuals A lab test that costs US$100 will be conducted for each person Therefore, I will need US$50,000 just for lab tests

I have 2 years to finish my thesis, of which one year is for data collection I think I can get data on 50 people in that year Is 50 a sufficient number of people to test my hypothesis with the significance level I want?

To show that under certain conditions, the hypothesis test has a good chance of showing a desired difference (if it exists) To show to the funding agency that the study has a reasonable chance to obtain a conclusive result To show that the necessary resources (human, monetary, time) will be minimized and well utilized

Most Important: sample size calculation is an educated guess It is appropriate for studies involving hypothesis testing There is no magic involved; only statistical and mathematical logic and some algebra Researchers need to know something about what they are measuring and how it varies in the population of interest

Quantities related to the research question (defined by the researcher)

Previous published studies Pilot studies If information is lacking, there is no good way to calculate the sample size!

Population factor

Study Design

Type of response variable or outcome Number of groups to be compared Specific study design Type of statistical analysis

In conjunction with the research question, the type of outcome and study design will determine the statistical method of analysis

Difference between two or more means Odds ratio 2 Change in R , etc

The magnitude of these values depend on the research question and objective of the study (for example, clinical relevance)

Smaller error greater precision need more information need larger sample size

Prob (type II error) = Prob (dont reject H0 when H0 is false) = Power =1-

The equation for sample size is derived from the equation for the statistical test In a t-test the equation for the test is

t=

The derived equation for sample size is n = (z1-/2 + z1- ) 2 (s12 + s22) (m1 - m2)2

Question: does exercise help to decrease body weight? Study design: participants will be randomized into two groups (exercise and control) Outcome: change in weight Want to detect: a change of at least 15 pounds Known: from past studies, the standard deviation varies between 10 and 15 pounds.

Number of Groups: 4 Hypothesized means: 35, 20, 25, 18 (possibly from a pilot study) Sample size pattern: same number in each group SD of subjects: 18 (from a previous study) = 0.01 and 0.05 Find power for sample sizes from 5 to 30 per group (increments of 5)

Research Question: is depression score an important factor in explaining pain ratings, after adjusting for age and sex? Statistical question: does adding depression score increase the explained variation of pain ratings, in a linear regression model that already has age and sex in it and has R2 =.2? Suppose I may have sample sizes of 20, 30, 50, 70, and 100. What is the minimum R2 change I can detect with power .8?

Different methods of data analysis require different input for sample size calculations

Logistic Regression

Repeated measures

Read chapter 2 of Statistical Rules of Thumb, by Gerald van Belle (2002, John Wiley and Sons) Using specialized software is useful if many calculations will be performed

Important to remember

Pilot studies do not need sample size calculation!!! There is no point in doing power analysis after the study is done Sample size is an educated guess, and it works only if:

The study samples comes from the same or similar populations to the pilot study populations The population of interest is not changing over time The difference or association being studied exists

E = m1 - m2

spooled

Question: How many more people do I need to enroll in the study (already in progress) to show statistical significance? Answer: It depends If the two populations have the same mean, increasing the sample size will not help! Since when is the objective of a study to find a statistically significant result??

Researcher is interested in outcome A, which differs very little for two treatments Sample size needed is around 3000!! Researchers changes the outcome to B, where sample size is smaller B does not answer the researchers question and he needs to accept that his new treatment is not really different (clinically speaking) from the already existent treatment

Researcher is interested in comparing two groups regarding prediction of outcome A by using a regression analysis (using several variables) He uses the only available formula from his statistical book (for a t-test) Wrong! He should find a software that can calculate the sample size appropriately

Summary

Define research question well Consider study design, type of response variable, and type of data analysis Decide on the type of difference or change you want to detect (make sure it answers your research question) Choose and Use appropriate equation sample size calculation

