Sie sind auf Seite 1von 41

Statistics and Experimental

Design for Animal Research:


A Gentle Introduction
Robert J. Tempelman
Department of Animal Science
Michigan State University
CANR Statistical Consulting Center
http://www.fw.msu.edu/orgs/canr_big/SCC.htm

What is statistics???

Ott and Longnecker (2001):


"science of learning from data

Biometry/Biostatistics:
Statistics

applied to biology
Double meaning of biometrics

Biological research involves data!!


1)CollectingData

ExperimentalDesign

2)SummarizingData

Simplenumericalandgraphicaldescriptions

3)AnalyzingData

Formalstatisticalmethodsforhypothesistestingand
estimation

4)CommunicatingResults

DiscussionandInterpretation

Biological data will always be noisy

Why?

In a study, data is collected from a finite sample


randomly drawn from a conceptually large
population.
Every subject responds to the same treatment
differently (experimental error).
Even the same subject might not respond the
same way to the same treatment from one day to
the next (measurement error).

Therefore, ALWAYS an element of uncertainty


in drawing conclusions from the statistical
analysis of data from finite samples.

Experimental Design
Definition: Plan for assigning experimental units to treatments.

Simplest experimental design:


Completely Randomized
Design (CRD)

In a CRD, experimental units are :


1) Randomly chosen from a representative populationthen

2) Randomly assigned to one of several treatments


experimental units should be as homogeneous as
possible; otherwise consider blocking (see later)
- Randomization is essential to remove systematic bias!

Blood cholestorol example

Experimental study with randomization:

6 rats assigned at random to one of 2 treatments (n=3 rats per


treatment)

Blood Cholestorol Data collected (mg/dl):


NewTreatment(A)

ControlTreatment(B)

yA1=144

A1

yB1 = 149

B1

yA2 = 148

A2

yB2 = 148

B2

yA3=146

A3

yB3 = 159

B3

Would you
Aveconclude
yA = 146 the treatments
Ave yB lead
= 152to different mean
blood cholestorol levels ????

A MEAN TREATMENT
DIFFERENCE IS FOUND!

Statistical
inference

But is it
Due to mere chance
(biological noise) ???
Or
The real thing (beyond
reasonable doubt)?

Statistical analysis for previous slide: consider a regular t-test

Significance and Power

Practical significance vs. statistical


significance
Statistically

significant results may not


be practically important.

Statistical Power issues!


Was

the study large enough to allow a


reasonable chance of definitively
concluding a treatment differenceif
one truly existed?

ScientificMethod
a)ReviewandResearchtheproblem
b)FormulateHypothesis
c)Designexperimentthatwillallowtestofhypothesis
d)Evaluatethehypothesis
e)DrawConclusions

Howdoesstatisticalinferencework?

Inferuponthecharacteristicsofalargepopulationbasedon
datafromafiniterandomsample
Mechanics(PartoftheScientificMethod):

Designandcollectdatafromanexperiment(e.gbloodpressure).
Assesstheprobabilityofgettingtheexperimentalresultsassuming
atruenullhypothesis(statusquoknowledge..e.g.notreatment
difference).
Commoninvestigatorobjective:disprovethestatusquoinfavorof
analternativehypothesis(thereisatreatmenteffect).

Conclusionsnevermadewithabsolutecertaintymust
establishproofbeyondareasonabledoubt.

Terminology: Sample versus specimen

Biologist draws blood from 20


people
A

biologist might state that he/she has


20 samples of blood.
Statistician would state that the
biologist has one sample of 20 glucuse
measurements.

20 specimens or 20 experimental units


rather than 20 samples.

SAMPLE VERSUS POPULATION

Sample vs. Population


(All rats)

Sample
Population
Target pop'n

Random

(Humans)

(Humans)

Judicious inference
Actual versus
target
population
differences could
be far more
subtle

Variables,Variables!!!

Quantitativevariables:

Qualitativevariables:

Duetoatruenumericalmeasurement.
Ratioscale(e.g.weight)versusIntervalScale(e.g.temperature)
Discrete(countable)versusContinuous

Nominalscale(classificationorgroup)GENOTYPE,SEX
Ordinalscale (ranked variables..small, medium, large)

Dependent(responsevariablese.g.weight)asafunctionof
Independentvariables (GENOTYPE,SEX)

Discrete(quantitative)versusordinal
(qualitative)variables

1) Litter size (discrete quantitative)

SOWID

BREED

LITTERSIZE

Duroc

10

Hampshire

12

Duroc

Duroc

11

Hampshire

Discrete(quantitative)versusordinal
(qualitative)variables(contd)

2) Calving ease scores (ordinal 1-5


scale in Holsteins)

COWID

Calvingeasescore

Madonna

Cher

Oprah

Hillary

1=
unassisted
5=
Caesarean

Other variable issues (contd)

Dont confuse discrete variables


with truly continuous variables
i.e.somevariablesappeartobediscrete(integers)

becauseofdatarecordingroundoff.

e.g.ageofcattlerecordedtothenearestmonthisa
continuousvariable.
Numberofmastitiscaseswithinalactationisa
discretevariable

Parameters versus statistics


Population:

Characterized by
parameters e.g mean:
Size = N

Random
selection

Size = n

Statistical inference

Sample:
characterized by
statistics e.g mean: y

A sample statistic is an
estimator of a
population parameter

Usual distributional assumption for


continuous responses: Normality
30

Distribution of weight
gains of 100 baby
chicks over specified
time period.

20

10

0
3.7

3.9

4.1

4.3

4.5

GAIN MIDPOINT

4.7

4.9

Data does not have to


be perfectly normally
distributed to use
common statistical
procedures (t-tests,
ANOVA)

Pseudo replication (an obvious


example from Gill, 1978)
Suppose 6 rats per treatment, each measured
twice (stimulus response to drugs on two
different occasions).

How much
replication?

e.g. Rat A1 had response 33 1st time, then 35


Rat Number
Within Treatment

Treatment A

Treatment B

33,35

A1

37,33

A1

39,38

A2

31,30

A2

29,31

A3

43,45

A3

41,41

A4

36,38

A4

34,36

A5

30,39

A5

26,23

A6

38,39

A6

n = 6not 12

Biological
versus technical
replication
(subsampling)
One rat per
treatment: no
replication

Remedy? Average each experimental


units responses
Rat Number
Treatment A
Within Treatment
1
34

Treatment B
35

38.5

30.5

30

44

41

37

35

34.5

24.5

38.5

Treat each experimental units average as the responsethen do regular t-test.


Note: there are still benefits to subsampling: controls measurement error. -> but
always better off increasing number of rats per treatment than number of
measurements per rat.

Animal in pens

Suppose you have two pens/litters


of pigs:
Each

pen has four pigs


All pigs in Pen # 1 receive Diet A
All pigs in Pen # 2 receive Diet B

Do you have replication?

Pen # 1 -> Diet A

Pen # 2 -> Diet B

Animals in pens (contd)

Answer to question on previous


slide: NO! n = 1
Pens/Litters are the experimental
units for diets
Pigs

See also

within pens are merely


pseudoreplicates!
Need several pens per diets in order to
have a valid study.

Wainwright, Patricia E. 1998. Issues of Design and Analysis Relating


to the Use of Multiparous Species in Developmental Nutrition
Studies. Journal of Nutrition 128:661-663.

Instructions to the Authors


(Journal of Dairy Science, 2007)
The

experimental unit is the smallest unit to


which an individual treatment is imposed . For
group-fed animals, the group of animals in the
pen or the paddock is the experimental unit;
therefore, groups must be replicated.

i.e. must have more than 1 pen per treatment (and


2 might not be nearly enough!)

Basic design concepts

Randomization
Experimental

units need to be randomly assigned


to treatments!

Replication
Several

experimental units per treatment needed


to assess experimental error
Power: Having sufficiently large enough sample
size (experimental units) to detect a mean
differenceif one truly exists

Blocking
Similar

experimental units could be blocked


together and randomization of units to
treatments conducted within each block

BLOCKING

e.g.youwishtotestanewdietsupplement(TreatmentB)
versusacontroldiet(TreatmentA)forgrowthinpiglets
establishedthatthereareknownlitter/peneffectsongrowth.
-> then consider blocking on litters!
Supposethesizeofeachlitterisstandardizedtotwopigs.

WerandomlyassignonepigletwithineachlittertoTreatmentAandthe
remainingpiglettoTreatmentB.

Thisisanexampleofarandomizedcompleteblockdesign(RCBD)!

The Randomized Complete Block Design


(RCBD)
Population of litters of size 2

Draw random sample of litters


Litter 1
Trt A

Trt B

Litter 2
Trt B

Trt A

Randomly assign
treatments to piglets
within litters

Litter n
Trt A

Trt B

Why block?

Remove block (e.g. litter) as a source of


variability
Greater

statistical power since treatment


comparisons conducted WITHIN each litter
Basis of paired t-test when block size = 2.

Remove litter as a potential source of


bias
Other examples of blocking?
Identical

twins
"Before and after" treatment on same
subject

Crossover designs

Where animals are blocks for


treatments
2 period crossover:

Design is balanced with respect to diets and periods.


e.g. not good idea to always feed Diet A in Period 1 and Diet B in
Period 2otherwise Diets and Periods are confounded with each
other

etc.

Typical experimental designs exploiting


blocking and used for animal science.
Variants

of crossover designs exploiting power of


within-animal comparisons have been chosen for
comparing two treatments
Constructed to be balanced with respect to
periods
4-period

crossovers double reversal


half of animals -> A B A B other half -> B A B A
3-period crossover switchback
half of animals -> A B A
& other half -> B A B
2-period crossover simple crossover/Latin square
half of animals -> A B
& other half -> B A

Table 1 from Cox (1980)

Cox, D.R. 1980. Design and analysis in nutritional and physiological


experimentation. Journal of Dairy Science 63:313-321

Table 1. Classification of 24 weight gains measured on


12 cows fed in two pens (2 period crossover)

Period

Pen 1 containing
Cows 1 to 6

Pen 2 containing
Cows 7 to 12

One
(4 wks)

Diet A used and


6 gains recorded

Diet B used and


6 gains recorded

Two
(4 wks)

Diet B used and 6


gains recorded

Diet A used and 6


gains recorded

The experimental units in this situation were pens of


animals in a given period no way to separate effects
of diet from all other possible factors.

Coxs internal torment


One

merely could assume that effects of


pens were negligible. This is equivalent to
assuming that, when feeding a single diet,
the gains of two animals in the same pen are
no more alike than the gains of two animals
each in different pens
BUT
Pen-to-pen variation has been important too
often to make such an assumption credible .

How many animals do I need for a


study!
It

depends on:

Your

design (blocking versus not blocking, size of


experimental unit -> pen vs. animal)
True mean difference that you hope to detect!
A

Relative

amount of variability ()

responses within same animal (e measurement error)


Between animals (a innate or biological variability)
Between litters/pens (p where applicable)
Between

Type

I error rate: Probability of concluding a treatment


effect when one doesnt exist (typically set <5%)
Why

we choose P-values < 0.05

Desirable

power: Probability of concluding a treatment


effect when one truly exists (typically set > 80%)

Well, how do we specify some of


this stuff?

Literature

and educated guessing


Uniformity trials or existing data on subjects in current
facility under regular management conditions.
Range approximation

1/4 x Range of responses


e 1/4 x range of responses within same subject and
treatment
a 1/4 x range of responses between subjects within same
treatment.
p 1/4 x range of average responses between pens within
same treatment.

What

would be a practically important specification for


A B

Dairy Example: Relationship between


within-cow variance 2e (kg2) and DIM (by
parity)
Early
lactation

Late
lactation

Reasonable
assumption:

2 e = 2 a
From Jensen, J. 2001. Genetic evaluation of dairy
cattle using test-day models. Journal of Dairy

CRD power for individual animal


study as function of n and
Early lactation
2e + 2e= 20 kg2

Power

Power

Late lactation
2e + 2a= 4 kg2

Two period crossover trial for


individual animal trials
Late lactation
2e = 2 kg2

Early lactation
2e = 10 kg2

Binary data?
Yes

or no responses
e.g. mastitis,
conception rate,
Consider comparison
of Trt A versus Trt B
Incidence

rate for Trt

A = 5%
Incidence rate for Trt
B = 7.5%, 10%,
12.5%,., 25%

CRD Power to conclude a


difference in incidence
rates between Trt A (0.05)
and Trt B (0.075 to 0.250)

Power calculators

Questions?

Das könnte Ihnen auch gefallen