Sie sind auf Seite 1von 25

Biostatistics

Statistical analysis is needed to draw a valid and unbiased conclusion. Mainly five statistical tests are there in biology. Three tests for analysis concerned with differences and two for analysis concerned with associations.

Differences-

Students T Test Mann Whitney U test Wilcoxon matched pairs test (W Test)

AssociationsChi Squared Analysis (c2) Spearmans Rank Order Correlation Coefficient(rs)

Summary
Chi-squared test- used to determine whether the difference between the observed number and the expected number is statistically significant or not. If the chi-square value is greater than the critical value at 5% significant level then reject the null hypothesis

Spearmans rank correlation co-efficient- used to determine if there is any correlation between dependant and independent variable.
If the calculated value is greater than the critical value at 5% significant level, then reject the null hypothesis

T test- used to determine whether the difference in the mean values for 2 sets of data measuring the same variable is significant or not. If the calculated t value is greater than the critical value at 5% significant level, then reject the null hypothesis. The difference between the two means are statistically significant. Utest- used to determine whether the difference in the median values for 2 sets of data measuring the same variable is significant or not.. From the two U values, if the smallest U value is less than the critical value at 5% significant level, reject the null hypothesis. There is a significant difference between the median values. W test- used to determine whether the difference for 2 sets of data measuring the same variable is significant or not. From the two W values, if the smallest W value is less than the critical value at 5% significant level, reject the null hypothesis. There is a significant difference.

Chi Squared Analysis (c2)


A statistical method used to determine goodness of fit. It refers to how close the observed data are to those predicted from a hypothesis. It evaluates to what extent the data and the hypothesis have a good fit.

Chi Squared Analysis (c2)

where
O = observed data in each category E = observed data in each category based on the experimenters hypothesis S = Sum of the calculations for each category

c2 = S

(O E)2
E

Consider the following example in Drosophila melanogaster


The Cross: A cross is made between two true-breeding flies (c+c+e+e+ and ccee). The flies of the F1 generation are then allowed to mate with each other to produce an F2 generation.

The outcome
F1 generation
All offspring have straight wings and gray bodies

F2 generation
193 straight wings, gray bodies 69 straight wings, ebony bodies 64 curved wings, gray bodies 26 curved wings, ebony bodies 352 total flies

Applying the chi square test


Step 1: Propose a null hypothesis (Ho) that allows us to calculate the expected values based on Mendels laws The two traits are independently assorting

Step 2: Calculate the expected values of the four phenotypes, based on the hypothesis According to our hypothesis, there should be a 9:3:3:1 ratio on the F2 generation
Phenotype Expected probability
9/16

Expected number

Observed number

straight wings, gray bodies straight wings, ebony bodies curved wings, gray bodies curved wings, ebony bodies

9/16 X 352 = 198

193 64 62 24

3/16

3/16 X 352 = 66

3/16

3/16 X 352 = 66

1/16

1/16 X 352 = 22

Step 3: Apply the chi square formula

c2 =

(O1 E1)2

(O2 E2)2

(O3 E3)2

(O4 E4)2

E1

E2

E3

E4

(193 198)2 2 c = 198

(69 66)2 66

(64 66)2 66

(26 22)2 22 Observed number 193 64 62 24

c2 = 0.13 + 0.14 + 0.06 + 0.73

Expected number 198 66 66 22

c2 = 1.06

Step 4: Interpret the chi square value The calculated chi square value can be used to obtain probabilities, or P values, from a chi square table
These probabilities allow us to determine the likelihood that the observed deviations are due to random chance alone

If the chi square value results in a probability that is less than 0.05 (ie: less than 5%) it is considered statistically significant The hypothesis is rejected

Before we can use the chi square table, we have to determine the degrees of freedom (df) The df is a measure of the number of categories that are independent of each other If you know the 3 of the 4 categories you can deduce the 4th (total number of progeny categories 1-3) df = n 1 where n = total number of categories In our experiment, there are four phenotypes/categories Therefore, df = 4 1 = 3

Drawing valid and reliable conclusion


The calculated value must be compared with the critical value at the 5% significant level. If the calculated chi square value is greater than the critical value, at 5% significance, then reject the null hypothesis In biology we are always using 5% significance level or p= 0.05 level. Always 95%confident in our conclusion.

Graph is a line of best fit

Spearmans Rank Correlation Coefficient


It can be used with ordinal data.

It can be used to determine whether there is significant association between two measured variables.

Calculation of Rank Correlation Coefficient

Find out the critical value and the degree of freedom

Drawing valid and unbiased conclusion

Compare calculated value with the critical value at the 5% significant level
If the calculated correlation coefficient is greater than the critical value, at 5% then reject the null hypothesis

Example: GNP and adult literacy


Let us test whether there is a relationship between GNP per capita and educational provision.
GNP per capita Nepal Sudan Gambia Peru Turkey Brazil Argentina 210 290 340 2460 3160 4570 8970 % adult literacy 39.7 55.5 34.5 89 81.4 84 97

Israel
U.A.E. Netherlands

15940
18220 24760

96
74.3 100

First, construct a null hypothesis (Ho) that there is no relationship between GNP per capita and % adult literacy. Remember, Spearmans Rank can only be used with ordinal data.

It is necessary, therefore, to rankorder the data first.

Rs = 0.733

Null hypothesis (Ho) was that there was no relationship between GNP per capita and % adult literacy. The degrees of freedom are (n 1). So df = 9. Spearmans Rank correlation coefficient (Rs) result of 0.733 exceeds the 95% probability value of 0.60 at 9 degrees of freedom. Therefore the Ho must be rejected and replaced by the alternative hypothesis (H1) that there is a relationship between GNP per capita and % adult literacy.

Das könnte Ihnen auch gefallen