Biostatistics and Orthodontics

Biostatistics
First part
• History of biostatistics
• Definition of biostatistics
• Basics of research methodology
• Measures of central tendency
• Measures of dispersion
• Methods of Data presentation
Second part
• Sampling variability
• Significance
• Tests of significanc
Hypothesis testing
What is a Hypothesis?
What is a Hypothesis?
•A hypothesis is an I assume the mean GPA
assumption about the of this class is 3.5!
population parameter.
– A parameter is a
characteristic of the
population, like its mean
or variance.
– The parameter must be
identified before
analysis.
.
The Null Hypothesis, H0
• States the Assumption (numerical) to be tested
• e.g. The grade point average of juniors is at least
3.0 (H0:  3.0)
• Begin with the assumption that the null
hypothesis is TRUE.
(Similar to the notion of innocent until proven guilty)
•Refers to the Status Quo

•Always contains the ‘ = ‘ sign
•The Null Hypothesis may or may not be rejected.
The Alternative Hypothesis, H1
• Is the opposite of the null hypothesis
e.g. The grade point average of juniors is
less than 3.0 (H1:  < 3.0)
• Challenges the Status Quo
• Never contains the ‘=‘ sign
• The Alternative Hypothesis may or may
not be accepted
• Is generally the hypothesis that is
believed to be true by the researcher
Identify the Problem
• Steps:
– State the Null Hypothesis (H0:  3.0)
– State its opposite, the Alternative

Hypothesis (H1:  < 3.0)
• Hypotheses are mutually exclusive &
exhaustive
• Sometimes it is easier to form the
alternative hypothesis first.
Hypothesis Testing Process
Assume the
population
mean age is 50.
(Null Hypothesis) Population
The Sample
Is X  20    50? Mean Is 20
No, not likely!
REJECT
Null Hypothesis Sample
Our hypothesis testing procedure
Reject in the red area, do not reject in the green area

Level of Significance, 
• Defines Unlikely Values of Sample
Statistic if Null Hypothesis Is True
– Called Rejection Region of Sampling
Distribution
• Designated (alpha)
– Typical values are 0.01, 0.05, 0.10
• Selected by the Researcher at the Start
• Provides the Critical Value(s) of the Test
Errors in Making Decisions
• Type I Error
– Reject True Null Hypothesis (“False Positive”)
– Has Serious Consequences
– Probability of Type I Error Is 
• Called Level of Significance
• Set by researcher
• Type II Error
– Do Not Reject False Null Hypothesis (“False
Negative”)
– Probability of Type II Error Is  (Beta)
Level of Significance, and
the Rejection Region

H0: 3 Critical
Value(s)
H1:  < 3
Rejection 0
Regions 
H0:   3
H1:  > 3
0
/2
H0:  3
H1:   3
0
Type I error
• We fixed the rejection region so that, when
the null hypothesis is true, we have a 5%
chance of incorrect rejecting the hypothesis.
• This is called the type I error, or “size” of the
test.
• This of course also means that, when the
null hypothesis is true, we have a 95%
chance of making the correct decision.
P < 0.05
Type II error
• Again, when the null distribution is the right

one, we have a 5% chance of making a
mistake and a 95% of not making a mistake
Type II error, continued
• When the alternative hypothesis is true,
we want to reject the null hypothesis. The
probability of doing this is called the “power”
of the test.
• When the alternative hypothesis is true, the
act of not rejecting is called type II error.
• A good test has low probabilities of both type
I and type II error.
& Have an
Inverse Relationship
Reduce probability of one error
and the other one goes up.

Factors Affecting
Type II Error, 
• True Value of Population Parameter
– Increases When Difference Between Hypothesized
Parameter & True Value Decreases
• Significance Level  
– Increases When Decreases 
• Population Standard Deviation  

– Increases When  Increases
Factors Affecting
Type II Error, 
• True Value of Population Parameter
– Increases When Difference Between Hypothesized
Parameter & True Value Decreases
• Significance Level  
– Increases When Decreases 
• Population Standard Deviation  

– Increases When  Increases
• Sample Size n 
– Increases When n Decreases
n
Hypothesis Testing: Steps
Test the Assumption that the true mean

grade point average of juniors is at least 3.
•1. State H0 H0 : 3.0
• 2. State H1 H1 : 
•3. Choose  = .05
•4. Choose n n = 100
•5. Choose Test: t Test (or p Value)
Hypothesis Testing: Steps
(continued)
Test the Assumption that grade point average of

juniors is at least 3.
• 6. Set Up Critical Value(s) t = -1.7

• 7. Collect Data 100 students sampled
• 8. Compute Test Statistic Computed Test Stat.= -2
• (computed P value=.04, two-tailed test)
• 9. Make Statistical Decision Reject Null Hypothesis
•10. Express Decision The true mean grade point is
less than 3.0
Hypothesis Testing Procedures
Hypothesis
Testing
Procedures
Parametric Nonparametric
Wilcoxon Kruskal-Wallis
Rank Sum H-Test
Test
One-Way Many More Tests Exist!
Z Test t Test
ANOVA
o Means and standard deviations are called
Parameters; all theoretical distributions have
parameters.
o Statistical tests that assume a distribution and
use parameters are called parametric tests
o Statistical tests that don't assume a
distribution or use parameters are called
nonparametric tests
When to use non parametric tests????
• While many things in nature, and science, are
normally distributed, some are not. In this case
using a t-test, for example, could be inappropriate
and misleading.
• Nonparametric tests have fewer assumptions or

restrictions on the data
• Examples:
– Nominal data: race, sex,
– Ordered categorical data: mild, moderate, severe
– Likert scales: strongly disagree, disagree, no
opinion, agree, strongly agree
How do nonparametric tests work?
• Most nonparametric tests use ranks

instead of raw data for their hypothesis
testing.
• Example: comparing test scores between

girls and boys
• Null hypothesis: medians are equal
How Nonparametric Tests Work
Step 1: rank data without regard to group
Test
Scores
Rank Score Sex
Boys Girls
1 50 Girl
2 55 Boy
70 60
3 60 Girl
90 50 4 65 Boy
5 70 Boy
85 95 6 75 Girl
Note: direction of the ranking
doesn't matter 7 80 Girl
55 80
8 85 Boy
65 75 9 90 Boy
10 95 Girl
How Nonparametric Tests Work -
Step 2: compute sum of the ranks per group
Step 3: use sum (or some function) of the ranks to do the
statistics
Ranks
Rank Score Sex Boys Girls

1 50 Girl
2 1
2 55 Boy
3 60 Girl 4 3 Sum of Ranks
4 65 Boy Boys = 28
5 70 Boy 5 6 Girls = 27
6 75 Girl
8 7
7 80 Girl
8 85 Boy
9 10
9 90 Boy
10 95 Girl
What about ties? Use average ranks of the tied scores
Test Ranks
Scores Rank Score Sex
Boys Girls 1 50 Girl Boys Girls
2 55 Boy
2 1
3 60 Girl
70 60
4 65 Boy 4 3
90 50 5 70 Boy
6 75 Girl 5 6
85 95
7 7.5 85 Girl
8 7
55 85 8 7.5 85 Boy
7.5 7.5
65 75 9 90 Boy 9 10
10 95 Girl
Commonly used nonparametric tests
Wilcoxon Rank Sum Test
• Also called the Mann-Whitney test

• Used to compare two independent groups
• Similar to a two sample t test but doesn't
require the data to be normally distributed
• A non-parametric test to compare the central
tendencies of two groups
• What does it assume? Random samples
• Test statistic: U
• Distribution under Ho: U distribution, with
sample sizes n1 and n2
Formulae
n1 n 1  1
U 1  n 1n 2   R1
2
n1= sample size of group 1
U2=n1n2-U1 n2= sample size of group 2
R1= sum of ranks of group 1
Mann-Whitney U test
Null hypothesis
Sample
The two groups
Have the same
median
Test statistic
U1 or U2 Null distribution
compare
(use the largest) U with n1, n2
How unusual is this test statistic?

P < 0.05 P > 0.05
Reject Ho Fail to reject Ho

Chi square test
What is it?
• Test of proportions
• Non parametric test
• Dichotomous variables are used
• Tests the association between two
factors
e.g. treatment and disease
gender and mortality
• It is the only test which can be used as
parametric as well as nonparametric test.
• The test we use to measure the differences
between what is observed and what is
expected according to an assumed
hypothesis is called the chi-square test.
Important
• The chi square test can only be used on

data that has the following characteristics:
The frequency data must have

The data must be in the
a precise numerical value and
form of frequencies
must be organised into
categories or groups.
The expected frequency in any one

cell of the table must be greater The total number of
than 5. observations must be greater
than 20.
Formula
χ = ∑ (O – E)
2 2
χ2 = The value of chi square

O = The observed value
E = The expected value
∑ (O – E)2 = all the values of (O – E) squared then added
Observed Frequencies (O)
Post LE1 LE2 LE3 LE4 LE5 & Row
Codes LE6 Total
Old 9 13 10 10 8 50
Industry
Food 4 3 5 9 21 42
Industry
Column 13 16 15 19 29 92
Total
(Note: that although there are 3 cells in the table that are not greater than 5,
these are observed frequencies. It is only the expected frequencies that have to
Expected frequency = row total x column total
Grand total
Eg: expected frequency for old industry in LE1 = (50 x 13) / 92 = 7.07

Codes LE6 Total
Old 7.07
Industry
Food
Industry
Column
Total
Codes LE6 Total
Old 7.07 8.70 8.15 10.33 15.76 50
Industry
Food 5.93 7.30 6.85 8.67 13.24 42
Industry
Column 13 16 15 19 29 92
Total
Eg: Old industry in LE1 is (9 –
(O – E)2
7.07)2 / 7.07 = 0.53 E

Codes LE6 Total
Old 0.53
Industry
Food
Industry
Column
Total
Post LE1 LE2 LE3 LE4 LE5 &L E6
Codes
Old 0.53 2.13 0.42 0.01 3.82

Industry
Food 0.63 2.54 0.50 0.01 4.55

Industry
Add up all of the above numbers to obtain the value for chi square: χ
2 = 15.14.
• Look up the significance tables. These will tell
you whether to accept the null hypothesis or
reject it.
• Wilcoxon Rank-sum test ~ t test
– (More commonly called the Mann-Whitney test)
• Wilcoxon Signed Rank Test ~ paired t test
• Kruskal-Wallis test ~ ANOVA (like a t test or

rank-sum test with more than 2 groups)
Parametric tests
• Most commonly used tests include
• Z test
• t test
• f test
• ANOVA test
t test-origin
• Founder WS Gosset
• Wrote under the pseudonym “Student”
• Mostly worked in tea (t) time
• ? Hence known as Student's t test.
• Preferable when the n < 60
• Certainly if n < 30
Is there a difference?
between …means,
who is meaner?
Statistical Analysis
control treatment
group group
mean mean
Is there a difference?
What does difference mean?
The mean difference
medium is the same for all
variability three cases
high
variability
low
variability
So we estimate
signal difference between group means

noise
= variability of groups
_ _
XT - XC
= _ _
SE(XT - XC)
= t-value
low
variability
Probability - p
• With t we check the probability

• Reject or do not reject Null hypothesis
• You reject if p < 0.05 or still less
Types
• One sample
compare with population
• Unpaired
compare with control
• Paired
same subjects: pre-post
• Z-test
large samples >60
Test direction
•One tailed t test
•Two tailed test

Mean systolic BP in nephritis is significantly higher than of
normal person
0.05
100 110 120 130 140

Mean systolic BP in nephritis is significantly
different from that of normal person
0.025 0.025
100 110 120 130 140
Slide downloaded from the Internet

Limitations - general
• Fails to gauge magnitude of difference
between two means
(solution- do CI)
• Only compares 2 groups

(solution- if> than 2 groups – ANOVA)
Normal curve test (Z test)
• It is utilized for the differences between mean values

based on large samples.
 It can be used for comparing,

1)two sample means
2)sample mean with population mean
3)two sample proportions
4)sample proportion with population proportion
Requisite conditions for the application of the
normal curve test
• Samples should be randomly selected.

• The data must be quantitative.
• The variable under study is assumed to follow
normal distribution in the population.
• The sample size must be larger
Steps involved in this test
• 1) Statement of null hypothesis and alternative

hypothesis.
• 2)Calculation of the standard error and critical ratio.
• 3)Fixation of the level of significance or having exact

level of significance.
• 4)comparison of the calculated value with the

theoretical value.
• 5) Drawing the inference.

Variance Ratio Test
• This test was developed by Fisher and

Snedecor.
• This test is utilized for comparison of variance

(SD2) between groups or samples.
• Variance ratio= variance2 square/ variance 1

square ( when variance 2> variance 1)
• Comparison of variance ratio is done with P
value .
• If F is higher than P value – variance is

significantly different from each other
• If F value is lower than P value means

variance of both samples are mostly same
and not significant.
ANOVA test (Analysis of variance)
• What is it for?- Testing the difference among

k means simultaneously
• What does it assume? The variable is
normally distributed with equal standard
deviations (and variances) in all k
populations; each sample is a random sample
• Test statistic: F
Quick Reference Summary: ANOVA (analysis of
variance)
• Formulae:
MSgroup
F
MSerror
SSgroup SSgroup SSerror SSerror
MSgroup   MSerror  
dfgroup k 1 dferror N  k
�
SSgroup   ni (Y i  Y) 2 SSerror   si2 (ni 1)
�
Y i = mean of group i ni = size of sample i
Y = overall mean N = total sample size
ANOVA
Null hypothesis
k Samples All groups have
the same mean
Test statistic
MSgroup Null distribution
F compare
F with k-1, N-k df
MSerror
� How unusual is this test statistic?

P < 0.05 P > 0.05
Reject Ho Fail to reject Ho

Conclusion
“It is nothing but

the truth”
• It has only single

aim- “to improve
the efficiency of
action program”
References
• Methods in Biostatistics- Dr B K Mahajan
• Biostatistics – K Vishweswara Rao
• Essentials of Preventive and Community

Dentistry-Dr. Soben Peter
• Community Dental Health – George Gluck,

Warren Morganstein
• Research Methodology: Kothari CR
• Various related articles on Internet

thank you

Biostatistics and Orthodontics

Hochgeladen von

Dokumentinformationen

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Biostatistics and Orthodontics

Hochgeladen von

Copyright:

Verfügbare Formate

Biostatistics

•Refers to the Status Quo

– State its opposite, the Alternative

Reject in the red area, do not reject in the green area

– Has Serious Consequences

– Probability of Type I Error Is 

• Called Level of Significance

• Again, when the null distribution is the right

Test the Assumption that the true mean

Test the Assumption that grade point average of

• 6. Set Up Critical Value(s) t = -1.7

• Nonparametric tests have fewer assumptions or

• Most nonparametric tests use ranks

• Example: comparing test scores between

Rank Score Sex Boys Girls

• Also called the Mann-Whitney test

How unusual is this test statistic?

Reject Ho Fail to reject Ho

• The chi square test can only be used on

The frequency data must have

The expected frequency in any one

χ2 = The value of chi square

Post LE1 LE2 LE3 LE4 LE5 & Row

Post LE1 LE2 LE3 LE4 LE5 & Row

Old 0.53 2.13 0.42 0.01 3.82

Food 0.63 2.54 0.50 0.01 4.55

• Wilcoxon Signed Rank Test ~ paired t test

• Kruskal-Wallis test ~ ANOVA (like a t test or

signal difference between group means

• With t we check the probability

•One tailed t test

•Two tailed test

100 110 120 130 140

100 110 120 130 140

Slide downloaded from the Internet

• Only compares 2 groups

• It is utilized for the differences between mean values

 It can be used for comparing,

• Samples should be randomly selected.

• 1) Statement of null hypothesis and alternative

• 3)Fixation of the level of significance or having exact

• 4)comparison of the calculated value with the

• 5) Drawing the inference.

• This test was developed by Fisher and

• This test is utilized for comparison of variance

• Variance ratio= variance2 square/ variance 1

• If F is higher than P value – variance is

• If F value is lower than P value means

• What is it for?- Testing the difference among

� How unusual is this test statistic?

Reject Ho Fail to reject Ho

“It is nothing but

• It has only single

• Biostatistics – K Vishweswara Rao

• Essentials of Preventive and Community

• Community Dental Health – George Gluck,

• Various related articles on Internet

Das könnte Ihnen auch gefallen