Sie sind auf Seite 1von 3

QTM 100, Spring 2014

Lab 12

PRACTICE
For this lab we will use Dr. Craig Hadleys data set regarding the social determinants
of wellbeing in adolescents in Ethiopia. This data set is ethiopia.csv, and as a
comma separated value data set you must to specify comma field separators to
import the data correctly.
In the lab practice we investigated factors related to bmi. For your
practice, lets investigate factors related to mid-upper arm circumference
(muac_cm). We want to determine if mid-upper arm circumference differs
by age and gender.
1. Perform a basic summary of your data set. You should see that age is
considered as a numeric variable (you see a numeric summary of the age
variable). For this practice, lets consider age as a categorical variable. That
is, lets consider the 13 year olds, the 14 year olds, etc, as independent
groups. To do so, you will need to create a new categorical variable in R
related to the age categories.
Data -> Manage variables in active data set -> Convert numeric variables
to factors
Highlight
age.

Select use numbers


for factor levels.
Name your new variable
age_category.

Now identify how many teenagers you have in each age group.
a.
b.
c.
d.
e.

13
14
15
16
17

year
year
year
year
year

olds,
olds,
olds,
olds,
olds,

n=
n=
n=
n=
n=

423
444
446
355
265

2. Describe the distribution of mid-upper arm circumference. What are the


lowest and highest values mid-upper arm circumference takes on? Right
skewed, highest:54, lowest: 13
1

QTM 100, Spring 2014


Lab 12
3. Produce a figure showing the relationship between mid-upper arm
circumference and age category.
a. Which age group has the lowest median mid-upper arm circumference?
13
b. What is the general trend in mid-upper arm circumference by age
group? Increase in age = increase in mid-upper arm
circumference
c. Which age group has the largest outlier in mid-upper arm
circumference? 15911
4. Is there any evidence that mid-upper arm circumference differs significantly
by age category?
a. What are the null and alternative hypotheses? H0: 1= 2= 3= 4=
5
Ha: at least one mean is different
b. Assessing (part of the) assumptions.
i. Calculate the standard deviation of mid-upper arm
circumference for each age group (rounded to one decimal).
13 year olds, sd=
0.1
14 year olds, sd=
0.2
15 year olds, sd=
0.2
16 year olds, sd=
0.2
17 year olds, sd=
0.2
ii. Which age group has the most variability in mid-upper arm
circumference? 17
iii. Do you think it would be reasonable to assume that mid-upper
arm circumference in each age group has approximately equal
variance? yes
c. Perform the hypothesis test.
i. What is the value of the test statistic (rounded to one decimal)?
157.5
ii. What is the p-value (rounded to four decimals)? < 2.2e-16
d. What is your decision about H0 (reject or fail to reject Ho at
alpha=0.05)?reject
e. What is your conclusion? Not enough evidence to suggest
different means
f. What is the estimate of the standard deviation of mid-upper arm
circumference within each age category (rounded to one decimal)?
5. If you rejected H0 perform the Tukey pairwise comparisons.
a. For all confidence intervals presented, in how many do you conclude
that the mean of one age category is significantly different than the
mean of the other age category?
b. Which confidence interval is closest to containing the null value of
zero?
6. Is there any evidence that mid-upper arm circumference differs by gender in
addition to age group?
2

QTM 100, Spring 2014


Lab 12
a. First, evaluate the interaction between gender and age group. What
are the null and alternative hypotheses?
b. What is the p-value for the significance test of the interaction (rounded
to four decimals)?
c. At alpha=0.05, is the interaction statistically significant?
d. If not, estimate a two-way ANOVA model with just gender and age
group but no interaction between the two. What are the null and
alternative hypotheses?
e. What is the p-value regarding the significance of gender (rounded to
four decimals)?
f. What do you conclude about the effect of gender in addition to age
category?
7. In light of the above exercises, why do you think it might be important to
account for gender or age group when trying to determine if another variable
like place of residence or household income is associated with health?
8. Practice with the F-distribution. To plot the F-distribution, go to
Distributions -> Continuous distributions -> F distributions -> Plot F
distribution
Select `Plot density function
a. Sketch an F distribution with df1=1 and df2=30. What is the p-value
(rounded to four decimals) for an F test statistic of
i. 2
ii. 5
b. Sketch and F distribution with df1=30 and df2=30. What is the pvalue
(rounded to four decimals) for an F test statistic of
i. 2
ii. 5

Das könnte Ihnen auch gefallen