Sie sind auf Seite 1von 5

.

INSTRUCTIONS:

To complete this assignment:


Conduct the analyses using Stata as per the computing exercises for the modules.
Include only relevant Stata output as part of the assignment or marks will be deducted,
but do not include more than one copy of each table/graph.
Do not submit Stata output separately with your assignment: the relevant Stata
output should be copied and pasted into your assignment for corresponding questions.
Submit your assignment as ONE Word document.

Assignment 2: BIOSTATISTICS
(Total marks 50 - to be scaled to 25%)
QUESTION 1
(Total: 24 marks)
To assess the effect of two exercise programs on lower maximal heart rate (beats per minute) of
athletes after a treadmill test, a fictitious random data BSHrate2015.dta was obtained from a trial
for 60 athletes, who were randomised to each exercise group (aerobic and strength). Equal numbers
of female and male athletes were included.
The differences in athletes fitness levels as measured by the athletes heart rate after a treadmill test
between the two exercise programs, aerobic and strength, are of interest. Gender differences need to
be accounted for as well. In this question, you are given one continuous dependent variable Y
(HRATE) and two categorical independent variables X (exercise and sex) as in Table 1:

i.

Table 1. Variables in BSHrate2015.dta


Variable
Description
HRATE
Heart rate after treadmill test (beats/minute)
GROUP
1 = Aerobic, 2 = Strength
GENDER
1 = Female, 2 = Male
Copy and paste relevant Stata output into each question if applicable, and fully interpret the
results you present.

1. Conduct exploratory analyses using descriptive statistics and plots.


1.1 Assess the Normality for the dependent variable Y. (1 mark)
A. List the measures you obtained for assessing the normality
B. Make a conclusion of your assessment
1.2 Test (independent samples t test or ANOVA) the association between the dependent
variable Y and each categorical independent variable X to assess the strength of the
association between the Xs and Y. i.e., for each independent variable, are there
significant differences in population mean heart rate between the groups defined by
the independent variable X?
A. List the test you used
B. Make a conclusion of your test with a p value.
(4 marks)

Obtain the mean heart rate, standard deviation (both with two decimal places) and number for
each gender group under each program type and fill the following table:
Gender
Female

Male

Total

Program

Heart rate
mean

s.d

Aerobic
Strength
Total
Aerobic
Strength
Total
Aerobic
Strength
Total

(4 marks)
3

Perform regression analysis.


3.1 Name the multiple regression model you think which is appropriate for this question.
(1 mark)
3.2 Fit the model you recommended in Question 3.1for hrate on program and gender.
(2 marks)
3.3 Based on the results you obtained in Question 3.2, test the hypothesis that there is no
interaction in the population between the program and gender using information from
ANOVA table. Comment on whether a further model is necessary?
(2 marks)

Evaluate the goodness-of-fit of the final model.


4.1 List the measure/criteria you used to evaluate the goodness-of-fit of the final model
4.2 Interpret and comment on the measure measure/criteria with a conclusion
(2 marks)

Make detailed interpretations and conclusions based on your final model.


5.1 Write down the regression equation (estimated regression coefficients are rounded up
to two decimal places)
(1 mark)

5.2 Draw a detailed conclusion for the final model.


(5 marks)
6. Based on your final model,
6.1 Interpret the constant in the final model.
6.2 Calculate the predicted mean heart rate for female athletes who participated in the
aerobic program.
(2 marks)
QUESTION 2
(Total: 26 marks)
In this question, the interest lies in exploring the predictors of obesity as measured by the persons
body mass index. A fictitious data set (BSBMI2015.dta) was collected from a random sample of
102 adults. The information of the variables is given below in Table 2. You task is to investigate the
relationship between BMI (dependent variable) and age, gender, smoking status, socio-economic
status and whether the person regularly participates in physical activity, using an appropriate
regression procedure.
Variable
gender
smoking
ses
physact
age
bmi

Table 2. Variables in BSBMI2015.dta


Description
The gender of the participant: { 1 = Male , 2 = Female }
Whether the person smokes or not: { 1 = Yes , 2 = No }
The socio-economic status of the participant:
{ 1 = Lower , 2 = Medium, 3 = Higher}
Whether the person regularly participates in physical activity:
{ 1 = Yes , 2 = No }
The age of the participant (in years)
Body mass index (in kg/m2)

Hint:
i. Treat all the explanatory variables equally i.e. there is no major variable of interest and the
aim is to build a parsimonious model to determine significant predictors of BMI.
ii. Do NOT test for interaction or confounding effects.
iii. Use Stata to obtain the statistics and analyses. Copy and paste relevant sections of the
computer output into each question if applicable, and fully interpret the results you present.

1. Exploratory analyses using descriptive statistics and plots.


1.1 Assess and comment on the normality of the dependent variable bmi.
A. List the measures you used for assessing the normality
3

B. Make a conclusion of your assessment


(1 Mark)
1.2 Examine the linear relationship between BMI and age. What is your conclusion?
(1 Mark)
1.3 Test the association for Y (BMI) against each categorical X (independent
samples t tests or one-way ANOVA) to assess the strength of the association
between the Xs and BMI, i.e., for each factor, are there significant differences
between the groups?
A. List the test you used
B. Make a conclusion of your test.
(6 marks)
2

Details of your model building process.


Initially include all the independent variables, and then build a parsimonious regression
model for BMI using a backward elimination process and deleting insignificant variables one
by one. Treat all the independent variable equally, i.e., there is no major variable of interest.
(4 marks)

Assessment of assumptions for the final model obtained in Question 2 above (include your
interpretations and conclusions).
3.1 Assess and comment on the normality of the standardised residuals;
3.2 Assess and comment on the assumption of the constant variation;
3.3 Assess and comment on the assumption of equal variances.
(3 marks)

Assess the goodness-of-fit of the final model


4.1 Interpret and comment on the adjusted R2 value.
4.2 Check and comment on the values of standardized residuals.
(2 marks)

Detailed interpretations and conclusions.


5.1 Write down the regression equation based on the final model (round up to two
decimal places).
(1 mark)
5.2 Write own detailed conclusions with regard to the relationships between all the
independent variables listed in the Table 2 and BMI based on the analyses you
conducted.
(2 marks)
5.3 Interpret the regression coefficients and their confidence interval(s) to describe
the relationships between the independent variables and BMI based on the final
model.
(4 marks)
4

Based on your final model,


6.1 Obtain the mean predicted value of BMI for a 30 years old person who participated in
physical activity (round up to two decimal places).
6.2 Mrs X concluded that the difference in the mean predicted BMI between her 15 years old
twin daughters (one participated in physical activity and the other didnt) is 1.57, do you
think her conclusion is correct? Justify your answer.
(2 marks)

Das könnte Ihnen auch gefallen