Sie sind auf Seite 1von 15

Page 1 of 15

Name ____Vanessa Sanchez

NUR 627: Advanced Epidemiology and Biostatistics for Nursing


Assignment 2; DUE APRIL 10, 2016
This test includes two sections.
== The FIRST SECTION includes 19 questions
== The SECOND SECTION includes 7 questions related to analyzing and interpreting the data.
Download and use the file sample_data_Assignment2_Spring.sav
The total score is 45 points (20% of the grade).
Name the file with your last name and attach the file to blackboard in Assignment2 or email
the answers to magdashaheen@cdrewu.edu by 4-10-2016

1. What is a Type I error? Give example [TWO POINTS]


Type I error- False positive error. This occurs when the null hypothesis is true, but was rejected
as false by the testing.
Example: The Alpha-Fetoprotein (AFP) Test screens the mothers blood during pregnancy for
AFP and determines risk. Abnormally high or low levels may indicate Down syndrome.
If an Error Type I (False positive) occurs, the test wrongly indicates that the patient may have a
child with down syndrome, which means that pregnancy may be aborted for no reason.

2. What type of measurement can be used to measure the following variables: fasting
glucose level, pain scale score, and marital status? [THREE POINTS]
Fasting glucose level: Ratio- quantitative, continuous variable because lots of possible values
within the range are clinically possible.
Pain scale score: Ordinal- qualitative because it describes the natural order depending on the
score (none, mild, moderate, severe)
Marital status: Nominal- qualitative because it has no order.

3. What type of statistics can be used to describe the variables fasting glucose level, pain
scale score, and marital status? [THREE POINTS]
Fasting glucose level: Parametric - for example, a two sample T- Test.

Page 2 of 15

Name ____Vanessa Sanchez

Pain scale score: Non parametric - for example, a Chi Square or correlation coefficient test.
Marital status: Non parametric - for example Chi Square or correlation coefficient test.

4. What is Type II error? Give example [TWO POINTS]


When the null hypothesis is false and you fail to reject it, you make a type II error. The
probability of making a type II error is , which depends on the power of the test. You can
decrease your risk of committing a type II error by ensuring your test has enough power.
You can do this by ensuring your sample size is large enough to detect a practical difference
when one truly exists.
Example: A medical researcher wants to compare the effectiveness of two medications.
Null hypothesis: The two medications are equally effective.
The researcher concludes that the medications are the same when, in fact, they are different.
This could potentially be life threatening if the less effective medication is sold to the
public.
5. t statistics = -3.25 and p-value=0.03 describe the difference between women and men for
mental health score. If alpha is set to 0.05, is the p-value of 0.03 statistically significant? In a
sentence, interpret the p value of 0.03. Provide a rationale for your answer [ONE POINT]
P-value is statistically significant.
Rationale: Since p-value of 0.03 is less than alpha of 0.05, null hypothesis will be rejected, thus,
it is significant

6. Was there a significant difference in Working status for the three levels of insurance
(uninsured, Medicaid-enrolled, and privately insured) if = 27.39, p < .001? Provide a
rationale for your answer. [ONE POINT]
Rationale: There is a significant difference between working status for the three levels of
insurance (uninsured, Medicaid-enrolled, and privately insured). Since p-value for the calculated
chi square is less than 0.05 we reject null hypothesis, therefore, this is statistically significant.
7. Does a set of scores with most of its values above the mean have a negatively or positively
skewed distribution? Provide a rationale for your answer. [ONE POINT]
Rationale: It would have a negatively skewed distribution. This is because most of its values are
above the mean, appearing as a long left tail in the negative direction on the number line. The

Page 3 of 15

Name ____Vanessa Sanchez

mean is pulled in the direction of the extreme scores, so in this scenario the extreme scores are
larger, thus the mean is larger than the median and to the left of the peak.
8. A study found that the correlation r value for the relationship between fasting glucose level
and the length of stay in the hospital is 0.62 and p-value was 0.02. If alpha is set to be 0.05, is
this r value of 0.62 statistically significant? Interpret the r value [STRENGTH AND
DIRECTION] and provide a rationale for your answer. [TWO POINTS]
r value of 0.62 is statistically significant.
Rationale: r value ranges from -1 to 1 and the greater the absolute value of the correlation
coefficient the stronger the linear relationship. The value of r in this case (r = 0.62) indicates that
there is a positive, linear relationship of moderate strength between fasting glucose level and the
length of hospital stay. Also p-value of 0.02 is less than alpha of 0.05, so we reject null
hypothesis, therefore, it is statistically significant.

9. If the researchers had set the level of significance or alpha=0.01, would the results of p=0.001
still be statistically significant? Provide a rationale for your answer. [ONE POINT]
Yes, the p-value of 0.001 would still be statistically significant because it is less than the alpha of
0.01 which would result in a rejection of the null hypothesis because the p-value is smaller than
the set rate of the alpha.
Use the following paragraph to answer the next two questions:
The following refers to questions 10 and 11.
A researcher randomizes 700 coronary artery disease (CAD) patients to either Medicine A or
Medicine B. The outcome of interest is myocardial infarction (MI). In the statistical section of
the published paper, the researcher writes that alpha was set at 0.05 and 700 patients were
required to achieve 80% power to detect a difference of 15% or more between the two study
arms. The researcher reports that 25% of the patients on Medicine A had an MI while 7% of
patients on Medicine B had an MI with p = 0.03.
10. Is the study result statistically significant? Provide rationale [ONE POINT]
This study result is statistically significant related to the fact that the p value of 0.03 is less than
the alpha value of 0.05. This determines that there is a difference in patients with CAD between
medicine A and medicine B. Therefore a p value of 0.03 is less than the alpha value of 0.05
making this a null hypothesis that should be rejected and an alternative hypothesis be considered.

Page 4 of 15

Name ____Vanessa Sanchez

11. Write a sentence that interprets the p value of 0.03 in relation to the study results. Use 0.03 in
your sentence. [ONE POINT]
A P- Value of 0.03 means that there is a 97% chance that the difference in outcomes is directly
related to the different medications (Medications A and B).
Select one answer for the following questions and justify your answer (i.e., why you
select a specific answer):
12. What is the statistical test (procedure) that is used to determine whether a significant
difference exists between three or more group means? [ONE POINT]
A) t-test
B)

ANOVA

C)

Correlation coefficient

D) Mann Whitney U test


Rationale: The ANOVA test is used to test differences in three or more group means or
means for two or more independent variables.
13. What type of hypothesis is represented by the statement women who smoke are as
likely to have low-birth-weight babies as women who do not? [ONE POINT]
A) Alternative hypothesis
B)

Non-directional

C)

Research

D) Null hypothesis
Rationale: This type of theory is one that has been put forward, either because it is
believed it is true or because it is used as a basis for an argument, but has not been
proved.

14. The nurse researcher is calculating the standard deviation. What is the standard
deviation? [ONE POINT]
A) The average amount of deviation of values from the mode and is calculated for
every other score
B) The average amount of deviation of values from the median and is calculated for
every other score
C) The average amount of deviation of values from the mean and is calculated for

Page 5 of 15

Name ____Vanessa Sanchez

every score
D) The average amount of deviation of values from the median and is calculated for
every score
Rationale: The Standard Deviation is a measure of how spread out numbers are. Its symbol
is (the greek letter sigma) The formula is the square root of the Variance.

15. What test would a nurse researcher use to test hypotheses about group differences in
proportions? [ONE POINT]
A) t-test
B)

ANOVA

C)

Correlation coefficient

D) Chi-square
Rationale: The Chi-squared test can be used to test differences in proportions between two or
more groups
16. The independent variable is weight gain during pregnancy. The dependent variable is the
infant's birth weight. What is the appropriate test statistic? [ONE POINT]
A) t-test
B)

ANOVA

C)

Chi-square

D) Pearson's r
Rationale: It is the measure of direction and strength of independent variable weight gain during
pregnancy and a dependent variable infants birth weight. Calculated using the mean and the
standard deviation of both variables. Giving a value of between +1 and -1, where 1 is total
positive correlation and 0 is no correlation at all.

17. The nurse researcher is reading about the standard deviation of a sampling distribution.
What is this called? [ONE POINT]
A) Sampling error
B)

Standard error

Page 6 of 15

C)

Name ____Vanessa Sanchez

Variance

D) Mean square
Rationale : The standard error of the mean (SEM) is the standard deviation of a sampling
distribution. It estimates the variability between sample means from multiple samples from the
same population. SEM is also a good measurement in determining how precise the mean of the
sample estimates the population mean. The lower the SEM value, the more precise the estimates
of the population mean, and vice versa.
18. What is the name for the shape of distribution that occurs when the nurse researcher
has a bell-shaped curve distribution? [ONE POINT]
A) Frequency
B) Unimodal
C) Multimodal
D) Normal
Rationale: Normal distributions are usually symmetrical with a single central peak at the mean
of the data (uni-modal). They are neither flat nor skewed. Therefore, a normal distribution has a
bell-shaped curve. All normal distributions or curves share the same properties, where 68% of
the observations lie within 1 standard deviation of the mean, 95% lie with 2 standard deviations
of the mean, and 99.7% of them lie within 3 standard deviations of the mean.
19. What parametric statistical method(s) a researcher can use to determine if the mean
body mass index of the population is the same for two groups of subjects (group1=diet
restriction; group2=none). (EACH IS ONE POINT =TOTAL=THREE POINTS)
A. Statistical test is a 2- sample t-test. It used to determine whether the means of
two independent group differs. Itll also calculate a range of values that is likely
to include the different between the population means.
B. Null Hypothesis is that the two group means are the samei.e., that the
independent variable and dependent variable are not related
C. Alternative hypothesis that the two group means are differenti.e., the
independent variable and dependent variable are related

USING THE DATA SET DOWNLOADED FROM BLACKBOARD


sample_data_Assignment2_Spring.sav

Page 7 of 15

Name ____Vanessa Sanchez

Answer the following questions:


1. Do frequency for the following variables and interpret the findings:
[FOUR POINTS]
Agecat (Age category), Gender, diabetes (history of diabetes), dhosp (died in hospital).

Rationale:
Age category: 36.5% are between the ages 55-64 years old. This is the highest percentage of the
group. 12.3% are 75 and older, and this is the lowest percentage by age.

Page 8 of 15

Name ____Vanessa Sanchez

Gender: 50.5% are male and 49.5% are female.


History of diabetes: 12.6 % are diabetics and 87% have no history of diabetes.
Died in hospital: 8.6% of the population from this study is missing information. For the rest of
the results, 23.7% died in the hospital, and 67.7% did not die in the hospital.

2. Do descriptive statistics and histogram with normal distribution and interpret the results
for the following variables: [THREE POINTS]
Los_rehab (length of stay for rehabilitation), cost (total treatment and rehabilitation costs
in thousands), and fasting_glucose_level (fasting glucose level)

Page 9 of 15

Name ____Vanessa Sanchez

Descriptive Statistics

Total treatment and rehabilitation


costs in thousands
Length of stay for rehabilitation
fasting glucose level
Valid N (listwise)

Minimum

Maximum

Statistic

Statistic

Statistic

Mean
Statistic

Std. Deviation

Std. Error

Statistic

1048

2.12

126.36

32.9396

.77052

24.94402

787

36

16.39

.448

12.565

1048

70

165

110.45

.491

15.897

787

The histogram for length of stay for rehabilitation is unimodal, symmetrical, with a wide spread, ranging
from 0 days to 36 days. The mean was 16.39 days with a standard deviation of 12.56. The histogram for
total treatment cost is unimodal and positively skewed, with a mean of 32.9 thousand dollars and standard
deviation of 24.94. Fasting glucose level has symmetrical and unimodal histogram, with a wide range of
70165, mean of 110.45, and standard deviation of 15.89.

3. Is there a difference between those who died in the hospital and those who did not die in
the hospital in the following variables:
- age (age at admission in years) [ONE POINT]
- fasting glucose level [ONE POINT]
a. What statistical test you will use?
Independent Samples T-Test will be used to determine the statistical difference between
related groups

Page 10 of 15

Name ____Vanessa Sanchez

b. Is the difference between the two groups statistically significant? Explain and
interpret the findings

Based on 958 subject study, 710 subjects did not die in the hospital while 248 subjects died in
the hospital. The mean for those who did not die in the hospital is 61.83 with a SD of 8.793,
while the mean for subjects who died in the hospital was 63.43. The mean for fasting blood sugar
for those who died in the hospital is 110.38 and 110.57 for those that didnt die in the hospital.
The test statistics for age and fasting blood glucose are -2.419 and -0.164 respectively. The age
p-value is 0.16 (less than 0.05). There is significant difference so we reject the null hypothesis.
The fasting blood glucose p value is 0.87 (more than 0.05) There is no significant difference
between variables and we do not reject the null hypothesis.
4. Is there a difference between the Age groups in the following variables:
- fasting glucose level (fasting_glucose_level) [ONE POINT]
- los_rehab (length of stay for rehabilitation) [ONE POINT]
a. What statistical test you will use?

Page 11 of 15

Name ____Vanessa Sanchez

b) Is the difference statistically significant?


Explain and interpret the finding
Group 1 (Fasting glucose level and Age category): The difference is statistically insignificant
because the p-values of 0.158 and 0.161 are greater than 0.05, therefore, we dont reject the null
hypothesis.
Group 2 (Length of stay for rehabilitation and age category): It is also statistically insignificant
because the p-values of 0.113 and 0.110 are also greater than 0.05, so we dont reject the null
hypothesis.
In other words, since its statistically insignificant for both groups, we cannot assume that there
is a difference between age groups and fasting glucose level or length of stay for rehabilitation.
5. Is there a correlation between age (age at admission in years), los_rehab (length of stay
for rehabilitation), cost (total treatment and rehabilitation costs in thousands) and
fasting_glucose_level (fasting glucose level)? [TWO POINTS]
a. Report the correlation coefficient (r) [direction and strength] and interpret the
results.

Page 12 of 15

Name ____Vanessa Sanchez

The correlation coefficient r is a measure of the direction and strength of a relationship that
ranges from -1 to +1. According to the correlation r values as displayed under Pearson
Correlation, most of the relationships in the matrix are of weak correlation with low correlation
r values in the positive direction, at 0.14, 0.10, 0.006, etc. The positive directions in these
relationships indicate that there are higher x values than y in weak linear relationships. On a
scatterplot, the data would look scattered and show somewhat positive linear relationships, but
the correlations would not mean much.
Though it is also worth mentioning that there is a moderate correlation of 0.611 in the positive
direction between length of stay for rehabilitation and total treatment and rehabilitation costs.
This means that there is a moderate linear relationship between the 2 variables. This positive
correlation also means that as the length of stay for rehab increases, the total costs would most
likely increase as well.
6. Are there relationship between patients death in the hospital (dhosp) and the following
variables:
a)

Diabetes (history of diabetes) [ONE POINT]

Page 13 of 15

b)

Name ____Vanessa Sanchez

Bp (Blood pressure). [ONE POINT]

a. What statistical test you will use?


Pearson Chi-Square Test
b. Is any of these relationships
statistically significant?
i. If yes, which relationship?

Page 14 of 15

Name ____Vanessa Sanchez

Both of these chi-square tests show there is no significance between the number of deaths in the
hospital in relation to diabetes and blood pressure.
ii.
Explain and interpret the findings.
2
X (1)= 1.028, p > .05. Since the p value is (0.311) is greater than alpha (0.05) and the null
hypothesis is retained. This concludes there is no significant relationship between the history of
diabetes of a patient and the number of deaths in the hospital.
X2 (2)= 2.239, p > .05 Since the p value is (0.326) is greater than alpha (0.05) and the null
hypothesis is retained. This concludes there is no significant relationship between the blood
pressure of a patient and the number of deaths in the hospital.

7. Write a ONE PAGE summary report for the results of the study and its impact on nursing
practice (i.e., summarize the findings from question 1 to 6) [TWO POINTS]
The frequency of the variables provides descriptive information about the background
characteristics of the sample in this study. In the age category 36.5% are between 55-64 yrs old
and 12.3% are 75+ age category. For the variable of gender 50.5% are male and 49.5% are
female. There were slightly more males than females. For the variable of history of diabetes,
87.4% have no history of diabetes and 12.6% are diabetics. The majority 87.4% of the sample
did not have a history of diabetes. For the blood pressure variable, 60.4% have normal blood
pressure, 27.8% have hypertension and 11.8% have hypotension. For patients who died in
hospital variable the frequency indicates we have missing information on 8.6% of the population

Page 15 of 15

Name ____Vanessa Sanchez

of the study group (90 samples). For the information available 23.7% died in the hospital while
67.7% did not die in the hospital.
With the use of descriptive statistics we are able to describe and summarize data related
to age in years and length of stay for rehabilitation. The age range is 41 with maximum 86 and
minimum 45. The mean age is 62.46 and SD is 9.113. The range of length of stay for
rehabilitation is 36 with maximum 36 and minimum 0. The mean is 16.39 and SD is 12.565. The
mean is the best single point in the distribution for summarizing a set of values and the standard
deviation tells us how much on average the values deviate from the mean.
The independent t-test was used to evaluate if there was a difference between those who
died in the hospital and those who did not die in the hospital in age. Based on the data, 710 did
not die in the hospital and 248 died in hospital. The mean age for did not die in hospital is 61.83
with SD of 8.655. The mean age for died in hospital is 63.43 with SD of 9.766. The t value is
-2.419 with standard error difference of 0.661. The p value is 0.016 that is less than 0.05. We
can reject the null hypothesis. Therefore we can conclude that there is a statistically significant
difference between the age in years of the individuals who died in the hospital and those who did
not die in the hospital.
The one-way ANOVA was used to analyze the difference between groups of blood
pressures. The difference is not statistically significant because p is 0.804 that is larger than the
alpha 0.05, so we fail to reject the null hypothesis. We have 95% confidence there is no
difference between the blood pressure groups in age.
A correlation is a bond or connection between variables. Correlation analysis is useful to
describe the direction and magnitude of a relationship between two variables. To test if there was
a relationship between age and length of stay for rehabilitation a correlation was computed. The
p is 0.000 that is less than alpha set at 0.01. We conclude that the correlation between age in
years and length of stay for rehabilitation is of weak significance. The correlation coefficient r is
0.140. The correlation between the age in years and length of stay for rehabilitation is in positive
direction and very weak.
The chi-square statistical test was used to analyze if there was a correlation between died
in hospital and diabetes then for died in hospital and blood pressure. The correlation coefficient
between died in hospital and diabetes is 0.311. This means the correlation between the two
variables is in positive direction and very weak. The correlation coefficient between died in
hospital and blood pressure is 0.687. This means the correlation between the two variables is in
in positive direction and of moderate strength. The p value is 0.311 for correlation between died
in hospital and diabetes. The p value is larger than alpha set at 0.01. We cannot reject the null
hypothesis. The relationship between died in hospital and diabetes and bp is not statistically
significant.