Sie sind auf Seite 1von 3

1. Textbook Exercise 1.118 1 - 0.

9599
=0.0401 4%
on the right
0.9599 =95% 0.2119
21%

-3 -2 -1 0 1 2 3 -3 -2 -1 0 1 2 3 -3 -2 -1 0 1 2 3
a. Z >1.75 b. z<1.75 c. z< -0.80
0.0401-(1-0.2119)= -
0.7499 ~75%

-3 -2 -1 0 1 2 3
d. - 0.80 < z < 1.75
2. Textbook Exercise 1.122
Mean = 100 Standard deviation=15
Consider X < 70 [ 2 Standard deviation from the mean]
Z= -2
70 Mean =100
= 0.0228 ~2.28% of the adults are developmentally disabled

3. Textbook Exercise 2.5. (Comment: The day of the week is often thought of as a categorical variable. If,
however, the sequential nature of the weekdays matter, it can be thought of as a quantitative variable instead.)
a. The cases of the study are “Tweets”.
b. Three quantitative (Click counts, Time, and Length) and two categorical (Day and Sex)
Click counts is responsive variable (It measures outcomes) and all others are explanatory explains changes in the
quantitative variable Explanatory response: Time of the day, the day of week, sex of the person posting the tweet,
and length of the tweet because they explain the number of clicks.

4. Textbook Exercise 2.34. (Hint: think lurking variables.)


a. Its negative curve association relationship
b. The countries with fewer than 40 Internet user per100 people, the birthrate increased. for countries with
more than 40 internet user\ 100 people the relationship tend tend which means that as much as people
are using the Internet the people they become more developed.

5. Textbook Exercise 2.59. (Hint: When is correlation a good measure of relationship in general?)
a. correlation = -0.72971
data<-read.csv(file.choose(), header=T)
attach(data)
plot(Users, BirthRate2011, main="Scatterplot", xlab="Users",
ylab="Birthrate")

b. Correlation is not good summary of two Variable


data.
it is not good summary for the data because there is
curvature in the plot
6. Textbook Exercise 2.75. (I recommend using R. See p. 29 in the class notes for an example. To see how
to load the data into R, find examples in the longer R reference provided in the syllabus.)
a. plot(Population, Undergrads, main="Scatterplot", xlab="Population", ylab="Undergrads")

b. The relationship is Linear, positive and Strong


there are few outliers
c. x̅ = 5,955,551 sx=6,620,733
y̅ = 302,136 sy=358,460
r = 0.98367

b = r(sy/sx)
= 0.98367(358,460/6,620,733)
=0.05326

a=yˉ−bxˉ
=302,136 – (0.05326*5,955,551)
= −15056.64 ~ -15057
ŷ=a+bx
ŷ= -15057+0.05326x Scatterplot
2000000

> plot(Population, Undergrads, main="Scatterplot",


xlab="Population", ylab="Undergrads")
> abline(a=--15057,b=.05326,col='red')
1500000
Undergrads

1000000
500000
0

0.0e+00 1.0e+07 2.0e+07 3.0e+07

Population
7. Textbook Exercise 2.76.
x̅= 4367448 sx= 3310957
y̅=220134 sy= 165270
r= 0.97081
b=r(sy/sx)
= 0.04846
a=yˉ− bxˉ
=8487.47
ŷ=a+bx
=8487.47+0.04846x

8. Textbook Exercise 2.77.


a. Ŷ= -15057+0.05326*4000000
=197,983
b. Ŷ= 198969.35+0.004846*4000000
=202327.47 ~ 202327
c. The large four states are the outliers and the outliers didn’t change the prediction for the median sized
states.

9. Textbook Exercise 2.128 (Hint: Simpson's paradox).


a. Death Rate for patients with poor conditions
Hospital A 3.8%
Hospital B 4%
The Death rate is less in Hospital A for poor condition patients.

b. Death Rate for patients with Good conditions


Hospital A 1%
Hospital B 1.3%
The Death rate is less in hospital A for good condition patients.

c. Hospital A is safer for both patients in good condition and patients in poor ... you should choose Hospital A.
d. Most patients arrive in poor condition and many are simply too far gone to survive even the best medical care.
That raises average death rate for all patients by a larger amount than it is lowered by Mercy's good record with
patients in good condition.

Das könnte Ihnen auch gefallen