Sie sind auf Seite 1von 3

T-Tests

1. Car mileage and weight:


a) The response variable is mileage, and the explanatory variable is weight.
b) x y 0052 . 6 . 45 = ; the y-intercept is 45.6 and the slope is - 0.0052.
c) For each 1000 pound increase in the vehicle, the predicted mileage will decrease by 5.2
miles per gallon.
d) The y-intercept is the predicted miles per gallon for a car that weighs 0 pounds. This is far
outside the range of the car weights in this database and, therefore, does not have
contextual meaning for these data.
2. Children of working females:
a) , ) 968 . 3 28 044 . 2 . 5 = = y (rounds to 4.0)
b) , ) 196 . 1 91 044 . 2 . 5 = = y (rounds to 1.2)
c) 104 . 1 196 . 1 3 . 2 = = y y (rounds to 1.1)
d) The y-intercept indicates that for nations with no female economic activity, the predicted
fertility rate is 5.2. As x increases from 0 to 100, the predicted fertility rate decreases from
5.2 to 0.8.
3. Dollars and thousands of dollars:
Slope when income is in dollars: 1.50/1000 = 0.0015
4. When can you compare slopes?:
a) For a $1000 increase in GDP, the predicted percentage using cell phones increases by
2.62, and the predicted percentage using the Internet increases by 1.55.
b) Because the slope of GDP to cell phone use is larger than is the relation of GDP to
Internet use, an increase in GDP would have a slightly greater impact on the percentage
using cell phones than on the percentage using the Internet.
5. Weight, height, and fat:
a) (i) Percentage of body fat and body mass index have the strongest association.
(ii) Height and body mass index have the weakest association.
b) There is a fairly strong, positive association between height and weight. As one goes up,
the other tends to go up.
c)
2
r = (0.553)(0.553) = 0.306 (rounds to 0.31).
2
r summarizes the reduction in sum of
squared errors in predicting y using the regression line instead of using the mean of y. In
this case, the sum of squared errors is 31% less when we use the regression equation.
d) None of these results would differ if height and weight were instead measured with metric
units.
6. Verbal and Math SAT:
a) y = 250 + 0.5(500) = 500. Generally, at the x-value equal to its mean, the predicted
value of y is equal to its mean.
b) We can find the correlation as follows: 5 . 0
100
100
5 . 0 = |
.
|

\
|
= =
x
y
s
s
b r When the x and y
variables have the same spread, the correlation equals the slope.
c)
2
r = (0.5)(0.5) = 0.25; The sum of squared errors is 25% less when we use the
regression equation instead of the mean of y.
7. SAT regression toward mean:
a) y = 250 + 0.5(800) = 650
b) The predicted y value will be 0.5 standard deviations above the mean, for every one
standard deviation above the mean that x is. Here, x = 800 is three standard deviations
above the mean; so the predicted y value is 0.5(3) = 1.5 standard deviations above the
mean.
8. GPAs and TV watching:
a) The correlation of -0.353 (rounds to -0.35) indicates that there is a negative relation
between the two variables. The more one watches television, the lower his or her college
GPA tends to be. The proportional reduction in error of 0.125 (rounds to 0.13) indicates
that the sum of squared errors is 13% less when we use the regression equation instead
of the mean of y.
b) We would expect that student to be (2)(0.505) = 1.01 standard deviations above the
mean on high school GPA. With regression to the mean, the predicted y is relatively
closer to its mean than x is to its mean.
9. t-score?:
a) df = n 2 = 25 2 = 23
b) - 2.069 (rounds to -2.07) and 2.069 (rounds to 2.07)
c) Wed use 2.07
10. More boys are bad?:
a) The negative slope indicates a negative association between life length and number of
sons. Having more sons is bad.
b) i) Assumptions: Assume randomization, linear trend with normal conditional distribution
for y and the same standard deviation at different values of x.
ii) Hypotheses: The null hypothesis that the variables are independent is H
0
: = 0. The
two-sided alternative hypothesis of dependence is H
a
: 0.
iii) Test statistic: t = b/se = - 0.65/0.29 = - 2.241.
iv) P-value: The P-value is 0.026.
v) Conclusion: If H
0
were true that the population slope = 0, it would be unusual to get a
sample slope at least as far from 0 as b = - 0.65. In fact, the probability would be 0.026.
The P-value gives very strong evidence that an association exists between number of
sons and life length.
c) The 95% confidence interval is , ) , ) 29 . 0 966 . 1 651 . 0
025 .
= se t b . The confidence
interval is (-1.220, -0.080) which rounds to (-1.2, -0.1). The plausible values for the true
population slope range from -1.2 to -0.1. It is not plausible that the true slope is 0.
11. Student GPAs:
a) i) Assumptions: Assume randomization, linear trend with normal conditional distribution
for y and the same standard deviation at different values of x.
ii) Hypotheses: The null hypothesis that the variables are independent is H
0
: = 0. The
two-sided alternative hypothesis of dependence is H
a
: 0.
iii) Test statistic: t = b/se = 0.6369/0.1442 = 4.42 (or just look at the printout for the test
statistic).
iv) P-value: The P-value is 0.000.
v) Conclusion: If H
0
were true that the population slope = 0, it would be very unusual
the probability would be almost 0 to get a sample slope at least as far from 0 as b =
0.6369. The P-value is beyond the significance level of 0.05, and we can reject the null
hypothesis. We have very strong evidence that an association exists between high
school and college GPA.
b) The 95% confidence interval is , ) , ) 1442 . 0 002 . 2 6369 . 0
025 .
= se t b
The confidence interval is (0.348, 0.926) which rounds to (0.3, 0.9). Zero is not a
plausible value for this slope; as was concluded in the significance test, it is not plausible that
there is no association.
12. Predicting house prices:
a) The residual df, 98, equals n 2; therefore, the sample size was 100.
b) The sample predicted mean selling price was y = 9.2 + 77.0(1.53) = 127.010, or
$127,010.
c) The estimated residual standard deviation of y is the square root of the MS Error, 1349.
The square root of 1349 is 36.729.
d) The prediction interval is: y 2s or 127.0102(36.729); (53.552, 200.468) which rounds
to (53.6, 200.5).
13. Predicting clothes purchases:
a) The value under Fit, 448, is the predicted amount spent on clothes in the past year for
those in the 12
th
grade of school.
b) The 95% confidence interval of (427, 469) is the range of plausible values for the
population mean of dollars spent on clothes for 12
th
grade students in the school.
c) The 95% prediction interval of (101, 795) is the range of plausible values for the
individual observations (dollars spent on clothes) for all the 12
th
grade students at the
school.
14. Savings grow exponentially:
a) y =
x
= (100)(1.10)
1
= 110
b) y =
x
= (100)(1.10)
5
= 161.05
c) y =
x
= (100)(1.10)
x
d) The first year after which youll have more than $200 is the 8
th
. y =
x
= (100)(1.10)
8
=
214.36
15. U.S. population growth:
a) y = 68.331.1418
0
= 68.33 million . y = 68.331.1418
11
= 293.83 million
b) 1.1418 is the multiplicative effect on y for a one-unit increase in x.
c) This suggests a very good fit of data to model. The high correlation indicates a linear
relation between the log of the y values and the x values.

Das könnte Ihnen auch gefallen