Sie sind auf Seite 1von 76

Simple Linear Regression and Correlation

89

CHAPTER 17
SIMPLE LINEAR REGRESSION
AND CORRELATION

SECTIONS 1 - 2
MULTIPLE CHOICE QUESTIONS
In the following multiple-choice questions, please circle the correct answer.
1.

The regression line y = 3 + 2x has been fitted to the data points (4, 8), (2, 5), and (1, 2).
The sum of the squared residuals will be:
a. 7
b. 15
c. 8
d. 22
ANSWER: d

2.

If an estimated regression line has a y-intercept of 10 and a slope of 4, then when x = 2


the actual value of y is:
a. 18
b. 15
c. 14
d. unknown
ANSWER: d

3.

Given the least squares regression line y = 5 2x:

90

Chapter Seventeen
a. the relationship between x and y is positive
b. the relationship between x and y is negative
c. as x increases, so does y
d. as x decreases, so does y
ANSWER: b

4.

A regression analysis between weight (y in pounds) and height (x in inches) resulted in


the following least squares line: y = 120 + 5x. This implies that if the height is increased
by 1 inch, the weight, on average, is expected to:
a. increase by 1 pound
b. decrease by 1 pound
c. increase by 5 pounds
d. increase by 24 pounds
ANSWER: c

5.

A regression analysis between sales (in $1000) and advertising (in $100) resulted in the
following least squares line: y = 75 +6x. This implies that if advertising is $800, then
the predicted amount of sales (in dollars) is:
a. $4875
b. $123,000
c. $487,500
d. $12,300
ANSWER: b

6.

A regression analysis between sales (in $1000) and advertising (in $) resulted in the
following least squares line: y = 80,000 + 5x. This implies that an:
a. increase of $1 in advertising is expected, on average, to result in an increase of $5 in
sales
b. increase of $5 in advertising is expected, on average, to result in an increase of $5,000
in sales
c. increase of $1 in advertising is expected, on average, to result in an increase of
$80,005 in sales
d. increase of $1 in advertising is expected, on average, to result in an increase of $5,000
in sales
ANSWER: d

7.

Which of the following techniques is used to predict the value of one variable on the
basis of other variables?
a. Correlation analysis
b. Coefficient of correlation
c. Covariance
d. Regression analysis
ANSWER: d

8.

The residual is defined as the difference between:

Simple Linear Regression and Correlation


91
a. the actual value of y and the estimated value of y
b. the actual value of x and the estimated value of x
c. the actual value of y and the estimated value of x
d. the actual value of x and the estimated value of y
ANSWER: a
9.

In the simple linear regression model, the y-intercept represents the:


a. change in y per unit change in x
b. change in x per unit change in y
c. value of y when x = 0
d. value of x when y = 0
ANSWER: c

10.

In the first order linear regression model, the population parameters of the y-intercept and
the slope are estimated respectively, by:
a. b0 and b1
b. b0 and 1
c. 0 and b1
d. 0 and 1
ANSWER: a

11.

In the simple linear regression model, the slope represents the:


a. value of y when x = 0
b. average change in y per unit change in x
c. value of x when y = 0
d. average change in x per unit change in y
ANSWER: b

12.

In regression analysis, the residuals represent the:


a. difference between the actual y values and their predicted values
b. difference between the actual x values and their predicted values
c. square root of the slope of the regression line
d. change in y per unit change in x
ANSWER: a

13.

In the first-order linear regression model, the population parameters of the y-intercept and
the slope are, respectively,
a. b0 and b1
b. b0 and 1
c. 0 and b1
d. 0 and 1
ANSWER: d

92

Chapter Seventeen

14.

In a simple linear regression problem, the following statistics are calculated from a
sample of 10 observations: ( x x )( y y ) = 2250, s x = 10, x = 50, y = 75.
The least squares estimates of the slope and y-intercept are respectively:
a. 1.5 and 0.5
b. 2.5 and 1.5
c. 1.5 and 2.5
d. 2.5 and 5.0
ANSWER: d

15.

If a simple linear regression model has no y-intercept, then:


a. all values of x are zero
b. all values of y are zero
c. when y = 0 so does x
d. when x = 0 so does y
ANSWER: d

16.

In the least squares regression line y = 3 - 2x, the predicted value of y equals:
a. 1.0 when x = -1.0
b. 2.0 when x = 1.0
c. 2.0 when x = -1.0
d. 1.0 when x = 1.0
ANSWER: d

17.

The least squares method for determining the best fit minimizes:
a. total variation in the dependent variable
b. sum of squares for error
c. sum of squares for regression
d. All of the above
ANSWER: b

18.

What do we mean when we say that a simple linear regression model is statistically
useful?
a. All the statistics computed from the sample make sense
b. The model is an excellent predictor of y
c. The model is practically useful for predicting y
d. The model is a better predictor of y than the sample y
ANSWER: d

Simple Linear Regression and Correlation


93
TRUE / FALSE QUESTIONS
19.

An inverse relationship between an independent variable x and a dependent variably y


means that as x increases, y decreases, and vice versa.
ANSWER: T

20.

A direct relationship between an independent variable x and a dependent variably y


means that the variables x and y increase or decrease together.
ANSWER: T

21.

Another name for the residual term in a regression equation is random error.
ANSWER: T

22.

A simple linear regression equation is given by y 5.25 3.8 x . The point estimate of y
when x = 4 is 20.45.
ANSWER: T

23.

The vertical spread of the data points about the regression line is measured by the yintercept.
ANSWER: F

24.

The method of least squares requires that the sum of the squared deviations between
actual y values in the scatter diagram and y values predicted by the regression line be
minimized.
ANSWER: T

25.

A regression analysis between sales (in $1000) and advertising (in $) resulted in the
following least squares line: y = 60 + 5x. This implies that an increase of $1 in
advertising is expected to result in an increase of $65 in sales.
ANSWER: F

26.

A regression analysis between weight ( y in pounds) and height ( x in inches) resulted in


the following least squares line: y = 135 + 6 x . This implies that if the height is
increased by 1 inch, the weight is expected to increase by an average of 6 pounds.
ANSWER: T

27.

The residual ri is defined as the difference between the actual value yi and the estimated
value yi .
ANSWER: T

28.

The regression line y = 2 + 3x has been fitted to the data points (4,11), (2,7), and (1,5).
The sum of squares for error will be 10.0.
ANSWER: T

94

Chapter Seventeen

29.

A regression analysis between sales (in $1000) and advertising (in $100) resulted in the
following least squares line: y = 77 +8x. This implies that if advertising is $600, then
the predicted amount of sales (in dollars) is $125,000.
ANSWER: T

30.

The residuals are observations of the error variable . Consequently, the minimized sum
of squared deviations is called the sum of squares for error, denoted SSE.
ANSWER: T

31.

Statisticians have shown that sample y -intercept b0 and sample slope coefficient b1 are
unbiased estimators of the population regression parameters 0 and 1 , respectively.
ANSWER: T

32.

If cov(x, y) = 7.5075 and sx2 = 3.5, then the sample slope coefficient is 2.145.
ANSWER: T

33.

The first order linear model is sometimes called the simple linear regression model.
ANSWER: T

34.

To create a deterministic model, we start with a probabilistic model that approximates the
relationship we want to model.
ANSWER: F

35.

The residual represents the discrepancy between the observed dependent variable and its
Predicted or estimated average value.
ANSWER: T

Simple Linear Regression and Correlation


95
STATISTICAL CONCEPTS & APPLIED QUESTIONS
FOR QUESTIONS 36 AND 37, USE THE FOLLOWING NARRATIVE:
Narrative: Car Speed and Gas Mileage
An economist wanted to analyze the relationship between the speed of a car (x) and its gas
mileage (y). As an experiment a car is operated at several different speeds and for each speed the
gas mileage is measured. These data are shown below.
Speed
Gas Mileage
36.

25
40

35
39

45
37

50
33

60
30

65
27

70
25

{Car Speed and Gas Mileage Narrative} Determine the least squares regression line.
ANSWER:
50.6563 0.3531x
y

37.

{Car Speed and Gas Mileage Narrative} Estimate the gas mileage of a car traveling 70
mph.
ANSWER:
When x = 70, y = 25.9393 mpg

38.

The following 10 observations of variables x and y were collected.


x
y

1
25

2
22

3
21

4
19

5
14

6
15

7
12

8
10

9
6

10
2

Find the least squares regression line, and the estimated value of y when x = 3
ANSWER:
27.733-2.389x. When x = 3, y = 20.566
y
39.

A scatter diagram includes the following data points:


x
y

3
8

2
6

5
12

4
10

5
14

Two regression models are proposed: (1) y 1.2 + 2.5x, and (2) y 5.5 + 4.0x.
Using the least squares method, which of these regression models provide the better fit
to the data? Why?
ANSWER:
SSE = 4.95 and 593.25 for models 1 and 2, respectively. Therefore, model (1) fits the data
better than model (2).
40.

Consider the following data values of variables x and y.

96

Chapter Seventeen

x
y

2
7

4
11

6
17

8
21

10
27

13
36

a. Determine the least squares regression line.


b. Find the predicted value of y for x = 9.
c. What does the value of the slope of the regression line tell you?
ANSWER:
a. y 0.934 + 2.637x
b. When x = 9, y = 24.667
c. If x increases by one unit, y on average will increase by 2.637.
FOR QUESTIONS 41 THROUGH 45, USE THE FOLLOWING NARRATIVE:
Narrative: Sunshine and Skin Cancer
A medical statistician wanted to examine the relationship between the amount of sunshine (x) in
hours, and incidence of skin cancer (y). As an experiment he found the number of skin cancers
detected per 100,000 of population and the average daily sunshine in eight counties around the
country. These data are shown below.
Average Daily Sunshine
Skin Cancer per 100,000
41.

5
7

7
11

6
9

7
12

8
15

6
10

4
7

3
5

{Sunshine and Skin Cancer Narrative} Determine the least squares regression line.
ANSWER:
-1.115 + 1.846x
y
{Sunshine and Skin Cancer Narrative} Draw a scatter diagram of the data and plot the
least squares regression line on it.
ANSWER:
Average Daily Sunshine Line Fit Plot
16
Skin Cancer

12
Skin Cancer

42.

Predicted Skin Cancer

Linear (Predicted Skin


Cancer)

0
0

Average Daily Sunshine

10

Simple Linear Regression and Correlation


97
43.

{Sunshine and Skin Cancer Narrative} Estimate the number of skin cancer per 100,000
of population for 6 hours of sunshine.
ANSWER:
When x = 6, y = 9.961

44.

{Sunshine and Skin Cancer Narrative} What does the value of the slope of the regression
line tell you?
ANSWER:
If the amount of sunshine x increases by one hour, the amount of skin cancer y increases
by an average of 1.846 per 100,000 of population.

45.

{Sunshine and Skin Cancer Narrative} Calculate the residual corresponding to the pair (x,
y) = (8, 15).
ANSWER:
e = y - y = 15 13.653 = 1.347

FOR QUESTIONS 46 THROUGH 49, USE THE FOLLOWING NARRATIVE:


NARRATIVE: Sales and Experience
The general manager of a chain of furniture stores believes that experience is the most important
factor in determining the level of success of a salesperson. To examine this belief she records last
months sales (in $1,000s) and the years of experience of 10 randomly selected salespeople.
These data are listed below.
Salesperson
1
2
3
4
5
6
7
8
9
10

Years of Experience
0
2
10
3
8
5
12
7
20
15

Sales
7
9
20
15
18
14
20
17
30
25

98

Chapter Seventeen

46.

{Sales and Experience Narrative} Draw a scatter diagram of the data to determine
whether a linear model appears to be appropriate.
ANSWER:
Scatter Diagram
35

Sales

30
25
20
15
10
5
0
0

10

15

20

25

Years of Experience

It appears that a linear model is appropriate.


47.

{Sales and Experience Narrative} Determine the least squares regression line.
ANSWER:
8.63 + 1.0817x
y

48.

{Sales and Experience Narrative} Interpret the value of the slope of the regression line.
ANSWER:
For each additional year of experience, monthly sales of a salesperson increase by an
average of $1,081.7.

49.

{Sales and Experience Narrative} Estimate the monthly sales for a salesperson with 16
years of experience.
ANSWER:
When x =16, y = 25.94

FOR QUESTIONS 50 THROUGH 53, USE THE FOLLOWING NARRATIVE:


Narrative: Income and Education
A professor of economics wants to study the relationship between income (y in $1000s) and
education (x in years). A random sample eight individuals is taken and the results are shown
below.
Education
Income

16
58

11
40

15
55

8
35

12
43

10
41

13
52

14
49

Simple Linear Regression and Correlation


99
50.

{Income and Education Narrative} Draw a scatter diagram of the data to determine
whether a linear model appears to be appropriate.
ANSWER:
Scatter Diagram

Income

60
50
40
30
6

10

12

14

16

18

Years of Education

It appears that a linear model is appropriate.


51.

{Income and Education Narrative} Determine the least squares regression line.
ANSWER:
10.6165 + 2.9098x
y

52.

{Income and Education Narrative} Interpret the value of the slope of the regression line.
ANSWER:
For each additional year of education, the income increases by an average of $2,909.80.

53.

{Income and Education Narrative} Estimate the income of an individual with 15 years of
education.
ANSWER:
When x = 15, y = 54.264 (in $1000s) or $54,264.0

FOR QUESTIONS 54 THROUGH 57, USE THE FOLLOWING NARRATIVE:


Narrative: Game Winnings and Education
An ardent fan of television game shows has observed that, in general, the more educated the
contestant, the less money he or she wins. To test her belief she gathers data about the last eight
winners of her favorite game show. She records their winnings in dollars and the number of years
of education. The results are as follows.

100

Chapter Seventeen

Contestant
1
2
3
4
5
6
7
8
54.

Years of Education
11
15
12
16
11
16
13
14

Winnings
750
400
600
350
800
300
650
400

{Game Winnings and Education Narrative} Draw a scatter diagram of the data to
determine whether a linear model appears to be appropriate.
ANSWER:
Scatter Diagram
1000

Winnings

800
600
400
200
8

10

12

14

16

18

Years of Education

It appears that a linear model is appropriate.


55.

{Game Winnings and Education Narrative} Determine the least squares regression line.
ANSWER:
1735 89.1667x
y

56.

{Game Winnings and Education Narrative} Interpret the value of the slope of the
regression line.
ANSWER:
For each additional year of education a contestant has, his or her winnings on TV game
shows decreases by an average of approximately $89.20.

Simple Linear Regression and Correlation


101
57.

{Game Winnings and Education Narrative} Estimate the game winnings for a contestant
with 15 years of education.
ANSWER:
When x = 15, y = $397.50

FOR QUESTIONS 58 THROUGH 61, USE THE FOLLOWING NARRATIVE:


Narrative: Movie Revenues
A financier whose specialty is investing in movie productions has observed that, in general,
movies with big-name stars seem to generate more revenue than those movies whose stars are
less well known. To examine his belief he records the gross revenue and the payment (in $
millions) given to the two highest-paid performers in the movie for ten recently released movies.
Movie

58.

Gross Revenue
48
65
18
20
31
26
73
23
39
58

{Movie Revenues Narrative} Draw a scatter diagram of the data to determine whether a
linear model appears to be appropriate.
ANSWER: It appears that a linear model is appropriate.
Scatter Diagram

Gross Revenue

1
2
3
4
5
6
7
8
9
10

Cost of Two Highest


Paid Performers
5.3
7.2
1.3
1.8
3.5
2.6
8.0
2.4
4.5
6.7

80
70
60
50
40
30
20
10
0
0

Payment to Top Tw o Stars

10

102
59.

Chapter Seventeen
{Movie Revenues Narrative} Determine the least squares regression line.
ANSWER:
4.225 + 8.285x
y

60.

{Movie Revenues Narrative} Interpret the value of the slope of the regression line.
ANSWER:
For each million dollar paid to the two highest paid performers, the gross revenue of the
movie increases by an average of $8.285 million.

61.

{Movie Revenues Narrative} Estimate the gross revenue of a movie if the two highest
paid performers received 6 million dollars.
ANSWER:
When x = 6, y = $53.935 million

FOR QUESTIONS 62 THROUGH 65, USE THE FOLLOWING NARRATIVE:


NARRATIVE: Cost of Books
The editor of a major academic book publisher claims that a large part of the cost of books is the
cost of paper. This implies that larger books will cost more money. As an experiment to analyze
the claim, a university student visits the bookstore and records the number of pages and the
selling price of twelve randomly selected books. These data are listed below.
Book
1
2
3
4
5
6
7
8
9
10
11
12
62.

Number of Pages
844
727
360
915
295
706
410
905
1058
865
677
912

Selling Price ($)


55
50
35
60
30
50
40
53
65
54
42
58

{Cost of Books Narrative} Determine the least squares regression line.


ANSWER:
19.387 + .0414x
y

Simple Linear Regression and Correlation


103
63.

{Cost of Books Narrative} Draw a scatter diagram of the data and plot the least squares
regression line on it.
ANSWER:

Selling Price

Number of Pages Line Fit Plot


70
60
50
40

Selling Price

30
20
10
0

Linear (Predicted Selling


Price)

Predicted Selling Price

200

400

600

800

1000

1200

Number of Pages

64.

{Cost of Books Narrative} Interpret the value of the slope of the regression line.
ANSWER:
For every additional page, the price of a book increases by an average of about 4 cents.

65.

{Cost of Books Narrative} Estimate the selling price for a 650 pages book.
ANSWER:
When x = 650, y = $46.037

FOR QUESTIONS 66 THROUGH 68, USE THE FOLLOWING NARRATIVE:


Narrative: Accidents and Precipitation
A statistician investigating the relationship between the amount of precipitation (in inches) and
the number of automobile accidents gathered data for 10 randomly selected days. The results
Day
1
2
3
4
5
6
7
8
9
10

Precipitation
0.05
0.12
0.05
0.08
0.10
0.35
0.15
0.30
0.10
0.20

Number of Accidents
5
6
2
4
8
14
7
13
7
10

104
66.

Chapter Seventeen
{Accidents and Precipitation Narrative} Find the least squares regression line.
ANSWER:
2.3704 + 34.864x
y

67.

{Accidents and Precipitation Narrative} Estimate the number of accidents in a day with
0.25 inches of precipitation
ANSWER:
When x = 0.25, y = 11.08 11 accidents

68.

{Accidents and Precipitation Narrative} What does the slope of the least squares
regression line tell you?
ANSWER:
For each additional inch of precipitation, the number of accidents on average increases by
34.864 (about 35 accidents).

FOR QUESTIONS 69 THROUGH 73, USETHE FOLLOWING NARRATIVE:


Narrative: Willie Nelson Concert
At a recent Willie Nelson concert, a survey was conducted that asked a random sample of 20
people their age and how many concerts they have attended since the first of the year. The
following data were collected:
Age
Number of Concerts

62
6

57
5

40
4

49
3

67
5

54
5

43
2

65
6

54
3

41
1

Age
Number of Concerts

44
3

48
2

55
4

60
5

59
4

63
5

69
4

40
2

38
1

52
3

An Excel output follows :


SUMMARY OUTPUT

DESCRIPTIVE STATISTICS

Regression Statistics
Multiple R
0.80203
R Square
0.64326
Adjusted R Square
0.62344
Standard Error
0.93965
Observations
20

Age
Mean
Standard Error
Standard Deviation
Sample Variance
Count

53
2.1849
9.7711
95.4737
20

Concerts
Mean
Standard Error
Standard Deviation
Sample Variance
Count

MS
28.65711
0.88294

F
32.45653

Significance F
2.1082E-05

t Stat
-2.53491
5.69706

P-value
0.02074
0.00002

Lower 95%
-5.50746
0.07934

3.65
0.3424
1.5313
2.3447
20

SPEARMAN RANK CORRELATION COEFFICIENT=0.8306


ANOVA
Regression
Residual
Total

Intercept
Age

df
1
18
19

SS
28.65711
15.89289
44.55

Coefficients Standard Error


-3.01152
1.18802
0.12569
0.02206

Upper 95%
-0.5156
0.1720

Simple Linear Regression and Correlation


105
69.

{Willie Nelson Concert Narrative} Draw a scatter diagram of the data to determine
whether a linear model appears to be appropriate to describe the relationship between the
age and number of concerts attended by the respondents.
ANSWER:
Scatter Diagram

Number of Concerts

7
6
5
4
3
2
1
0
30

35

40

45

50

55

60

65

70

75

Age

A linear model appears to be appropriate to describe the relationship between the age and
number of concerts attended by the respondents.
70.

{Willie Nelson Concert Narrative} Determine the least squares regression line.
ANSWER:
-3.0115 + 0.1257x
y
{Willie Nelson Concert Narrative} Plot the least squares regression line on the scatter
diagram.
ANSWER:
Scatter Diagram with Trendline
7

Number of Concerts

71.

6
5
4
3
2
1
0
30

35

40

45

50

55

Age

60

65

70

75

106
72.

Chapter Seventeen
{Willie Nelson Concert Narrative} Interpret the value of the slope of the regression line.
ANSWER:
For every additional year of age, the number of concerts attended increases on average by
0.1257. Equivalently we may say, for every additional 20 years of age, the number of
concerts attended increases on average by about 2.50.

73.

{Willie Nelson Concert Narrative} Estimate the number of Willie Nelson concerts
attended by a 64 year old person.
ANSWER:
When x = 64, y = 5.03 (about 5 concerts)

FOR QUESTIONS 74 THROUGH 77, USE THE FOLLOWING NARRATIVE:


Narrative: Oil Quality and Price
Quality of oil is measured in API gravity degrees the higher the degrees API, the higher the
quality. The table shown below is produced by an expert in the field who believes that there is a
relationship between quality and price per barrel.
Oil degrees API
27.0
28.5
30.8
31.3
31.9
34.5
34.0
34.7
37.0
41.0
41.0
38.8
39.3

Price per barrel (in $)


12.02
12.04
12.32
12.27
12.49
12.70
12.80
13.00
13.00
13.17
13.19
13.22
13.27

A partial Minitab output follows:


Descriptive Statistics
Variable
N
Mean
Mean
Degrees
13
34.60
Price
13 12.730

S = 0.1314

Coef
StDev
T
9.4349
0.2867
32.91
0.095235 0.008220 11.59
R-Sq = 92.46%

Degrees
Price

Degrees
21.281667
2.026750

P
0.000
0.000

R-Sq(adj) = 91.7%

Analysis of Variance
Source
Regression

DF
1

SS
2.3162

4.613
0.457

Covariances

Regression Analysis
Predictor
Constant
Degrees

StDev

MS
2.3162

F
134.24

P
0.000

Price
0.208833

SE
1.280
0.127

Simple Linear Regression and Correlation


107
Residual Error
Total
74.

11
12

0.1898
2.5060

0.0173

{Oil Quality and Price Narrative} Draw a scatter diagram of the data to determine
whether a linear model appears to be appropriate to describe the relationship between the
quality of oil and price per barrel.
ANSWER:

Price

Scatter Diagram
13.4
13.2
13
12.8
12.6
12.4
12.2
12
11.8
20

25

30

35

40

45

Degrees

A linear model appears to be appropriate to describe the relationship between the quality
of oil and price per barrel.
75.

{Oil Quality and Price Narrative} Determine the least squares regression line.
ANSWER:
9.4349 + 0.095235x
y
{Oil Quality and Price Narrative} Plot the least squares regression line on the scatter
diagram.
ANSWER:
Scatter Diagram

Price

76.

13.6
13.4
13.2
13
12.8
12.6
12.4
12.2
12
11.8
20

25

30

Degrees

35

40

45

108

77.

Chapter Seventeen

{Oil Quality and Price Narrative} Interpret the value of the slope of the regression line.
ANSWER:
For every additional API gravity degree, the price of oil per barrel increases by an
average of 9.52 cents.

Simple Linear Regression and Correlation


109

SECTIONS 3 - 4
MULTIPLE CHOICE QUESTIONS
In the following multiple-choice questions, please circle the correct answer.
78.

In a simple linear regression problem, the following sum of squares are produced:
( yi y ) 2 200 , ( yi y i ) 2 50 , and ( y i y ) 2 150 . The percentage of the
variation in y that is explained by the variation in x is:
a. 25%
b. 75%
c. 33%
d. 50%
ANSWER: b

79.

In simple linear regression, most often we perform a two-tail test of the population slope
1 to determine whether there is sufficient evidence to infer that a linear relationship
exists. The null hypothesis is stated as:
a. H 0 : 1 0
b. H 0 : 1 b1
c. H 0 : 1 r
d. H 0 : 1 s
ANSWER: a

80.

Testing whether the slope of the population regression line could be zero is equivalent to
testing whether the:
a. sample coefficient of correlation could be zero
b. standard error of estimate could be zero
c. population coefficient of correlation could be zero
d. sum of squares for error could be zero
ANSWER: c

81.

2
Given that s x2 500, s y 750 , cov (x, y) = 100, and n = 6, the standard error of
estimate is:
a. 12.247
b. 24.933
c. 30.2076
d. 11.180
ANSWER: c

82.

The symbol for the population coefficient of correlation is:

110

Chapter Seventeen
a. r
b.
c. r 2
d. 2
ANSWER: b

83.

Given that the sum of squares for error is 60 and the sum of squares for regression is 140,
then the coefficient of determination is:
a. 0.429
b. 0.300
c. 0.700
d. 0.837
ANSWER: c

84.

A regression line using 25 observations produced SSR = 118.68 and SSE = 56.32. The
standard error of estimate was:
a. 2.1788
b. 1.5648
c. 1.5009
d. 2.2716
ANSWER: b

85.

The symbol for the sample coefficient of correlation is:


a. r
a.
b. r 2
c. 2
ANSWER: a

86.

Given the least squares regression line y = -2.48 + 1.63x, and a coefficient of
determination of 0.81, the coefficient of correlation is:
a. -0.85
b. 0.85
c. -0.90
d. 0.90
ANSWER: d

87.

Which value of the coefficient of correlation r indicates a stronger correlation than 0.65?
a. 0.55
b. -0.75
c. 0.60
d. -0.45
ANSWER: b

88.

If the coefficient of determination is 0.975, then the slope of the regression line:
a. must be positive

Simple Linear Regression and Correlation


111
b. must be negative
c. could be either positive or negative
d. None of the above.
ANSWER: c
89.

In regression analysis, if the coefficient of determination is 1.0, then:


a. the sum of squares for error must be 1.0
b. the sum of squares for regression must be 1.0
c. the sum of squares for error must be 0.0
d. the sum of squares for regression must be 0.0
ANSWER: c

90

The sample correlation coefficient between x and y is 0.375. It has been found out that
the p value is 0.744 when testing H o : 0 against the one-sided alternative H1 : 0 .
To test the H o : 0 against the two-sided alternative H1 : 0 at a significance level
of 0.193, the p value is
a. 0.372
b. 1.488
c. 0.256
d. 0.512
ANSWER: d

91.

Correlation analysis is used to determine:


a. the strength of the relationship between x and y
b. the least squares estimates of the regression parameters
c. the predicted value of y for a given value of x
d. the coefficient of determination
ANSWER: a

92.

If the coefficient of correlation is 0.80 then, the percentage of the variation in y that is
explained by the variation in x is:
a. 80%
b. 64%
c. 80%
d. 64%
ANSWER: b

93.

If all the points in a scatter diagram lie on the least squares regression line, then the
coefficient of correlation must be:
a. 1.0
b. 1.0
c. either 1.0 or 1.0
d. 0.0
ANSWER: c
If the coefficient of correlation is 0.60, then the coefficient of determination is:
a. -0.60

94.

112

Chapter Seventeen
b. -0.36
c. 0.36
d. 0.40
ANSWER: c

95.

In regression analysis, if the coefficient of correlation is 1.0, then:


a. the sum of squares for error is 1.0
b. the sum of squares for regression is 1.0
c. the sum of squares for error and sum of squares for regression are equal
d. the sum of squares for regression and total variation in y are equal
ANSWER: d

96.

If the coefficient of correlation between x and y is close to 1.0, this indicates that:
a. y causes x to happen
b. x causes y to happen
c. both (a) and (b)
d. there may or may not be any causal relationship between x and y
ANSWER: d

97.

For the values of the coefficient of determination listed below, which one implies the
greatest value of the sum of squares for regression given that the total variation in y is
1800?
a. 0.69
b. 0.96
c. 0.58
d. 0.85
ANSWER: b

98.

When all the actual and predicted values of y are equal, the standard error of estimate will
be:
a. 1.0
b. 1.0
c. 0.0
d. 2.0
ANSWER: c

99.

Which of the following statistics and procedures can be used to determine whether a
linear model should be employed?
a. The standard error of estimate
b. The coefficient of determination
c. The t-test of the slope
d. All of the above
ANSWER: d

Simple Linear Regression and Correlation


113
100.

In testing the hypotheses: H 0 : 1 0 vs. H 1 : 1 0 , the following statistics are


available:
n = 10, b0 1.8 , b1 2.45 , s b = 1.20, and y = 6. The value of the
test statistic is:
a. 2.042
b. 0.306
c. 1.50
d. -0.300
ANSWER: a
1

101.

The standard error of estimate s is given by:


a. SSE/(n 2)
SSE /( n 2)
b.
SSE /( n 2)
c.
d. SSE/ n 2
ANSWER: c

102.

If the standard error of estimate s = 20 and n = 10, then the sum of squares for error,
SSE, is:
a. 400
b. 3200
c. 4000
d. 40000
ANSWER: b

103.

The smallest value that the standard error of estimate s can assume is:
a. 1
b. 0
c. 1
d. 2
ANSWER: b

104.

2
2
If cov(x, y) = 1260, s x 1600 and s y 1225, then the coefficient of determination is:
a. 0.7875
b. 1.0286
c. 0.8100
d. 0.7656
ANSWER: c

105.

The standard error of estimate s is a measure of the:


a. variation of y around the regression line

114

Chapter Seventeen
b. variation of x around the regression line
c. variation of y around the mean y
d. variation of x around the mean x
ANSWER: a

106.

The Pearson coefficient of correlation r equals 1 when there is no:


a. explained variation
b. unexplained variation
c. y-intercept in the model
d. outliers
ANSWER: b

107.

In regression analysis, the coefficient of determination R 2 measures the amount of


variation in y that is:
a. caused by the variation in x
b. explained by the variation in x
c. unexplained by the variation in x
d. None of the above
ANSWER: b

108.

If we are interested in determining whether two variables are linearly related, it is


necessary to:
a. perform the t-test of the slope 1
b. perform the t-test of the coefficient of correlation
c. either (a) or (b) since they are identical
d. calculate the standard error of estimate s
ANSWER: c

109.

In a regression problem the following pairs of (x,y) are given: (3,1), (3,-1), (3,0), (3,-2)
and (3,2). That indicates that the:
a. correlation coefficient is 1
b. correlation coefficient is 0
c. correlation coefficient is 1
d. coefficient of determination is between 1 and 1
ANSWER: b

110.

In a regression problem, if the coefficient of determination is 0.95, this means that:


a. 95% of the y values are positive
b. 95% of the variation in y can be explained by the variation in x
c. 95% of the x values are equal
d. 95% of the variation in x can be explained by the variation in y
ANSWER: b

111.

The sample correlation coefficient between x and y is 0.375. It has been found out that
the p value is 0.256 when testing H o : 0 against the two-sided alternative

Simple Linear Regression and Correlation


115
H1 : 0 . To test H o : 0 against the one-sided alternative H1 : 0 at a significant
level of 0.193, the p value will be equal to
a. 0.128
b. 0.512
c. 0.744
d. 0.872
ANSWER: a
112.

In simple linear regression, which of the following statements indicate no linear


relationship between the variables x and y?
a. Coefficient of determination is 1.0
b. Coefficient of correlation is 0.0
c. Sum of squares for error is 0.0
d. Sum of squares for regression is relatively large
ANSWER: b

113.

If the sum of squared residuals is zero, then the:


a. coefficient of determination must be 1.0
b. coefficient of correlation must be 1.0
c. coefficient of determination must be 0. 0
d. coefficient of correlation must be 0.0
ANSWER: a

114.

In a regression problem, if all the values of the independent variable are equal, then the
coefficient of determination must be:
a. 1.0
b. 0.5
c. 0.0
d. 1.0
ANSWER: c

115.

The standard error of the estimate is a measure of


a. total variation of the y variable
b. the variation around the sample regression line
c. explained variation
d. the variation of the x variable
ANSWER: b

116.

In simple linear regression, the coefficient of correlation r and the least squares estimate
b1 of the population slope 1 :
a. must be equal

116

Chapter Seventeen
b. must have opposite signs
c. must have the same sign
d. may have opposite signs or the same sign
ANSWER: c

117.

The coefficient of determination ( R 2 ) tells us


a. that the coefficient of correlation is larger than 1
b. whether r has any significance
c. that we should not partition the total variation
d. the proportion of total variation in y that is explained by x
ANSWER: d

118.

In performing a regression analysis involving two numerical variables, we are assuming:


a. the variances of x and yare equal
b. the variation around the line of regression is the same for each x value
c. that x and y are independent
d. All of the above
ANSWER: b

119.

Which of the following assumptions concerning the probability distribution of the


random error term is stated incorrectly?
a. The distribution is normal
b. The mean of the distribution is 0
c. The variance of the distribution increases as x increases
d. The errors are independent
ANSWER: c

120.

If the correlation coefficient (r) = 1.00, then


a. The y intercept ( bo ) must equal 0
b. The explained variation equals the unexplained variation
c. There is no unexplained variation
d. There is no explained variation
ANSWER: c

121.

In a simple linear regression problem, r and b1


a. may have opposite signs
b. must have the same sign
c. must have opposite signs
d. must be equal
ANSWER: b

122.

The sample correlation coefficient between x and y is 0.375. It has been found out that
the p value is 0.256 when testing H o : 0 against a two-sided alternative H1 : 0 .
To test H o : 0 against the one-sided alternative H1 : 0 at a significance level of
0.193, the p - value will be equal to

Simple Linear Regression and Correlation


117
a. 0.128
b. 0.512
c. 0.744
d. 0.872
ANSWER: d
123.

Which of the following in not a required condition for the error variable in the simple
linear regression model?
a. The probability distribution of is normal.
b. The mean of the probability distribution of is zero.
c. The standard deviation of is a constant no matter what the value of x.
d. The values of are auto correlated.
ANSWER: d

124.

Testing for existence of correlation is equivalent to


a. testing for the existence of the slope ( 1 )
b. testing for the existence of the Y intercept ( o )
c. the confidence interval estimate for predicting Y
d. None of the above
ANSWER: a

125.

The coefficient of determination R 2 measures the amount of:


a. variation in y that is explained by variation in x
b. variation in x that is explained by variation in y
c. variation in y that is unexplained by variation in x
d. variation in x that is unexplained by variation in y
ANSWER: a

126.

If the coefficient of correlation is 0.90, then the percentage of the variation in the
dependent variable y that is explained by the variation in the independent variable x is:
a. 90%
b. 81%
c. 0.90%
d. 0.81%
ANSWER: b

127.

If a researcher wanted to find out if alcohol consumptions and grade point average on a 4
point scale are linearly related, he would perform a
a. 2 test for the difference in two proportions
b. 2 test for independence
c. a z test for the difference in two proportions

118

Chapter Seventeen
d. a t test for no linear relationship between the two variables
ANSWER: d

Simple Linear Regression and Correlation


119
TRUE / FALSE QUESTIONS
128.

If the value of the sum of squares for error SSE equals zero, then the coefficient of
determination must equal zero.
ANSWER: F

129.

When the actual values y of a dependent variable and the corresponding predicted values
y are the same, the standard error of the estimate will be 1.0.
ANSWER: F

130.

The value of the sum of squares for regression SSR can never be smaller than 0.0.
ANSWER: T

131.

The value of the sum of squares for regression SSR can never be smaller than 1.
ANSWER: F

132.

If all the values of an independent variable x are equal, then regressing a dependent
variable y on x will result in a coefficient of determination of zero.
ANSWER: T

133.

In a simple linear regression model, testing whether the slope 1 of the population
regression line could be zero is the same as testing whether or not the population
coefficient of correlation equals zero.
ANSWER: T

134.

When the actual values y of a dependent variable and the corresponding predicted values
y are the same, the standard error of estimate s will be 0.0.
ANSWER: T

135.

If there is no linear relationship between two variables x and y , the coefficient of


determination must be 1.0.
ANSWER: F

136.

The value of the sum of squares for regression SSR can never be larger than the value of
sum of squares for error SSE.
ANSWER: F

137.

When the actual values y of a dependent variable and the corresponding predicted values
y are the same, the standard error of estimate s will be -1.0.
ANSWER: F

138.

In a simple linear regression problem, the least squares line is y = -3.75 + 1.25 x , and
the coefficient of determination is 0.81. The coefficient of correlation must be 0.90.
ANSWER: F

120

Chapter Seventeen

139.

In simple linear regression, the divisor of the standard error of estimate s is n 2.


ANSWER: T

140.

In a regression problem the following pairs of (x, y) are given: (4,-2), (4,-1), (4,0), (4,1)
and (4,2). That indicates that the coefficient of correlation is 1.
ANSWER: F

141.

The value of the sum of squares for regression SSR can never be larger than the value of
total sum of squares SST.
ANSWER: T

142.

In regression analysis, if the coefficient of determination is 1.0, then the coefficient of


correlation must be 1.0.
ANSWER: F

143.

Correlation analysis is used to determine the strength of the relationship between an


independent variable x and dependent variable y.
ANSWER: T

144.

If the coefficient of correlation is 0.81, then the percentage of the variation in y that is
explained by the regression line is 81%.
ANSWER: F

145.

If all the points in a scatter diagram lie on the least squares regression line, then the
coefficient of correlation must be 1.0.
ANSWER: F

146.

If the standard error of estimate s = 20 and n = 8, then the sum of squares for error SSE
is 2,400.
ANSWER: T

147.

The probability distribution of the error variable is normal, with mean E( ) = 0, and
standard deviation =1.
ANSWER: F

148.

In a simple linear regression problem, if the coefficient of determination is 0.95, this


means that 95% of the variation in the independent variable x can be explained by
regression line.
ANSWER: F

149.

2
2
Given that cov(x, y) = 10, s y = 15, sx = 8, and n = 12, the value of the standard error of
estimate s is 2.75.
ANSWER: F

Simple Linear Regression and Correlation


121
150.

If the error variable is normally distributed, the test statistic for testing H 0 : 1 0 is
Student t distributed with n 2 degrees of freedom.
ANSWER: T

151.

2
2
Given that cov(x, y) = 8.5, s y = 8, and sx = 10, then the value of the coefficient of
determination is 0.95.
ANSWER: F

152.

The coefficient of determination is the coefficient of correlation squared. That is, R 2 r 2


ANSWER: T

153.

Given that SSE = 60 and SSR = 540, the proportion of the variation in y that is explained
by the variation in x is 0.90.
ANSWER: T

154.

Given that SSE = 84 and SSR = 358.12, the coefficient of correlation (also called the
Pearson coefficient of correlation) must be 0.90.
ANSWER: F

155.

Except for the values r = -1, 0, and 1, we cannot be specific in our interpretation of the
coefficient of correlation r. However, when we square it we produce a more meaningful
statistic.
ANSWER: T

156.

A zero population correlation coefficient between a pair of random variables means that
there is no linear relationship between the random variables.
ANSWER: T

157.

2
2
Given that cov(x, y) = 8, s y = 14, sx = 10, and n = 6, the value of the sum of squares for
error SSE is 38.
ANSWER: T

158.

A store manager gives a pre-employment examination to new employees. The test is


scored from 1 to 100. He has data on their sales at the end of one year measured in
dollars. He wants to know if there is any linear relationship between pre-employment
examination score and sales. An appropriate test to use is the t test on the population
correlation coefficient.
ANSWER: T

122

Chapter Seventeen
STATISTICAL CONCEPTS & APPLIED QUESTIONS

FOR QUESTIONS 159 THROUGH 164, USE THE FOLLOWING NARRATIVE:


Narrative: Car Speed and Gas Mileage
An economist wanted to analyze the relationship between the speed of a car (x) and its gas
mileage (y). As an experiment a car is operated at several different speeds and for each speed the
gas mileage is measured. These data are shown below.
Speed
Gas Mileage
159.

25
40

35
39

45
37

50
33

60
30

65
27

70
25

{Car Speed and Gas Mileage Narrative} Calculate the standard error of estimate, and
describe what this statistic tells you about the regression line.
ANSWER:
s 1.448; the models fit to these data is good.

160.

{Car Speed and Gas Mileage Narrative} Do these data provide sufficient evidence at the
5% significance level to infer that a linear relationship exists between higher speeds and
lower gas mileage?
ANSWER:
H 0 : 0 vs. H 1 : 0
Rejection region: | t | > t0.025,10 2.228
Test statistic: t = -9.754
Conclusion: Reject the null hypothesis. Yes, these data provide sufficient evidence at the
5% significance level to infer that a linear relationship exists between higher speeds and
lower gas mileage.

161.

{Car Speed and Gas Mileage Narrative} Predict with 99% confidence the gas mileage of
a car traveling 55 mph.
ANSWER:
31.236 6.284. Thus, LCL = 24.952, and UCL = 37.52

162.

{Car Speed and Gas Mileage Narrative} Calculate the Pearson coefficient of correlation.
ANSWER:
r = -0.975

163.

{Car Speed and Gas Mileage Narrative} What does the coefficient of correlation tell you
about the direction and strength of the relationship between the two variables?
ANSWER:
There is a very strong negative linear relationship between car speed and gas mileage.

Simple Linear Regression and Correlation


123
164.

{Car Speed and Gas Mileage} Calculate the coefficient of determination and interpret its
value.
ANSWER:
R 2 = 0.95. This means that 95% of the total variation in gas mileage can be explained by
the speed of the car.

165.

The following 10 observations of variables x and y were collected.


x
y

1
25

2
22

3
21

4
19

5
14

6
15

7
12

8
10

9
6

10
2

a. Calculate the standard error of estimate.


b. Test to determine if there is enough evidence at the 5% significance level to indicate
that x and y are negatively linearly related.
c. Calculate the coefficient of correlation, and describe what this statistic tells you about
the regression line.
ANSWER:
a. s 1.322
b. H 0 : 1 0 vs. H 1 : 1 0
Rejection region: | t | > t0.05,8 1.86
Test statistic: t = -16.402
Conclusion: Reject the null hypothesis. Yes, there is enough evidence at the 5%
significance level to indicate that x and y are negatively linearly related.
c. r = -0.9854. This indicates a very strong negative linear relationship between the two
variables.
166.

Consider the following data values of variables x and y.


x
y

2
7

4
11

6
17

8
21

10
27

13
36

a. Calculate the coefficient of determination, and describe what this statistic tells you
about the relationship between the two variables.
b. Calculate the Pearson coefficient of correlation. What sign does it have? Why?
c. What does the coefficient of correlation calculated Tell you about the direction and
strength of the relationship between the two variables?
ANSWER:
a. R 2 0.995. This means that 99.5% of the variation in the dependent variable y is
explained by the variation in the independent variable x.
b. r = 0.9975. It is positive since the slope of the regression line is positive.
c. There is a very strong (almost perfect) positive linear relationship between the two
variables.

124

Chapter Seventeen

FOR QUESTIONS 167 THROUGH 171, USE THE FOLLOWING NARRATIVE:


Narrative: Sunshine and Skin Cancer
A medical statistician wanted to examine the relationship between the amount of sunshine (x)
and incidence of skin cancer (y). As an experiment he found the number of skin cancers detected
per 100,000 of population and the average daily sunshine in eight counties around the country.
These data are shown below.
Average Daily Sunshine
Skin Cancer per 100,000
167.

5
7

7
11

6
9

7
12

8
15

6
10

4
7

3
5

{Sunshine and Skin Cancer Narrative} Calculate the standard error of estimate, and
describe what this statistic tells you about the regression line.
ANSWER:
s 0.9608; the models fit to these data is good.

168.

{Sunshine and Skin Cancer Narrative} Can we conclude at the 1% significance level that
there is a linear relationship between sunshine and skin cancer?
ANSWER:
H 0 : 0 vs. H 1 : 0
Rejection region: | t | > t0.005,6 3.707
Test statistic: t = 8.485
Conclusion: Reject the null hypothesis. Yes, we conclude at the 1% significance level that
there is a linear relationship between sunshine and skin cancer.

169.

{Sunshine and Skin Cancer Narrative} Calculate the coefficient of determination and
interpret it.
ANSWER:
R 2 0.9231. This means that 92.31% of the variation in the incidence of skin cancer is
explained by the variation in the amount of sunshine.

170.

{Sunshine and Skin Cancer Narrative} Calculate the Pearson coefficient. What sign does
it have? Why?
ANSWER:
R = 0.9608. It is positive since the slope of the regression line ( b1 = 1.846) is positive.

171.

{Sunshine and Skin Cancer Narrative} What does the coefficient of correlation calculated
Tell you about the direction and strength of the relationship between the two variables?

ANSWER:
There is a very strong (almost perfect) positive linear relationship between the two
variables.
FOR QUESTIONS 172 THROUGH 177, USE THE FOLLOWING NARRATIVE:

Simple Linear Regression and Correlation


125
Narrative: Sales and Experience
The general manager of a chain of furniture stores believes that experience is the most important
factor in determining the level of success of a salesperson. To examine this belief she records last
months sales (in $1,000s) and the years of experience of 10 randomly selected salespeople.
These data are listed below.
Salesperson
1
2
3
4
5
6
7
8
9
10
172.

Years of Experience
0
2
10
3
8
5
12
7
20
15

Sales
7
9
20
15
18
14
20
17
30
25

{Sales and Experience Narrative} Determine the standard error of estimate and describe
what this statistic tells you about the regression line.
ANSWER:
s 1.5724; the models fit is good.

173.

(Sales and Experience Narrative} Determine the coefficient of determination and discuss
what its value tells you about the two variables.
ANSWER:
R 2 0.9536, which means that 95.36% of the variation in sales is explained by the
variation in years of experience of the salesperson.

174.

{Sales and Experience Narrative} Calculate the Pearson correlation coefficient. What
sign does it have? Why?
ANSWER:
r 0.9765. It has a positive sign since the slope of the regression line ( b1 = 1.0817) is
positive.

126
175.

Chapter Seventeen
{Sales and Experience Narrative} Conduct a test of the population coefficient of
correlation to determine at the 5% significance level whether a linear relationship exists
between years of experience and sales.
ANSWER:
H 0 : 0 vs. H 1 : 0
Rejection region: | t | > t0.025,8 2.306
Test statistic: t = 12.8258
Conclusion: Reject the null hypothesis. Yes, a linear relationship exists between years of
experience and sales.

176.

{Sales and Experience Narrative} Conduct a test of the population slope to determine at
the 5% significance level whether a linear relationship exists between years of experience
and sales.
ANSWER:

H 0 : 1 0 vs. H 1 : 1 0

Rejection region: | t | > t0.025,8 2.306


Test statistic: t = 12.8258
Conclusion: Reject the null hypothesis. Yes, a linear relationship exists between years of
experience and sales.
177.

{Sales and Experience Narrative} Do the tests of and 1 in the previous two questions
provide the same results? Explain.
ANSWER:
Yes; both tests have the same value of the test statistic, the same rejection region, and of
course the same conclusion. This is not a coincidence; the two tests are identical.

FOR QUESTIONS 178 THROUGH 183, USE THE FOLLOWING NARRATIVE:


Narrative: Income and Education
A professor of economics wants to study the relationship between income (y in $1000s) and
education (x in years). A random sample eight individuals is taken and the results are shown
below.
Education
Income
178.

16
58

11
40

15
55

8
35

12
43

10
41

13
52

14
49

{Income and Education Narrative} Determine the standard error of estimate and describe
what this statistic tells you about the regression line.
ANSWER:
s 2.436; the models fit to these data is good.

Simple Linear Regression and Correlation


127
179.

{Income and Education Narrative} Determine the coefficient of determination and


discuss what its value tells you about the two variables.
ANSWER:
R 2 0.9223, which means that 92.03% of the variation in income is explained by the
variation in years of education.

180.

{Income and Education Narrative} Calculate the Pearson correlation coefficient. What
sign does it have? Why?
ANSWER:
r 0.9604. It has a positive sign since the slope of the regression line ( b1 = 2.9098) is
positive.

181.

{Income and Education Narrative} Conduct a test of the population coefficient of


correlation to determine at the 5% significance level whether a linear relationship exists
between years of education and income.
ANSWER:
H 0 : 0 vs. H 1 : 0
Rejection region: | t | > t0.025,6 2.447
Test statistic: t = 8.439
Conclusion: Reject the null hypothesis. Yes, a linear relationship exists between years of
education and income.

182.

{Income and Education Narrative} Conduct a test of the population slope to determine at
the 5% significance level whether a linear relationship exists between years of education
and income.
ANSWER:
H 0 : 1 0 , H 1 : 1 0
Rejection region: | t | > t0.025,6 2.447
Test statistic: t = 8.439
Conclusion: Reject the null hypothesis. Yes, a linear relationship exists between years of
education and income.

183.

{Income and Education Narrative} Do the tests of and 1 in the previous two provide
the same results? Explain.
ANSWER:
Yes; both tests have the same value of the test statistic, the same rejection region, and of
course the same conclusion. This is not a coincidence; the two tests are identical.

128

Chapter Seventeen

FOR QUESTIONS 184 THROUGH 189, USE THE FOLLOWING NARRATIVE:


Narrative: Game Winnings and Education
An ardent fan of television game shows has observed that, in general, the more educated the
contestant, the less money he or she wins. To test her belief she gathers data about the last eight
winners of her favorite game show. She records their winnings in dollars and the number of years
of education. The results are as follows.
Contestant
1
2
3
4
5
6
7
8
184.

Years of Education
11
15
12
16
11
16
13
14

Winnings
750
400
600
350
800
300
650
400

{Game Winnings and Education Narrative} Determine the standard error of estimate and
describe what this statistic tells you about the regression line.
ANSWER:
s 59.395; the models fit to these data is good.

185.

{Game Winnings and Education Narrative} Determine the coefficient of determination


and discuss what its value tells you about the two variables.
ANSWER:
R 2 0.9185, which means that 91.85% of the variation in TV game shows winnings is
explained by the variation in years of education.

186.

{Game Winnings and Education Narrative} Calculate the Pearson correlation coefficient.
What sign does it have? Why?
ANSWER:
r -0.9584. It has a negative sign since the slope of the regression line ( b1 = -89.1667)
is negative.

187.

{Game Winnings and Education Narrative} Conduct a test of the population coefficient
of correlation to determine at the 5% significance level whether a linear relationship
exists between years of education and TV game shows winnings.
ANSWER:
H 0 : 0 vs. H 1 : 0
Rejection region: | t | > t0.025,6 2.447
Test statistic: t = -8.2227
Conclusion: Reject the null hypothesis. Yes, a linear relationship exists between years of

Simple Linear Regression and Correlation


129
education and TV game shows winnings.
188.

{Game Winnings and Education Narrative} Conduct a test of the population slope to
determine at the 5% significance level whether a linear relationship exists between years
of education and TV game shows winnings.
ANSWER:

H 0 : 1 0 vs. H 1 : 1 0

Rejection region: | t | > t0.025,6 2.447


Test statistic: t = -8.2227
Conclusion: Reject the null hypothesis. Yes, a linear relationship exists between years of
education and TV game shows winnings.
189.

{Game Winnings and Education Narrative} Do the tests and 1 in the previous two
questions provide the same results? Explain.
ANSWER:
Yes. This is not a coincidence; the two tests are identical.

FOR QUESTIONS 190 THROUGH 195, USE THE FOLLOWING NARRATIVE:


Narrative: Movie Revenues
A financier whose specialty is investing in movie productions has observed that, in general,
movies with big-name stars seem to generate more revenue than those movies whose stars are
less well known. To examine his belief he records the gross revenue and the payment (in $
millions) given to the two highest-paid performers in the movie for ten recently released movies.
Movie
1
2
3
4
5
6
7
8
9
10

190.

Cost of Two Highest


Paid Performers
5.3
7.2
1.3
1.8
3.5
2.6
8.0
2.4
4.5
6.7

Gross Revenue
48
65
18
20
31
26
73
23
39
58

{Movie Revenues Narrative} Determine the standard error of estimate and describe what
this statistic tells you about the regression line.

130

Chapter Seventeen
ANSWER:
s 2.0247; the models fit to these is good.

191.

{Movie Revenues Narrative} Determine the coefficient of determination and discuss


what its value tells you about the two variables.
ANSWER:
R 2 0.9908, which means that 99.08% of the variation in gross revenue is explained by
the variation in payment to the highest performers.

192.

{Movie Revenues Narrative} Calculate the Pearson correlation coefficient. What sign
does it have? Why?
ANSWER:
r 0.9954. It has a positive sign since the slope of the regression line ( b1 = 8.285) is
positive.

193.

{Movie Revenues Narrative} Conduct a test of the population coefficient of correlation to


determine at the 5% significance level whether a linear relationship exists between
payment to the two highest-paid performers and gross revenue.
ANSWER:
H 0 : 0 vs. H 1 : 0
Rejection region: | t | > t0.025,8 2.306
Test statistic: t = 29.304
Conclusion: Reject the null hypothesis. Yes, a linear relationship exists between payment
to the two highest-paid performers and gross revenue.

194.

{Movie Revenues Narrative} Conduct a test of the population slope to determine at the
5% significance level whether a linear relationship exists between payment to the two
highest-paid performers and gross revenue.
ANSWER:

H 0 : 1 0 vs. H 1 : 1 0

Rejection region: | t | > t0.025,8 2.306


Test statistic: t = 29.304
Conclusion: Reject the null hypothesis. Yes, a linear relationship exists between payment
to the two highest-paid performers and gross revenue.
195.

{Movie Revenues Narrative} Do the and 1 tests in the previous questions provide the
same results? Explain.
ANSWER:
Yes; both tests have the same value of the test statistic, the same rejection region, and of

Simple Linear Regression and Correlation


131
course the same conclusion. This is not a coincidence; the two tests are identical.
FOR QUESTIONS 196 AND 197, USE THE FOLLOWING NARRATIVE:
Narrative: Cost of Books
The editor of a major academic book publisher claims that a large part of the cost of books is the
cost of paper. This implies that larger books will cost more money. As an experiment to analyze
the claim, a university student visits the bookstore and records the number of pages and the
selling price of twelve randomly selected books. These data are listed below.
Book
1
2
3
4
5
6
7
8
9
10
11
12
196.

Number of Pages
844
727
360
915
295
706
410
905
1058
865
677
912

Selling Price ($)


55
50
35
60
30
50
40
53
65
54
42
58

{Cost of Books Narrative} Determine the coefficient of determination and discuss what
its value tells you.
ANSWER:
R 2 0.9378, which means that 93.78% of the variation in the price of books is explained
by the variation in the number of pages.

197.

{Cost of Books Narrative} Can we infer at the 5% significance level that the editor is
correct?
ANSWER:

H 0 : 1 0 vs. H 1 : 1 0

Rejection region: | t | > t0.025,10 2.228


Test statistic: t = 12.2814
Conclusion: Reject the null hypothesis. Yes, we can infer at the 5% significance level that
the editor is correct
FOR QUESTIONS 198 THROUGH 202, USE THE FOLLOWING NARRATIVE:
Narrative: Automobile Accidents and Precipitation
A statistician investigating the relationship between the amount of precipitation (in inches) and
the number of automobile accidents gathered data for 10 randomly selected days. The results

132
Day
1
2
3
4
5
6
7
8
9
10
198.

Chapter Seventeen
Precipitation
0.05
0.12
0.05
0.08
0.10
0.35
0.15
0.30
0.10
0.20

Number of Accidents
5
6
2
4
8
14
7
13
7
10

{Automobile Accidents and Precipitation Narrative} Calculate the standard error of


estimate, and describe what this statistic tells you about the regression line.
ANSWER:
s 1.3207; the models fit to these is good.

199.

{Automobile Accidents and Precipitation Narrative} Determine the coefficient of


determination and discuss what its value tells you about the two variables.
ANSWER:
R 2 0.893, which means that 89.3% of the variation in the number of accidents is
explained by the variation in the amount of precipitation.

200.

{Automobile Accidents and Precipitation Narrative} Conduct a test of the population


slope to determine whether these data allow us to conclude at the 10% significance level
that the amount of precipitation and the number of accidents are linearly related?
ANSWER:

H 0 : 1 0 vs. H 1 : 1 0

Rejection region: | t | > t0.05,8 1.86


Test statistic: t = 8.1709
Conclusion: Reject the null hypothesis. Yes, these data allow us to conclude at the 10%
significance level that the amount of precipitation and the number of accidents are
linearly related

201.

{Automobile Accidents and Precipitation Narrative} Conduct a test of the population


coefficient of correlation to determine whether these data allow us to conclude at the 10%
significance level that the amount of precipitation and the number of accidents are
linearly related.
ANSWER:

Simple Linear Regression and Correlation


133
H o : 0 vs. H1 : 0

Rejection region: | t | > t0.05,8 1.86


Test statistic: t = 8.1709
Conclusion: Reject the null hypothesis. Yes, these data allow us to conclude at the 10%
significance level that the amount of precipitation and the number of accidents are
linearly related.
202.

{Automobile Accidents and Precipitation Narrative} Do the 1 and tests in the


previous two questions provide the same results? Explain
ANSWER:
Yes, the two tests are identical to each other.

FOR QUESTIONS 203 THROUGH 208, USE THE FOLLOWING NARRATIVE:


Narrative: Willie Nelson Concert
At a recent Willie Nelson concert, a survey was conducted that asked a random sample of 20
people their age and how many concerts they have attended since the first of the year. The
following data were collected:
Age
Number of Concerts

62
6

57
5

40
4

49
3

67
5

54
5

43
2

65
6

54
3

41
1

Age
Number of Concerts

44
3

48
2

55
4

60
5

59
4

63
5

69
4

40
2

38
1

52
3

An Excel output follows :


SUMMARY OUTPUT

DESCRIPTIVE STATISTICS

Regression Statistics
Multiple R
0.80203
R Square
0.64326
Adjusted R Square
0.62344
Standard Error
0.93965
Observations
20

Age
Mean
Standard Error
Standard Deviation
Sample Variance
Count

53
2.1849
9.7711
95.4737
20

Concerts
Mean
Standard Error
Standard Deviation
Sample Variance
Count

MS
28.65711
0.88294

F
32.45653

Significance F
2.1082E-05

t Stat
-2.53491
5.69706

P-value
0.02074
0.00002

3.65
0.3424
1.5313
2.3447
20

SPEARMAN RANK CORRELATION COEFFICIENT=0.8306


ANOVA
Regression
Residual
Total

Intercept
Age

df
1
18
19

SS
28.65711
15.89289
44.55

Coefficients Standard Error


-3.01152
1.18802
0.12569
0.02206

203.

{Willie Nelson Concert


Lower 95%
-5.50746
0.07934

Upper 95%
-0.5156
0.1720

Narrative} Determine the standard error of estimate and describe what this statistic tells
you about the models fit.

134

Chapter Seventeen
ANSWER:
s 0.9396, and since the sample mean y = 3.65, we would have to admit that the
standard error of estimate is not very small. On the other hand, it is not a large number
either. Because there is no predefined upper limit on s , it is difficult in this problem to
assess the model in this way. However, using other criteria, it seems that the models fit to
these data is reasonable.

204.

{Willie Nelson Concert Narrative} Determine the coefficient of determination and


discuss what its value tells you about the two variables.
ANSWER:
R 2 0.64326, which means that 64.326% of the variation in number of concerts attended
is explained by the variation in age of the attendees.

205.

{Willie Nelson Concert Narrative} Calculate the Pearson correlation coefficient. What
sign does it have? Why?
ANSWER:
r 0.80204. It has a positive sign since the slope of the regression line, b1 , is positive.

206.

{Willie Nelson Concert Narrative} Conduct a test of the population coefficient of


correlation to determine at the 5% significance level whether a linear relationship exists
between age and number of concerts attended.
ANSWER:
H 0 : 0 vs. H 1 : 0
Rejection region: | t | > t0.025,18 2.101
Test statistic: t r (n 2) /(1 r 2 ) = 5.6971
Conclusion: Reject the null hypothesis. Yes

207.

{Willie Nelson Concert Narrative} Conduct a test of the population slope to determine at
the 5% significance level whether a linear relationship exists between age and number of
concerts attended.
ANSWER:

H 0 : 1 0 vs. H 1 : 1 0

Rejection region: | t | > t0.025,18 2.101


Test statistic: t = 5.6971
Conclusion: Reject the null hypothesis. Yes, we can infer that at the 5% significance level
that a linear relationship exists between age and number of concerts attended.
208.

{Willie Nelson Concert Narrative} Do the and 1 tests in the previous two questions
provide the same results? Explain.

Simple Linear Regression and Correlation


135
ANSWER:
Yes; both tests have the same value of the test statistic, the same rejection region, and of
course the same conclusion. This is not a coincidence; the two tests are identical.
FOR QUESTIONS 209 THROUGH 214, USE THE FOLLOWING NARRATIVE:
Narrative: Oil Quality and Price
Quality of oil is measured in API gravity degrees the higher the degrees API, the higher the
quality. The table shown below is produced by an expert in the field who believes that there is a
relationship between quality and price per barrel.
Oil degrees API
27.0
28.5
30.8
31.3
31.9
34.5
34.0
34.7
37.0
41.0
41.0
38.8
39.3

Price per barrel (in $)


12.02
12.04
12.32
12.27
12.49
12.70
12.80
13.00
13.00
13.17
13.19
13.22
13.27

Regression Analysis
Predictor
Coef
StDev
T
Constant
9.4349
0.2867
32.91
Degrees
0.095235 0.008220 11.59
S = 0.1314
R-Sq = 92.46%
Analysis of Variance
Source
Regression
Residual Error
Total
209.

DF
1
11
12

A partial statistical software output follows:


Descriptive Statistics
Variable
N
Mean
Mean
Degrees
13
34.60
Price
13 12.730

StDev

SE

4.613
0.457

1.280
0.127

Covariances
Degrees
Price

Degrees
21.281667
2.026750

Price
0.208833

P
0.000
0.000

R-Sq(adj) = 91.7%

SS
2.3162
0.1898
2.5060

MS
2.3162
0.0173

F
134.24

P
0.000

{Oil Quality and Price Narrative} Determine the standard error of estimate and describe
what this statistic tells you.
ANSWER:
s 0.1314. Since the sample mean y = 12.73, the standard error of estimate is judged to
be small, and we may say that the model fits the data well.

136
210.

Chapter Seventeen
{Oil Quality and Price Narrative} Determine the coefficient of determination and discuss
what its value tells you about the two variables.
ANSWER:
R 2 0.9246, which means that 92.46% of the variation in the oil price per barrel is
explained by the variation in the API degrees.

211.

{Oil Quality and Price Narrative} Calculate the Pearson correlation coefficient. What
sign does it have? Why?
ANSWER:
r 0.9616. It has a positive sign since the slope of the regression line, b1 , is positive.

212.

{Oil Quality and Price Narrative} Conduct a test of the population coefficient of
correlation to determine at the 5% significance level whether a linear relationship exists
between the quality of oil and price per barrel.
ANSWER:
H 0 : 0 vs. H 1 : 0
Rejection region: | t | > t0.025,11 2.201
Test statistic: t r (n 2) /(1 r 2 ) = 11.61
Conclusion: Reject the null hypothesis. Yes, we can infer that at the 5% significance level
that a linear relationship exists between the quality of oil and price per barrel.

213.

{Oil Quality and Price Narrative} Conduct a test of the population slope to determine at
the 5% significance level whether a linear relationship exists between the quality of oil
and price per barrel.
ANSWER:

H 0 : 1 0 vs. H 1 : 1 0

Rejection region: | t | > t0.025,11 2.201


Test statistic: t = 11.59 (from Minitab output)
Conclusion: Reject the null hypothesis. Yes, we can infer at the 5% significance level that
a linear relationship exists between the quality of oil and price per barrel.
214.

{Oil Quality and Price Narrative} Do the and 1 tests in the previous two questions
provide the same results? Explain.
ANSWER:

Simple Linear Regression and Correlation


137
Yes; both tests have the same value of the test statistic (the small difference between
11.61 and 11.59 is due to rounding in Minitab output), the same rejection region, and of
course the same conclusion. This is not a coincidence; the two tests are identical.

138

Chapter Seventeen

SECTION 6
MULTIPLE CHOICE QUESTIONS
In the following multiple-choice questions, please circle the correct answer.
215.

In order to estimate with 95% confidence the expected value of y for a given value of x
in a simple linear regression problem, a random sample of 10 observations is taken.
Which of the following t-table values listed below would be used?
a. 2.228
b. 2.306
c. 1.860
d. 1.812
ANSWER: b

216.

Given a specific value of x and confidence level, which of the following statements is
correct?
a. The confidence interval estimate of the expected value of y can be calculated but the
prediction interval of y for the given value of x cannot be calculated.
b. The confidence interval estimate of the expected value of y will be wider than the
prediction interval.
c. The prediction interval of y for the given value of x can be calculated but the
confidence interval estimate of the expected value of y cannot be calculated.
d. The confidence interval estimate of the expected value of y will be narrower than the
prediction interval.
ANSWER: d

217.

In order to predict with 90% confidence the expected value of y for a given value of x in a
simple linear regression problem, a random sample of 10 observations is taken. Which of
the following t-table values listed below would be used?
a. 2.228
b. 2.306
c. 1.860
d. 1.812
ANSWER: c

218.

The confidence interval estimate of the expected value of y for a given value y x,
compared to the prediction interval of y for the same given value of x and confidence
level, will be
a. wider
b. narrower
c. the same
d. impossible to know
ANSWER: b

Simple Linear Regression and Correlation


139
219.

In order to predict with 99% confidence the expected value of y for a given value of x in a
simple linear regression problem, a random sample of 10 observations is taken. Which of
the following t-table values listed below would be used?
a. 1.860
b. 2.306
c. 2.896
d. 3.355
ANSWER: d

220.

The width of the confidence interval estimate for the predicted value of y depends on
a. the standard error of the estimate
b. the value of x for which the prediction is being made
c. the sample size
d. All of the above
ANSWER: d

221.

In order to predict with 80% confidence the expected value of y for a given value of x in a
simple linear regression problem, a random sample of 15 observations is taken. Which of
the following t-table values listed below would be used?
a. 1.350
b. 1.771
c. 2.160
d. 2.650
ANSWER: a

222.

In order to predict with 98% confidence the expected value of y for a given value of x in a
simple linear regression problem, a random sample of 15 observations is taken. Which of
the following t-table values listed below would be used?
a. 1.350
b. 1.771
c. 2.160
d. 2.650
ANSWER: d

140

Chapter Seventeen
TRUE / FALSE QUESTIONS

223.

In developing a 95% confidence interval for the expected value of y from a simple linear
regression problem involving a sample of size 10, the appropriate table value would be
1.86.
ANSWER: F

224.

In developing a 80% prediction interval for the particular value of y from a simple linear
regression problem involving a sample of size 12, the appropriate table value would be
1.372
ANSWER: T

225.

In developing 90% prediction interval for the particular value of y from a simple linear
regression problem involving a sample of size 14, the appropriate table value would be
2.179
ANSWER: F

226.

In order to predict with 95% confidence a particular value of y for a given value of x in
a simple linear regression problem, a random sample of 20 observations is taken. The
appropriate table value that would be used is 2.101.
ANSWER: T

227.

The confidence interval estimate of the expected value of y will be narrower than the
prediction interval for the same given value of x and confidence level. This is because
there is less error in estimating a mean value as opposed to predicting an individual value.
ANSWER: T

228.

The confidence interval estimate of the expected value of y will be wider than the
prediction interval for the same given value of x and confidence level. This is because
there is more error in estimating a mean value as opposed to predicting an individual
value.
ANSWER: F

229.

In developing a 90% confidence interval for the expected value of y from a simple linear
regression problem involving a sample of size 15, the appropriate table value would be
1.761.
ANSWER: F

230.

In developing a 99% confidence interval for the expected value of y from a simple linear
regression problem involving a sample of size 25, the appropriate table value would be
2.807
ANSWER: T

231.

The prediction interval for a particular value of y is always wider than the confidence
interval for mean value of y, given the same data set, x value, and confidence level.
ANSWER: T

Simple Linear Regression and Correlation


141
BASIC TECHNIQUES & APPLIED QUESTIONS
232.

A medical statistician wanted to examine the relationship between the amount of sunshine
(x) and incidence of skin cancer (y). As an experiment he found the number of skin
cancers detected per 100,000 of population and the average daily sunshine in eight
counties around the country. These data are shown below.
Average Daily Sunshine
Skin Cancer per 100,000

5
7

7
11

6
9

7
12

8
15

6
10

4
7

3
5

Predict with 95% confidence the skin cancers per 100,000 in a county with a daily
average of 6.5 hours of sunshine.
ANSWER:
10.884 2.525. Thus, LCL= 8.359, and UCL = 13.409
FOR QUESTIONS 233 THROUGH 235, USE THE FOLLOWING NARRATIVE:
Narrative: Sales and Experience
The general manager of a chain of furniture stores believes that experience is the most important
factor in determining the level of success of a salesperson. To examine this belief she records last
months sales (in $1,000s) and the years of experience of 10 randomly selected salespeople.
These data are listed below.
Salesperson
1
2
3
4
5
6
7
8
9
10
233.

Years of Experience
0
2
10
3
8
5
12
7
20
15

Sales
7
9
20
15
18
14
20
17
30
25

{Sales and Experience Narrative} Predict with 95% confidence the monthly sales of a
salesperson with 10 years of experience.
ANSWER:
19.447 3.819. Thus LCL = 15.628 (in $1000s), and UCL = 23.266 (in $1000s)

234.

{Sales and Experience Narrative} Estimate with 95% confidence the average monthly
sales of all salespersons with 10 years of experience.
ANSWER:
19.447 1.199. Thus LCL = 18.248 (in $1000s), and UCL = 20.646 (in $1000s)

142
235.

Chapter Seventeen
{Sales and Experience Narrative} Which interval in the previous two questions is
narrower: the confidence interval estimate of the expected value of y or the prediction
interval for the same given value of x (10 years) and same confidence level? Why?
ANSWER:
The confidence interval estimate of the expected value of y is narrower than the
prediction interval for the same given value of x (10 years) and some confidence level.
This is because there is less error in estimating a mean value as opposed to predicting an
individual value.

FOR QUESTIONS 236 THROUGH 238, USE THE FOLLOWING NARRATIVE:


Narrative: Income and Education
A professor of economics wants to study the relationship between income (y in $1000s) and
education (x in years). A random sample eight individuals is taken and the results are shown
below.
Education
Income
236.

16
58

11
40

15
55

8
35

12
43

10
41

13
52

14
49

{Income and Education Narrative} Predict with 95% confidence the income of an
individual with 10 years of education.
ANSWER:
39.715 2.710. Thus, LCL = 37.005 (in $1000s), and UCL = 42.425 (in $1000s)

237.

{Income and Education Narrative} Estimate with 95% confidence the average income of
all individuals with 10 years of education.
ANSWER:
39.715 1.188. Thus, LCL = 38.527 (in $1000s), and UCL = 40.903 (in $1000s)

238.

{Income and Education Narrative} Which interval in the previous two questions is
narrower: the confidence interval estimate of the expected value of y or the prediction
interval for the same given value of x (10 years) and same confidence level? Why?
ANSWER:
The confidence interval estimate of the expected value of y is narrower than the
prediction interval for the same given value of x (10 years) and some confidence level.
This is because there is less error in estimating a mean value as opposed to predicting an
individual value.

Simple Linear Regression and Correlation


143
FOR QUESTIONS 239 THROUGH 242, USE THE FOLLOWING NARRATIVE:
Narrative: Movie Revenues
An ardent fan of television game shows has observed that, in general, the more educated the
contestant, the less money he or she wins. To test her belief she gathers data about the last eight
winners of her favorite game show. She records their winnings in dollars and the number of years
of education. The results are as follows.
Contestant
1
2
3
4
5
6
7
8
239.

Years of Education
11
15
12
16
11
16
13
14

Winnings
750
400
600
350
800
300
650
400

{Movie Revenues Narrative} Predict with 95% the winnings of a contestant who has 15
years of education.
ANSWER:
397.500 159.213. Thus, LCL = $238.287, and UCL = $556.713

240.

{Movie Revenues Narrative} Predict with 95% the winnings of a contestant who has 10
years of education.
ANSWER:
397.500 179.971. Thus, LCL = $217.529, and UCL = $577.471

241.

{Movie Revenues Narrative} Estimate with 95% confidence the average winnings of all
contestants who have 15 years of education.
ANSWER:
397.500 64.998. Thus, LCL = $332.502, and UCL = $462.498

242.

{Movie Revenues Narrative} Estimate with 95% confidence the average winnings of all
contestants who have 10 years of education.
ANSWER:
397.500 106.141. Thus, LCL = $291.359, and UCL = $503.641

FOR QUESTIONS 243 THROUGH 245, USE THE FOLLOWING NARRATIVE:

144

Chapter Seventeen

Narrative: Movie Revenues


A financier whose specialty is investing in movie productions has observed that, in general,
movies with big-name stars seem to generate more revenue than those movies whose stars are
less well known. To examine his belief he records the gross revenue and the payment (in $
millions) given to the two highest-paid performers in the movie for ten recently released movies.
Movie
1
2
3
4
5
6
7
8
9
10
243.

Cost of Two Highest


Paid Performers
5.3
7.2
1.3
1.8
3.5
2.6
8.0
2.4
4.5
6.7

Gross Revenue
48
65
18
20
31
26
73
23
39
58

{Movie Revenues Narrative} Predict with 95% confidence the gross revenue of a movie
whose top two stars earn $5.0 million.
ANSWER:
45.65 4.916. Thus, LCL = 40.734 (in $1,000,000s), and UCL = 50.566 (in
$1,000,000s)

244.

{Movie Revenues Narrative} Estimate with 95% confidence the average gross revenue of
a movie whose top two stars earn $5.0 million.
ANSWER:
45.65 1.54. Thus, LCL= 44.11 (in $1,000,000s), and UCL = 47.19 (in $1,000,000s)

245.

{Movie Revenues Narrative} Which interval in the previous two questions is narrower:
the confidence interval estimate of the expected value of y or the prediction interval for
the same given value of x (10 years) and same confidence level? Why?
ANSWER:
The confidence interval estimate of the expected value of y is narrower than the
prediction interval for the same given value of x (10 years) and some confidence level.
This is because there is less error in estimating a mean value as opposed to predicting an
individual value.

FOR QUESTIONS 246 THROUGH 248, USE THE FOLLOWING NARRATIVE:


Narrative: Cost of Books

Simple Linear Regression and Correlation


145
The editor of a major academic book publisher claims that a large part of the cost of books is the
cost of paper. This implies that larger books will cost more money. As an experiment to analyze
the claim, a university student visits the bookstore and records the number of pages and the
selling price of twelve randomly selected books. These data are listed below.
Book
1
2
3
4
5
6
7
8
9
10
11
12
246.

Number of Pages
844
727
360
915
295
706
410
905
1058
865
677
912

Selling Price ($)


55
50
35
60
30
50
40
53
65
54
42
58

{Cost of Books Narrative} Predict with 90% confidence the selling price of a book with
900 pages.
ANSWER:
56.647 5.311. Thus, LCL = $51.336, and UCL = $61.958

247.

{Cost of Books Narrative} Estimate with 90% confidence the average selling price of all
books with 900 pages.
ANSWER:
56.647 1.803. Thus, LCL = $54.844, and UCL = $58.450

248.

{Cost of Books Narrative} Which interval in the previous two questions is narrower: the
confidence interval estimate of the expected value of y or the prediction interval for the
same given value of x (10 years) and same confidence level? Why?
ANSWER:
The confidence interval estimate of the expected value of y is narrower than the
prediction interval for the same given value of x (10 years) and some confidence level.
This is because there is less error in estimating a mean value as opposed to predicting an
individual value.

FOR QUESTIONS 249 THROUGH 251, USE THE FOLLOWING NARRATIVE:


Narrative: Automobile Accidents and Precipitation

146

Chapter Seventeen

A statistician investigating the relationship between the amount of precipitation (in inches) and
the number of automobile accidents gathered data for 10 randomly selected days. The results
Day
1
2
3
4
5
6
7
8
9
10
249.

Precipitation
0.05
0.12
0.05
0.08
0.10
0.35
0.15
0.30
0.10
0.20

Number of Accidents
5
6
2
4
8
14
7
13
7
10

{Automobile Accidents and Precipitation Narrative} Predict with 95% confidence the
number of accidents that occur when there is 0.40 inches of rain.
ANSWER:
16.316 4.032. Thus, LCL = 12.284, and UCL = 20.348

250.

{Automobile Accidents and Precipitation Narrative} Estimate with 95% confidence the
average daily number of accidents when the daily precipitation is 0.25 inches.
ANSWER:
11.086 1.377. Thus, LCL = 9.709, and UCL = 12.463

251.

{Automobile Accidents and Precipitation Narrative} Which interval in the previous two
questions is narrower: the confidence interval estimate of the expected value of y or the
prediction interval for the same given value of x (10 years) and same confidence level?
Why?
ANSWER:
The confidence interval estimate of the expected value of y is narrower than the
prediction interval for the same given value of x (10 years) and some confidence level.
This is because there is less error in estimating a mean value as opposed to predicting an
individual value.

FOR QUESTIONS 252 THROUGH 254, USE THE FOLLOWING NARRATIVE:


Narrative: Willie Nelson Concert

Simple Linear Regression and Correlation


147
At a recent Willie Nelson concert, a survey was conducted that asked a random sample of 20
people their age and how many concerts they have attended since the first of the year. The
following data were collected:
Age
Number of Concerts

62
6

57
5

40
4

49
3

67
5

54
5

43
2

65
6

54
3

41
1

Age
Number of Concerts

44
3

48
2

55
4

60
5

59
4

63
5

69
4

40
2

38
1

52
3

An Excel output follows :


SUMMARY OUTPUT

DESCRIPTIVE STATISTICS

Regression Statistics
Multiple R
0.80203
R Square
0.64326
Adjusted R Square
0.62344
Standard Error
0.93965
Observations
20

Age
Mean
Standard Error
Standard Deviation
Sample Variance
Count

53
2.1849
9.7711
95.4737
20

Concerts
Mean
Standard Error
Standard Deviation
Sample Variance
Count

MS
28.65711
0.88294

F
32.45653

Significance F
2.1082E-05

t Stat
-2.53491
5.69706

P-value
0.02074
0.00002

Lower 95%
-5.50746
0.07934

3.65
0.3424
1.5313
2.3447
20

SPEARMAN RANK CORRELATION COEFFICIENT=0.8306


ANOVA
df
1
18
19

Regression
Residual
Total

Intercept
Age

252.

SS
28.65711
15.89289
44.55

Coefficients Standard Error


-3.01152
1.18802
0.12569
0.02206

Upper 95%
-0.5156
0.1720

{Willie Nelson Concert Narrative} Predict with 95% confidence the number of concerts
attended by a 45 years-old individual.
ANSWER:
2.645 2.057. Thus, LCL = 0.588, and UCL = 4.702

253.

{Willie Nelson Concert Narrative} Estimate with 95% confidence the average number of
concerts attended by all 45 year-old individuals.
ANSWER:
2.645 0.577. Thus, LCL = 2.068, and UCL = 3.222

254.

{Willie Nelson Concert Narrative} Which interval in the previous two questions is
narrower: the confidence interval estimate of the expected value of y or the prediction
interval for the same given value of x (10 years) and same confidence level? Why?
ANSWER:

148

Chapter Seventeen
The confidence interval estimate of the expected value of y is narrower than the
prediction interval for the same given value of x (10 years) and some confidence level.
This is because there is less error in estimating a mean value as opposed to predicting an
individual value.

FOR QUESTIONS 255 THROUGH 257, USE THE FOLLOWING NARRATIVE:


Narrative: Oil Quality and Price
Quality of oil is measured in API gravity degrees the higher the degrees API, the higher the
quality. The table shown below is produced by an expert in the field who believes that there is a
relationship between quality and price per barrel.
Oil degrees API
27.0
28.5
30.8
31.3
31.9
34.5
34.0
34.7
37.0
41.0
41.0
38.8
39.3

Price per barrel (in $)


12.02
12.04
12.32
12.27
12.49
12.70
12.80
13.00
13.00
13.17
13.19
13.22
13.27

A partial Minitab output follows:


Descriptive Statistics
Variable
N
Mean
Mean
Degrees
13
34.60
Price
13 12.730

Coef
StDev
T
9.4349
0.2867
32.91
0.095235 0.008220 11.59

S = 0.1314

R-Sq = 92.46%

Degrees
Price

Degrees
21.281667
2.026750

P
0.000
0.000

R-Sq(adj) = 91.7%

Analysis of Variance
Source
Regression
Residual Error
Total

DF
1
11
12

SS
2.3162
0.1898
2.5060

4.613
0.457

Covariances

Regression Analysis
Predictor
Constant
Degrees

StDev

MS
2.3162
0.0173

F
134.24

P
0.000

Price
0.208833

SE
1.280
0.127

Simple Linear Regression and Correlation


149
255.

{Oil Quality and Price Narrative} Predict with 95% confidence the oil price per barrel for
an API degree of 35.
ANSWER:
12.768 (2.201)(0.1314)(1.038) = 12.768 0.300 . Thus, LCL = 12.468, and UCL =
13.068

256.

{Oil Quality and Price Narrative} Estimate with 95% confidence the average oil price per
barrel for an API degree of 35.
ANSWER:
12.768 (2.201)(0.1314)(0.2785) = 12.768 0.081. Thus, LCL = 12.687, and UCL =
12.849

257.

{Oil Quality and Price Narrative} Which interval in the previous two questions is
narrower: the confidence interval estimate of the expected value of y or the prediction
interval for the same given value of x (10 years) and same confidence level? Why?
ANSWER:
The confidence interval estimate of the expected value of y is narrower than the
prediction interval for the same given value of x (10 years) and some confidence level.
This is because there is less error in estimating a mean value as opposed to predicting an
individual value.

150

Chapter Seventeen

SECTION 7
MULTIPLE CHOICE QUESTIONS
In the following multiple-choice questions, please circle the correct answer.
258.

259.

The standardized residual is defined as:


a. residual divided by the standard error of estimate
b. residual multiplied by the square root of the standard error of estimate
c. residual divided by the square of the standard error of estimate
d. residual multiplied by the standard error of estimate
ANSWER: a
2
The least squares method requires that the variance of the error variable is a
constant no matter what the value of x is. When this requirement is violated, the
condition is called:
a. non-independence of
b. homoscedasticity
c. heteroscedasticity
d. influential observation
ANSWER: c

260.

When the variance 2 of the error variable


is, this condition is called:
a. homocausality
b. heteroscedasticity
c. homoscedasticity
d. heterocausality
ANSWER: c

261.

If the plot of the residuals is fan shaped, which assumption of regression analysis if
violated?
a. Normality
b. Homoscedasticity
c. Independence of errors
d. No assumptions are violated, the graph should resemble a fan
ANSWER: b

is a constant no matter what the value of x

Simple Linear Regression and Correlation


151
262.

In regression analysis we use the Spearman rank correlation coefficient to measure and
test to determine whether a relationship exists between the two variables if
a. one or both variables may be ordinal
b. both variables are interval but the normality requirement is not met
c. both (a) and (b)
d. neither (a) nor (b)
ANSWER: c

263.

The sample Spearman rank correlation coefficient, where a and b are the ranks of x and y,
respectively, is given by
a. rs cov a, b / sa / sb
b. rs cov a, b / sa sb
c. rs cov a, b / sa sb
d. rs cov a, b / sa sb
ANSWER: d

152

Chapter Seventeen
TRUE / FALSE QUESTIONS

264.

2
The variance of the error variable is required to be constant. When this requirement is
satisfied, the condition is called homoscedasticity.
ANSWER: T

265.

2
The variance of the error variable is required to be constant. When this requirement is
violated, the condition is called heteroscedasticity.
ANSWER: T

266.

We standardize residuals in the same way we standardize all variables, by subtracting the
mean and dividing by the variance.
ANSWER: F

267.

An outlier is an observation that is unusually small or unusually large.


ANSWER: T

268.

One method of diagnosing heteroscedasticity is to plot the residuals against the predicted
values of y, then look for a change in the spread of the plotted values.
ANSWER: T

269.

Regardless of the value of x, the standard deviation of the distribution of y values about
the regression line is the same. This assumption of equal standard deviations about the
regression line is called residual analysis.
ANSWER: F

270.

Data that exhibit an autocorrelation effect violate the regression assumption of


independence.
ANSWER: T

271.

When n is greater than 30, the sample Spearman rank correlation coefficient rs is
approximately normally distributed with mean of 0 and standard deviation of 1.
ANSWER: F

272.

Given that n = 37, and the value of sample Spearman rank correlation coefficient rs =
0.35, the value of the test statistic for testing H o : s 0 is z = 2.10
ANSWER: T

273.

Another name for Pearson coefficient of correlation is the Spearman rank correlation
coefficient.
ANSWER: F

Simple Linear Regression and Correlation


153
STATISTICAL CONCEPTS & APPLIED QUESTIONS
FOR QUESTIONS 274 THROUGH 278, USE THE FOLLOWING NARRATIVE:
Narrative: Sales and Experience
The general manager of a chain of furniture stores believes that experience is the most important
factor in determining the level of success of a salesperson. To examine this belief she records last
months sales (in $1,000s) and the years of experience of 10 randomly selected salespeople.
These data are listed below.
Salesperson
1
2
3
4
5
6
7
8
9
10
274.

Years of Experience
0
2
10
3
8
5
12
7
20
15

Sales
7
9
20
15
18
14
20
17
30
25

{Sales and Experience Narrative} Use the regression equation y 8.63 1.0817 x to
determine the predicted values of y.
ANSWER:
y : 8.630, 10.793, 19.447, 11.875, 17.284, 14.039, 21.610, 16.202, 30.264, and 24.856

275.

{Sales and Experience Narrative} Use the predicted and actual values of y to calculate the
residuals.
ANSWER:
ri : 1.630, -1.793, 0.553, 3.125, 0.716, -0.039, -1.610, 0.798. 0.264, and 0.144

154
276.

Chapter Seventeen
{Sales and Experience Narrative} Plot the residuals against the predicted values of y.
Does the variance appear to be constant?
ANSWER:
Residuals versus Predicted
4
3
Residuals

2
1
0
-1 0

10

15

20

25

30

35

-2
-3
Predicted Values

It appears that heteroscedasticity is not a problem.


277.

{Sales and Experience Narrative} Compute the standardized residuals.


ANSWER:
1.100, -1.210, 0.373, 2.108, 0.483, -0.026, -1.086, 0.538, -0.178, and 0.097

278.

{Sales and Experience Narrative} Identify possible outliers.


ANSWER:
The point (3, 15) is a possible outlier since its standardized residual 2.108 exceeds 2.0.

FOR QUESTIONS 279 THROUGH 283, USE THE FOLLOWING NARRATIVE:


Narrative: Income and Education
A professor of economics wants to study the relationship between income (y in $1000s) and
education (x in years). A random sample eight individuals is taken and the results are shown
below.
Education
Income
279.

16
58

11
40

15
55

8
35

12
43

10
41

13
52

14
49

{Income and Education Narrative} Use the regression equation y 10.6165 2.9098 x to
determine the predicted values of y.
ANSWER:
y : 57.173, 42.624, 54.263, 33.895, 45.534, 39.714, 48.444, and 51.353

Simple Linear Regression and Correlation


155
280.

{Income and Education Narrative} Use the predicted and actual values of y to calculate
the residuals.
ANSWER:
ri : 0.877, -2.624, 0.737, 1.105, -2.534, 1.286, 3.556, and 2.353.

281.

{Income and Education Narrative} Plot the residuals against the predicted values of y.
Does the variance appear to be constant?
ANSWER:
Residuals versus Predicted

Residulas

4
2
0
-2 0

10

20

30

40

50

60

70

-4
Predicted Values

It appears that heteroscedasticity is not a problem.


282.

{Income and Education Narrative} Compute the standardized residuals.


ANSWER:
0.367, -1.164, 0.327, 0.490, -1.124, 0.570, 1.577, and 1.044

283.

{Income and Education Narrative} Identify possible outliers.


ANSWER:
No outliers exist, since no observation has standard residual whose absolute value
exceeds 2.0.

FOR QUESTIONS 284 THROUGH 288, USE THE FOLLOWING NARRATIVE:


Narrative: Game Winnings and Education
An ardent fan of television game shows has observed that, in general, the more educated the
contestant, the less money he or she wins. To test her belief she gathers data about the last eight
winners of her favorite game show. She records their winnings in dollars and the number of years
of education. The results are as follows.

156

Chapter Seventeen

Contestant
1
2
3
4
5
6
7
8
284.

Years of Education
11
15
12
16
11
16
13
14

{Game

Winnings

and

Winnings
750
400
600
350
800
300
650
400
Education

Narrative}

Use

y 1735 89.1667 x to determine the predicted values of y.

the

regression

equation

ANSWER:
y : 754.167, 397.500, 665.000, 308.333, 754.167, 308.333, 575.833, and 486.667
285.

{Game Winnings and Education Narrative} Use the predicted and actual values of y to
calculate the residuals.
ANSWER:
ri : 4.167, 2.500, -65.000, 41.667, 45.833, -8.333, 74.167, and 86.667
{Game Winnings and Education Narrative} Plot the residuals against the predicted values
y . Does the variance appear to be constant.
ANSWER:
Residuals versus Predicted

Residuals

286.

100
75
50
25
0
-25 0
-50
-75
-100

100

200

300

400

500

Predicted Values

The variance appears to be constant.

600

700

800

Simple Linear Regression and Correlation


157
287.

{Game Winnings and Education Narrative} Compute the standardized residuals.


ANSWER:
The standardized residuals are: 0.076, 0.045, -1.182, 0.758, 0.833, -0.152, 1.349, and
1.576.

288.

{Game Winnings and Education Narrative} Identify possible outliers.


ANSWER:
No outliers exist, since no observation has standard residual whose absolute value
exceeds 2.0.

FOR QUESTIONS 289 THROUGH 293, USE THE FOLLOWING NARRATIVE:


Narrative: Movie Revenues
A financier whose specialty is investing in movie productions has observed that, in general,
movies with big-name stars seem to generate more revenue than those movies whose stars are
less well known. To examine his belief he records the gross revenue and the payment (in $
millions) given to the two highest-paid performers in the movie for ten recently released movies.
Movie
1
2
3
4
5
6
7
8
9
10
289.

Cost of Two Highest


Paid Performers
5.3
7.2
1.3
1.8
3.5
2.6
8.0
2.4
4.5
6.7

Gross Revenue
48
65
18
20
31
26
73
23
39
58

{Movie Revenues Narrative} Use the regression equation y 4.225 8.285 x to determine
the predicted values of y.
ANSWER:
y : 48.137, 63.878, 14.996, 19.139, 33.223, 25.767, 70.506, 24.110, 41.508, and 59.736.

290.

{Movie Revenues Narrative} Use the predicted and actual values of y to calculate the
residuals.
ANSWER:
ri : -0.137, 1.122, 3.004, 0.861, -2.223, 0.233, 2.494, -1.110, 2.508, and 1.736

158
291.

Chapter Seventeen
{Movie Revenues Narrative} Plot the residuals against the predicted values of y. Does the
variance appear to be constant.
ANSWER:
Residuals versus Predicted

Residuals

4
2
0
-2

10

20

30

40

50

60

70

80

-4
Predicted Values

It appears that heteroscedasticity is not a problem.


292.

{Movie Revenues Narrative} Compute the standardized residuals.


ANSWER:
The standardized residuals are: 0.072, 0.588, 1.574, 0.451, -1.165, 0.122, 1.306, -0.581,
-1.314, and 0.909.

293.

{Movie Revenues Narrative} Identify possible outliers.


ANSWER:
No outliers exist, since no observation has standardized residual whose absolute value
exceeds 2.0.

FOR QUESTIONS 294 THROUGH 301, USE THE FOLLOWING NARRATIVE:


Narrative: Willie Nelson Concert
At a recent Willie Nelson concert, a survey was conducted that asked a random sample of 20
people their age and how many concerts they have attended since the first of the year. The
following data were collected:
Age
Number of Concerts

62
6

57
5

40
4

49
3

67
5

54
5

43
2

65
6

54
3

41
1

Age
Number of Concerts

44
3

48
2

55
4

60
5

59
4

63
5

69
4

40
2

38
1

52
3

Simple Linear Regression and Correlation


159
An Excel output follows :
SUMMARY OUTPUT

DESCRIPTIVE STATISTICS

Regression Statistics
Multiple R
0.80203
R Square
0.64326
Adjusted R Square
0.62344
Standard Error
0.93965
Observations
20

Age
Mean
Standard Error
Standard Deviation
Sample Variance
Count

53
2.1849
9.7711
95.4737
20

Concerts
Mean
Standard Error
Standard Deviation
Sample Variance
Count

MS
28.65711
0.88294

F
32.45653

Significance F
2.1082E-05

t Stat
-2.53491
5.69706

P-value
0.02074
0.00002

Lower 95%
-5.50746
0.07934

3.65
0.3424
1.5313
2.3447
20

SPEARMAN RANK CORRELATION COEFFICIENT=0.8306


ANOVA
Regression
Residual
Total

Intercept
Age

294.

df
1
18
19

SS
28.65711
15.89289
44.55

Coefficients Standard Error


-3.01152
1.18802
0.12569
0.02206

{Willie Nelson Concert Narrative} Use the regression equation y 3.0115 0.1257 x to
determine the predicted values of y.
ANSWER:
The predicted values y are:
4.781 4.153 2.016 3.147
2.519 3.022 3.901 4.530

295.

5.410
4.404

3.776
4.907

2.393
5.661

5.158
2.016

3.776
1.765

2.142
3.524

{Willie Nelson Concert Narrative} Use the predicted values and the actual values of y to
calculate the residuals.
ANSWER:
The residuals r y y are:
1.219 0.847 1.984 -0.147 -0.410
0.481 -1.022
0.099
0.470 -0.404

296.

Upper 95%
-0.5156
0.1720

1.224
0.093

-0.393 0.842
-1.661 -0.016

-0.776
-0.765

-1.142
-0.524

{Willie Nelson Concert Narrative} Plot the residuals in against the predicted values y .

160

Chapter Seventeen

ANSWER:
Residuals versus Predicted
3.000

Residuals

2.000
1.000
0.000
-1.000

-2.000

Predicted

297.

{Willie Nelson Concert Narrative} Does it appear that heteroscedasticity is a problem?


Explain.
ANSWER:
The variance of the error variable appears to be constant; therefore heteroscedasticity is
not a problem.

298.

{Willie Nelson Concert Narrative} Draw a histogram of the residuals.


ANSWER:

Histogram

Frequency

10
8
6
4
2
0
-1

Residuals

299.

{Willie Nelson Concert Narrative} Does it appear that the errors are normally
distributed? Explain.

Simple Linear Regression and Correlation


161
ANSWER:
The histogram is positively skewed. The errors may not be normally distributed.
300.

{Willie Nelson Concert Narrative} Use the residuals to compute the standardized
residuals.
ANSWER:
The standardized residuals r / s are:
1.297 0.902 2.111 -0.157 -0.436
0.512 -1.087 0.105 0.500 -0.430

301.

1.303
0.099

-0.418
-1.768

0.896
-0.017

-0.826
-0.814

-1.215
-0.558

{Willie Nelson Concert Narrative} Identify possible outliers.


ANSWER:
There are no outliers since none of the 20 observations has a standardized residual whose
absolute value exceeds 2.0.

FOR QUESTIONS 302 THROUGH 309, USE THE FOLLOWING NARRATIVE:


Narrative: Oil Quality and Price
Quality of oil is measured in API gravity degrees the higher the degrees API, the higher the
quality. The table shown below is produced by an expert in the field who believes that there is a
relationship between quality and price per barrel.
Oil degrees API
27.0
28.5
30.8
31.3
31.9
34.5
34.0
34.7
37.0
41.0
41.0
38.8
39.3
Degrees
Price

Price per barrel (in $)


12.02
12.04
12.32
12.27
12.49
12.70
12.80
13.00
13.00
13.17
13.19
13.22
13.27

21.281667
2.026750

A partial Minitab output follows:

Descriptive Statistics
Variable
N
Mean
Mean
Degrees
13
34.60
Price
13 12.730
Covariances
Degrees

0.208833

Regression Analysis
Predictor
Coef
StDev
T
Constant
9.4349
0.2867
32.91
Degrees
0.095235 0.008220 11.59

P
0.000
0.000

Price

StDev

SE

4.613
0.457

1.280
0.127

162

Chapter Seventeen

S = 0.1314

R-Sq = 92.46%

R-Sq(adj) = 91.7%

Analysis of Variance
Source
Regression
Residual Error
Total
302.

DF
1
11
12

SS
2.3162
0.1898
2.5060

MS
2.3162
0.0173

F
134.24

P
0.000

{Oil Quality and Price Narrative} Use the regression equation y 9.4349 0.095235 x to
determine the predicted values of y.
ANSWER:
The predicted values y are: 12.006, 12.149, 12.368, 12.416, 12.473, 12.721, 12.673,
12.740, 12.959, 13.340, 13.340, 13.130, and 13.178.

303.

{Oil Quality and Price Narrative} Use the predicted values and the actual values of y to
calculate the residuals.
ANSWER:
The residuals r y y are: 0.014, -0.109, -0.048, -0.146, 0.017, -0.021, 0.127, 0.260,
0.041, -0.170, -0.150, 0.090, and 0.092.

{Oil Quality and Price Narrative} Plot the residuals against the predicted values y .
ANSWER:

Residuals Versus the Fitted Values


(response is Price)

0.3

0.2

Residual

304.

0.1

0.0

-0.1

-0.2
12.0

12.2

12.4

12.6

12.8

Fitted Value

13.0

13.2

13.4

Simple Linear Regression and Correlation


163

305.

{Oil Quality and Price Narrative} Does it appear that heteroscedasticity is a problem?
Explain.
ANSWER:
The variance of the error variable appears to be constant; therefore heteroscedasticity is
not a problem.

306.

{Oil Quality and Price Narrative} Draw a histogram of the residuals.


ANSWER:
Histogram of the Residuals
(response is Price)
5

Frequency

307.

{Oil
0 Quality and Price Narrative} Does it appear that the errors are normally distributed?
Explain. -0.2
-0.1
0.0
0.1
0.2
0.3
Residual

ANSWER:
The histogram is fairly symmetric; therefore we may conclude that the errors are
normally distributed.
308.

{Oil Quality and Price Narrative} Use the residuals to compute the standardized
residuals.
ANSWER:
The standardized residuals r / s are: 0.105, -0.830, -0.366, -1.109,
0.967, 1.982, 0.315, -1.290, -1.138, 0.685, and 0.703.

0.130, -0.156,

164

Chapter Seventeen

309.

Identify possible outliers.


ANSWER:
There are no outliers since none of the 13 observations has a standardized residual whose
absolute value exceeds 2.0. However, observation 9 with standardized residual of 1.982
may be an outlier.

Das könnte Ihnen auch gefallen