Sie sind auf Seite 1von 5

Set of Questions to Choose From: Tentative Exam

Q1. [5 points] Imagine that you are head of personnel at Huge Corp. The CEO keeps getting
other people's mail by mistake, so she asks you to conduct a study of mailroom productivity. You
take a random sample of 27 mailroom employees and gather data on the following variables:

productivity (PROD = letters correctly sorted per minute),


experience (EXP = months of experience in the Huge Corp. mailroom), and
aptitude score (SCORE = score on the test they took when they applied for a job at Huge
Corp.)

You generate the following regression equation:


Predicted PROD = 2.0 + 0.5 EXP + 0.2 SCORE

1. Identify the independent variable(s) and the dependent variable(s).

2. Ernest has just been hired, and has an aptitude score of 80. What level of productivity would
you expect from Ernest?

3. How much is Ernests productivity expected to improve after he gains an additional 6 months
of experience?

4. What is the best interpretation of the intercept term (2.0) in the regression equation?

5. Jack has 3 months more experience than Jill, but Jills aptitude score is 20 points higher than
Jacks. Who is expected to be more productive?

Q2. [3 points] Using total SAT score (SATSUM) and High-school grades (HSGPA) to predict
first-year college grades (FYGPA):

predicted FYGPA = -.873 + .00144 SATSUM + .58 HSGPA

R2 = 0.358 residual SD =0.594

1. Which measure(s) describe how well the two predictors, as a team, predict first-year college
grades?

2. What percentage of the variability in first-year college grades are not explained by SATSUM
and HSGPA?

Q3. [10 points] Below are several regression analyses involving data for the 50 states in the U.S.,
measured in 2001. We will consider several models predicting income per capita (INCOME),
from the following predictor variables:

BA: percentage of state residents with bachelors degrees (ranging from 15 to 33 percent)
COMMUTE: the average commute time between home and work for the states residents
(ranging from 16 to 32 minutes)

Model Regression Equation R2 Adj. Residual


R2 SD
1 Predicted INCOME = 11345 + 392 COMMUTE .169 .152 3077
2 Predicted INCOME = 6665 + 585 BA .559 .550 2242
3 Predicted INCOME = 3277 + 536 BA + 193 .596 .579 2168
COMMUTE

Detailed output for Model 3:


Coefficient Standard
s Error t Stat P-value
Intercept 3277 2390.9 1.4 0.1770
BA 536 76.1 7.0 6.94E-09
COMMUTE 193 92.6 2.1 0.0429

1. Determine the correlation coefficient (r) between INCOME and BA.

2. For the state of Florida, INCOME=$21,557, BA=22.3, and COMMUTE=26.2.


Use the best overall model to determine the predicted value of INCOME for Florida, and
whether the residual for Florida is positive or negative.

3. Connecticuts BA score is 4 points higher than New Yorks BA score, but Connecticuts
COMMUTE score is 7 points lower than New Yorks COMMUTE score. Based on this
information, which state is predicted to have higher per capita income?

4. What is the best interpretation of the intercept term (3,277) in Model 3?

5. Construct a 95% confidence interval for the additional income associated with each additional
minute of commute time, holding constant the percentage of people with bachelors degrees in
model (3).

Q4. [7 points] We will now consider factors predicting the top speed of the n=74 cars. The
relevant variables:
TOPSPEED: Top speed of the car (in miles per hour)
WEIGHT: Vehicle weight (in pounds)
HORSEPOWER: Maximum horsepower of the cars engine

Summaries of two regressions predicting TOPSPEED are below:

Model 1: TOPSPEED regressed on WEIGHT


Predicted TOPSPEED = 85.24 + 0.00839 WEIGHT
R2 = .313 R2(Adjusted) = .304 Residual Standard Deviation = 8.72
Model 2: TOPSPEED regressed on WEIGHT and HORSEPOWER
Predicted TOPSPEED = 96.44 - 0.00792 WEIGHT + 0.349 HORSEPOWER
R2 = .987 R2(Adjusted) = .986 Residual Standard Deviation = 1.22

1. Suppose all we know about a car in the sample is that it weighs 4000 pounds. What would we
predict its top speed to be?

2. Suppose all we know about a car in the sample is that it weighs 4000 pounds and it has 200
horsepower. What would we predict its top speed to be?

3. How much of the variability in TOPSPEED is explained by HORSEPOWER and WEIGHT


together?

4. Test for the increment effect of adding HORSEPOWER to the regression model (2).

Q5. [4 points] MULTIPLE CHOICE: Circle the one best answer to each question.

(1) If the error term is heteroskedastic, then the usual formula for the standard error of the least-
squares coefficient estimator is
a. still consistent.
b. inconsistent, with an upward bias.
c. inconsistent, with a downward bias.
d. inconsistent, and may be biased upward or downward.

(2) Under the null hypothesis of no heteroskedasticity, the Goldfeld-Quandt test statistic is close
to:
a. minus one.
b. zero.
c. one.
d. two.
e. four.

(3) Under the null hypothesis of no autocorrelation, the Durbin-Watson test statistic is close to
a. minus one.
b. zero.
c. one.
d. two.
e. four.

(4) Which test for serial correlation is still valid when the regressors in the original equation
include a lagged value of the dependent variablee.g., yt = 1 + 2 xt + 3 yt-1 + t ?
a. A t-test from a regression of the least-squares residual on its lag (without an intercept).
b. The Durbin-Watson test.
c. The F-test
d. none of the above.
Q6. [10 points] The regression equation y = 1 + 2 x + was estimated using 80 cross-sectional
observations on countries, by ordinary least squares. To check for heteroskedasticity related to
population, separate regressions were run for the 32 countries with the lowest populations and
the 32 countries with the highest populations. The sum of squared residuals for the low-
population countries was 240. The sum of squared residuals for the high-population countries
was 90.
a. Compute unbiased estimates of the variance of the error term in the two subsamples.

Variance for high-population subsample =

Variance for low-population subsample =

b. Given these results, which subsample appears to lie closer to the true regression line: the
low-population-countries or the high-population countries? Explain your answer.

c. Test the null hypothesis of homoskedasticity, against the (one-sided) alternative


hypothesis that low-population countries have higher error variance, at 5 percent
significance using a Goldfeld-Quandt test. Give the value of the test statistic, the critical
point, and your conclusion (accept or reject the null hypothesis of homoskedasticity).

Value of test statistic =

Critical point =

Conclusion =

Q7. [5 points] The regression equation y = 1 + 2 x + was estimated using 40 time-series


observations. To check for serial correlation, a Durbin-Watson statistic was computed. The value
of the Durbin-Watson statistic turned out to be 1.4 .
a. Test the null hypothesis of no serial correlation against the alternative hypothesis of
positive serial correlation at 5 percent significance. Give the critical point(s) and your
conclusion (accept or reject the null hypothesis of no serial correlation).

Critical point(s) =

Conclusion =

b. Compute an estimate of (rho), the serial correlation parameter, based on the value of
the Durbin-Watson statistic.

Estimate of rho =
Q8. [6 points] Consider the following regression model:

Das könnte Ihnen auch gefallen