Sie sind auf Seite 1von 9

MGCR 650 BUSINESS TOOLS

MBAJAPAN
SAMPLE FINAL EXAM
Solutions

Professors: Derek Hart and Philippe Levy

NAME: ___________________________________________ ID: _______________________


(Please print legibly)

INSTRUCTIONS:

1. There are four parts to this examination. Part I is made up of eight


multiple-choice and short answer questions which are worth 3 marks
each. Parts II, III, and IV are case studies. MARKER ONLY

2. Excel printouts are provided for questions 6 and 7 of part I. These are Parts
needed to answer the above mentioned questions.
I. ______/24
3. The Excel printouts for the statistics case (part II) are provided as a
II. ______/26
separate handout.
III. ______/30
4. The McGill University Code of conduct applies.
IV. ______/20
5. The exam is open-book. Non-text storing calculators are permitted.
TOTAL:
6. The exam is 17 pages (including this cover page); the separate handout
with Excel printouts is 8 pages. ______/ 100

7. Write clearly. Any illegible answers will receive a zero. To receive full
marks in Parts II, III, and IV, you must show all work and logic.

8. Answer all questions in the space provided on the exam paper. When answering questions in Part II,
use the Excel printouts to support your comments and conclusions. Page 13 is a blank page to be
used to answer Part III.

9. Both the exam and separate handout must be returned at the end of the exam.

10. GOOD LUCK!


Part I (Statistics)
(Total marks 24)
Time: 45 minutes
The following information is used for questions 1 and 2

The manager of a grocery store has taken a random sample of 100 customers. The average length of
time it took the customers in the sample to check out was 3.1 minutes with a standard deviation of
0.5 minutes. We want to test to determine whether or not the mean waiting time of all customers is
significantly more than 3 minutes.

1. What is the p-value? H0: µ < 3 HA: µ > 3

a. 0.025 Z = (3.1 - 3)/ (0.5/10) = 2


b. 0.0456
c. 0.05 P-value = 0.5 - 0.4772 = 0.0228
d. 0.0228
e. None of the above answers is correct. Answer: d

2. At 95% confidence, it can be concluded that the mean of the population is

a. significantly greater than 3 p-value = 0.0228 < α = 0.05


b. not significantly greater than 3
c. significantly less than 3 reject H0 , significantly greater than 3
d. significantly greater then 0.0228
e. None of the above answers is correct.
Answer: a

3. From a population of 200 elements, a sample of 49 elements is selected. It is determined that the
sample mean is 56 and the sample standard deviation is 14. The standard error of the mean (the
standard deviation of the sampling distribution) is?

a. 3 d. less than 2
b. 2 e. None of the above answers is correct.
c. greater than 2

Note: 49 > 5% of 200 = N Therefore the finite correction factor must be used.

14 200 − 49
σx = = 2 × 0.871 < 2
49 200 − 1

Answer: d

2
4. The business manager of a local health clinic is interested in estimating the difference between the
fees for extended office visits in her center and the fees of a newly opened group practice. She
gathered the following information regarding the two offices using Excel.

Health Clinic Group Practice


Sample size 50 visits 45 visits
Sample mean $21 $19
Standard deviation (S) $2.75 $3.00

Develop a 95% confidence interval estimate for the difference between the average fees of the two
offices assuming the CLT is used to obtain the result.

a. 0.8384 to 3.1616
b. 1.3163 to 2.6837
c. 1.0251 to 2.9749
d. 1.078 to 2.922
e. None of the above answers is correct

2 2
S HC S GP 2.75 2 3 2
µ HC − µ GP ∈ ( x HC − xGP ) ± Z α 2 × + = (21 − 19) ± 1.96 × +
n HC nGP 50 45
µ HC − µ GP ∈ 2 ± 1.96 × 0.5927 = 15 ± 1.1616 → 0.8384 ≤ µ ≤ 3.1616

Answer: a

3
5. A politician has commissioned a survey of blue-collar and white-collar employees in her
constituency. The survey reveals that 286 out of 542 blue-collar workers intend to vote for her in the
next election whereas 428 out of 955 white-collar workers intend to vote for her.

Estimate with 95% confidence the difference in population proportions.

a. (0.0027, 0.0132) d. (0.027, 0.132)


b. (0.027, 0.0132) e. Not enough information
c. (0.0027, 0.132)

pˆ × qˆ pˆ × qˆ
PBC − PWC ∈ ( PˆBC − PˆWC ) ± Z α × ×
2 n n
PBC − PWC ∈ (0.5277 − 0.4482) ± 1.96 × 0.0268 = 0.0795 ± 0.0525 → 0.027 ≤ PBC − PWC ≤ 0.132

Answer: d

The following information is used for questions 6 and 7

A consumer research organization is attempting to determine whether there is any difference in


kilometers per liter for fully-loaded 22-foot trucks leased from three different companies, A-haul,
Bertz, Glyder. Five of these trucks are rented from each company. Each truck is driven with the
same weight cargo over the same 300-kilometer route and the kilometers per liter recorded. The
results of the test are:

Type of Truck
A-Haul Bertz Glyder
5.4 8.1 12.7
6.7 3.2 13.3
8.1 13.7 8.4
7.9 10.7 12.4
4.9 9.7 13.0

4
The null hypothesis is that there is no difference in mean kilometers per liter between the various
fully-loaded 22-foot trucks. Using this data, answer questions 6 and 7, an Excel output of the
ANOVA is given below.

Anova: Single Factor

SUMMARY
Groups Count Sum Average Variance
A-Haul 5 33 6.6 2.07
Bertz 5 45.4 9.08 14.972
Glyder 5 59.8 11.96 4.073

ANOVA
Source of Variation SS df MS F F crit
Between Groups 71.957 2 35.979 5.112
Within Groups 84.460 12 7.038
Total 156.417 14

6. What is the F statistic, comparing the between-groups mean square with the within-groups
mean square?

a. 35.979 d. 3.8850
b. 7.038 e. 5.1118
c. 4.073 Answer: e

7. What can we conclude about the null hypothesis that there is no difference in mean kilometers per
liter between the various types of trucks?

a. the null hypothesis can be rejected at the 5% significance level


b. the null hypothesis can be rejected at the 1% significance level
c. the null hypothesis can be accepted at the 5% significance level
d. the null hypothesis can be accepted at the 1% significance level
e. both a and d
F(2,12) α = 0.05 equals 3.89
F(2,12) α = 0.01 equals 6.93
Since F = 5.112 Answer: e

5
The following information is used for questions 8

In a regression model involving 30 observations, the following estimated regression equation was
obtained:

 = 17 + 4X - 3X + 8X + 8X
Y 1 2 3 4

For this model SSR = 700 and SSE = 100.

8. The computed F statistic for testing the significance of the above model is
a. 43.75
b. 0.875 MSR = SSR / k = 700/4
c. 50.19 MSE = SSE / n-k-1 = 100/25
d. 7.00 F = 175/4 = 43.75
Answer: a

Using the critical F value at 95%, the conclusion is that the


a. model is not significant
b. model is significant
c. slope of X1 is significant
d. slope of X2 is significant Answer: b

Critical F value: F(4,25) α = 0.05 equals 2.76< T.S. = 43.75

Therefore the model is significant

6
Part II (Statistics) (26 marks)
Time: 45 minutes
Canadian Industrial Supplies

Canadian Industrial Supplies is a manufacturer of industrial shelving, tables and chairs, office partitions as
well as many other plant and office furnishings. As a new employee in the Human Resources department,
you have been assigned the task of analyzing the salaries of workers involved in the production process. In
order to accomplish this task, you have decided to develop a multiple regression model to predict the
workers weekly salaries. The following information is available from the personnel files of each worker in
the company.

1. X1 = length of employment in months

2. X2 = age in years

3. X3 = 0 for female employees


=1 for male employees

4. X4 = 0 for employees with technical jobs


=1 for employees with clerical jobs

Using the personnel files and a frame constructed using the employee number; a simple, random sample of
49 workers involved in the production process was selected. The data corresponding to their current weekly
salaries, lengths of employment, ages, gender, and job classifications are analyzed.

With the assistance of the company computer and using Excels' Data Analysis, you goal is to use step-wise
regression to develop the "best" fitting model. You decide to set the required R2 at a given level and develop
a model that uses the minimum number of variables.

The Director of Personnel is a long-time employee of the company, who has worked her way up through the
ranks. She has accomplished this without the benefit of a formal education involving the study of statistics
(poor soul). She has expressed concern that this goal can not be achieved. You wish to respond to her
statement:

"Is it possible to construct a simple, predictive model that will explain 75% or more of the
changes in weekly salaries using the data mentioned above?"

The data for the 49 cases are listed on the accompanying printout along with the Excel printouts. Consider
all this information when responding to the following questions.

7
a) Is this data obtained from a process over time (time series data) or a cross-sectional study (snapshot
at one point in time)? Explain. (2 marks)

This is the classic example of cross-sectional data. A the sequence plot would not provide any useful
information since the data was not collected over time. With respect to this company, the data
represents a snapshot of the situation at a certain point in time.

b) Using the simple-r matrix for the variables, what can we say about the variables used in this study?
Can your comments be confirmed from other Excel printout? (6 marks)

The first thing we note is the fact that two of the independent variable are related to Salaries

at an r-value > 0.5, while the other two are not strongly related to Salaries.

The second thing we notice is that the independent variables have low r-values when compared to

each other. This suggests that multicollinearity should not be a problem.

But when the step-wise regression from salaries vs employ to salaries vs employ and age, the t-value

for employ drops and the age variable is not significant. This suggests that age and employ are

correlated. This would seem reasonable given the nature of the variables.

c) Various regression models for Salaries (response) versus years of employment, age, gender, and job
type (clerical) were run. Consider all the regression models that have been run. Which model would
be the best? Why would this model be favored over the other models? (Do not discuss the regression
assumptions)
(6 marks)

Since the r-squared value is set at 75%, the model must first meet this criterion. The first model
is significant but at r2 = 0.744.

From the second model, Salaries Vs Employ and Age, The R2 = 0.744 and for Age, the p-value is
0.988 > α = 0.05. Age is useless to the model.

The third model adds Gender, which is significant at p-value = 0.007. The adjR2 = 0.768, which
takes the model above the required 75%.

By the time the other variable is added, we can see that the two significant independent variables
are Years of Employment and Gender. This model has adjR2 = 0.773 > 0.75, the ANOVA table
provides evidence that the model is significant F = 83 and the p-value approximately zero. Both
independent variables are significant, as we see from the Excel printout.

The best model is the Salaries versus Years of Employment and Gender model.
8
d) Does the model Salaries versus years of employment and gender seem to satisfy the normality and
independent residuals assumptions of the linear regression model? (6 marks)

Normality: From the Nscores plot we can see that the dependent variable (Salaries) is
approximately normally distributed. The line has a few wrinkles but is close enough to a straight line
to provide evidence that salaries is approx. normal.

Randomness: From the three scatter plots it can be seen that the residuals are randomly
scattered around the horizontal line e=0. There is no discernable pattern.

Overall the model is not too bad. Given the robustness of the model, it can be used to predict.

e) Using all available information, respond to the Director of Personnel's concern: “Is it possible
to construct a predictive model that will explain 75% or more of the differences between weekly
salaries using the data mentioned above?"
(6 marks)

Yes, it is possible to construct a model that will meet the requirements. The model is salaries
versus employ and gender.

yˆ = 404.97 + 0.74(employ ) + 39.19( gender )

This model has an adjR2 = 0.773 which is greater than 0.75. The F-value is 83, which results in a p-
value that is approximately zero. The model as a whole is highly significant. Each of the independent
variables is also significant with p-values of 0.00 and 0.01 for employ and gender.

Finally, the assumptions are within the range that would allow the use of this model to predict
salaries using number of years of employment and gender.

Das könnte Ihnen auch gefallen