Sie sind auf Seite 1von 11

Question 1.

[12 marks]
Market research has indicated that customers are likely to bypass Roma tomatoes that
weigh less than 70 grams. A produce company produces Roma tomatoes that average
74.0 grams with a standard deviation of 3.2 grams.
(a) [2 marks] Assuming that the normal distribution is a reasonable model for the
weights of these tomatoes, what proportion of Roma tomatoes are currently
undersize (less than 70g)?
(b) [2 marks] How much must a Roma tomato weigh to be among the heaviest 10%?
(c ) [2 marks] The aim of the current research is to reduce the proportion of
undersized tomatoes to no more than 2%. One way of reducing this proportion is to
reduce the standard deviation. If the average size of the tomatoes remains 74.0
grams, what must the target standard deviation be to achieve the 2% goal?
(d) [3 marks] The company claims that the goal of 2% undersized tomatoes is
reached.To test this, a random sample of 25 tomatoes is taken. What is the
distribution of undersized tomatoes in this sample if the company's claim is true?
Explain your reasoning.

Question 2:
In an article in Marketing Science, Silk and Berndt investigate the output of advertising
agencies. They describe ad agency output by finding the shares of dollar billing volume
coming from various media categories such as network television, spot television,
newspaper, radio, and so forth.
Suppose that a random sample of 400 U.S. advertising agencies gives an average
percentage share of billing volume from network television equal to 7.46 percent with a
standard deviation of 1.42 percent. Further, suppose that a random sample of 400 U.S.
advertising agencies gives an average percentage share of billing volume from spot
television commercials equal to 12.44 percent with a standard deviation of 1.55 percent.
Using the sample information, does it appear that the mean percentage share of billing
volume from spot television commercials for the U.S. advertising agencies is greater than
the mean percentage share of billing volume from network television? Explain.

Module #3: Sampling Distributions, Estimates, and Hypothesis Testing

Question 3:
[3] Identify which of these types of sampling is used: random, systematic, convenience,
stratified, or cluster.
a) The instructor of this course observed at a Walnut Creek Police sobriety
checkpoint at which every fifth driver was stopped and interviewed. Some drivers
were arrested.
b) The instructor of this course observed professional wine tasters working at a
winery in Napa Valley, CA. Assume that a taste test involved three different wines
randomly selected from each of five different wineries.
c) The U.S. Department of Corrections collects data about returning prisoners by
randomly selecting five federal prisons and surveying all of the prisoners in each
of the prisons.
d) In a Gallup poll, 1003 adults were called after their telephone numbers were
randomly generated by a computer, and 20% of them said that they get news on
the Internet every day.
e) The instructor of this course surveyed all of my students to obtain sample data
consisting of the number of credit cards students possess in one of my statistics
classes.

Question 4:
[4] In March 16, 1998, issue of Fortune magazine, the results of a survey of 2,221 MBA
students from across the United States conducted by the Stockholm-based academic
consulting firm Universum showed that only 20 percent of MBA students expect to stay at
their first job five years or more. Source: Shalley Branch, "MBAs: What Do They Really
Want," Fortune (March 16, 1998), p.167.
a) Assuming that a random sample was selected, construct a 98% confidence
interval for the proportion of all U.S. MBA students who expect to stay at their first
job five years or more.
b) Based on the interval from a), can you conclude that there is strong evidence that
less than one-fourth of all U.S. MBA students expect to stay? Explain why.

Question 5:
[5] An earlier study claims that U.S. adults spend an average of 114 minutes with their
families per day. A recently taken sample of 25 adults showed that they spend an
average of 109 minutes per day with their families. The sample standard deviation is 11
minutes. Assume that the time spent by adults with their families has an approximate
normal distribution. We wish to test whether the mean time spent currently by all adults
with their families is less than 114 minutes a day.

a) Construct a 95% confidence interval for the mean time spent by all adults with their
families.
b) Does the sample information support that the mean time spent currently by all adults
with their families is less than 114 minutes a day? Explain your conclusion in words.

Question 6:
[6] When 40 people used the Weight Watchers diet for one year, their mean weight loss
was 3.00 lb. (based on data from Comparison of the Atkins, Ornish, Weight Watchers,
and Zone Diets for Weight Loss and Heart Disease Reduction, by Dansinger, et al.,
Journal of the American Medical Association, Vol. 293, No. 1). Assume that the standard
deviation of all such weight changes is = 4.9 lb. We shall use a 0.01 significance level
to test the claim that the mean weight loss is greater than 0.
a) Set up the null and alternative hypotheses, and perform the hypothesis test.
b) Based on these results, does the diet appear to be effective? Does the diet appear to
have practical significance?

Question 7:
[7] In the case of Castenedav. Partida, it was found that during a period of 11 years in
Hilda County, Texas, 870 people were selected for grand jury duty, and 39% of them
were Americans of Mexican ancestry. Among the people eligible for grand jury duty,
79.1% were Americans of Mexican ancestry. We shall use a 0.01 significance level to test
the claim that the selection process is biased against Americans of Mexican ancestry.
(a) Set up the null and alternative hypotheses, and perform the hypothesis test.
(b) Does the jury selection system appear to be fair?

Question 8:
[8] A local television station has added a consumer spot to its nightly news. The
consumer reporter has recently bought sixteen bottles of aspirin from a local drugstore
and has counted the aspirins in each bottle. Although the bottles advertised 500
aspirins, the reporter found the following numbers with the mean count 498.8125:
499, 498, 496, 501, 493, 495, 497, 502, 496, 502, 499, 501, 500, 498, 501, 503
The consumer reporter claims that this is an obvious case of the public being taken
advantage of. Using a confidence interval estimate method or a hypothesis testing
method, do you think that the reporters claim is justifiable?

Module #4: Two-Sample Tests and Simple Linear Regression

Question 9:
[9] In an article in Marketing Science, Silk and Berndt investigate the output of
advertising agencies. They describe ad agency output by finding the shares of dollar
billing volume coming from various media categories such as network television, spot
television, newspaper, radio, and so forth.
Suppose that a random sample of 400 U.S. advertising agencies gives an average
percentage share of billing volume from network television equal to 7.46 percent with a
standard deviation of 1.42 percent. Further, suppose that a random sample of 400 U.S.
advertising agencies gives an average percentage share of billing volume from spot
television commercials equal to 12.44 percent with a standard deviation of 1.55 percent.
Using the sample information, does it appear that the mean percentage share of billing
volume from spot television commercials for the U.S. advertising agencies is greater than
the mean percentage share of billing volume from network television? Explain.

Question 10:
[10] A random sample of the birth weights of 186 babies has a mean of 3103g and a
standard deviation of 696g (based on data from Cognitive Outcomes of Preschool
Children with Prenatal Cocaine Exposure, by Singer et al., Journal of the American
Medical Association, Vol. 291, No. 20). These babies were born to mothers who did not
use cocaine during their pregnancies. Further, a random sample of the birth weights of
190 babies born to mothers who used cocaine during their pregnancies has a mean of
2700g and a standard deviation of 645g. Does cocaine use appear to affect the birth
weight of a baby? Substantiate you conclusion.

Question 11:
[11] The owner of an intra -city moving company typically has his most experienced
manager predict the total number of labor hours that will be required to complete an
upcoming move. This approach had proved useful in the past, but he would like to be
able to develop a more accurate method of predicting the labor hours by using the
amount of cubic feet moved. In a preliminary effort to provide a more accurate method,
he has collected data for 36 moves, in which the travel time was an insignificant portion
of the labor hours worked.

The data are in the Excel file, MOVING.xls downloadable from File or click Companion
Website at www.peasronhighered.com/levine, and go to the Excel Date Files link.
a) Set up a scatter diagram.
b) Assuming a linear relationship, find the regression coefficients, b0, b1, and its
regression equation.
c) Interpret the meaning of the slope b1 in this problem.
d) Predict the labor hours for moving 500 cubic feet.
e) What factors besides the cubic feet moved might affect labor hours?
f)

Determine the coefficient of determination, r2, and interpret its meaning.

g) Find the standard error of the estimate.


h) How useful do you think this regression model is for labor hours?
i)

Determine if the assumption of normality is violated by using the normal


probability plot for residuals.

j)

At the 0.05 level of significance, is there evidence of a linear relationship between


the numbers of cubic feet moved and labor hours?

k) Set up a 95% confidence interval estimate of the population slope,1.


l)

Set up a 95% confidence interval estimate of the average labor hours for all
moves of 500 cubic feet.

m) Set up a 95% confidence interval of the labor hours of an individual move that has
500 cubic feet.
n) Explain the difference in the results obtained in (l) and (m).

Question 12:
[4] An auto manufacturing company wanted to investigate how the price of one of
its car models depreciates with age. The research department at the company
took a sample of eight cars of this model and collected the following information
on the ages (in years) and prices (in hundreds of dollars) of these cars. The data
are in USEDCAR.xls.
Age (x)
Price
(y)

16

74

40

19

124

36

33

89

a) Find the value of the linear correlation coefficient r.


b) Find the value of the coefficient of determination r2, and interpret the meaning for
this problem.
c) At the 0.05 level of significance, is there a significant linear relationship between
two variables?
d) Determine the adequacy of the fit of the model.

e) Evaluate whether the assumptions of regression (LINE) have been seriously


violated.
f)

If there is a linear correlation, what is the regression equation?

g) Interpret the meaning of the slope b1 in this problem.


h) Interpret the meaning of the Y-intercept b0 in this problem. Will it make sense to
you as far as this model is concerned? Explain why.
i)

Set up a 95% confidence interval estimate of the population slope.

j)

Set up a 95% confidence interval estimate of the average price for all cars of this
model after 7 years.

k) Set up a 95% confidence interval of the average price of a car of this model after
7 years.
l)

Explain the difference in the results obtained in (j) and (k).

Question 13:

Question 14:

Question 15:

Question 16:

Question 17:

Question 18:

Question 19:

Question 20:

Question 21:

Question 22:

Question 23:

Question 24:

Question 25:

Question 26:

Question 27:

Question 28:

Question 29:

Question 30:

Question 31:

Question 32:

Question 33:

Question 34: