You are on page 1of 11

Solutions to Statistics Quiz 1

Question 1

The amount of rainfall in a certain city per day has mean 1.3m and variance 2.8m2. The amount of rainfall
for a random sample of 60 days is collected.

(i) Calculate the mean of the sample mean.

(ii) Calculate the variance of the sample mean.

(ii) Find the probability that the mean amount of rainfall exceeds 0.8m.

(Think about it, no answers given)

State an assumption used in obtaining this estimate in part (ii).

Solution:

Let X m be the amount of rainfall in the city.

E(X) = 1.3 and Var(X) = 2.8

(i) E( X ) = E( X ) = 1.3

Var( X ) 2.8 7
(ii) Var( X ) = = = 0.0467 or
n 60 150

(iii) P ( X > 0.8) = 0.98968 = 0.990

The assumption necessary is that the amount of rainfall in a day is independent of the amount of rainfall in
other days, i.e. X 1 , X 2 ,… , X 60 are independent of each other.

Question 2

An electrical company offers free repairs for the first 3 years for each refrigerator sold. The company
records the number of free repairs made to 90 refrigerators in the first 3 years.

No. of repairs 0 1 2 3 4

Frequency 44 37 5 3 1

(i) Find the sample mean.

It is assumed that the number of free repairs, X offered by the company in the first 3 years follows a
Poisson distribution with the value of the mean found above.

(ii) Find P( X = 2).

(iii) A random sample of 9 refrigerators sold on a particular day was taken. Find the probability that a total
of at least 15 free repairs are required on these 9 refrigerators in the first 3 years.

The company also sells freezers and provides free repairs for the first 3 years. The number of free repairs a
freezer requires can be modelled by an independent Poisson distribution with mean 0.2. A customer buys a
refrigerator and a freezer at the same time.

(iv) Find the probability that

(a) the 2 equipments require a total of 4 free repairs between them in the first 3 years.

(b) each equipment requires exactly 2 free repairs in the first 3 years.

Think about it (no answer given): explain why there is a difference in answers to (iv)(a) and (iv)(b).

Solution:

(i) DIY : 2/3 or 0.667

2
(ii) X ∼ Po  
3

P( X = 2) = 0.114

 2
X 1 + X 2 + ... + X 9 ∼ Po  9 ×  ,i.e.Po ( 6 )
 3
(iii)
P( X 1 + X 2 + ... + X 9 ≥ 15) = 1 − P( X 1 + X 2 + ... + X 9 < 15) = 0.00140

(iv) Let Y be the number of free repairs a freezer requires for the first 3 years.

Y ∼ Po(0.2)

2 
(a) X + Y ∼ Po  + 0.2 
3 

P( X +Y = 4) = 0.00988

(b) P( X = 2)P(Y = 2) = 0.00187

Part (b) is only a particular case for part (a), hence it has a smaller value. In (a), other possibilities includes X
= 0 and Y = 4 or X = 1 and Y = 3, …to name a few.

Question 3

The magazine Nearly Eighteen has a web site, on which it recently ran a pop trivia quiz with ten questions.
The results of the first 1000 entries were analysed. The numbers of questions answered correctly, x, are
illustrated by the frequency diagram.
Find the mode of the data.

Find the median of the data.

Calculate the mean of the data.

Calculate the standard deviation of the data.

Each question in the quiz was in fact of the multiple choice variety, with four possible answers. Three points
are awarded for a question answered correctly, and one point is deducted for a question which is not
answered correctly. So, if x questions are answered correctly, the number of points, y, is given by y = 4x - 10.

Hence find the mean number of points scored.

Hence find the standard deviation of the number of points scored, giving your answer correct to 3 decimal
places.

Solution:

mode = 7

Key in data into GC,

Number 0 1 2 3 4 5 6 7 8 9 10
correct

Frequency 10 20 40 65 100 140 160 175 140 100 50


From GC, median = 6

From GC, mean = 6100/1000 = 6.1

From GC, s.d of data = 2.25

y = 4 x − 10 = 4(6.1) − 10 = 14.4

variance of y = 42(variance of x)

so standard deviation of y = 4(standard deviation of x) = 4(2.24722) =8.989

Question 4

Dr Yeo, a statistics professor, has been monitoring the length of his lectures. He aims for each lecture to be
between 50 and 60 minutes long, but in fact the length of his lectures are given by the random variable
X which is normally distributed with mean 57 minutes and standard deviation 5 minutes. The lengths of
different lectures are independent of one another.

(i) Find the probability that an individual lecture lasts between 50 and 60 minutes.

(ii) During a particular week, Dr Yeo gives four lectures. Find the probability that their total length is more
than 4 hours.

(iii) Dr Yeo is asked to provide a series of lectures to be broadcast on an educational programme on TV. He
is instructed to reduce the length of his lectures such that the duration of his lectures is now given by the
random variable 0.5X. Find, to the nearest minute, the minimum time required on the educational
programme to ensure that there is a probability of at least 0.85 that there is sufficient time for the lecture.

Solution:

X ∼ N (57, 52 )

(i) P(50<X<60) = 0.645

(ii)

T = X 1 + X 2 + X 3 + X 4 ∼ N (4 × 57, 4 × 52 )
P (T > 4 × 60) = 0.115

(iii)

0.5 X ∼ N (0.5 × 57, 0.52 × 52 )


P(0.5 X < t ) ≥ 0.85
t ≥ 31.09

Hence, least t = 32 minutes.


Question 5

The following illustration shows the shape of the sampling distribution of the sample mean for n = 5, 10, 15, 20,
30.

(Refer to attachment)

Pick the shape for which n = 30.

Solution:

For n =30 (reasonably large n), the sampling distribution of the sample mean should be close to a normal
distribution, hence (c) is the answer.

This is the result of CLT.

Question 6

What is the standard deviation of the distribution of X 50 (mean of a sample of size 50) given that
X ∼ B(12, 0.5)

Solution:

E(X) =12x0.5 =6

Var(X) =12x0.5x0.5=3

Var( X ) 3 6
( )
Var X 50 =
50
= =
50 100

6 6
Hence standard deviation = = .
100 10
Question 7

A garage has 60 cars for hire. Sometimes, a hirer makes a booking but does not show up. This is called a
"no-show". On the average, 6% of the bookings are no-shows. The garage manager accepts up to 65
bookings before saying that there are no more cars for hire. If at least 5 of these bookings are no-shows,
then there is enough cars for all the hirers.

On a particular day, 65 bookings were made.

Using a suitable approximation, find the probability that

(i) there are exactly 5 no-shows.

(ii) there are enough cars for all the hirers.

The manager would like to review the number of bookings accepted before saying that there are no more
cars for hire. Find the largest number of bookings he can accept if he has 60 cars for hire and the
probability that there are enough cars for all the hirers is at least 0.8.

Solution:

Let X be the number of “no-shows” on a particular day with 65 bookings.

X ∼ B(65, 0.06)

Using Poisson distribution as an approximation, X ∼ Po(65 × 0.06), i.e. X ∼ Po(3.9)

(i) P(X =5) = 0.152

(ii) P( X ≥ 5) = 1 − P( X < 5) = 1 − P( X ≤ 4) = 0.352

Note that the number of bookings has to be more than 60.

From (ii), if there are 65 bookings, the probability that there are enough cars is 0.352, so to have a
higher probability (at least 0.8) of having enough cars for hire, the number of bookings has to be
less than 65.

So, we work out the probability of having enough cars for hire when there are 61, 62, 63, 64
bookings.

When there are 61 bookings, P(enough cars for hire) = 0.974

When there are 62 bookings, P(enough cars for hire) = 0.886 >0.8

When there are 63 bookings, P(enough cars for hire) = 0.728 <0.8

Hence the largest number of bookings is 62.


Question 8

If the sampling distribution of the sample mean for n = 5 has a variance of 50, what will be the variance of
the sampling distribution of the sample mean for n = 10, given that the sample means are obtained from
the same population in both instances.

Solution:

( )
Var X 5 = 50
Var ( X )
( )
Var X 5 =
5
Var ( X ) = 5 × 50 = 250
Var ( X ) 250
( )
Var X 10 =
10
=
10
= 25

Question 9

Given that E(X) = 3, E(Y) = 4, Var(X) = 25, Var(Y) = 16, find

(i) E(2X - 3Y)

(ii) Var(2X - 3Y)

(iii) E(X1 + X2 - (Y1 + Y2 + Y3 ))

(iv) Var(X1 + X2 - (Y1 + Y2 + Y3 ))

Solution:

(i) E(2X - 3Y) = 2E(X) -3E(Y) = -6

(ii) Var(2X-3Y) = 22Var(X) + 32Var(Y) = 244

(iii) E(X1 + X2 - (Y1 + Y2 + Y3 )) = 2E(X) -3E(Y) = -6

(iv) Var(X1 + X2 - (Y1 + Y2 + Y3 )) = 2Var(X) + 3Var(Y) = 98

Question 10

Given that X is a binomial distribution with mean 15 and probability of success is 0.5, calculate P( X > 8).

Solution:

np=15, p =0.5, hence n = 30. P ( X > 8) = 1 − P ( X ≤ 8) = 0.992


Question 11

It is given that X ∼ Po( µ ) and P( X > 3) = 0.201. Find the value of µ .

Find the least value of n, n ∈  + such that P( X > n) < 0.01.

Solution:

P( X > 3) = 0.201
⇒ P( X ≤ 3) = 0.799

From GC,

Hence µ =2.30.

P( X > n) < 0.01


⇒ P( X ≤ n) > 0.99

Hence least n =6.

Question 12

The distribution of weights of a large group of high school students is normally distributed with mean 55kg
and a standard deviation of 5kg. Which of the following is true?

About 15.9% of the students will be over 60kg.

About 2.3% of the students will be below 45kg.

Half of them can be expected to weigh less than 55kg.

About 60% of the students will weigh between 53 and 63kg.

All the above statements are true.

Solution:
Let X kg be the weight of a student. X ∼ N (55, 52 )

Check that

P(X>60) = 0.159

P(X<45) = 0.0228

P(X<55) = 0.5

P(53<X<63) = 0.601

Hence all the above statements are true.

Question 13

Which of the following is a statistic?

Select all that applies.

Note: A statistic is a characteristic of a sample. It is used to estimate the value of a population parameter.

• Sample mean

• Sample proportion

• Standard deviation of a sample

• Median of a sample

Solution:

All of them are statistic as they are used to estimate a corresponding population parameter.

But we focus mainly on sample mean for the A levels.

Question 14

In a certain desert, sandstorms occur randomly at an average rate of two every three days.

Use a suitable approximation to find the probability that in 60 consecutive days, the average number of
sandstorms per day is less than half.

Solution:

2
Let X be the number of sandstorms in a day. X ∼ Po  
3

E(X) = Var(X) = 2/3


 2 23 
By CLT (n =60 is large), X ∼ N  , 
 3 60 

 1
P  X <  = 0.0569
 2

Question 15

When the test scores for a class is first calculated, the median score is found to be 33 marks.

The following amendments were made after the students have checked their scripts:

Original Score 26 28 29 32 32 37 45

Amended Score 25 29 32 26 35 42 43

Which of the following best describes the nature of the median after the amendments were made?

Solution:

Only the highlighted change in score will affect the median as the rest are still in their respective “half”.

With the increase in score from 32 to 35, the new median’s ‘position’ would have moved to the right. This
means that it could remain as 33 or could have increased.

Hence the answer is ‘The amended median is not less than the original median.’

Question 16

What is Central Limit Theorem? Select the best answer.

The sampling distribution of the sample mean is normal.

The sampling distribution of the sample mean is normal if the sample size is big enough.

The sampling distribution of the sample mean is approximately normal.

The sampling distribution of the sample mean is approximately normal if the sample size is big enough.

σ2
( )
E X = µ and Var X = ( ) n
when n is large.

( ) ( )
E X = µ and Var X = σ 2 when n is large.
Question 17

The mass of turkeys in a bird park is found to have mean 6.7kg and standard deviation 3.1kg.

Find the probability that the mean mass of a random sample of 300 turkeys is between 6.5 and 6.8kg.

(Think about it, no answers given)Give a reason why it is not necessary to assume that the mass of the
turkey is normally distributed in order to carry out the calculation above.

Solution:

Let X kg be the mass of a turkey.

 3.12 
Since n = 300, by CLT, X ∼ N  6.7, 
 300 

P(6.5 < X < 6.8) = 0.580

Since n is large, by CLT X is approximately normal, hence there is no need to assume that X is
normal.

Question 18

Select all that applies. Central Limit Theorem applies when

• the sample size is sufficiently large.

• we are looking at the sampling distribution of the sample mean.

• the population does not have a normal distribution.

• the sample size is small.

Question 19

The number of goals per game scored by teams playing at home in the Premier League may be modelled by
a Poisson distribution. Team Rovers score an average of 1.63 goals per game played at home. In a
particular season, they play 19 home games. Using a suitable approximation, find the probability that
Rovers will score more than 35 goals in that particular season.

Solution:

Let X be the number of goals scored by Team Rover in 19 home games.


X ∼ Po(19 × 1.63)i.e. X ∼ Po(30.97)

Using normal as an approximation, X ∼ N (30.97,30.97)

P( X > 35) = P( X > 35.5) = 0.208