LCCI International Qualifications

Business Statistics Level 3

Model Answers
Series 4 2011 (3009)

For further information contact us:

Tel. +44 (0) 8707 202909 Email. enquiries@ediplc.com www.lcci.org.uk

Business Statistics Level 3
Series 4 2011

How to use this booklet Model Answers have been developed by EDI to offer additional information and guidance to Centres, teachers and candidates as they prepare for LCCI International Qualifications. The contents of this booklet are divided into 3 elements: (1) (2) Questions Model Answers – reproduced from the printed examination paper – summary of the main points that the Chief Examiner expected to see in the answers to each question in the examination paper, plus a fully worked example or sample answer (where applicable) – where appropriate, additional guidance relating to individual questions or to examination technique

(3)

Helpful Hints

Teachers and candidates should find this booklet an invaluable teaching tool and an aid to success. EDI provides Model Answers to help candidates gain a general understanding of the standard required. The general standard of model answers is one that would achieve a Distinction grade. EDI accepts that candidates may offer other answers that could be equally valid.

© Education Development International plc 2011 All rights reserved; no part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic, mechanical, photocopying, recording or otherwise without prior written permission of the Publisher. The book may not be lent, resold, hired out or otherwise disposed of by way of trade in any form of binding or cover, other than that in which it is published, without the prior consent of the Publisher

3009/4/11/MA

Page 1 of 16

QUESTION 1 (a) State four characteristics of the normal distribution. (4 marks) Each day from Monday to Friday John commutes to work by train. His journey takes an average of 40 minutes to complete with a standard deviation of 10 minutes. (b) Assuming his journey times are normally distributed, find the probability that his journey takes: (i) (ii) (iii) less than 20 minutes less than 55 minutes between 30 minutes and 60 minutes to complete. (8 marks) (c) If John always leaves home 60 minutes before his scheduled work start time, what is the probability that he is not late for work on any day in a normal five day week? (3 marks) John buys a rail ticket each morning he arrives at the station. Over a typical 80 day period he notes that on 13 days there was no queue at the ticket office, on 25% of the days there was a queue of between 1 and 5 people and on the remaining days there was a queue of 6 or more. He also noted that on 5% of the occasions there was a pricing error on his ticket Assume the length of the queue and possible ticket errors are independent events and that the above data is representative of all journeys made. (d) Find the probability that on a randomly chosen day there is a queue at the ticket office and no error on the ticket purchased. (5 marks) (Total 20 marks)

MODEL ANSWER TO QUESTION 1 Syllabus Topic 4.2: Normal Distribution (a) Symmetrical, bell-shaped, mean median and mode are equal, asymptotic to the x axis 4x1

(b)

(i)

z= z= z= z=

x−x

σ
x−x

=

20 − 40 = -20/10 = -2, p = 0.977, Ans = 1-0.977 =0.023 10 55 − 40 = 15/10 = 1.5, Ans p = 0.933 10 30 − 40 = -10/10 = -1.0, p = 0.841 10 60 − 40 = 20/10 = 2.0, p = 0.977 10

1m 1ft 1cao

(ii)

σ
x−x

=

1ft 1cao

(iii)

σ
x−x

=

1cao

σ

=

1cao

Answer 0.841 + 0.977 – 1 = 0.818 1cao (c) (d) 0.977 = 0.891 1m 1ft 1cao 0.587 and 0.1625 seen 0.95 (0.5875 + 0.25) x 0.95 = 0.796 Page 2 of 16 1m 1cao 1cao 1m 1cao
5

3009/4/11/MA

QUESTION 2 A book company publishes books with the following associated costs Book D E 7 9 14 16

Quantity Produced (000), x Production Costs £(000), y

A 2 10

B 5 12

C 6 13

F 11 18

G 15 21

H 11 19

I 10 16

J 6 10

You are given the following totals Σx = 798 Σy = 2347 (a) (b) (c) (d)
2 2

Σxy = 1342

Calculate the least squares regression equation for production cost on quantity produced. (7 marks) Interpret the meaning of the coefficients in your answer to part (a). (4 marks) Calculate the correlation coefficient for these data. (4 marks) Use your regression equation to calculate the expected production costs of producing 12000 copies of a book. Comment on the likely accuracy of your answer. (5 marks) (Total 20 marks)

3009/4/11/MA

Page 3 of 16

MODEL ANSWER TO QUESTION 2 Syllabus Topic 3.1: Correlation and regression (a) ∑x = 82, ∑y = 149 b 1cao, 1cao

=

n ∑ xy − ∑ x ∑ y n ∑ x 2 − (∑ x )
2

1m 1ft 1cao

= 10 x 1342 – 82 x 149 = 1202 = 0.957 2 10x798 - 82 1256 a=

∑ y − b∑ x n

1cao

149 – 0.957 x 82 = 7.05 10 y = 7.05 + 0.957x (b) 7.05 (£7050) are the fixed costs. 0.957 (£957 per 1000 produced) are the variable costs associated with the number of books printed.

1cao

2x2

(c)

r=

(n∑ x

n∑ xy − (∑ x )(∑ y )
2 2

− (∑ x ) n∑ y 2 − (∑ y )

)(

2

)

1ft 1cao 2 of num/denom, 1cao 1m 1cao

=
(d)

(10 × 798 − 82 )(10 × 2347 − 149 )
2 2

10 × 1342 − 82 × 149

= 1202/1262 = 0.952

7.05 + 0.957x = 7.05 + 0.957 x 12 = 7.05 + 11.48 = 18.53 (£000) As the correlation coefficient is close to one there is very good association between the variables and therefore confidence can be high in the forecast.

1m1ft1cao 2 (or ref. interpolation)

3009/4/11/MA

Page 4 of 16

QUESTION 3 A random sample of stations operated by a rapid transit rail company shows the following number of tickets sold between 10.00 am and 11.00 am over a sixteen day period. Number of tickets sold (a) 27 32 35 26 43 33 20 28 39 19 45 23 53 51 28 44

Find the arithmetic mean and standard deviation for the above data. (6 marks)

(b)

Estimate the 99% confidence interval for the mean number of tickets sold at a station between 10.00am and 11.00am. (5 marks)

The company has introduced a new on-line system of buying tickets. In the previous year the average sales of tickets between 10.00 am and 11.00 am was 41. (c) Test whether the new system has significantly reduced the number of tickets bought between 10.00 am and 11.00 am. (9 marks) (Total 20 marks)

3009/4/11/MA

Page 5 of 16

MODEL ANSWER TO QUESTION 3 Syllabus Topic 2.1: Measures of location Syllabus Topic 2.2: Measures of Location and Dispersion Syllabus Topic 5.1: Procedure for Significance Tests Syllabus Topic 5.3: Significance Tests and Confidence Intervals (a) ∑x = 546 Mean = 546/16 = 34.125 (34.13) ∑x = 20342
2

1cao 1m 1 cao 1m 1m 1cao

20342 −
sd =

546 16

2

15

= 10.676

1cao

(b)

99% confidence interval: n-1 degrees of freedom = 15, critical t value = 2.95

1m 1ft

ci = x ± t 0.99 ×

σ
n

= 34.125 ± 2.95 ×

10.676 16

= = 34.125 ± 7.874

1cao 1cao

= 26.251 to 41.999 1 = Ho 1 = H1 (c) Null hypothesis: The mean number of tickets sold has not changed. Alternative hypothesis: The mean number of tickets sold has reduced. : n-1 degrees of freedom = 15, critical t value = 2.13/2.95 1 df 1 cv

t=

x−µ sd n

=

6.875 34.125 − 41 = = 2.58 10.676 2.669 16

1m 1ft 1cao

Conclusions: Reject the null hypothesis at the 0.05 significance level. The mean number of tickets sold has reduced. At the 0.01 significance level the t value is smaller than the critical t value, do not reject the null hypothesis, the mean number of tickets sold has not reduced.

2 needs reference to 0.01 level

3009/4/11/MA

Page 6 of 16

QUESTION 4 (a) State the circumstances, in time series analysis, in which finding the average seasonal variation by the additive method is to be preferred to finding the average seasonal variation by the multiplicative method. (4 marks)

A traffic survey, on a busy road into the main business area of town, for a three week period showed the following data (thousands of vehicles) over a three week period. Sunday 12 13 14 Monday 25 27 28 Tuesday 28 30 32 Wednesday 30 33 35 Thursday 35 38 40 Friday 40 42 44 Saturday 30 31 33

Week 1 Week 2 Week 3 (b)

Calculate a trend for this data using an appropriate moving average. (4 marks)

(c)

Estimate the average daily variations. (4 marks)

(d)

Forecast the number of vehicles for Monday and Tuesday of the fourth week. (6 marks)

(e)

If it is known that the fourth week is a holiday week how might this affect your result in part (d). (2 marks) (Total 20 marks)

3009/4/11/MA

Page 7 of 16

MODEL ANSWER TO QUESTION 4 Syllabus Topic 3.2: Time Series (a) The additive method is used in preference to the multiplicative method when the differences between the moving average and the original data are not related to the value of the moving average. Vehicles 000 12 25 28 30 35 40 30 13 27 30 33 38 42 31 14 28 32 35 40 44 33 (c) S M T W 1.43 -16.71 -17.43 Total S Var Average S Var (d) -34.14 -17.04 -3.14 -0.43 -3.71 0.00 -6.86 -0.43 -3.43 -0.21 2.43 2.71 6.57 2.19 13.57 6.79 22.14 11.07 0.57 0.29 1m 1 cao 1m 1cao 1cao T 6.29 7.29 F 11.00 11.14 S 0.71 -0.14 1 transfer ft 1m 1cao 1m 1cao 1m 1cao (4marks + 2 marks) 200 201 203 205 208 211 213 214 215 216 218 220 222 224 226 28.57 28.71 29.00 29.29 29.71 30.14 30.43 30.57 30.71 30.86 31.14 31.43 31.71 32.00 32.29 1.43 6.29 11.00 0.71 -16.71 -3.14 -0.43 2.43 7.29 11.14 -0.14 -17.43 -3.71 0.00 2.71 Moving Total Moving Average Differences (Trend) 2 x 2 can use diagrams

(b)

Rate of change = 32.29 – 28.57 = 3.71 = 0.265 (0.27) 15-1 14 Estimated trend week 4 Monday = 32.29 + 0.265 x 5 = 33.61 Estimated trend week 4 Tuesday = 32.29 + 0.265 x 6 = 33.88 Estimated number of vehicles Monday = 33.61 – 3.43 = 30.18 (000) Estimated number of vehicles Tuesday = 33.88 – 0.21 = 33.66 (000)

1m 1cao

(e)

It is reasonable to assume that the number of vehicles will be fewer than the forecast as fewer people go into work. The forecast will over estimate the number of vehicles. Page 8 of 16

2

3009/4/11/MA

QUESTION 5 The records of a company contain the following data for the number of customers entering its stores per day. February Mean number of customers per day Standard deviation Median Sample size (a) 3560 225 3420 120 March 3650 260 3780 120

Calculate and compare the coefficient of variation for the two sets of data. Explain what the coefficient of variation measures. (4 marks)

(b)

Calculate and compare the coefficient of skewness for the two sets of data, explaining what the coefficient of skew measures. (6 marks)

(c)

Test whether there been a significant increase in the mean number of customers entering the company’s stores between the two months? (8 marks) Would your conclusion have been different if a two tail test had been used? Explain your reasoning. (2 marks) (Total 20 marks)

(d)

3009/4/11/MA

Page 9 of 16

MODEL ANSWER TOQUESTION 5 Syllabus Topic 2.3: Measures of Skewness Syllabus Topic 2.4: Coefficient of Variation Syllabus Topic 5.1: Procedure for Significance Tests Syllabus Topic 5.3: Significance Tests and Confidence Intervals (a) Coefficient of variation = (standard deviation/mean) x100 February = 225/3560 x 100 = 6.3% March = 260/3650 x 100 = 7.1% Measures the relative variation of a data set Both coefficients of variation are small indicating there is little variation compared with the mean values. The figure for March is greater than February. Coefficient of skew = 3(mean – median)/sd/ mean x 100 February = 3(3560 - 3420)/225 = 1.87 March = 3(3650 - 3780)/260 = -1.5 2x1 cao

2x2

(b)

1m

Coefficient of skew gives an impression of the shape of the data. Positive skew suggests a longer tail of data towards higher values of a distribution whilst negative skew shows a longer tail of data towards lower values of a distribution. The data shows positive skew in February compared with negative skew in March. February’s data is also more skewed that March’s reflecting a greater difference between the mean and median. (c) Null hypothesis: There has been no significant change in the mean number of customers entering the stores. Alternative hypothesis: There has been a significant increase in the mean number of customers entering the stores. Critical z value = 1.64/2.33

3 needs some ref to what c of skew measures.

2x1

1cao

=

3560 − 3650 225 2 260 2 + 120 120

= 90/31.388 = 2.87 1m1ft1cao

Conclusion: Reject the null hypothesis at both the 0.05 and 0.01 significance level. The mean number of customers entering the stores has increased. (d) No, the critical value for z in a two tail test is 2.58 which is less than 2.87.

2x 1needs ref to1% 2ft

3009/4/11/MA

Page 10 of 16

QUESTION 6 (a) Explain the impact of a business setting its quality control limits too wide. (4 marks) Quality control procedures are used which set the warning limits at the 0.025 probability point and action limits at the 0.001 probability point. This means, for example, that the upper action limit is set so that the probability of the means exceeding the limit is 0.001. The average weight of farmed trout is 125 grams with a standard deviation of 10 grams. Samples of 7 fish at a time are taken from a tank of fish to check the average weight. (b) (i) (ii) Draw a control chart to monitor the process (8 marks) 6 samples are taken with the following mean weights (grams): Weight per fish 128.5 (grams) 131.2 124.6 137.4 134.6 123.2

Plot these data on your control chart and comment appropriately. (4 marks) (c) An improved feeding regime has decreased the standard deviation to 7 grams. Assuming the arithmetic mean remains the same and the sample size is 7, what is the probability that a sample mean lies outside the upper warning limit. (4 marks) (Total 20 marks)

3009/4/11/MA

Page 11 of 16

MODEL ANSWER TO QUESTION 6 Syllabus Topic 1.1: Graphical Presentation Syllabus Topic 5.6: Quality Control

(a)

By setting the quality control limits too wide the number of samples rejected will decrease and costs of resetting the process will be reduced. The business may gain a reputation for poor quality which may benefit its competitors. (i) Warning limits =

2x 2

(b)

x ± 1.96 ×

σ
n

=

125 ± 1.96 ×

10 7

= 125 ± 7.4

1 for 1.96, 1m 1cao

=117.6 to 132.4 Action limits =

x ± 3.09 ×

σ
n

= 125 ± 11.7 = 113.3 to 136.7

3.09, 1 cao limits 1

(ii)

UAL

Title, axes & scale 1 Plot of lines 2

UWL

Mean

Plot of means 2

LWL LAL

Comment: Sample 4 is outside the upper action limit. The process should be stopped.

2 1m 1ft z = 2.8 1cao

(c)

z=

x−x

σ

=

132.4 − 125 = 7.4/2.65 = 2.8, p = 0.997 7 7

n

1cao Answer = 1 - 0.997 = 0.003

3009/4/11/MA

Page 12 of 16

QUESTION 7 (a) Explain why, when the sample size increases, the sample means cluster more closely about the population mean. (4 marks) Eight pieces of rope are divided into two parts and subjected to two chemical treatments. After the treatment the ropes are subjected to breaking strain tests. Rope Sample a Breaking Strain (kg) Chemical A Breaking Strain (kg) Chemical B b c 1052 1064 d 1068 1077 e 1114 1125 f 1250 1239 g 1098 1112 h 1058 1072

1025 1038 1047 1028

(b)

(c)

Test whether there is any difference between the effects of the two chemicals on breaking strain. (12 marks) Explain what is meant by a type 2 error. Is it likely that a type 2 error has occurred in your answer to part (b)? Explain how you reached your conclusion. (4 marks) (Total 20 marks)

3009/4/11/MA

Page 13 of 16

MODEL ANSWER TO QUESTION 7 Syllabus Topic 2.2: Measures of Location and Dispersion Ungrouped Data Syllabus Topic 5.1: Procedure for Significance Tests Syllabus Topic 5.2: Type I and Type II Errors Syllabus Topic 5.4: Significance Tests and Confidence Intervals

(a)

When samples of a given size are taken, the distribution of the sample means is referred to as the sampling distribution of the mean. The formula is se = σ/√n, as n increases for any given mean the value of se will decrease. Null hypothesis: The two chemicals have no different effect on breaking strain. Alternative hypothesis: The two chemicals have a different effect on breaking strain. Degrees of freedom = n –1 = 8 – 1 = 7 Critical t = 2.37/3.50 1025 1047 d = 22 ∑d² = 1038 1028 -10 1443 1052 1064 12 sd = 11.819 1068 1077 9 1114 1125 11 1250 1239 -11 1098 1112 14 1058 1072 14 61 Mean = 7.625

2 x2 2 1 1

(b)

1 sd 1m 1cao

t=

d −0

σ

= (7.625-0)/4.18 = 1.82

1m 1ft 1cao

n
Conclusion: The calculated value of t is less than the critical value of t at the 0.05 significant level. Do not reject the null hypothesis. The two chemicals have no different effect on breaking strain. (c) A type 2 error is accepting a null hypothesis when is should be rejected. A type 2 error may have occurred as the null hypothesis was accepted. 1 1

2x2

3009/4/11/MA

Page 14 of 16

QUESTION 8 The table below shows the Retail Prices Index (RPI), 1987 = 100 and the weekly wage paid to an employee. 2006 194.2 480 2007 203 510 2008 211.1 526 2009 210.9 549 2010 219.3 558

RPI Wages £ (a)

Convert both series of data to index numbers with 2006 as the base year = 100. (4 marks)

(b)

Calculate, in real terms, how much the employee’s wage has risen between 2006 and 2010. (3 marks)

The company wishes to survey its employees regarding their attitudes to changes in pension arrangements. The finance director has been told that the payroll listing can be used as the sampling frame. (c) Explain what is meant by the sampling frame and how the payroll listing may be used as a sampling frame. (4 marks) Explain how the following methods can be used to select a sample of employees indicating one advantage and one disadvantage of each method. (i) (ii) (iii) Systematic sampling Random sampling Stratified Random sampling. (9 marks) (Total 20 marks)

(d)

3009/4/11/MA

Page 15 of 16

MODEL ANSWER TOQUESTION 8 Syllabus Topic 1.2: Survey Methodology Syllabus Topic 2.5: Index Numbers

(a)

Rebase Retail Price index by dividing by 194.2 x 100 Convert wages to an index by dividing by 480 x 100 to give 2006 100 100 2007 104.5 106.3 2008 108.7 109.6 2009 108.6 114.4 2010 112.9 116.3 m1 1cao 1m 1ft 1cao

m1 1cao

(b) (c)

Increase in real wage = (116.3/112.9) x 100% - 100% = 2.9% (3%) A sampling frame is a list of all the relevant population. The payroll is a list of all the employees of a business and will be up-to-date and relevant. However some people may not be on the payroll, for example, new employees who may not receive a pension. (i) Systematic sampling: A random number is selected for example 4 and then the th th, 14 , 24 etc are selected for a 10% sample. Advantage: It is simple, cheap, and quick. Disadvantage: There may be a natural periodicity in the data, it is not random therefore the standard error cannot, legitimately, be calculated. Random sampling. Each person in the sampling frame is identified and numbered. The sample size is selected and random numbers are generated to allow the sample to be identified. Advantages: All persons have a equal chance of being selected. The standard error can be estimated. Disadvantages: More time consuming and costly. May be difficult to contact all the identified sample. Stratified Random sampling. The sampling frame has to be sub-divided into relevant groupings/strata, for example, age, gender, salary. A random sample is taken for each subset identified. Advantages: The results will reflect more closely the composition of the employees’ views Disadvantages: More costly and time consuming as more preparatory work has to be undertaken to identify the sub groups.

2 x2

(d)

1 1 1

(ii)

1 1 1

(iii)

1 1 1

3009/4/11/MA

Page 16 of 16

© Education Development International Plc 2011

EDI International House Siskin Parkway East Middlemarch Business Park Coventry CV3 4PE UK Tel. +44 (0) 8707 202909 Fax. +44 (0) 2476 516505 Email. enquiries@ediplc.com www.ediplc.com

3009/3/11/MA

Page 17 of 17

© Education Development International plc 2011