You are on page 1of 5

Statistics for Data Analysis Module Code B8IT103

QQI
Higher Diploma in Data Analytics

NOVEMBER 2015 EXAMINATIONS

Module Code: B8IT103

Module Description: Statistics for Data Analytics

Examiner: Thomas Fitzsimons

Internal Moderator: Niall Larkin

External Examiner: Paul Stynes


Date: Wednesday, 18th November 2015
Time: 3pm to 5pm

INSTRUCTIONS TO CANDIDATES
Attachments: Statistical tables and formulas (at back of the paper).
Calculators are permitted in this exam.

Time allowed is 2 hours


Answer 3 Question out of 4 Questions
All questions carry 100 marks

Page 1 of 5
Statistics for Data Analysis Module Code B8IT103

Question 1

a) Describe what is meant in statistics by a Normal Distribution.


(40 marks)

b) Suppose a College Lecturer wants to select 6 students out of a class of 12 students


to make a presentation. Calculate how many different samples of the 6 students can be
selected from the 12.
(30 Marks)

c) Using a diagram to illustrate your answer, describe what is meant by the terms
Null and Alternate hypotheses. In your answer show what is meant by a Type I
and Type II error, and describe on what basis you would reject or accept a Null
Hypothesis.
(30 Marks)

Question 2

Part A
A computer tablet manufacturer requires aluminium covers that can support a weight
of 15kg g, and has issued a tender to suppliers to fulfil an order of 20,000 covers. The
successful supplier must either meet or exceed the 15kg standard. Using the following
sample data from one supplier:

KG
15.5 (A) Specify the Null and Alternate hypotheses to test the
17.3 suppliers claim that their covers meet or exceed the 15kg
13.8 standard.
14.9 (B) Calculate the Z score to determine if there is enough
15.2 evidence to reject the Null Hypothesis.
15.6 (C) Select an appropriate alpha () value for your test.
16.2 (D) Explain in detail what your answer means.
14.7
19.9
15.1
16.2
17.0
16.5
15.7
15.0
(40 Marks)

Page 2 of 5
Statistics for Data Analysis Module Code B8IT103

Part B

The ABC Battery Company has developed a new battery. The engineer in charge
claims that the new battery will operate continuously for at least 7 minutes longer
than the old battery. To test the claim, the company selects a simple random sample of
100 new batteries and 100 old batteries. The old batteries run continuously for 190
minutes with a standard deviation of 20 minutes; the new batteries, 200 minutes with
a standard deviation of 40 minutes. Test the engineer's claim that the new batteries run
at least 7 minutes longer than the old. (Assume that there are no outliers in either
sample.)

I. Calculate the t-Statistic to determine if there is a significant difference


between sample data.
II. Specify the NULL and Alternate Hypotheses.
III. Select an appropriate alpha () value for your test.
IV. Use the Statistics Tables in Appendix 1 to determine if there is, or is not, a
significant difference between the two samples.
V. Explain in detail what your answer means.

(40 Marks)

Part C

Describe the T-statistics and when you would use it.

(20 Marks)

Page 3 of 5
Statistics for Data Analysis Module Code B8IT103

Question 3

A Sales Manager wants to compare sales figures for three new brands (X, Y, and Z)
for the first quarter (Q1) of the year.

Q1 Sales
Brand X Brand Y Brand Z
4 7 3
5 9 2
6 8 1

I. Report the Sample Means for the three brands.


II. The Grand Mean
III. Find the Sum of Squares SST.
IV. Find the Sum of Squares Within-Groups SSW.
V. Find the Sum of Squares Between-Groups SSB.
VI. Select an appropriate alpha () value for your test.

VII. Test the hypothesis that the mean sales for the three brands are the same.
Report the null and alternative hypotheses, test statistic, df values, P-value,
and interpret.
VIII. Construct an ANOVA table for displaying the results of this analysis.

(100 marks total)

Question 4

Part (A)

A genetics engineer was attempting to cross a tiger and a cheetah. She predicted a
phenotypic outcome of the traits she was observing to be in the following ratio 4
stripes only: 3 spots only: 9 both stripes and spots. When the cross was performed
and she counted the individuals she found 50 with stripes only, 41 with spots only and
85 with both.

Use a Chi-square test; did she get the predicted outcome?

(50 Marks)

Page 4 of 5
Statistics for Data Analysis Module Code B8IT103

Part (B)

1. A poker-dealing machine is supposed to deal cards at random, as if from an infinite


deck.
In a test, you counted 1600 cards, and observed the following:
Spades 404
Hearts 420
Diamonds 400
Clubs 376

(25 Marks)

A. Use a Chi-square test to measure the discrepancy between the observed results
and the expected results. Explain the significance of your result.

2. Using the same poker-dealing machine is supposed to deal cards at random, as if


from an infinite deck but this time jokers are included.
In a test, you counted 1662 cards, and observed the following:
Spades 404
Hearts 420
Diamonds 400
Clubs 356
Jokers 84

A) How many jokers would you expect out of 1662 random cards? How many of
each suit?
B) Is it possible that the cards are really random? Or are the discrepancies too
large?

(25 Marks)

END OF EXAMINATION

Page 5 of 5