Statistics Assignment

Student Ref: 20177530
Statistics Assignment
Question 1
a) Frequency Distribution Table
Class Interval Midpoint Frequency
100.1 – 103 101.55 15
103.1 – 106 104.55 11
106.1 – 109 107.55 14
109.1 – 112 110.55 19
112.1 – 115 113.55 12
115.1 – 118 116.55 14
118.1 – 121 119.55 15
Figure 1
Figure 2
The frequency distribution table was created by arranging the data into separate
classes and determining their ranges. Then the midpoint was calculated for each
class by averaging the lower and higher band of their range:
Classh+Classl2
The frequency of the amount of resistors in each band is calculated by either

manually counting the number that fall in that particular range or by using the
following excel formula which counts the amount of resistors in the higher band
and subtracts the number of resistors in the lower band:
=COUNTIF(A3:A102,”>=118.1”)-COUNTIF(A3:A102,”>121”)
b) The data set was inputted onto a spreadsheet in order to produce the
histogram and calculate the following mean.
The mean can be calculated by inputting the following formula into the
spreadsheet cell:
=AVERAGE(A3:A102)
The outputted result of the mean is 110.621.

The mean can also be calculated from the histogram graph by summing the
products of each bar’s frequency and midpoint (area) and then divide by the sum
of the frequencies:
Midpoint Frequency Midpoint x Frequency
101.55 15 1523.25
104.55 11 1150.05
107.55 14 1505.7
110.55 19 2100.45
113.55 12 1362.6
116.55 14 1631.7
119.55 15 1793.25
Total 11067
Total / sum of 110.67

Frequencies
Figure 3
This method is a good estimate of the mean but not as accurate.
c) First method to determine the mode of the data using the spreadsheet
functions is by inputting the following into a cell:
=MODE(A3:A102)
This produces the result 118.2. The mode represents the most commonly
occurring number in the data set which can also be calculated manually. We can
also calculate the mode from the histogram by drawing a diagonal line from the
upper corners of the highest block to adjacent block upper corners on opposite
sides shown below:
A vertical line is then plotted

through the point where the two
diagonal lines intersect. The
value at the point where it
crosses the horizontal axis
determines the mode. This is a
crude method and not very accurate compared to the calculated method using
the raw data.
Figure 4
A cumulative data plot represents all the resistors in ascending order. From this
we can determine the mode from identifying the longest horizontal section with
no gradient in the trend as follows in figure 5:
Figure 5
d) The median of the data set is determined by inputting the following

formula into a cell in the spreadsheet:
=MEDIAN(B3:B102)
This returns the value of 110.95. This represents the middle value of the data set
range once the values have been sorted. There are 100 resistor values in the
range so the median is between the 50th and the 51st value which are 110.9 and
111.0 respectively. To calculate this manually we average the two values as
follows:
110.9+111.02=110.95
To determine the median from the histogram a vertical line is drawn that divides
the total area of the histogram into two even parts. The total area of the
histogram is 3 x 100 = 300 units. Halving this produces 150 which gives us the
unit area required on either side of the median. To achieve this the largest
rectangle must be split so that
150 – (45 + 33 + 42) = 30 units lay to the left and

150 – (36 + 42 + 45) = 27 units lay to the right.
This corresponds that the median lies at approximately 111. This calculation is
slightly less accurate than the first method.
Question 2
a) Here we determine the standard deviation from the following data:
Maximum Load Number of Cables
84 – 88 4
89 – 93 10
94 – 98 24
99 – 103 34
104 – 108 28
109 – 113 12
114 – 118 6
119 – 123 2
Figure 6
To achieve this another table is formed which details midpoint x, xc, frequency f,
xcf and xc2f with a step size of 5.
x xc f xcf xc2f
86 -3 4 -12 36
91 -2 10 -20 40
96 -1 24 -24 24
101 0 34 0 0
106 1 28 28 28
111 2 12 24 48
116 3 6 18 54
121 4 2 8 32
Totals 120 22 262
Figure 7
From figure 7 we can calculate the mean of xc using the following formula:
c=xcff=22120=0.183
=101+0.183×5=101.917
The following formula is used to determine the standard deviation:
σc=xc2ff- c2=262120-0.1832=2.15=1.466
Then multiply this by the step increment value of 5 to find the standard
deviation:
σ=1.466×5=7.331
This value of 7.331 represents the average distance that the values in the data
set lay away from the mean which entails that this set has a large dispersion and
variability of values.
b) To obtain the standard deviation using a spreadsheet the following

formulae was inputted:
x xc f xcf x2cf ̅xc

MAX LOAD =F11/E11
=((B2-
84 88 A2)/2)+A2 -3 4 =D2*E2 =D2*D2*E2
=((B3-
̅x
89 93 A3)/2)+A3 -2 10 =D3*E3 =D3*D3*E3 Mean =C5+(J1*(C3-C2))
=((B4-
94 98 A4)/2)+A4 -1 24 =D4*E4 =D4*D4*E4
=((B5- =SQRT((G11/E11)-
σc
99 103 A5)/2)+A5 0 34 =D5*E5 =D5*D5*E5 (J1*J1))
=((B6-
104 108 A6)/2)+A6 1 28 =D6*E6 =D6*D6*E6
=((B7- Standard
σ
109 113 A7)/2)+A7 2 12 =D7*E7 =D7*D7*E7 Deviation =J5*(C3-C2)
=((B8-
114 118 A8)/2)+A8 3 6 =D8*E8 =D8*D8*E8
=((B9-
119 123 A9)/2)+A9 4 2 =D9*E9 =D9*D9*E9
Total =SUM(E2:E =SUM(F2:F =SUM(G2:G

s 9) 9) 9)
The mean , was calculated as 101.917, and the standard deviation to be 7.331
which is the same as the previously calculated values.
Question 3
a) Using the Binomial distribution, we can determine the probabilities of

defective components with this formula:
(q+p)n=qn+nqn-1p+n(n-1)2!qn-2p2+nn-1(n-2)3!qn-3p3+…
i. Probability that 0 will be defective:

3% of 50 equals 1.5, 1.5/50 = 0.03 is the probability.
Therefore p = 0.03 and q = 0.97
qn=0.9750=0.218
ii. Probability that 1 will be defective:
nqn-1p=50×0.9749×0.03=0.337
iii. Probability that 1 or less will be defective:
qn+nqn-1p=(0.9750)+(50×0.9749×0.03)=0.218+0.337=0.555
iv. Probability that 2 or more will be defective:
1-qn+nqn-1p=1-[(0.9750)+(50×0.9749×0.03)]=1-0.555=0.445
a) Using the Poisson distribution, we can determine the probabilities of

defective fuses with this formula:
e-λ(1+λ+λ22!+λ33!+…)
i. Probability that 0 will be defective:
λ=np=250×0.01=2.5
e-2.51=0.082
ii. Probability that 1 will be defective:
λe-λ=2.5e-2.5=0.205
iii. Probability that 2 will be defective:

λ2e-λ2!=2.52×e-2.52×1=0.257
iv. Probability that 3 or more will be defective:
1-(e-λ+λe-λ+λ2e-λ2!)=1-0.082+0.205+0.257=0.456
a) To compare the two distribution results for 10% of defective items

produced by a machine with a sample amount of 10, we can see how they
differ for 2 defective items.
10% = 0.1 = p, therefore q = 0.9 and n = 10.
Binomial:
(q+p)n=qn+nqn-1p+n(n-1)2!qn-2p2+nn-1(n-2)3!qn-3p3+…
n(n-1)2!qn-2p2=10(10-1)2×1×0.910-2×0.12=0.194
Poisson:
λ=np=10×0.1=1
e-λ(1+λ+λ22!+λ33!+…)
λ2e-λ2!=12e-12×1=0.184
There is a difference of 0.010 between the two approaches. This can be

explained by the fact that the Poisson distribution is concerned with the situation
where the number of trials has no limit. The binomial distribution involves a finite
number of possibilities where the number of trials is known and is a key variable
in the probability calculation. The Poisson distribution is a good approximation of
the binomial method but in this instance, where n is less than 10, it is more
accurate to use the binomial method despite the Poisson saving on computation.
Question 4
For the sample size, N = 7, and mean,
=1.12+1.15+1.10+1.14+1.15+1.10+1.117=1.1243Ωm-1
And standard deviation:
σ=(1.12-1.1243)2+(1.15-1.1243)2+(1.10-1.1243)2+…
7=0.002971437=0.0206Ωm-1
With the mean and standard deviation calculated, we must find z, the normal
standard variate.
z=99÷2100=0.495
Using the following table Partial areas under the standardised normal curve in
figure 8, z has been identified by the red square which equates to 2.57.
Using the formula:
±zσn
We can evaluate the confidence intervals:
1.124+2.57×0.02067=1.144 Ωm-1
1.124-2.57×0.02067=1.104Ωm-1
Figure 8 – Partial areas under the standardised normal curve1
1 Higher Engineering Mathematics 2006, John Bird, p.561

Another method is to employ the Student’s t distribution where v has been

identified by the red square:
Figure 9 – Percentile values (tp) for Student’s t distribution with v degrees of

freedom
2
(shaded area = p)
So, t0.99 , v = 7 – 1 = 6, which shows tc = 3.14.
2 Higher Engineering Mathematics 2006, John Bird, p.587

From this we can carry out the following calculation:
±tcsN-1=1.1243±3.14(0.0206)7-1=1.1243±0.0264
Which gives rise to the two possible values of:
1.1243+0.0264=1.151Ωm-1
1.1243-0.0264=1.098Ωm-1
This indicates that there is a 99% chance that the true specific resistance of the
wire lies between 1.151Ωm-1 and 1.098Ωm-1.

Statistics Assignment

Hochgeladen von

Dokumentinformationen

Originalbeschreibung:

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Statistics Assignment

Hochgeladen von

Copyright:

Verfügbare Formate

Student Ref: 20177530

a) Frequency Distribution Table

Class Interval Midpoint Frequency

100.1 – 103 101.55 15

103.1 – 106 104.55 11

106.1 – 109 107.55 14

109.1 – 112 110.55 19

112.1 – 115 113.55 12

115.1 – 118 116.55 14

118.1 – 121 119.55 15

The frequency of the amount of resistors in each band is calculated by either

The outputted result of the mean is 110.621.

Midpoint Frequency Midpoint x Frequency

Total / sum of 110.67

This method is a good estimate of the mean but not as accurate.

A vertical line is then plotted

d) The median of the data set is determined by inputting the following

150 – (45 + 33 + 42) = 30 units lay to the left and

150 – (36 + 42 + 45) = 27 units lay to the right.

a) Here we determine the standard deviation from the following data:

Maximum Load Number of Cables

Totals 120 22 262

The following formula is used to determine the standard deviation:

b) To obtain the standard deviation using a spreadsheet the following

x xc f xcf x2cf ̅xc

Total =SUM(E2:E =SUM(F2:F =SUM(G2:G

a) Using the Binomial distribution, we can determine the probabilities of

i. Probability that 0 will be defective:

ii. Probability that 1 will be defective:

iii. Probability that 1 or less will be defective:

iv. Probability that 2 or more will be defective:

a) Using the Poisson distribution, we can determine the probabilities of

i. Probability that 0 will be defective:

ii. Probability that 1 will be defective:

iii. Probability that 2 will be defective:

iv. Probability that 3 or more will be defective:

a) To compare the two distribution results for 10% of defective items

10% = 0.1 = p, therefore q = 0.9 and n = 10.

There is a difference of 0.010 between the two approaches. This can be

For the sample size, N = 7, and mean,

And standard deviation:

Using the formula:

We can evaluate the confidence intervals:

Figure 8 – Partial areas under the standardised normal curve1

1 Higher Engineering Mathematics 2006, John Bird, p.561

Another method is to employ the Student’s t distribution where v has been

Figure 9 – Percentile values (tp) for Student’s t distribution with v degrees of

So, t0.99 , v = 7 – 1 = 6, which shows tc = 3.14.

2 Higher Engineering Mathematics 2006, John Bird, p.587

From this we can carry out the following calculation:

Which gives rise to the two possible values of:

Das könnte Ihnen auch gefallen