Sie sind auf Seite 1von 10

Math 1040 Skittles Term Project

Data Collection
I started this project by buying a 2.17 ounce bag of Original Skittles and writing down my
findings. Then, my class did the same and we collected all data to use it as reference for the next
step of the project. Using all my statistics skills, I began to make a pie chart, Pareto chart, and
box plot using Statcrunch. The objective of this project is to understand and apply the concepts
we are learning throughout the semester.
Stat 1040 Spring 2016 Skittles
Data
S# ID
S00782426
S00683437
S00100509
S00849029
S00776901
S00560594
S00803934
S00775245
S00725046
S00208021
S00746197
S00266921
S00822804
S00619174
S00842982
S00760913
S00854511
S00823237
S00818486
S00604208
S00814210
S00625361
S00688475
S00798227
S00784553
S00576852
S00727349

Total
Skittles
60
62
60
58
58
56
60
59
58
59
63
61
58
59
60
61
64
60
54
62
57
55
60
56
58
57
66

1601

Red

Orange

Yellow

Green

Purple

14
13
8
16
16
11
11
11
11
13
11
16
13
15
16
15
15
11
6
17
10
21
11
15
16
10
18

8
14
12
11
12
12
7
19
16
12
15
8
11
9
16
13
8
14
13
16
9
11
14
11
9
5
10

13
13
13
6
12
10
17
10
7
9
12
9
12
14
10
9
13
9
14
8
14
7
12
8
11
12
10

13
9
11
12
4
13
14
9
12
14
13
15
11
10
9
11
11
16
10
7
19
5
15
10
15
12
13

12
13
16
13
14
10
11
10
12
11
12
13
11
11
9
13
17
10
11
14
5
11
8
12
7
18
15

360

315

294

313

319

Each student in the class will purchase one 2.17-ounce bag of Original Skittles and
record the following data:
Individual Data:

Number of
Total
candies

60

Number
of
red
candies
11

Number of
orange
candies

Number of
yellow
candies

Number of
green
candies

Number of
purple
candies

17

14

11

Number of
yellow
candies

Number of
green
candies

Number of
purple
candies

294

313

319

Class Data: Total number of bags 27


Number of
Number
Number of
Total
of
orange
candies
red
candies
candies
360
315
1601

Comparing my individual data with my class, I noticed that I got more


green (23.3%) and yellow (28.3%) skittles than the rest of my
classmates. The percentage of orange (11.7%), purple (18.3%) and red
(18.3%) skittles in my bag were less than what other people in my
Statistics class got.

Mean Std. dev. Median Min Max Q1 Q3


2.71
59 54
66 58 61
59.3

In general, it seems that the


company likes to add more
red skittles to their bags than
yellow skittles.

This box plot seems to have a


normal distribution. There is one
outlier in the total number of
skittles, which is the number 66.
This can affect the mean value
and the standard deviation.

1. Do the graphs reflect what you expected to see? Does the overall data collected by
the whole class agree with your own data from a single bag of candies?
The graph look very similar to what I expect it to be. The mean value for the class was
59.3 skittles per bag. This result was close to what I got, which was 60 skittles per bag.
Some outliners may have affect the data a little bit, but not a lot.

2. What is the shape of the distribution? Do the graphs reflect what you expected to
see? Does the overall data collected by the whole class agree with your own data
from a single bag of candies?
The shape of distribution in my frequency histogram seems to have a belt shape
(normal) distribution. The overall data collected by my classmates were close to
what I got in my skittles bag. Therefore, my own data agrees to what they had in
their own bags. Pie graphs and bar charts are great ways to graph categorical
data.
Reflection
Categorical data are values or observations that can be sorted into groups or categories.
For example; labels such as names, colors, height, etc. Numerical data are values or
observations that can be measured. Pie graphs and bar charts are used to graph categorical data
like the one I use, to separate the skittles by color. By using a pie chart, the information can be
easy to understand because most people are visual.
Scatter plots and line graphs, for instance the boxplot I did, are best when used to graph
numerical data. Scatter plots are a quantitative way to display data because they involve
observations that include numbers in them. For quantitative data the use of the mean and the five
number summary are very important to graph a boxplot. Categorical data dont need this kind of
calculations.

Confidence Interval Estimates


o Explain in general the purpose and meaning of a confidence interval.
A confidence interval gives an estimated range of values which is likely to include an
unknown population parameter, the estimated range being calculated from a given set of
sample data.
o Construct a 99% confidence interval estimate for the true proportion of yellow candies.
(1 )
= 2

(0.184)(0.816)

= (2.575)

1601

= .0249

Im 99% confident that the


true proportion of yellow
candies is between 15.91%
and 20.89%.

. 1591 < < .2089

o Construct a 95% confidence interval estimate for the true mean number of candies per
bag.
= 2

= (2.056)(

2.71

27
= 1.0723

Im 95% confident that the


interval 58.64 to 62.24
contains the true value of .

58.23 < < 60.37

o Construct a 98% confidence interval estimate for the standard deviation of the number of
candies per bag.

( 1) 2
( 1) 2

<

<
2
2

(27 1)(2.71)2
(27 1)(2.71)2

<<
45.642
12.198
2.05 < < 3.96

Im 98% confident that the


standard deviation of the
number of candies per bag lies
between 2.05 and 3.96.

Hypothesis Tests
The best way to determine whether a statistical hypothesis is true would be to examine the
entire population. Since that is too much work, almost impossible and a very expensive
procedure, researchers prefer examine a random sample from the population. If sample data
are not consistent with the statistical hypothesis, the hypothesis is rejected.
There are two types of statistical hypotheses.
Null hypothesis. The null hypothesis, denoted by H0, is usually the hypothesis that sample
observations result purely from chance.
Alternative hypothesis. The alternative hypothesis, denoted by H1 or Ha, is the hypothesis
that sample observations are influenced by some non-random cause.
The result is statistically significant when it has been predicted unlikely to be an occurrence by
only chance. These tests are important in determining which outcomes will lead to rejection of
the null hypothesis for a pre calculated level of significance. When a p-value is less than the
required significance level, then we reject the null hypothesis. However if the p-value isnt less
than the significance level, then we conclude that there isnt enough evidence to support a
conclusion.
o Use a 0.05 significance level to test the claim that 20% of all Skittles candies are red.
0 : = 0.20
1 : 0.20

0
.025

.025

0 =

(1)

0 =
-1.96

1.96

(.2249) (.20)
(.20)(. 80)

1601

0 = 2.49

From the data above, our test statistic (2.49) is greater than our critical value (1.96).
Therefore, by definition, we reject our null hypothesis. In this case, we do have sufficient
evidence to reject the claim that 20% of all skittles candies are red.

o Use a 0.01 significance level to test the claim that the mean number of candies in a bag of
Skittles is 55.
0 : = 55
1 : 55
.005

.005

=
-2.779

59.3 55
2.71
27

2.779

= 8.24
From the data above, our test statistic (8.24) is greater than our critical value (2.779).
Therefore, by definition, we reject our null hypothesis. In this case, we do have sufficient
evidence to reject the claim that the mean number of candies in a bag of skittle is 55.
Reflection
An interval estimate is defined by two numbers, between which a population parameter
lies. For example, p - E < p < p + E. The sampling method must be simple random sampling,
which is a randomly selected sample from a larger sample or population. The sampling
distribution is normally distributed. For hypothesis testing, normal distribution is required unless
the samples size is greater than 30.
If the sample is (1) obtained using simple random sampling, (2) has no outliers, and (3)
is normally distributed, then the t-distribution must be used to test the hypothesis.
Our sample size was from 27 students ,so normal distribution needed to be used. The bellshaped graphs included in part II of the assignment helped visibly show that the information was

normally distributed. There were no major outliers. And while the group performing the Skittle
project was not huge, I felt that there were enough people present to adequately present the larger
population.

Possible errors could be things such as errors in sampling methods, as well as the quality
of the samples themselves. The sampling technique could have been better if we had someone
else or the whole class checking the county procedure for each skittle bag. Also, there could be
several errors made with the calculations of all tests, because of the use of several different
formulas and the rounding off of final counts. I will suggest to add to the term project more
specific information about the rounding off, for example how many decimals do we need for t
statistic answer.

Reflective writing and e- portfolio:

This project was very beneficial in helping me to develop a better understanding of


statistics and how this concepts can be applied in real world problems. The information I learned
by taking this class will be very helpful in my future career as a nurse. There are many ways
statistics can be used in a hospital setting. For instance, it may be used to calculate the average
number of people examined per day, week, month or year. Also, it can be a very powerful tool in
determining the time interval in which a patient should be given a particular medicine. This
project has also taught me how to interpret data from surveys and to differentiate between valid
professional papers from those that end up being from questionable sources.
This type of research, helped me to understand statistics beyond the formulas and
numbers. Most mathematical classes are all about memorizing formulas and getting the right

answer, but this class has challenged me to think about what the answer means and to make
conclusions from my results. I had a hard time in understanding this class at the beginning
because it challenge my way of thinking about mathematics.
Stat crunch, Minitab and excel gives you amazing graphs that are very helpful in
presentations, in the future I will be using those programs for my major. We are wired to take in
visual content faster and more effectively than words or numbers. That is why I find so amusing
how many advertisement companies use this information to manipulate their studies in a way
that makes us buy their stuff. Now that I learned so much about statistics, I can put this new
knowledge in practice and analyze those graphs and decide whether they studies were well done
or not.

Das könnte Ihnen auch gefallen