Sie sind auf Seite 1von 8

Katelynn Robbins

Report Introduction
This purpose of this project is to analyze the different aspects of a bag of skittles.
First, collect data. Count the number of each color of skittle in the bag, and the total
number of skittles in each bag. Second, make Pie chart and a Pareto chart observing
how many skittles of each color. Then analyze them, looking at what methods are
most effective when analyzing the data, and comparing all of the values to your
specific bag of skittles. Calculate the mean, SD and the 5 number summary of the
number of skittles in each bag. Also create a frequency histogram to represent data.
Then again analyze the data. Then observe the differences between the categorical
data and the quantitative data. Also to create confidence interval estimates and
interpret the results for the true proportion of orange candies, the true mean
number of candies per bag, and the interval estimate for the standard deviation of
the number of candies per bag. Then to conduct a hypothesis test to test the claim
that 20% of skittles are purple, and to test the claim that the mean number of
candies per bag is 58. Then reflect back on this assignment.
Organizing and Displaying Categorical Data: Colors

Skittles By Color Pie Chart

Purple; 19%

Green; 22%

Red; 20%

Orange; 19%

Yellow; 21%

Skittles by Color Pareto Chart


300
250
200
# of skittles 150
100
50
0

Green

Yellow

Red

Orange

Purple

Color of skittle

My Data
Class Data

Green
17
237

Yellow
7
226

Red
10
247

Orange
16
263

Purple
10
224

The graphs display generally what I expected that they would. Even though my
single bag of skittles was not evenly distributed between the colors, I thought that
between all of the different bags of skittles that they would be close to even
between all of the colors. These graphs display that they are close to even.
Organizing and Displaying Quantitative Data: the Number of Candies per
Bag
Candies per bag
Mean
Standard Deviation
Minimum
Q1
Q2
Q3
Maximum

59.
9
2.2
8
55.
0
59.
0
60.
0
62.
0
63.
0

# of Skittles in Each Bag


6
4
Frequency

2
0

55

56

57

58

59

60

61

62

63

Total # of Skittles

My Data
Class Data

Total# of skittles in
bag
60
1197

# of Bags
1
20

The distribution seems to be random. There does not seem to be a pattern on how
many skittles there are in each bag. I expected to see a more normal distribution
with the number of skittles. The overall data in from the class produced a mean of
59.9, and that is very close to my personal bag of skittles (60). Also, by observing
the box plot, (see scanned in paper), you can see that it is skewed to the left.
Reflection
Categorical data groups the data into named categories or labels, and quantitative
data has to do with the numerical data. The pie charts and bar graphs make sense
for categorical data, because it displays how the categories relate to one another.
5-number summaries and frequency histograms make sense for quantitative data
because it can help display how the data is distributed. There is not really any
calculations that make sense to perform on categorical data because it does not
deal with numbers. There are however, many calculations you can perform on
quantitative data though. For example the 5-number summary, the mean, the
standard deviation, and so forth. As stated above, the data, since the numbers

actually hold meaning in quantitative data, you can analyze the distribution of the
data.
Confidence Interval Estimates
A confidence interval is the range of values used to estimate the true value of a
population parameter.

We are 95% confident that the interval (0.167, 0.211) contains the true population
proportion of orange skittles.

We are 99% confident that the interval (58.441, 61.359) contains the true
population mean of the number of skittles per bag.

We are 98% confident that the interval (1.652, 3.597) contains the true population
standard deviation of the number of skittles per bag.
Hypothesis Tests
A Hypothesis test is a procedure to test a clam about a property of a population.

There is not sufficient evidence to reject the claim that the population proportion of
Purple Skittles is 20%.

There is sufficient evidence to reject the claim that the population mean is 58
skittles per bag.

Reflection
The conditions for doing interval estimates and hypothesis tests for population
proportions are:
1. Must be a random sample
2. Must be a binomial distribution
3. Must have at least 5 successes and 5 failures.
Our samples meet these conditions because the skittles were a simple
random sample, there is a fixed number of trials, and there is two categories
of outcomes, which is either orange or not orange. There are also at least 5
that are orange, and 5 that are not orange.
The conditions for doing interval estimates and hypothesis test for a population
means are:
1. Must be a random sample
2. The population must be normally distributed and/or the sample size has to be
greater than 30.
Our samples do not meet these conditions because the population is not
necessarily normally distributed, and the sample size is 20, which is less than
30.
The conditions for doing interval estimates for population standard deviations are:
1. Must be a random sample
2. Must be a normal distribution
Our samples do not meet these conditions because the population is not
necessarily normally distributed.
A type I error is the mistake of rejecting the null hypothesis when in reality it is true.
For example, when I rejected the null hypothesis that the population mean is 58, if it
were actually true, then that would be a type I error.
A type II error is the mistake of not rejecting the null hypothesis when in reality it is
false. For example, when I failed to reject the null hypothesis that 20% of all skittles
are purple, if it were actually to be false, that would be a type II error.
The sampling method that we used could be improved by getting skittles from
different states, creating a more random sample. It could also be improved by
adding more samples to the data.
The conclusions that I have come to from my statistical research is that skittles
seem to be pretty evenly proportioned between all of the different colors. There also
seem to be approximately 60 skittles in every bag.

Project Reflection

Throughout this skittles project, I have applied what I have learned while attending my
statistics class. I used excel to generate graphs, I calculated confidence interval estimates,
conducted hypothesis tests, but most importantly, this assignment trained me to use my analytical
skills to interpret the accuracy and the meaning behind each graph or interval.
While I was creating the bar graph to analyze how many candies of each color there were
in comparison with each other, excel started the y-axis of the graph at 150, skewing all the
results. While zoomed in to this graph (without realizing it) it led me to believe that there was an
absurd difference between of the amount of green skittles and the amount of purple skittles. If I
were to have not taken the time to analyze these results, and recognize that there was something
wrong with them, I would have created a misleading graph that could have changed the way that
people look at the skittles company. However, since I was capable of understanding that there
was a problem, and I was able to fix it, I was able to create a graph that could accurately
represent the data.
Numbers are just numbers. They dont mean anything without an accurate conclusion.
When conducting the hypothesis test for testing the claim that the mean number of skittles per
bag is 58, I came to the conclusion that I needed to reject the null hypothesis. While not having a
full knowledge of what to do with that information, I proceeded to make the conclusion that
There is sufficient evidence to reject the null hypothesis. However, that is not taking the
conclusion far enough. To fully understand what to do with this information, I needed to connect
the answer back to the claim. This led me to my new conclusion, There is sufficient evidence to
reject the claim that the population mean is 58 skittles per bag.
Statistics, whether you like it or not, is a part of our everyday life. By understanding the
basics of statistics, and how to interpret and analyze them, we can create accurate representations

of data, come to correct and applicable solutions, and we can understand their importance and
affect that these statistics have on us.

Das könnte Ihnen auch gefallen