Beruflich Dokumente
Kultur Dokumente
MATH1040
Skittles Project
For this assignment, each student had to record each color of skittles from a small bag. Once all
the students had the numbers of their skittles, they were put together and combined. That was the
first part of this assignment. Below there is a list of the numbers I got from my skittles bag.
The following section shows the overall sample that was gathered by our class. It will be shown
in a Pie Chart and Pareto chart. It will be separated by their color. The sample size for this data is
they go from Red, Orange, Green, Purple, and Yellow. When comparing this to my own data I
can see that mine isnt in that same order, mine goes from Orange, Purple, Green, Yellow and
Red. Although, mine is different, a lot of the numbers are pretty close to the overall data. There
are definitely a variety of combination in each bag, so each one will look different.
For this section there is going to be a frequency histogram and a boxplot will be displayed and a
5-number summary of the data. We are focusing on the number of candies per bag.
Although the boxplot and the histogram are different graphs, using the same information, it still
follows the same pattern. They are both skewed left. The mean number of candies per bag was
60.3 with a standard deviation of 3.44. To explain the boxplot a little more it shows that the
minimum amount of Skittles per bag was 50, the maximum amount per bag was 65. The first line
you see is the first interquartile range (Q1), it was 59 Skittles, the second line is the median which
was 61 Skittles, and the third line seen is the third interquartile range (Q3) was 62 skittles. The 5
number-summary proves the shape of the distribution for both graphs. Each individual person
may have different numbers per bag, mine in particular had 54, which is close to the minimum.
There is a difference between categorical data and quantitative data. This project consists of both,
when youre counting how many candies per bag there are is an example of quantitative data
because it they are all different. It is used to see the sample size of the data as well. Once you
combine them and split them into groups of color, like this project, then its categorical data.
show if the value we are looking for falls within a specific parameter. The higher the percentage
the more confident you will be to have a value fall within the parameters. An example for this
data, is to find the true proportion of yellow candies and be 99% confident. The number of
Yellow Skittles from the total number of skittles is 581. Reminder that the number of all the
Skittles is 3076. Below I will insert an image with the problem I worked out to make a 99%
confidence interval. After working out the numbers both by hand and by calculator we can make
a conclusion. The conclusion would say, we are 99% confident that the true proportion of Yellow
Another example would be constructing a 95% confidence interval for the true mean number of
candies per bag. Below I will insert an image of the work by hand. I also did it on the calculator
as well. The mean number of Skittles for all classes is 615.2 So after conducting the 95%
confidence interval for the true mean number of candies per bag, we can say that we are 95%
confident that the true mean number of Skittles is between 587.11 and 643.29.
For the following section we are going to focus on a hypothesis test. What we want to find with
this is whether our predictions based on the sample we have is true or not. I will insert an image
below showing both hypothesis tests. Using this data, we will make a hypothesis test, using a
significance level of 0.05, to test the claim that 20% of all Skittles of candies are red. To begin
with the hypothesis test we have to write a null hypothesis and an alternative hypothesis. The
null hypothesis will always be equal to the proportion, or mean we are trying to test. The
alternative is what we are trying to find, whether it is more than, less than, or not equal. For this
case we will write the null hypothesis as H0: p=20, and the alternative hypothesis as H1: p 20.
We put the values in the calculator and find that our p-value is .23, and our z value is 1.21. With
this data we compare the p-value to the level of significance, which in this case our p-value of .
23 is more than our level of significance of 0.05, because of this we fail to reject the null
hypothesis. We do not have sufficient evidence to support the claim that 20% of the Skittles are
not Red.
Another example using the data using a significance level to test the claim that the mean number
of candies in bag of Skittles is 55. We will do the same process as above and begin with the null
hypothesis and alternative hypothesis. The null hypothesis would be H0: = 55, and the
alternative hypothesis would be H1: 55. We put the values in the calculator and find that the
p-value is 0.000, and the t value is 11. Comparing the p-value to the level of significance we find
that the p-value is less than the level of significance, because of this we reject the null
hypothesis. There is sufficient evidence to support the claim that the mean number of Skittles per
Before we do a
hypothesis test, we have to meet 3 conditions, first that they are independent, more than or equal
to 10, and are less than 5% of the population. If these conditions are not met we cannot do the
hypothesis test. During this project there are many possible errors that could have been made, but
using a calculator eliminates most of these errors. There couldve been errors when entering the
data of each individuals bag of skittles and when adding up the total number of skittles. The
sampling method couldve improved if each person could compare the data they got to theirs, but
to do it with 51 people is a lot. Doing this project I was able to put the concepts learned in class
to something real and not just a story problem in the book, which makes it interesting.