Beruflich Dokumente
Kultur Dokumente
Klekas 2nd
Math 1040
Skittles Project
Introduction
This project will be conducted with skittles and is used to show the total number of each
colour of skittle along with the total per bag. A class study will be taking place where each
student records the number of skittles and each colour of skittle. The students will be using a
2.17 ounce bag of the original skittles packet. Using only whole candies and disregarding any
partial candies. The goal of the study is the analyze and record the number of skittles per packet
along with the proportion of each colour within the sample and observe the mean. Each
proportion will be represented as decimals rounded by three places and be shown in a Pie and
Pareto Chart for the colours. After recording the class data, it will be put into a 5-number-
summary to one decimal place with a rounded Standard Deviation to two decimal places. The 5-
number-summary will be put into a boxplot and a histogram. The data will then be compared the
personal data the highest amount colour was red at 25% of the bag. For the class data it was
yellow which is at 23.9% of the total amount. The bar graph for the personal data has no shape
The total number of candies for the class data was 1207 and the personal data is 60. The
mean of the data is 60.35. There are 3 points of outliers 56, 57, and 63. The although 56 is an
outlier it is still our minimum and 63 is also an outlier but our maximum. The shape of the curve
Categorical data is something you can't add up like purple and red. The colour data with
the Pareto and Pie charts was looking at the Skittle categorical data a graph that wouldn’t make
much sense would be a line graph because that is comparing the data for quantitative. The
quantitative data are the numbers from the class and personal data. This includes the Box plot, 5-
number summary and histogram. Any graph used for categorical data more times than not can
not be used for quantitative data such as a pie chart since there is only one category of data.
Means, mediums, max and min don’t work for categorical those are more used for quantitative.
This is because those are used to look at one group of the same data and not different types like
purple and red, there isn’t much of a maximum for purple and red.
Introduction
A confidence interval is used to estimate the true population parameter from a sample in
order to do this our sample must meet certain conditions, those conditions are that it is a random
sample of skittles, that the sample is less than 5% of the population of skittles. An that it is a
large sample size. We want a random sample of skittles to remove as much bias as we can from
the experiment. We want a sample size that is less than 5% of the population because that way if
we were to remove some of the skittles it won’t impact the next group of data or experiment;
I am trying to estimate p = the true proportion of yellow coloured skittles. Our best guess
Conditions
Now we’ll construct a 95% confidence interval estimate for the true mean number of candies per
bag.
We are trying to estimate M which is the average number of skittles in a bag. Our best
guess is that x = 60.35 but due to sampling variability we’re unlikely to be correct. We will
Conditions
● Large enough sample size? No because 20 < 30, proceed with cautions [X]
I am 95% confident that the intervals from 59.54 to 61.16 captures the true mean of skittles per
bag.
These results seem to be pretty accurate after looking at the box plot and the histogram.
My proportion of skittles was roughly around the same range as we predicted each colour to
have, which was .2 of the bags worth of skittles. Within my bag I had exactly 60 skittles so the
95% CI for the number of bags of skittles was true for me.
sample. Hypothesis testing also as condition checks; Random Sample to remove any selection
bias, independent so the probability doesn’t significantly change for each sample, and a large
We will be sing a 0.05 significance level to test the claim that 20% of all Skittles are read.
1. True proportion P = Red Skittles is les that P = .20 since p̂ = .25 . It is possible that were
is no change and we got a smaller proportion due to sampling variability. There will be a
2. H0: P=.20
a. H1:P<.20
3. Conditions
c. Is the sample size large? No 9.6 < 10. We will proceed with caution.[X]
5. Since the p-value is > alpha (.8335 > .05( we fail to reject the null hypothesis, and
Using a .01 significance level to test the claim that mean of candies in a bag of Skittles is 55.
It appears at first that the true mean M = Number of Skittles in a bag is greater than M =
55 because x = 60.35. It is possible that the true mean is 55 and we were wrong due to sampling
2. Conditions:
4. Because the p-value is less than the alpha (0.00 < .001), we reject the null
hypothesis and can conclude that the true mean of skittles in a bag is greater than
55.
Reflection
The conditions for the hypothesis testing and CI are the same. Since some conditions
were not met some of these conclusions could possible be wrong but we still proceeded with
caution. If we had more bags of Skittles the distributions would have been increasingly normal to