Sie sind auf Seite 1von 7

Richard Kwan

Klekas 2nd

Math 1040

May 29, 2018

Skittles Project

Introduction

This project will be conducted with skittles and is used to show the total number of each

colour of skittle along with the total per bag. A class study will be taking place where each

student records the number of skittles and each colour of skittle. The students will be using a

2.17 ounce bag of the original skittles packet. Using only whole candies and disregarding any

partial candies. The goal of the study is the analyze and record the number of skittles per packet

along with the proportion of each colour within the sample and observe the mean. Each

proportion will be represented as decimals rounded by three places and be shown in a Pie and

Pareto Chart for the colours. After recording the class data, it will be put into a 5-number-

summary to one decimal place with a rounded Standard Deviation to two decimal places. The 5-

number-summary will be put into a boxplot and a histogram. The data will then be compared the

the data received from a single bag of skittles.


The class date pie chart shows a difference between the personal data as expected. For the

personal data the highest amount colour was red at 25% of the bag. For the class data it was

yellow which is at 23.9% of the total amount. The bar graph for the personal data has no shape

while the class data has more of a uniform shape.

Red Orange Yellow Green Purple Total in Bag


Class data 244 236 289 222 216 1207
nkllnlknlknl
nl
Red Orange Yellow Green Purple Total in Bag
Personal 15 11 13 8 13 60
Q3 61
Mid/Q2 60.5
Q1 60
Min 56
SD 1.73
Mean 60.35

The total number of candies for the class data was 1207 and the personal data is 60. The

mean of the data is 60.35. There are 3 points of outliers 56, 57, and 63. The although 56 is an

outlier it is still our minimum and 63 is also an outlier but our maximum. The shape of the curve

on the histogram looks to be skewed right.

Categorical data is something you can't add up like purple and red. The colour data with

the Pareto and Pie charts was looking at the Skittle categorical data a graph that wouldn’t make

much sense would be a line graph because that is comparing the data for quantitative. The

quantitative data are the numbers from the class and personal data. This includes the Box plot, 5-

number summary and histogram. Any graph used for categorical data more times than not can
not be used for quantitative data such as a pie chart since there is only one category of data.

Means, mediums, max and min don’t work for categorical those are more used for quantitative.

This is because those are used to look at one group of the same data and not different types like

purple and red, there isn’t much of a maximum for purple and red.

Introduction

A confidence interval is used to estimate the true population parameter from a sample in

order to do this our sample must meet certain conditions, those conditions are that it is a random

sample of skittles, that the sample is less than 5% of the population of skittles. An that it is a

large sample size. We want a random sample of skittles to remove as much bias as we can from

the experiment. We want a sample size that is less than 5% of the population because that way if

we were to remove some of the skittles it won’t impact the next group of data or experiment;

however, we also want a large enough sample size to ensure normality.

I am trying to estimate p = the true proportion of yellow coloured skittles. Our best guess

is p̂ = .2167 but because of sampling variability, we are unlikely to be correct. So we will

calculate a .99 z-interval for p.

Conditions

● Random sample of skittles [✔]

● Independence condition, Assuming the population is 24140 [✔]

● Sample size is less than 10 so proceed with caution.9.6 < 10 [X]

99% CI = (.07967, .35366)


I am 99% confident that the intervals from .07967 to .35366 captures the true proportion of

yellow skittles within the bag.

Now we’ll construct a 95% confidence interval estimate for the true mean number of candies per

bag.

We are trying to estimate M which is the average number of skittles in a bag. Our best

guess is that x = 60.35 but due to sampling variability we’re unlikely to be correct. We will

calculate a 95% t-interval for M.

Conditions

● Random Sample of Skittles [✔]

● Independent assuming N > 400 [✔]

● Large enough sample size? No because 20 < 30, proceed with cautions [X]

95% CI = (59.54 , 61.16)

I am 95% confident that the intervals from 59.54 to 61.16 captures the true mean of skittles per

bag.

These results seem to be pretty accurate after looking at the box plot and the histogram.

My proportion of skittles was roughly around the same range as we predicted each colour to

have, which was .2 of the bags worth of skittles. Within my bag I had exactly 60 skittles so the

95% CI for the number of bags of skittles was true for me.

Hypothesis testing is used to check if a population parameter is correct based on the

sample. Hypothesis testing also as condition checks; Random Sample to remove any selection
bias, independent so the probability doesn’t significantly change for each sample, and a large

sample size to ensure normality.

We will be sing a 0.05 significance level to test the claim that 20% of all Skittles are read.

1. True proportion P = Red Skittles is les that P = .20 since p̂ = .25 . It is possible that were

is no change and we got a smaller proportion due to sampling variability. There will be a

1-sided-z test conducted for proportions (α = .05)

2. H0: P=.20

a. H1:P<.20

3. Conditions

a. Random Sample of skittles [✔]

b. Independent assuming N>1,220. [✔]

c. Is the sample size large? No 9.6 < 10. We will proceed with caution.[X]

4. P(p̂ > .25) = P (z > .968) = .8335

5. Since the p-value is > alpha (.8335 > .05( we fail to reject the null hypothesis, and

conclude that the true proportion of red skittles is 20%

Using a .01 significance level to test the claim that mean of candies in a bag of Skittles is 55.

It appears at first that the true mean M = Number of Skittles in a bag is greater than M =

55 because x = 60.35. It is possible that the true mean is 55 and we were wrong due to sampling

variability. There will be a 1-sided t-test for means to decide.

1. H0: M=55 H1: M>55

2. Conditions:

a. Random Sample of Skittles [✔]

b. Independent assuming N=400 [✔]


c. Large Enough Sample Size No because 20<30 [X]

3. P(x>60.35) = (t> 13.86) = 0.00000000001

4. Because the p-value is less than the alpha (0.00 < .001), we reject the null

hypothesis and can conclude that the true mean of skittles in a bag is greater than

55.

Reflection

The conditions for the hypothesis testing and CI are the same. Since some conditions

were not met some of these conclusions could possible be wrong but we still proceeded with

caution. If we had more bags of Skittles the distributions would have been increasingly normal to

get more accurate conclusions.

Das könnte Ihnen auch gefallen