Sie sind auf Seite 1von 9

Each of the students in a Math 1040 class were asked to buy one 2.

17 ounce bag of
skittles and record the amount of whole candies in each bag of each of the five colors (yellow,
red, green, purple, and orange). Each student was asked to record this information. Once we
gathered the information of everyone in the class we are using it to compare and contrast the
proportion of each color of candies per bag. Below are just a few charts explaining the data that
was collected.

In the graphs above its easy to see that for some of the graphs there was one or two
colors that seem to be more present in each bag. In the pie chart showing the total proportions it
seems that over all yellow and purple were present in higher amounts through the class. When I
compare my bag with the total class average my graph does not reflect what I expected to see
because red is significantly more present then yellow or purple. Therefore the overall data
collected does not reflect my bag of candies. When I look at the Pareto chart I also see that
purple and yellow are present in very similar amounts and orange, red, and green are less present
but not by too much.
Column

Mean

Std. dev.

Median

Min Max Q1 Q3

Total

59.5

4.69

60

36

64

59 61.5

Most of the data seems to be from 60-66, but there is a fair amount that is found from 5460, a very small amount found from 36-42. The total amount of candies per bag from 36-42 seem
to be outliers because those numbers are found about 5 standard deviations away from the mean
which seem to be misleading when compared to the rest of the data collected. When I compare
my bag of candies to the whole class it is easy to see that my bag falls within one standard
deviation of the mean therefore my bag of candies does not seem to be misleading but in fact
accurate because it falls where 68% of the data should be found. In my bag I found 63 whole
candies. 32 bags where counted when gathering this information, one from each member of the
class.
Categorical data can also be referred to as qualitative, this type of data might be in order
but if its not it does not matter because in categorical data the order does not matter. In
quantitative data the variable being measured can be ordered for example weight, or length.
Categorical data can be gender (female and male) or days of the week, things that the order does
not matter. Quantitative data can be used in prices to see what the highest price is and what is the
lowest.

Confidence Interval Estimate


Confidence intervals used to point estimate and to interval estimate because only confidence
intervals indicate the precision of the estimate and the uncertainty of the estimate. In other words
confidence intervals are used to express the degree of uncertainty associated with a sample
statistic. A confidence interval is an interval estimate combined with a probability statement.
95% confidence interval for purple candies = (11.018, 13.544)
99% confidence interval for true mean of candies per bag = (57.395, 61.665)
98% confidence interval estimate for the standard deviation of number of candies per bag =
(3.661 < sigma < 6.753)

The requirement to use the formula that we used are that it must be a simple random sample, it
must be an independent sample and p multiplied by n must be greater than 0.5%. In the 1st
confidence interval p multiplied by n is equal to 6.25 which allows us to use the formula based
on the requirements. For the 2nd confidence interval since n is greater than 30, we can use z score.
On the 3rd confidence interval since the standard deviation was given we were able to use the
variance formula to find the 98% confidence interview.
Hypothesis Test
Hypothesis testing is a process by which an analyst tests a statistical hypothesis. The goal is to
either reject or fail to reject the null hypothesis. Hypothesis is used to infer a result of a
hypothesis performed on a sample data from a larger population.
0.01 significance level to test the claim that 20% of all skittles candies are green. = -.0849
0.05 significance level to test the claim that the mean number of skittles per bag is 56. = 4.257
After testing the claim that 20% of all skittle candies are green we can conclude that with a 0.01
significance level we fail to reject the null hypothesis because -1.20 falls within -2.58 and 2.58
which is the confidence level of 99%. In the second claim we tested we reject the null hypothesis
because 4.26 falls in the rejection zone which is outside the 95% confidence interval perimeter.

Reflection
The requirement the must be met in order to estimate a confidence interval are 1) the sample
must be a simple random sample, 2) the conditions for the binomial distribution are satisfied, 3)
there are at least 5 successes and 5 failures. The sample does meet all the requirements in order
to estimate a confidence interval. The requirements for testing a claim about a population
proportion are the same 1st two requirements as listed above. The 3rd requirement is different
though. The 3rd requirement states that the conditions of np must be greater than or equal to 5,
and nq must be less than or equal to 5.
The interval estimate and hypothesis test for a population mean requirements are 1) the sample
must be simple random sample. 2) Either or both of these conditions is satisfied: the population
is normally distributed or n is greater than 30. Our sample meets this requirement because n is
greater than 30.
The requirements for the interval estimate for a population standard deviation are 1) the sample
must be a simple random sample. 2) The population must have a normal distribution. Our sample
meets this requirement because our sample size is greater than 30.
The possible errors that could have occurred are entering incorrect data when putting the data
sheet together for example typing in wrong amount of purple candies which were really the
amount of green candies. When each individual counted the amount of candies there could have
been an error while counting. To reduce the amount of errors made we could have had someone
double check the amount of candies when counting and also have someone review the data sheet
and compare it to each sample summited by us (the students).

Reflective Writing
In the project above our class constructed a small sample of 32 skittle candy bags which
were randomly picked. We then counted the number of candies in each bag along with the
amount of candies of each color per bag. Once everyone had gathered their individual
information, all of our results were given to our instructor who put every ones data into one data
sheet and gave each of us a copy. From that I learned how to conduct an experiment. From what
I learned in this class each of the samples where done independently, one sample did not depend
on what another classmates results were. Each sample was done individually, each individual
sample was counted towards the data sample of 32 bags.
Another thing I learned from this project was that errors can occur if one is not specific
on how the data showed be collected and what specific material should be used. For example one
of the students used a smaller size bag then the rest of us which used a 2.17 ounce bag. If this
student had not redone their sample data collection with the correct bag our results would have
been wrong to do the fact that not everyone used the same size bag. From this I learned the
importance of a control in a sample and the importance of being specific with the instructions on
how exactly the sample should be gathered.
When we got in our groups and compared our pie charts we were able to see that not
everyones proportion of each color of candies was the same as the pie chart of the whole class.
With this I learned the importance of the average (mean) of the sample. This helped me
understand why it is best to conduct an experiment with a large number of samples to get as
close as possible to the true population average. I also learned that the larger the sample the more
errors that can occur and also the expense of the experiment can get very high.

On the last part of the project we were asked to construct confidence intervals and
hypothesis test to find various information about our data. For example we had to construct a
99% confidence interval for the estimate of the true mean number of candies per bag. I had a
hard time with this interval because the calculations I was getting did not make sense for
example one of my answers for the estimate of true candies was .3465. I had to look at the data
and think logically if that .3465 even made sense. When I thought about it I realized that I must
have calculated something wrong because when I looked at the sample data collected most bags
of candies had anywhere from 56 to 63 pieces of candies. Common sense helped me realize that
it was absurd for one of the intervals to be .3465 when the mean was very different for my
answer. There is a quote for which the source I cant seem to recall that says Statistics is simple
math with common sense. Which I believe is true because in this case common sense helped me
check my work and realize how wrong I was.

Das könnte Ihnen auch gefallen