Sie sind auf Seite 1von 5

Name: Katie Alpert

Block: 4

Group Members names: Hallie, Quinn, and Conor


The Observed and Expected Amount of Each Color of M&M Per Bag
Introduction (paragraph form)
A Chi Square analysis is used to determine if the reason that ones observed results
differs from their expected results be cause of chance, a sampling error, or if the original
expectation was incorrect. A Chi square lab reveals why a scientist got results that they
were not anticipating. Before beginning a chi square lab, a person must identity the null
hypothesis. A null hypothesis is that differences between observed and expected data are
due to chance. A null hypothesis is states that other than chance, the observed and
expected data sets are the same. The null hypothesis states that the process that one is
studying will not change the control or that there is nothing present that will change the
control. Subsequently, the control group and the treated group are not any different. The
null hypothesis for a Chi- Square lab is that any difference between the observed and
expected data is due to chance. This hypothesis is always the same for Chi- Square
analyses.
The purpose a Chi- Square lab is to prove or reject the null hypothesis. The value
of the Chi- Square is found by comparing the observed and expected results by using the
equation
. Once these values have been calculated, compare
the Chi- Square value and the table of critical values. If the Chi- Square value is smaller
than the critical value, the null hypothesis is accepted, but if the Chi-Square value is
larger than the critical value then the hypothesis is rejected. The acceptance of the
hypothesis means that any difference in the data is due solely to chance. However, if the
null hypothesis is rejected, the discrepancies between the expected and observed data are
due to another explanation other than chance. This is because the difference between the
expected results is too large to be due to chance. After calculating the Chi- Square value,
to determine if the hypothesis is rejected or accepted, one must find the experiments
degree of freedom. To do so, subtract one from the number of categories. Then, find this
number on a chart of degrees of freedom and where it interests with the 0.05 value on the
chart. This value means that it is 95% certain that the data is correct. If the Chi- Square
value is greater than this number then the hypothesis is rejected and the outcome is not
due to chance. If the Chi-Square number is less than this value, than the null hypothesis is
accepted and the results of the hypothesis are due to chance.

Problem (question)
Does the Chi- Square analysis of the amount of M&Ms in one bag accept or reject the
null hypothesis? Does the number of observed M&Ms deviate from the expected amount

of each color, and if so, is it due to chance?


Hypothesis (statement)
Any difference from the given data is due to chance.
Materials (list)

The materials used include:

One bag of M&Ms

One piece of paper towel

Procedure (number each)


1. Wash hands thoroughly.
2. Place one piece of paper towel on a table.
3. Open the bag of M&Ms and pour them onto the paper towel.
4. Separate the M&Ms by color on the paper towel.
5. Count and record the number of each color.
6. Record the number of each group of color in the observed column of a data table.
7. Calculate the Chi-Square value of this bag of M&Ms using a data table and the
equation:
8. Once the Chi- Square Value has been found, find the degrees of freedom for the
data.
9. Compare the Chi-Square value and the probability for that degree of freedom.
10. Determine whether the null hypothesis was accepted or rejected.
Results (data, chart, drawing, table, etc)
Data Table 1: The Calculation of the Chi- Square Value and the Observed Number of
M&Ms
Color Categories
Calculations:
Brown Blue
Orange Green Red
Yellow Total
Observed (O)
48
109
67
82
34
34
374
Expected (E)
48.62
89.76
74.8
59.8
48.6
52.4
374
Difference (O-E)
-0.62
19.24
-7.8
22.2
-14.6
18.4
N/A
Difference Squared
0.384
370.17 60.84
491.07 213.16 338.56 N/A
(O-E)2
(O-E)2/E
0.008
4.12
0.813
8.21
4.39
6.46
N/A
N/A
N/A
N/A
N/A
N/A
N/A
24.001

Data Table 2: The Calculation of the Chi- Square Value and the Observed Number of
M&Ms for All of the Class Data
Color Categories
Calculations:
Brown
Observed (O)
315
Expected (E)
297.18
Difference (O-E)
17.82
Difference Squared
317.55
2
(O-E)
(O-E)2/E
1.07
N/A

Blue
532
548.64
-16.64
276.89

Orange
474
457.2
16.8
282.24

Green
408
365.76
42.24
1784.22

Red
281
297.18
-16.18
38.19

Yellow
276
320.04
-44.04
1939.52

Total
2286
2286
N/A
N/A

0.50
N/A

0.62
N/A

2.14
N/A

0.13
N/A

6.06
N/A

N/A
10.52

Error Analysis (paragraph form)

Some errors that may have affected the data could have been counting errors
when tallying the number of M&Ms for each color. Additionally, calculation
errors may have occurred when calculating the Chi- Square value. For example,
there may be errors when subtracting the expected value from the observed value.
Mistakes may have occurred when squaring the difference between the observed
and expected data and when dividing that by each expected quantity. Lastly,
totaling all of these values may have been slightly incorrect or not completely
accurate.

Discussion and Conclusion (paragraph form)


The null hypothesis is that any difference from the given data is due to chance. For the
first set of data, the null hypothesis was rejected. This could be due to the fact that the
machines were not set correctly and therefore distributed the M&Ms incorrectly
according to the expected results. For the second data set, the null hypothesis was
accepted because there was a greater amount of data that made the results more accurate.
The results for the first set were not as expected because when factories fill the bags of
M&Ms, they do not fill each bag with a certain percentage of each color but when a run
of production is filled, the percentages are very accurate. Thus causing the calculation of
one bag of M&Ms to be inaccurate while many bags of M&Ms do accept the null
hypothesis. The null hypothesis should be rejected based upon our own individual sample
because the p-value was less than the Chi-Square value.
The Chi- Square value for the first data set was 24.001, and because it had 5
degrees of freedom it needed to be less than 11.07 to accept the null hypothesis. This set
of data deviated too largely from the expected values due to chance. In the second data
set, the Chi- Square value was 10.52 so all of the data from the two classes did accept the
null hypothesis. The increased amount of data caused a greater amount of accuracy in the
results. The first set of data varied greatly from the expected values that were given by
the makers of M&Ms. For example, there were 109 observed blue M&Ms, while there
was expected to be about 89.76 blue M&Ms. This deviation was found throughout this
trial because the producers of the M&Ms did not fill each bag with the same percentage
of each color, yet as a whole, each batch of M&Ms is produced based upon the given
percentages. This is seen in the second set of data when the expected blue amount was
about 548.64 M&Ms and the observed blue M&Ms equaled 532. These statistics are very
close and satisfy the null hypothesis because there is not too much deviation from the
expected outcome. Additionally, the deviation that did occur was due to chance, as the
accepted hypothesis states.
While the Chi- Square value of one bag of M&Ms does not satisfy the null
hypothesis, many bags combined does satisfy it. This is due to the fact that each run, or
batch, of M&Ms does produce the expected amount of M&Ms. However, after this
amount of M&Ms is produced, each bag is filled randomly from the entire batch of
M&Ms. After the percentages are fulfilled, the bags are filled based upon weight. This is
proven because the weight of one bag was 323.64g, another was 325.69, and a third was

324.44g. A fourth bag was 357.49g, but this may have been an error because the bag was
filled with too many M&Ms off of a conveyor line. This random filling of each bag
causes the percentage of the number of each color M&M to vary quite widely.
Additionally, it shows why having a large number of bags and a greater data set better
depicts the intended percentages of each color. Different packages of M&Ms all from the
same run will indeed match the percentages that the Mars Company gives to the public
with little variation, which is due to chance as stated by the null hypothesis. This shows
why the null hypothesis was rejected for the first trial, and the results found were not
highly out of the ordinary. The number of each color of M&M in a single package varies
widely because the percentages are not used to fill each bag. The bags are packed
randomly out of large amount of M&Ms that do support the null hypothesis. This large
group, or run, is produced based upon the percentages that the Mars Company stated
were in each bag. These runs do support the null hypothesis because there is very little
deviation from the expected data. This means that any variation seen in the observed data
for second data set is due to chance. The more data there is the closer it is to the expected
results because each run is created with a certain amount of M&Ms.
In the future one could research if, based upon these results, a smaller bag of
M&Ms would deviate farther from the expected percentages because it is a smaller pool
of data. If multiple runs were calculated would they be even closer to the expected
percentages, as well? Additionally, one could question what would make a large group of
data, such as a run or multiple runs, reject the null hypothesis and if this is even possible.
Some recommendations for the future could be to have a more precise or efficient way of
counting the M&Ms. It may also be more accurate if one person counted all of the
M&Ms to have the most precise number. Lastly, the data that used a calculator should be
double checked to ensure that all of the math was done correctly and accurately.

Sources
https://joshmadison.com/2007/12/02/mms-color-distribution-analysis/

Das könnte Ihnen auch gefallen