Beruflich Dokumente
Kultur Dokumente
Introduction
Statement of Task
I am investigating the relationship of SAT scores and family income of the test
takers around the world. I have collected data on SAT scores and family income of the
test takers around the world. With the collection of data that I have acquired, a number
of mathematical processes were used to analyze the data: a scatter plot of the data,
calculation of the least squares regression line and correlation coefficient. I am going to
do a χ2 test on the data to show the dependence of SAT scores and family income of the
test takers around the world.
Mathematical Investigation
Collected Data
Table 1: Mean SAT scores per section categorized in family income of test taker
in 2007
Family income of test Percentage of test takers
takers within each family income Critical
group reading Math Writing
1550 1522
1508
1487
1500
1462
1450 1427
1400
1371 Average SAT score
1363
1350
1301
1300
1250
0 20 40 60 80 100
Family Income of SAT Takers ($ in Thousands)
Graph 1 shows the average SAT score Vs. family income of test taker. As of now,
there seems to be very strong positive correlation. It does appear that the SAT
scores improve as the family income increases. (Graph was generated through
Microsoft Excel)
Calculation of the Least Squares Regression
where and
x y xy x2
15000 1301 19515000 225000000
25000 1371 34275000 625000000
35000 1363 47705000 1225000000
45000 1427 64215000 2025000000
55000 1462 80410000 3025000000
65000 1487 96655000 4225000000
75000 1508 113100000 5625000000
85000 1522 129370000 7225000000
95000 1559 148105000 9025000000
∑ = 495000 ∑ = 13000 ∑ = 733350000 ∑ = 33225000000
= 55000 = 1444. = 79444444.44 = 3691666667
These are the calculated values used in finding the Least Squares Regression
2
Calculation of Pearson’s Correlation Coefficient
where , and
x y
15000 1301 1600000000 20576.30864
25000 1371 900000000 5394.08642
35000 1363 400000000 6633.197531
45000 1427 100000000 304.308642
55000 1462 0 308.1975309
65000 1487 100000000 1810.975309
75000 1508 400000000 4039.308642
85000 1522 900000000 6014.864198
95000 1559 1600000000 13122.97531
∑ = 495000 ∑ = 13000 ∑ = 6000000000 ∑ = 58204.22222
= 55000 = 1444.
These are the calculated values used in finding the Correlation Coefficient.
0.9819360378
The calculation suggests that the strength of the
association of the data is very strong since 0.90 r2 < 1.
1559
1550 1522
1508
1487
1500
1462
1450 1427
1250
0 20 40 60 80 100
Family Income of SAT Takers ($ in Thousands)
Graph 2 indicates that there is a strong positive linear correlation. This is also
indicated through the value of correlation coefficient, 0.96.(the graph was generated
through Microsoft Excel )
2
Calculation of a test
B1 B2 Total
A1 A B A+B
A2 C D C+D
Total A+C B+D N
B1 B2 Total
A1 A+B
A2 C+D
Total A+C B+D N
Degrees of freedom measure the number of values in the final calculation that
are free to vary:
Null (H0) Hypothesis: SAT scores and family income are independent from each
other.
Alternative (H1) Hypothesis: SAT scores and family income are dependent from
each other.
Table 4: Observation Values
Score
Income($) 1300-1430 1431-1561 Total
15000 – 55000 4 1 5
56000 – 96000 - 4 4
Total 4 5 9
Table 2 shows the observed values for SAT score Vs. family income. The data
pieces have been put into ranges that represent the income of the families of the
test takers.
Score
Income($) 1300-1430 1300-1430 Total
15000 – 55000 4+1
Score
Income($) 1300-1430 1300-1430 Total
15000 – 55000 2.22222 2.77777 5
56000 – 96000 1.77777 2.22222 4
Total 4 5 9
Table 6 shows the expected values retrieved by the calculations in table 4
The 2 critical value at 5% significance with 1 degree of freedom is 3.841. As the
2
value is greater than the critical value, 5.760 3.841, the null hypothesis is
rejected and SAT score is assumed dependent from family income.
Discussion/Validity
Limitations
One limitation of the data collected could be that it only reflects on the
people who filled in the family income section before signing up for the SAT.
There is no evidence that the data reflects everyone who has taken the SAT
score as there may be people who did not fill that section.
Another limitation could be that not everyone in the world decide to take
the SAT, people who cannot afford it or take alternative tests are being neglected.
Also the data does not confirm of how many SAT takers are being considered.
The data can be proved insufficient and inaccurate for those reasons.
Then there could be a limitation to the data due to culture and race. The
data does not mention culture and race which might affect the data as there
might have been more American surveys who mentioned family income
compared to Asian who answered the survey.
2
Another limitation is that the table of expected values in the test has all
values less than 5 which reduces its validity.
Adding on to that, there might be a limitation to the amount of data that
was collected as 9 pieces of data may not prove to be sufficient enough to reflect
the correlation between SAT scores and family income in a world perspective.
Lastly, there may be many other factors taking place when considering the
correlation between SAT scores and family income such as reasons for having a
high family income and IQ of SAT test takers.
Conclusion
Work Cited
Rampell, Catherine. "SAT Scores and Family Income - NYTimes.com." The Economy
scores-and-family-income/>.
Downey, Joel. "SAT Scores Rise with Family Income." Cleveland OH Local News,
Breaking News, Sports & Weather - Cleveland.com. 10 Apr. 2008. Web. 01 Nov.
2010.<http://www.cleveland.com/pdgraphics/index.ssf/2008/04/sat_scores_rise_
with_family_in.html>.
Whiffen, Glen, John Owen, Robert Haese, Sandra Haese, and Mark Bruce. "Two
Studies SL. By Mal Coad. [S.l.]: Haese And Harris Pub, 2010. 581-82. Print.