Scatterplot Project

Sudhakar 1
Sharanya Sudhakar
Prof. Saraswati Bala
Math 130
January 31, 2016
Height Vs. Knee Length

This project attempts to correlate the height of any given person and their knee height. To
begin with data was collected in class for 20 students. Their individual heights were measured
first, by making them stand with their back against the wall. Then the measurement was marked
on the wall and measured using a measuring tape. Care has to be taken to ensure correct height
measurement, by taking off any boots or footwear and any hair ornaments that might give a false
reading. Next their corresponding knee heights were measured by making them sit in a chair with
their thighs parallel to the floor. Then with the footwear off once again a ruler is set on the thigh
so that it juts out of the knee. Now the height from the floor to the ruler is measured and noted as
knee height.
Knee Height Vs. Height - Scatterplot

78
76
74
72
Height
70
Male
68
Linear (Male)
Female
66
64
62
60
58
18
19
20
21
22
23
24
25
Knee Height
The regression line has the equation y= 22.9 + 2.10x and the correlation r = 0.86
Sudhakar 2
Predicting the height of three individuals with the given regression line we have,
Pers
on
A
B
C
Gende
r
M
F
F
Knee Height
Predicted Height
24.1
22.5
18.2
73.51
70.15
61.12
In examining the scatterplot we look at the overall form, direction and strength of
the relationship and finally outliers or deviations of pattern. The form of the current
scatter plot is linear. The line formed is a regression line. Since the direction is clear,
we can say this line has a positive direction. As in if the knee height increases then
so do the height of the person in question. There is hence a positive correlation (i.e.,
r=0.86). The strength of the scatterplot is measured by how close the points are to
the regression line and is determined by the value of r. Since r is positive and 0.86
the scatterplot is pretty strong meaning the predicted values of Height will be
accurate more than 80% of the time. Outliers for the scatterplot fall well above or
below the general pattern and in this case we have a max at Knee Height 24.5 and
Height 77 and a min at Knee Height 19.2 and Height 64.5. But no outliers,
because removing these max or min values from the data does not improve the r
value, they are clearly part of the data and hence this data set has the perfect mix
of values to accurately calculate the regression line.
Residuals: Residuals is calculated by finding the sum of the differences between
the measured y value and the predicted y ( ^y ) value.
X
21
22.5
22
22.5
22
21
22.5
21.5
23.5
21.5
22.8
22
20
23
21.78
19.8
21.2
22.8
23
24.5
67
66.5
70
70
68.5
67
67
69.5
73
67
70
71.5
66
71.5
69
65
67
72.5
72.5
77
^y
y- ^y
67
70.15
69.1
70.15
69.1
67
70.15
68.05
72.25
68.05
70.78
69.1
64.9
71.2
68.638
64.48
67.42
70.78
71.2
74.35
0
-3.65
0.9
-0.15
-0.6
0
-3.15
1.45
0.75
-1.05
-0.78
2.4
1.1
0.3
0.362
0.52
-0.42
1.72
1.3
2.65
Sudhakar 3
19.2
64.5
63.22
Total
Residual
1.28
4.93
Mean and Median:

Mean(x)=21.9
Mean(y)=69.1
Notice the regression line passes through the mean and the data set Knee
height:21.78 and Height: 69.
Conclusion:
From the scatterplot we are not only able to plot two sets of variables we are able to
correlate them and give them a value. This value sets a trend and helps relate one
variable in terms of the other and make a decent prediction within the range of the
data used. The stronger the correlation the more accurate your prediction and the
lower or closer to zero is the residual value. Depending on the correlation value we
can verify how much one value is dependent on the other or we can move on to
another set of variable that will give a better correlation thus enabling us to fine
tune our prediction. For example, if Knee Height Vs. Height has a better correlation
(in this case) then we use the regression line from this to base our prediction but in
another case if arm length and height can be correlated and it has a better
correlation value then measuring arm length for predicting height might give a
better or more accurate a prediction where residual values are near zero. In this
case the data range will not fall under dwarfism, for which this regression line will
not hold true or if it does it will have to be proven. Thus the scatterplot not only
helps us find the trend of the data set it helps in accurately predicting a data within
the range of the regression line.

Scatterplot Project

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Scatterplot Project

Hochgeladen von

Copyright:

Verfügbare Formate

Sudhakar 1

Height Vs. Knee Length

Knee Height Vs. Height - Scatterplot

Mean and Median:

Das könnte Ihnen auch gefallen