Sie sind auf Seite 1von 3

0CHAPTER 3, SECTION 3, DAY 2 (Reiner)

Date: ___________________________

Residuals: When we studied the LSRL, we are looking at the overall pattern as well as the
scatter points that deviate from the regression line. In the LSRL, the vertical distance from
each point to the LSRL is minimized, having the smallest possible sum of squares. Because
these distances represent the “left-over” variation after fitting the data with the regression
line, they are called ____________________________.
Formula:

Example 3.14: Does the age at which a child begins to talk predict later score on a test
of mental ability? A study of the development of young children recorded the age in
months at which each of the 21 children spoke their first word and Gesell Adaptive Score,
the result of an aptitude test taken much later. The data appears as follows…

The following is a scatterplot of the data with “age at first word” as the explanatory
variable x and “Gesell score” as the response variable y. Looking at the graph, we can see
that the association shows a ____________________ direction and has a __________________
strength when talking about the linear correlation.

Suppose the correlation coefficient is r = -0.640 and the LSRL is ^y = 109.8738 – 1.1270x.

For Child 1, who spoke her first world at 15 months, we predict the score:
The residual for Child 1 is:
Facts about residuals: Residuals show how far each data point is away from the
regression line, helping us assess how well a regression line fits our data. For the least-
squares regression line, the residuals have a special property:

Note:

Example 3.14 (con’t): The summary of all residuals is as follows:

We can plot this data in a “residual plot,” which compares ____________________ against the
_______________________. The line y = 0 corresponds to the lease-squares regression line.

When the LSRL is a good fit…

And when it’s not…

Important data points to consider when studying regression:


Outlier:

An influential observation:
Example 3.14 (con’t): The graph below shows how Child 19 is an outlier and Child 18 is
both an outlier and an influential point in the data.

Residuals of influential points:

Example 3.15: The strong influence of Child 18 makes the original regression of Gesell
score on age at first word misleading. The original data have a coefficient of determination
r 2= 0.41, meaning _____________________________________________________________. This can
make this connection between age and Gesell score quite interesting to parents. However,
without Child 18 included in the data, the coefficient of determination r 2 = 0.11. What
should a child developer who is studying this relationship do?

Assignment:
p. 173 – 176 #3.46

Das könnte Ihnen auch gefallen