Beruflich Dokumente
Kultur Dokumente
We can also calculate the correlation between more than two variables.
where rxz, ryz, rxy are as defined in Definition 2 of Basic Concepts of Correlation.
Here xand y are viewed as the independent variables and z is the dependent variable.
We also define the multiple coefficient of determination to be the square of the
multiple correlation coefficient.
Often the subscripts are dropped and the multiple correlation coefficient and multiple
coefficient of determination are written simply as R and R2 respectively. These
definitions may also be expanded to more than two independent variables. With just one
independent variable the multiple correlation coefficient is simply r.
Unfortunately R is not an unbiased estimate of the population multiple correlation
coefficient, which is evident for small samples. A relatively unbiased version of R is
given by R adjusted.
Definition 2: If R is Rz,xy as defined above (or similarly for more variables) then
the adjusted multiple coefficient of determination is
where k = the number of independent variables and n = the number of data elements in
the sample for z (which should be the same as the samples for x and y).
Excel Data Analysis Tools: In addition to the various correlation functions described
elsewhere, Excel provides the Covariance and Correlation data analysis tools.
The Covariance tool calculates the pairwise population covariances for all the variables
in the data set. Similarly the Correlation tool calculates the various correlation
coefficients as described in the following example.
Example 1: We expand the data in Example 2 of Correlation Testing via the t Test to
include a number of other statistics. The data for the first few states are as described in
the Figure 1:
Observation: Suppose we look at the relationship between GPA (grade point average)
and Salary 5 years after graduation and discover there is a high correlation between
these two variables. As has been mentioned elsewhere, this is not to say that doing well
in school causes a person to get a higher salary. In fact it is entirely possible that there is
a third variable, say IQ, that correlates well with both GPA and Salary (although this
would not necessarily imply that IQ is the cause of the higher GPA and higher salary).
In this case, it is possible that the correlation between GPA and Salary is a consequence
of the correlation between IQ and GPA and between IQ and Salary. To test this we need
to determine the correlation between GPA and Salary eliminating the influence of IQ
from both variables, i.e. the partial correlation .
Property 1:
where D = 1 (A + B + C).
Real Statistics Functions: The Real Statistics Resource Pack contains the following
supplemental functions:
CORREL_ADJ(R1, R2) = adjusted correlation coefficient for the data sets defined by
ranges R1 and R2
MCORREL(R, R1, R2) = multiple correlation of dependent variable z with x and y
where the samples for z, x and y are the ranges R, R1 and R2 respectively
Observation: Definition 1 defines the multiple correlation coefficient Rz,xy and
corresponding multiple coefficient of determination for three variables x, y and z. These
definitions can be extended to more than three variables as described
in Advanced Multiple Correlation.
E.g. if R1 is an m n data range containing the data for n variables then the
supplemental function RSquare(R1, k) calculates the multiple coefficient of
determination for the kth variable with respect to the other variables in R1. The multiple
correlation coefficient for the kth variable with respect to the other variables in R1 can
be calculated by the formula =SQRT(RSquare(R1, k)).
Thus if R1, R2 and R3 are the three columns of the m 3 data range R, with R1 and R2
containing the samples for the independent variables x and y and R3 containing the
sample data for dependent variable z, then =MCORREL(R3, R1, R2) yields the same
result as =SQRT(RSquare(R, 3)).
Observation: Similarly the definition of the partial correlation coefficient (Definition
3) can be extended to more than three variables as described in Advanced Multiple
Correlation.