Beruflich Dokumente
Kultur Dokumente
A matrix is a mathematical way or organizing information that can be arranged as rows and col-
umns. For many types of tables of data, these matrices can simplify the mathematical formulae
used to express various multivariate analyses. In fact, most multivariate texts will present formu-
las for the data analysis and theory in terms of matrix algebra. For this reason, we spend time in
this unit reviewing and introducing some basic concepts and basic operations of matrix algebra.
These concepts and operations will be introduced with reference to a multivariate data example.
The data set consists of 4 measurements made on the same skulls of 2 groups of individuals,
males and females. Thus, an observation consists of the 4 skull measurements on a given skull
and of the sex of the individual to which the skull belonged. We might have several objectives in
the data analysis including a description (means, medians, variances, covariances, correlations) of
the measurements according to the sex of the individual and an hypothesis test to determine if the
skulls differ in size or shape between the sexes. The data are presented in Table 1 while various
data summaries are given in Tables 2-6.
TABLE 1. Four skull measurements on males and females.
Zygomatic
Sex Length Basilar Arch Postorbital
F 6287 4845 3218 996
F 6583 4992 3300 1107
F 6518 5023 3246 1035
F 6432 4790 3249 1117
F 6450 4888 3259 1060
F 6379 4844 3266 1115
F 6424 4855 3322 1065
F 6615 5088 3280 1179
F 6760 5206 3337 1219
F 6521 5011 3208 989
F 6416 4889 3200 1001
F 6511 4910 3230 1100
F 6540 4997 3320 1078
F 6780 5259 3358 1174
F 6336 4781 3165 1126
F 6472 4954 3125 1178
F 6476 4896 3148 1066
F 6276 4709 3150 1134
F 6693 5177 3236 1131
F 6328 4792 3214 1018
F 6661 5104 3395 1141
F 6266 4721 3257 1031
Zygomatic
Sex Length Basilar Arch Postorbital
F 6660 5146 3374 1069
F 6624 5032 3384 1154
F 6331 4819 3278 1008
F 6298 4683 3270 1150
M 6460 4962 3286 1100
M 6252 4773 3239 1061
M 5772 4480 3200 1097
M 6264 4806 3179 1054
M 6622 5113 3365 1071
M 6656 5100 3326 1012
M 6441 4918 3153 1061
M 6281 4821 3133 1071
M 6606 5060 3227 1064
M 6573 4977 3392 1110
M 6563 5025 3234 1090
M 6552 5086 3292 1010
M 6535 4939 3261 1065
M 6573 4962 3320 1091
M 6537 4990 3309 1059
M 6302 4761 3204 1135
M 6449 4921 3256 1068
M 6481 4887 3233 1124
M 6368 4824 3258 1130
M 6372 4844 3306 1137
M 6592 5007 3284 1148
M 6229 4746 3257 1153
M 6391 4834 3244 1169
M 6560 4981 3341 1038
M 6787 5181 3334 1104
M 6384 4834 3195 1064
M 6282 4757 3180 1179
M 6340 4791 3300 1110
M 6394 4879 3272 1241
M 6153 4557 3214 1039
M 6348 4886 3160 991
M 6534 4990 3310 1028
M 6509 4951 3282 1104
TABLE 3. Summary statistics output from SAS PROC CORR for the skulls of males.
TABLE 4. Variances and covariances output from SAS PROC CORR for the skulls of females.
TABLE 5. Variances and covariances output from SAS PROC CORR for the skulls of males.
TABLE 6. Pearson correlations output from SAS PROC CORR. Correlations for the skulls of females
are given in the upper right triangle while correlations for the skulls of males are given in the lower left
triangle.
ple, let x = 1 be a vector whose x 1 element is 1 and whose x 2 element is 2. We can plot
2
this vector in 2-space ( p = 2 ) by simply locating the end-point or head of the vector
according to the axis specifications above and then drawing a line from the origin out to this
end-point.
x1
0
x2
0 1 2
Vectors can be specified as row vectors, such as x = 1 2 , or as column vectors, as defined
above. This distinction becomes important later on when we perform operations on the vec-
tors. Here, we have written the vector with an underscore to emphasize that it is a vector and
not a scalar. For the most part, variables x , y , and z will refer to vectors.
2. A column vector can be changed to a row vector and vice versa using the transpose operator.
Let x = 1 then x = 1 2 .
2
3. Multiplication of a vector by a positive constant expands or contracts the vector. Multiplica-
tion of a vector by a negative constant also expands or contracts the vector but also changes
the direction of the vector by 180 degrees (opposite direction).
3 1 = 3 or ( 3 ) 1 = 3
2 6 2 6
4. Addition and subtraction of vectors is also possible. The vectors must be conformable, i.e.,
they must both either be row vectors or column vectors and they must have the same num-
ber of elements. Vector addition and subtraction are performed by adding or subtracting the
corresponding elements from each vector.
x1 y1 x1 + y1
Let x = and y = then x + y = .
x2 y2 x2 + y2
Lets estimate the population average skull size by ignoring the sex to
which a skull belongs. Thus, we need a weighted average of the vectors of
means of the 2 sexes.
6486 6429 6454.1
-----------------
26 4939 33 4898 = 4916.1
26 + 33- + ------------------
26 + 33
3261 3259 3259.9
1094 1090 1091.8
5. The length of a vector can be computed using the Pythagorean theorem which can be gener-
alized to p -dimensions. Let x = x 1 x 2 x p then the length of x is
x = x 12 + x 22 + + x p2 .
Let x be the vector of postorbital measurements for the females. Then the
length of this vector would be
x = 996 2 + 1107 2 + + 1150 2 = 5587.05 .
6. The usual inner product of two conformable vectors x and y is defined to be
xy = x 1 y 1 + x 2 y 2 + + x p y p . Thus the length of a vector can be written in terms of its
inner product with itself as x = xx .
6287 6583
Let x = 4845 and y = 4992 then
3218 3300
996 1107
6583
xy = 6287 4845 3218 996 4992
3300
1107
1 3 5
Are the vectors x = 1 , y = 4 , and z = 6 linearly dependent?
1 1 3
Solve the system of equations by, for example, subtracting the first equa-
tion from the second, and then the third equation from the second,
c2 + c3 = 0
c 2 + c 3 = 0 or that c 2 = c 3 .
3c 2 + 3c 3 = 0
Since we are trying to show linear dependence, pick some non-zero val-
ues for c 2 and c 3 that fit the last equation. For example, take c 2 = 1 and
c 3 = 1 . Using any one of the original equations, this implies that
c 1 = 2 . Thus, these 3 constants are non-zero and they solve all 3 equa-
tions. Therefore, the 3 vectors must be linearly dependent. One can be writ-
ten as a linear combination of the other 2. What this means in terms of
geometry is that although there are 3 vectors and each has 3 coordinate val-
ues (3 axis values), these 3 vectors can actually be plotted in fewer than 3
dimensions by finding the appropriate coordinate system (Here, 2 axes. In
linear algebra this minimal set of axes is called a basis).
1 3 3
Are the vectors x = 1 , y = 4 , and z = 6 linearly dependent?
1 1 5
1 3 3 0 c 1 + 3c 2 + 3c 3 = 0
c 1 1 + c 2 4 + c 3 6 = 0 or c 1 + 4c 2 + 6c 3 = 0 .
1 1 5 0 c 1 + c 2 + 5c 3 = 0
Solve the system of equations by, for example, subtracting the first equa-
tion from the second, and then the third equation from the second,
c 2 + 3c 3 = 0
c 3 = 0 and that c 2 = 0 and that c 1 = 0 .
3c 2 + c 3 = 0
Since the only values for the c i that will solve the equations are c i = 0 ,
the vectors are linearly independent. Note: this does not imply that they are
mutually perpendicular.
Proj(y x)
x
The lengths of these projections will often correspond to regression coefficients in normal
theory linear and multiple regression ( = ( XX ) 1 ( Xy ) where the vector y is projected
into the space of the predictor variables).
11. A matrix X is a transformation of the rows ( n ) to columns ( p ) and columns to rows. A
( nxp )
very common representation is the data matrix whose rows are the observations and whose
columns are the variables. This would be the data of Table 1. Typically, we would have a
separate matrix for males and females and would discard the column corresponding to sex.
The other tables could also be described as matrices.
12. A square matrix has the same number of rows as columns, n = p . The covariance matrix
and correlation matrix (Tables 4, 5, and 6) are square matrices.
13. The transpose of a matrix X is the new matrix X whose rows are the columns of X and
whose columns are the rows of X , such that x ij = x ji , where i and j are the matrix indices
for rows and columns with the row index listed first.
If = 1 3 then = 1 4 .
4 2 3 2
14. The matrix X is a square symmetric matrix if X = X . Notice that the covariance matrix and
correlation matrix are symmetric matrices. Note also that Table 6 is actually 2 different cor-
relation matrices written as one by exploiting this fact.
15. The identity matrix I is a square symmetric matrix that has 1s on the main diagonal
( i jj = 1 for j = 1, 2, , n ) and 0s elsewhere.
1 0 0
For n = 3 , = 0 1 0 .
0 0 1
If = 1 3 then 2X = 2 6 .
4 2 8 4
19. Two matrices can be added if they are conformable for addition, i.e., they have the same
number of rows and columns. Addition is performed by adding together the corresponding
elements of each matrix and placing the results in the corresponding elements of a new
matrix of the same dimensions.
Compute the pooled variance-covariance matrix of the skull measurements
by pooling the variance-covariance matrices of the males and females.
Note that we must take into account the degrees of freedom of each matrix.
( nm 1) Sm + ( nf 1) Sf
S pool = -----------------------------------------------------------
- where S m and S f are the variance-
nm + nf 2
covariance matrices for the males and females, respectively, and, n m and
n f are the number of skulls for the males and females, respectively. To save
some space with this example, the matrix entries will be rounded to integer
numbers.
20. Matrix multiplication can also be performed if the matrices are conformable. Let X be a
p n matrix and let Y be a n k matrix, then the product XY is defined and the resulting
matrix will have dimensions p k . The matrix multiplication is essentially a set of inner
products of the rows of the first matrix with the columns of the second matrix with the result
being entered into the new matrix at the position defined as the row of the first matrix and
the column of the second matrix that are in the inner product. Note that the order of the
matrices in the operation is very important. Note also that pre- or post-multiplication of a
matrix by a conformable identify matrix returns the original matrix.
If = 1 3 and Y = 5 2 then
4 2 1 3
XY = 1 3 5 2 = 1 ( 5 ) + 3 ( 1 ) 1 ( 2 ) + 3 ( 3 ) = 8 7
4 2 1 3 5 ( 4) + 2 ( 1) 4 ( 2) + 2 ( 3) 18 14
21. The trace of a matrix is the sum of its diagonal elements.
23. The matrix inverse of the square matrix X is the matrix X 1 such that X 1 X = XX 1 = I .
For the inverse to exist, the columns of X must be linearly independent (full rank). If the
matrix is not of full rank then the matrix is called singular, while if it is of full rank then it
is called non-singular.
1 x x 12
For a 2 2 matrix X , the inverse is X 1 = ------ 22 where X is the
X x x
21 11
determinant of the matrix.
If = 1 3 then
3 2
1
1
X 1 = 1 3 = --------- 2 3 = 0.18181818 0.27272727 .
3 2 11 3 1 0.27272727 0.09090909
Note that the inverse of a square symmetric matrix is also square and sym-
metric. Further note that in many statistical applications due to the high
correlations among variables, the inverse can be very sensitive to rounding
errors. Thus, it is important to use good subroutines and statistical software
to avoid numerical problems.
The above results also imply the characteristic equation, A I = 0 , which can be used to
find the eigenvalues of A .
A I = 0 1 0 1 0 = 0 1 0 0 = 0
13 0 1 1 3 0
1 0 = 0 ( 1 ) ( 3 ) ( 1) ( 0) = 0
1 3
( 1 ) ( 3 ) = 0 . The roots of this quadratic equation can be
found using the quadratic formula or, here, by inspection. Thus, 1 = 1
and 2 = 3 are the eigenvalues.
1 0 x1 = x1
x 1 = x1
x 1 = x1
. For exam-
1
1 3 x2 x2 x 1 + 3x 2 = x 2 x 1 = 2x 2
x1 = 2
ple, is a solution. The normalized eigenvector e 1 can be found
x2 = 1
53474.014
= 3641.7324
2824.7904
1189.3712
You should also note that the matrix of eigenvectors of a square, symmetric
matrix is also an orthogonal matrix,
-8 -7 -7
0.99999997 9.566176 10 1.5926028 10 3.4196496 10
-8 -7 -7
= 9.566176 10 1.0000001 1.3958974 10 1.589027 10
-7 -7 -8
1.5926028 10 1.3958974 10 0.9999998 7.740809 10
-7 -7 -8
3.4196496 10 1.589027 10 7.740809 10 1.0000005
1 0 0
0 2 0
main diagonal elements are the corresponding eigenvalues of X , D = .
. . . .
0 0 k
e1
e2
You should also note that = .
.
ek
As an example, use the matrix E above for the matrix P , then construct the
diagonal matrix D as
53474.014 0 0 0
D = 0 3641.7324 0 0 .
0 0 2824.7904 0
0 0 0 1189.3712
28. The form xAx , where x is a vector and A is a symmetric matrix, is called a quadratic form
as it involves square and cross-product terms when expanded.
A = -----
1
- + ---- S
1 1
in our earlier formula. Also note that in order for
n m n f pool
Further, these results imply that the square of the singular values of X are the non-zero
eigenvalues of XX and XX assuming that X is of full column rank.
45131.02 0 0 0
D = 0 377.19 0 0
0 0 297.96 0
0 0 0 148.97
U =
0.1910442 0.039454 -0.289166 0.2150759
0.1986437 0.0109427 0.0138016 -0.112158
0.1973333 -0.15866 -0.231725 0.0123482
0.1933035 0.1957425 0.1819122 -0.308274
0.1947336 0.0700373 -0.078581 -0.113865
0.193244 0.1556038 0.0951482 0.171743
0.1944312 0.2584742 -0.088927 0.0379134
0.2003849 -0.16754 0.1873947 0.2195149
0.2047745 -0.212409 0.2406047 0.2440917
0.1967973 -0.229306 -0.339616 -0.237416
0.1935504 -0.066441 -0.240768 -0.226629
0.1958693 -0.02393 0.0754968 -0.285667
0.198091 0.0462814 -0.114289 0.0845266
0.2058031 -0.247272 0.049245 0.3098562
0.1909718 0.0254284 0.2312493 -0.094161
0.1951365 -0.313074 0.3369767 -0.025048
0.194365 -0.187348 0.0195847 -0.452141
0.1890062 0.0988286 0.3002431 -0.119194
0.2022619 -0.400064 0.0257814 0.0012229
0.1910822 0.1088505 -0.154361 -0.132992
0.2021647 0.0596883 -0.002213 0.2673777
0.1895836 0.3096762 -0.110239 0.004265
0.2022993 -0.054062 -0.250534 0.2262461
0.2006189 0.1419394 0.090107 0.1597449
0.19196 0.2103423 -0.250101 0.0881389
0.1900654 0.4033163 0.3018608 0.0045121