Beruflich Dokumente
Kultur Dokumente
mn
A aij
a11 a12
a
a22
21
am1 am 2
a1n
a2 n
amn
where aij denotes the element in the ith row and jth column of A, often referred
to as the (i,j)-element of A.
A aij
nm
a
ji
mn
a11
a
12
a1n
a21 am1
a22 am 2
a2 n amn
or vT v1 v2 vn
mn and B bij nl :
ml
AB C cij
with cij
k 1
u v v u ui vi u1v1 u2v2 un vn
T
i 1
0
0
Quadratic form
xT Ax x1
x2
a11
a
xn 12
a1n
a21 an1 x1
a22 an 2 x2 n n
aij xi x j
i 1 j 1
a2 n ann xn
a11 a12 x
2
2
y
a
x
a
a
xy
a
y
12
21
22
y 11
a
a
21 22
Determinant
A a1 j A1 j a2 j A2 j a3 j A3 j (expansion by column j)
For example,
2
3 2
0 2
0 3
A 0 3 2 2
4
5
6 8
3 8
3 6
3 6 8
3 2
6
(3)
3 2
2 24 (12) 3 8 15 = 141
a21 an1
a22 an 2
ann
a11a22 ann
1v1 2v2 k vk 0
holds only for scalars 1 2 k 0 .
Vectors v1, v2 ,, vk are said to be linearly dependent if
1v1 2v2 k vk 0
holds for at least one i 0 .
Equivalently, v1, v2 ,, vk are linearly dependent if there exists a vi such that,
for some scalars j , j i ,
j
vi j v j
j i
j i i
vj
( i 0 )
Rank of matrix
Example A1.1.
1 3 2
2 8 8
A
3 6 9
4 10 1
1
R3 1.5 R 2 0
R 4 R 2 0
0
2
1
R 2 2 R1
R3 3R1 0
R 4 4 R1 0
3 2 0
2 4 2
0 9 3
0 3 1
3 2
2 4
3 3
2 7
1
0
3
R
3 0
0
2
3
3 2 0
2 4 2
0 9 3
0 0 0
Thus Rank(A) = 3.
Note: The matrix A need not be square in this way to determine its rank.
Inverse matrix
(A1.1)
Then A1 B .
An n n matrix A is said to be orthogonal if AAT AT A I , or AT A1.
A a1 an is orthogonal if and only if a1, , an are orthonormal vectors
in the sense that aiT ai 1 and aiT a j 0 for i j , 1 i, j n .
10
Example A1.2.
0 1 1
0 1 1
A 1 0 1 A I 1 0 1
3 2 1
3 2 1
1 0 1
0 1 1
3 2 1
R1 R 2
R 3 6 1
0 1
0 1 1
0 0 1
1 0 0
0 1 0
0 0 1
0 1 0 R 33 R1 1 0 1
1 0 0 0 1 1
0 0 1 R 3 2 R 2 0 0 6
0 R1 R 3 1 0 0
1
0
0 0 1 0
1 3 1 2 1 6 R 2 R 3 0 0 1
0
0 1 0
1 0 0
2 3 1
1 3
23
13
1 6
1 2 1 6
12
1 6
12
1 3 1 2 1 6
2 3 1
1
A1 2 3 1 2 1 6 4 3 1
1 6
1 3 1 2
2 3 1
11
nn can be determined by
Alternatively, A1 of A aij
nn
1
A Aij
A
1
(A1.2)
a
a
21 22
1 A11
A A21
A12
1 a22
A22
A a12
a21
a11
a22 a12
1
12
Linear equations
(A1.3)
13
A I v 0
(A1.4)
(A1.5)
(A1.6)
1 2 2v2 1 m mvm 0
1 2 2 1 m m 0
(A1.7)
17
Similar Transform
AV V 1v V 1 Av V 1 v V 1v
(A1.8)
Let v j v1 j
AV 1v1 2v2
v11 v12
v
21 v22
vn1 vn 2
1v11 2v12
v
1 21 2 v22
n vn
1vn1 2vn 2
v1n 1 0
v2 n 0 2
vnn 0 0
n v1n
n v2 n
n vnn
0
0
VD
(A1.9)
19
1 1 2
A 1I v1
v2 A I v1
0 0 0 1 0
v2 0 0 0 0 1 0 and
1 1 1 1 1
1 0 0 0
A 2 I v3 0 1 0 0 0 V v1 v2
1 1 0 1
1 0 0
v3 0 1 0
1 1 1
Triangularisation
(A1.10)
A i I l v 0 ,
l 1, 2, , mi
(A1.11)
If A I v 0 has only one linearly independent solution v1, then any other
eigenvector of A has the form cv1 for some scalar c 0 .
Let v2 be a solution to A I v 0 , linearly independent of v1. Then
A I v2 is an eigenvector, which implies A I v2 cv1 for some c 0 ,
so that Av2 cv1 v2 . Thus
2
AV A v1 v2 Av1
Av2 v1 cv1 v2
v1 v2 0 cv1 V 0 cv1
(A1.12)
T
T
Since V 1v1 V 1v2 V 1 v1 v2 V 1V I cV 1v1 c 1 0 c 0 ,
it follows from (A1.12) that V 1 AV is upper triangular:
V 1 AV V 1 V 0 cv1 I 0 cV 1v1
1 0 0 c c
0 1 0 0 0
22
Orthogonal transform
(A1.13)
v2 , ,
vm , where v vT v
23
I
V V v1 vn
vT
vT v vT v 0 1
n n
n
n 1
1 ,, n V T
12
A1 2 A1 2 V diag
V diag
1 ,, n V T V diag
1 ,, n V T
1 ,, n V T V diag 1,, n V T
V V T AV V T VV T AVV T A
Appendix A2
Then
f x, y f X |Y x | y fY y and f X x
Define
f x, y dy
g y E X | Y y x f X |Y x | y dx
Then by definition, E X | Y g Y .
26
It follows that
E E[ X | Y ] E g Y g y fY y dy
x f
X |Y
x | y fY y dxdy
x f x, y dxdy x f x, y dy dx xf X x dx E X
(A2.1)
E h X , Y E E[h X , Y | Y ] E h X , y Y y fY y dy
(A2.2)
(A2.3)
27
Appendix A3
si
log l | X i
2
and vi 2 log l | X i
log L log l | X i
i 1
28
Define
n
log L si
i 1
and
2
2 log L vi
i 1
E s1
log f x | f x | 0 dx
f x | 0 f
dx
f x |
f x |
dx
(1)
0
0
f x | dx
0
(A3.1)
29
Hence
E 0 nE s1 0 0
(A3.2)
0 () 0 0 0
(A3.3)
E s1 0
0
0
0 n
E v1 0
n
(A3.4)
Asymptotic normality
1
1 n
0
si 0 N 0, Var s1 0
n
n i 1
(A3.5)
By (A3.3),
n 0
Var
s
1
1 0
0 N 0,
2
n
E v1 0
(A3.6)
1 2 f1
1 2 f1
log f1
2
s1
2
f1
f1 2
(A3.7)
31
E s12
2 f
1
f x | 0 dx
2
f x | 0
0
E s12
0 2 (1) E s12 0
0
(A3.8)
Var s1 0 E s12 0 E s1 0
E v1 0
(A3.9)
Consequently,
Var 0 nVar s1 0 nE v1 0 E 0 I 0
(A3.10)
32
E v1 0
0 ~ N 0,
n E v1 0
1
1
N 0,
N 0,
nE v1 0
I 0
~ N 0 ,
This also shows that, for large n, the variance of the MLE of 0 can be
approximated by
Var
I 0
33
0 0 () 0 0
(A3.11)
E v1 0
E s1 0 0
By the multivariate central limit theorem, (A3.5) remains valid, and similar
arguments to (A3.8) show that the variance matrix in (A3.5) is
T
Var s1 0 E s1 0 s1 0 E v1 0
(A3.12)
34
Combine (A3.5), (A3.11), (A3.12) and the law of large numbers, we get
1
1
1
0
I 0 0 0 n 0
n
n
n
N 0, E v1 0 in distribution as n
(A3.13)
(A3.14)
nE v1 0 I 0
I 0 I 0 I 0
I 0
and consequently,
~ N 0 , I 0
) I () 1
with variance estimator Var(
35
12
nE v1 0
12
with inverse I 0
1 2
E v1 0
n
1 2
I 0
12
I 0
1 2
I 0
N 0, E v1 0
1 2
E v1 0
1 2
E v1 0 E v1 0
1
I 0 0
n
1 2
N 0, I k
It follows that
I 0 0 k2
(A3.15)
36
(1) ( n )
if x( n )
if x( n )
Pr X ( n )
x
x Pr X1 x, , X n x I0 x I x
37
x
x
Pr n X ( n ) x Pr X ( n ) Pr X ( n )
n
n
1
x x
1
n
n
Consequently,
lim Pr n X ( n )
x
x
x lim 1
n
n
n
Thus the MLE of for the uniform distribution over interval 0, is not
asymptotically normal.
38
The above arguments for consistency and asymptotic normality are based on
complete data. For survival data subject to censoring, the results remain valid,
but the derivations become more complex.
For example, the contribution l | X i to the likelihood becomes
1 i
l | X i f X i | i S X i |
(A3.16)
Let X* denote the failure time (not subject to censoring) with cdf F x |
and C the censoring random variable.
39
Pr X c Pr X * c 1 F c | 0 S c | 0
log l 1 f
1 S
1
f
S
(A3.17)
l l | X1 ,
f f X1 |
and
S S X1 |
40
It follows that
1 f
log l
1 S
E s1 C E
C E
I X * C
I X * C C
S
f x | 0 f x |
S C | 0 S C |
dx
f x |
S C |
E s1 0 E E s1 0 C E
f x |
S C |
dx
(A3.18)
0
E s1 0 E
F C | S C |
S C |
f x | dx
E
0
0
E F c | S c | E (1) 0
0
0
(A3.19)
41
2 log l
log l
1 f
1 S
v1
1
T
f
S
1 f f T 1 2 f
1 S S T 1 2 S
1
2
2
T
T
f
f
S S
1 f
1 f
1 S
1 S
1
1
S
S
f
f
1 2 f
1 2S
1
T
T
f
S
s1 s1
1 2 f
1 2S
I
T X * C
T X * C
f
S
(A3.20)
42
E v1 0 E s1 0 s1 0
E s1 0 s1 0
E s1 0 s1 0
2S C |
f x | dx
E
F C | S C |
T
(A3.21)
E
1
E
s
1 0 1 0
T
By the law of large numbers and central limit theorem, (A3.19) and (A3.21)
show that the following properties of the MLE remain true for survival data:
(i) 0 in probability as n ; and
1
(ii) ~ N 0 , I 0
If the data include both uncensored and censored points, let x j be the largest
censored point, then j 0 and
1 i xi
L 1
i 1
n
1 i
xi
(A3.22)
i 1
n n 1 i
n log 1 i log xi
i 1 xi
i 1
as x j ( x j )
1 i
n
xj
i j xi
n 1 i 0 as
44
0 2
1 ( 1)
2
Hence 2 if 1 x2 x( n ) 2 ; or x( n ) x2 if x2 2 .
45
Furthermore, by (A3.18),
E s1 0 E
I x dx
1 IC IC
0
C 0 C
2 I x dx 2 IC E
2 IC
2
0
0
0
0
C 0
C
IC IC 2 IC
E
2
0
0
0
0
0
0
C
E 2 IC 2 IC 2 IC E IC
0
0
0
0
0
0
0
1
E IC Pr C 0
0
0
0
(A3.23)
46
Similarly by (A3.20),
E v1 0
E s12
E s12
0 E
0 E
2 f x |
2
2
2S C |
dx
I
dx
3 x
I
C
3 0
2C
2C 0 2C
2
2
E s1 0 E
3 IC E s1 0 E 2 IC
3
0
0
0
0
2
2
E s1 0 2 Pr C 0
(A3.24)
(A3.25)
if and only if Pr C 0 0 , or Pr C 0 1.
47
Recall that a key regularity condition for the properties of MLE is that the
support of the failure distribution does not depend on unknown parameters,
which can generally ensure interchangeable integration with differentiation in
(A3.1) and (A3.8), or in (A3.19) and (A3.21), in order to obtain (A3.25).
For example, U 0, does not satisfy this condition and in this case,
f
dx
1
dx
0
0
2 dx
0
0
1
f dx
0
This is, however, not a necessary condition. Even if it fails, (A3.25) may still
hold, such as in a censored U 0, with Pr C 0 1.
When the support of the failure distribution depends on unknown parameters,
we need to check (A3.25) specifically instead of relying on the general theory.
48