Beruflich Dokumente
Kultur Dokumente
Intermediate Biostatistics
Introduction to Cox Regression
History
Example 1:
Study of publication
bias
By
KaplanMeier
method
s
From: Publication bias: evidence of delayed publication in a cohort study of clinical research projects BMJ 1997;315:640-645 (13 September)
# not published
# published
Null
29
23
1.00
Non-significant
trend
16
Significant
47
99
From: Publication bias: evidence of delayed publication in a cohort study of clinical research projects BMJ 1997;315:640-645 (13 September)
Example 2:
Study of mortality in
academy award winners for screenwriting
KaplanMeier
method
s
From: Longevity of screenwriters who win an academy award: longitudinal study BMJ 2001;323:1491-1496 ( 22-29 December )
Relative increase
in death rate for
winners
37 (10 to 70)
HR=1.37; interpretation:
37% higher incidence of
death for winners
compared with nominees
HR=1.35; interpretation:
35% higher incidence of
death for winners
compared with
nominees even after
adjusting for potential
confounders
32 (6 to 64)
36 (10 to 69)
39 (12 to 73)
33 (7 to 65)
37 (10 to 70)
39 (12 to 73)
40 (13 to 75)
43 (14 to 79)
36 (9 to 68)
32 (6 to 64)
40 (11 to 76)
35 (7 to 70)
Characteristics of Cox
Regression
Continuous predictors
E.g.: hmohiv dataset from the lab
10
Characteristics of Cox
Regression, continued
Assumptions of Cox
Regression
12
13
The model
Components:
A baseline hazard function that is left unspecified but must
be positive (=the hazard when all covariates are 0)
A linear function of a set of k fixed covariates that is
exponentiated. (=the relative risk)
hi (t ) 0 (t )e
The model
Proportional hazards:
Hazard for person i (eg a smoker)
Hazard
ratio
HRi , j
e
h j (t ) 0 (t )e 1 x j1 ... k x jk
15
(1)
( 60 )
age
hi (t ) 0 (t )e smoking
smoking (1 0 )
e
h j (t ) 0 (t )e smoking ( 0 ) age ( 60 )
smoking
The model:continuous
predictor
(0)
( 70 )
age
hi (t ) 0 (t )e smoking
age ( 70 60 )
e
h j (t ) 0 (t )e smoking ( 0 ) age ( 60 )
age (10 )
L p ()
i 1
18
i 1
h1 (1)
Li (
)x
h1 (1) h2 (1) h3 (1) h4 (1) h5 (1) h6 (1)
h3 (4)
h2 ( 3)
(
) x(
)
h2 ( 3) h3 ( 3) h4 ( 3) h5 ( 3) h6 ( 3)
h3 (4) .... h6 ( 4)
h5 (12)
h6 (18)
x(
) x(
)
h5 (12) h6 (12)
h6 (18)
The risk
The PL
L p ()
i 1
0 (1)e x1 0 (1)e x 2
....
0 (t 1)e x1
)x
x 3
x 5
x 6
x 4
0 (1)e 0 (1)e 0 (1)e 0 (1)
0 (18)e x 6
x(
)
x 6
0 (18)e
L p ()
i 1
e x1 e x 2
e x1
) x....x 1
x 3
x 5
x 6
x 4
e e e
20
The PL
L p ()
i 1
x j
)
e
x j
jR ( ti )
log L p ()
i 1
j [x j
log(
x j
)]
jR ( ti )
21
Maximum likelihood
estimation
log L p ()
j [x j
log(
i 1
x j
)]
jR ( ti )
Once youve written out log of the PL, then maximize the function
22
Variance of
Var ( ) I ( )
23
Hypothesis Testing
H0: =0
1.
0
Z
asymptotic standard error ( )
2 ln
L p (reduced )
L( full )
25
Ties
In SAS:
option on the model statement:
ties=exact/efron/breslow/discrete
26
Exact, continued
15!
L P (Oi )
i 1
x100
x 96
e
e
eg : P (O15! ) ( x1
)( x1
)
x100
x 99
x 2
x 2
e e ...e
e e ...e
e x93
( x1
)....
x 95
x 97
x 98
x 99
x 2
e e ...e e e e
Each P(Oi) has 15 terms; sum 15! P(Oi)s
Hugely complex computation!so need
approximations
28
29
Discrete method
Assumes time is truly discrete.
When would time be discrete?
When events are only periodic, such as:
--Winning an Olympic medal (can only
happen every 4 years)
--Missing this class (can only happen on
Mondays or Wednesdays at 3:15pm)
--Voting for President (can only happen
every 4 years)
30
Discrete method
L1
Oddsi
i 1
15
Odds
jP (1)
15
Odds
jP ( 2 )
....
Recursive algorithm
makes it possible to
calculate.
31
Ties: conclusion
Well see how to implement in SAS
and compare methods (often
doesnt matter much!).
32
Evaluation of Proportional
Hazards assumption:
Recall proportional hazards concept:
Hazard for person i (eg a
smoker)
Hazard
ratio
for
smokin
g
hi ( t ) 0 ( t )e xi
1 ( xi x j )
HR
e
x j
h j ( t ) 0 ( t )e
Hazard for person j (eg a non-smoker)
33
hi ( t ) 0 ( t )e
( h ( u ) du )
0
x i
Pi (X t) S i (t) e
( 0 ( u ) e x du )
0
34
Evaluation of Proportional
Hazards assumption:
hi (t ) HRh j (t )
t
S j (t ) e
h j ( u ) du
0
and S i (t ) e
HRh j ( u ) du
0
Multiply
both sides
by a
negative
and take
logs again
S i (t ) e
HR ( h ( u ) du )
S i (t ) (e
( h ( u ) du )
0
) HR S i (t ) S j (t ) HR
35
Evaluation of Proportional
Hazards assumption:
e.g., graph well produce in lab
36
coefficient
The covariate
multiplied by time
If Interaction coefficient is significant indicates non-proportionality, and at the same time its
inclusion in the model corrects for non-proportionality!
Negative value indicates that effect of x decreases linearly with time.
Positive value indicates that effect of x increases linearly with time.
This introduces the concept of a time-dependent covariate
37
Time-dependent covariates
38
Time-dependent covariates
Time-dependent covariates
Time-dependent
Time-dependent covariates:
Example data
4 events
ID
Time
1
2
3
4
5
6
7
12
11
20
24
19
6
17
Fractur
e
1
0
1
0
0
1
1
StartO
C
0
10
.
0
0
.
1
StopOC
12
11
.
24
11
.
7
41
1. Time independent
predictor
42
Time-dependent covariates
Order by Time
ID
Time
6
2
1
7
5
3
4
6
11
12
17
19
20
24
Fractur
e
1
0
1
1
0
1
0
StartO
C
.
10
0
1
0
.
0
StopOC
.
11
12
7
11
.
24
43
Time-dependent covariates
3 OC users at
baseline
ID
Time
6
2
1
7
5
3
4
6
11
12
17
19
20
24
Fractur
e
1
0
1
1
0
1
0
StartO
C
.
10
0
1
0
.
0
StopOC
.
11
12
7
11
.
24
44
Time-dependent covariates
4 non-users at
baseline
ID
Time
6
2
1
7
5
3
4
6
11
12
17
19
20
24
Fractur
e
1
0
1
1
0
1
0
StartO
C
.
10
0
1
0
.
0
StopOC
.
11
12
7
11
.
24
45
Time-dependent covariates
ID
Time
6
2
1
7
5
3
4
6
11
12
17
19
20
24
0
1
0
0
.
0
11
.
24
46
47
48
Time-dependent covariates
ID
Time
6
2
1
7
5
3
4
6
11
12
17
19
20
24
0
1
0
0
.
0
11
.
24
49
e (0)
e (1)
e (1)
e (0)
L p ( oc ) (1)
x (1)
x (1)
x (1)
(0)
( 0)
(0)
5e
2e
4e
e
3e
e
e
e (0)
50
Time-dependent...
51
Time-dependent covariates
First event at time 6
ID
Time
6
2
1
7
5
3
4
6
11
12
17
19
20
24
Fractur
e
1
0
1
1
0
1
0
StartO
C
.
10
0
1
0
.
0
StopOC
.
11
12
7
11
.
24
52
The PL at t=6
L p ( oc )
e x 6 ( t 6 )
e x1 ( 6 ) e x 2 ( 6 ) e x3 ( 6 ) e x4 ( 6 ) e x5 ( 6 ) e x6 ( 6 ) e x7 ( 6 )
X is timedependent
53
Time-dependent covariates
ID
6
2
1
7
5
3
4
Time
6
11
12
17
19
20
24
e
1
0
1
1
0
1
0
C
.
10
0
1
0
.
0
.
11
12
7
11
.
24
54
The PL at t=6
e x6 ( t 6 )
L p ( oc ) x1 ( 6 )
e
e x2 ( 6) e x3 ( 6 ) e x4 ( 6 ) e x5 ( 6 ) e x6 ( 6 ) e x7 ( 6 )
e ( 0)
( 0)
3e
4e (1)
55
Time-dependent covariates
ID
Time
6
2
1
7
5
3
4
6
11
12
17
19
20
24
Fractur
e
1
0
1
1
0
1
0
StartO
C
.
10
0
1
0
.
0
StopOC
.
11
12
7
11
.
24
56
The PL at t=12
e (0)
e (1)
L p ( oc ) ( 0)
x (1)
(1)
3e
4e
2e 3e ( 0)
57
Time-dependent covariates
Third event at time 17
ID
Time
6
2
1
7
5
3
4
6
11
12
17
19
20
24
Fractur
e
1
0
1
1
0
1
0
StartO
C
.
10
0
1
0
.
0
StopOC
.
11
12
7
11
.
24
58
The PL at t=17
e ( 0)
e (1)
e ( 0)
L p ( oc ) ( 0 )
x (1)
x (1)
(1)
(0)
3e
4e
2e 3e
e 3e ( 0 )
59
Time-dependent covariates
Fourth event at time 20
ID
Time
6
2
1
7
5
3
4
6
11
12
17
19
20
24
Fractur
e
1
0
1
1
0
1
0
StartO
C
.
10
0
1
0
.
0
StopOC
.
11
12
7
11
.
24
60
The PL at t=20
e ( 0)
e (1)
e ( 0)
e ( 0)
L p ( oc ) ( 0)
x (1)
x (1)
x (1)
(1)
( 0)
( 0)
3e
4e
2e 3e
e 3e
e e (0)
61
62