Beruflich Dokumente
Kultur Dokumente
Censoring Mechanisms
Data are censored if we do not know the exact values of
each observation but we do have information about the
value of each observation relative to one or more bounds.
Censoring is a key feature of survival data and has a
profound impact on the estimation procedure. Censoring
occurs in a number of different forms:
Right Censoring
Data are right censored if the censoring mechanism cuts
short observations in progress.
Eg: the ending of a mortality investigation before all the
lives being observed have died. Persons still alive when the
investigation ends are right censored we only know that
their lifetimes exceed some value.
Start
t=0
Censored
Death
T> t
Left Censoring
Data are left censored if the censoring mechanism prevents
us from knowing when entry into the state that we wish to
observe took place.
Eg: medical examinations in which patients are subject to
regular examinations. Discovery of a condition tells us inly
that the onset fell in the period since the previous
examination, the time elapsed since onset has been left
censored.
Fell ill
Found ill
t1
t2
Death
t3
Interval Censoring
Data are censored if the observational plan only allows us
to say that an event of interest fell within some interval of
time.
Eg: an actuarial investigation where we only know the
calendar year of death. Both right and left censoring can be
seen as special cases of interval censoring.
Start
t=0
Alive
t1
Death
Dead
t2
Random Censoring
If the censoring is random then the time Ci at which the
observation of the ith lifetime is censored is a random
variable. The observation will be censored if Ci < Ti where
Ti is the (random) lifetime if the i th life. In such a situation
censoring is said to be random.
Random censoring is a special case of right censoring.
Non-Informative Censoring
Censoring is non-informative if it gives no information
about the lifetimes {Ti}. If each member of the pair {Ti, Ci}
is independent then censoring is non-informative.
Note: if we are dealing with events that are statistically
independent the likelihood function representing all the
events is simply the product of the likelihood functions for
each individual event. This greatly simplifies the
mathematics required in the analysis!
Type I Censoring
If the censoring times {Ci} are known in advance then the
mechanism is called Type I Censoring.
Eg: a mortality investigation may end on a particular date.
Type II Censoring
If observation is continued until a predetermined number of
deaths has occurred then Type II censoring is said to be
present. In this case the no. of deaths is non random.
Eg: Type II censoring occurs in reliability testing when
components may be tested until a certain number fail. This
kind of censoring is uncommon in mortality studies.
4
If cj lives are censored in the interval (tj, tj+1) then the times
at which the lives are censored within this interval are tj1,
tj2, t .
jc j
Example
A group of 15 laboratory rats are injected with a new drug.
They are observed over the next 30 days.
The following events occur:
Day Event
3
Rat 4 dies from effects of drug.
4
Rat 13 dies from effects of drug.
6
Rat 7 gnaws through bars of cage and escapes.
11 Rats 6 and 9 die from effects of drug.
17 Rat 1 killed by other rats.
21 Rat 10 dies from effects of drug.
24 Rat 8 freed during raid by animal liberation activists.
25 Rat 12 accidentally freed by journalist reporting
earlier raid.
26 Rat 5 dies from effects of drug.
30 Investigation closes. Remaining rats hold street party.
F (t j ) F (t j )
Censored Observations
The probability that a life should survive to be censored at
time tjl is 1-F(tjl) (assuming non informative censoring).
As observed deaths and censored observations are assumed
to be independent the total likelihood is:
k
L ( P (T t j ))
j 1
dj
cj
( P(T t
jl
))
j 0 l 1
cj
L ( F (t j ) F (t j ))dj (1 F (t jl ))
j 1
j 0 l 1
dj
( F (t j ) F (t j ))
1 F (t j )
cj
(1 F (t j )) d j (1 F (t jl ))
j 0
l 1
( F (t j ) F (t j ))
j 1
1 F (t j )
(1 F (t
k
j 0
)) j (1 F (t j ))
cj
dj
P(T t j )
j 1 P (T t j )
( P(T t
)) j ( P(T t j ))
cj
j 0
(
j 1
dj
( P(T t
j 1
)) j ( P (T t j 1 ))
k
cj
( P (T t 2 )) c1 ( P (T t j )) j ( P (T t j 1 ))
d
cj
j 2
Also, since:
1 j P (T t j | T t j )
for j=1,2,k
we can write:
(1 1 )(1 2 ) P (T t1 | T t1 ) P (T t 2 | T t 2 )
P (T t1 ) P (T t 2 )
P (T t 2 )
P (T t1 ) P (T t 2 )
More generally:
(1 1 )(1 2 )...(1 j ) P (T t j ) P (T t j 1 )
for j=1,2,k
10
1-F(t) = (1
t j t
( P (T t j )) j ( P (T t j 1 ))
cj
(1 1 )
d j c j
.....(1 j 1 )
d j c j
(1 j )
cj
for j =
2,3,k.
As nj = dj + cj + nj+1 we have:
d
( P (T t j )) j ( P (T t j 1 ))
cj
(1 1 )
n j n j 1
.....(1 j 1 )
n j n j 1
(1 j )
cj
11
c1
n j n j 1
j 1
j 2
n j n j 1
cj
(1 1 ) c1 [(1 1 ) n2 n3 (1 2 ) c2 ][(1 1 ) n3 n4 (1 2 ) n3 n4 (1 3 ) c3 ]
.................[(1 1 ) nk nk 1 ......(1 k 1 ) nk nk 1 (1 k ) ck ]
(1
j 1
c j n j 1
(1 j )
n j d j
j 1
d
j
j 1
(1 j )
j 1
n j d j
j j (1 j )
d
n j d j
j 1
12
(1 j k)
(Proof on board)
In other words we estimate the discrete hazard function, j,
by dividing the number of deaths observed at time tj by the
number of lives at risk at time t (1jk). This is a very
intuitive result.
j
13
Example
Using the data from the observation of laboratory rats,
calculate the Kaplan-Meier estimate of F(t).
Solution
j
tj
dj
nj
d
j j
nj
1 j
1 (1 j )
k 1
15
0.06667
0.93333
0.06667
14
0.07143
0.92857
0.13333
11
12
0.16667
0.83333
0.27778
21
0.11111
0.88889
0.35803
26
0.16667
0.83333
0.46503
Therefore:
for 0 t 3
0
0.06667 for 3 t 4
0.13333 for 4 t 11
F (t )
0.27778 for 11 t 21
0.35803 for 21 t 26
0.46503 for t 26
14
15
dj
n j (n j d j )
16
t s ds j
0
t j t
dj
j
nj
j
t j t n j
17
t j t
3
j
18
d
FKM (t ) 1 1 j
n j
t j t
d
d
d
1 1 1 1 2 .... 1 s
n1
n2
ns
n1
ed2
n2
dj
1 exp
t t n j
j
)
1 exp(
...e d s
ns
FNA (t )
Example
The following data relate to 12 patients who had an
operation time 0 is the start of the investigation.
Patient Time of
Operation
No.
(in weeks)
Time observation
ended (in weeks)
Reason
observation
ended
19
1
0
120
Censored
2
0
68
Death
3
0
40
Death
4
4
120
Censored
5
5
35
Censored
6
10
40
Death
7
20
120
Censored
8
44
115
Death
9
50
90
Death
10
63
98
Death
11
70
120
Death
12
80
110
Death
You can assume that the censoring was non informative
with regard to the survival of any individual patient.
(i)
(ii)
(iii)
20