11.6 Kaplan-Meier survival curve

  • We don’t have T so cannot just count up how many are alive at any given point in the study to estimate S(t).

  • Define:

    • dj : the times of death.

    • rj: the number of non-censored ‘alive’ cases at time dj. (at risk)

    • qj: the number that die at time dj (typically just 1!)

  • The ratio (rjqj)/rj is the fraction of those at risk that survive past time dk

  • This fraction is an estimate of the probabilty Pr(T>dj|T>dj1)

Note that this uses only uncensored data at time di but includes data that could become censored later! It takes care of censoring ‘automatically’.

  • The text shows how one can decompose S(dk) into these more elemental probabilities:

S(dk)=Pr(T>dk|T>dk1)×...×Pr(T>d2|T>d1)Pr(T>d1)

  • This leads to the Kaplan-Meier estimator:

ˆS(dk)=Πkj=1(rjqjrj)

  • Note also that:

lnˆS(dk)=kj=1ln(rjqjrj)