Econometrics

# Kaplan-Meier Survival Curves

Survival analysis is used in many contexts.  Some examples include:

• Medical research: fraction of patients living for a certain amount of time after treatment.
• Economics: length of time people remain unemployed after a job loss.
• Engineering: time until failure of machine parts.
• Ecology: how long fleshy fruits remain on plants before they are removed by frugivores.

How can you estimate the probability of survival from empirical data?  One method is to compute a Kaplan-Meier curve (or Kaplan-Meier estimator).  The Kaplan-Meier curve calculate survival as the share of individuals who do not fail (e.g., break, die, find a job, of individuals) before a given time period.

One advantage of the Kaplan-Meier curve is that it can take into account censored data.  Such as cases where a study ends before failure (or in the medical case often death), or due to other forms of censoring within the study horizon (e.g., removal of a part before failure, patient attrition).

Kaplan-Meier curves are particularly useful for measuring the effectiveness of medical treatments.  In randomized controlled trials (RCTs) comparing drug treatments against a placebo, one can create two Kaplan-Meier curves to compare the survival probabilities the two trial arms.  Visually, one can often clearly see if the treatment is more effective than the control.

However, is there a statistical method for determining whether the two survival functions are equivalent?  In fact, the answer is yes.  It is called the log-rank test.

The Healthcare Economist has worked through an example in Excel to demonstrate how to calculate the survival functions as well as the log-rank test (EXAMPLE).