Survival analysis is used in many contexts. Some examples include:
- Medical research: fraction of patients living for a certain amount of time after treatment.
- Economics: length of time people remain unemployed after a job loss.
- Engineering: time until failure of machine parts.
- Ecology: how long fleshy fruits remain on plants before they are removed by frugivores.
How can you estimate the probability of survival from empirical data? One method is to compute a Kaplan-Meier curve (or Kaplan-Meier estimator). The Kaplan-Meier curve calculate survival as the share of individuals who do not fail (e.g., break, die, find a job, of individuals) before a given time period.
One advantage of the Kaplan-Meier curve is that it can take into account censored data. Such as cases where a study ends before failure (or in the medical case often death), or due to other forms of censoring within the study horizon (e.g., removal of a part before failure, patient attrition).
Kaplan-Meier curves are particularly useful for measuring the effectiveness of medical treatments. In randomized controlled trials (RCTs) comparing drug treatments against a placebo, one can create two Kaplan-Meier curves to compare the survival probabilities the two trial arms. Visually, one can often clearly see if the treatment is more effective than the control.
However, is there a statistical method for determining whether the two survival functions are equivalent? In fact, the answer is yes. It is called the log-rank test.
The Healthcare Economist has worked through an example in Excel to demonstrate how to calculate the survival functions as well as the log-rank test (EXAMPLE).