Econometrics

Attrition Bias

If you are evaluating the treatment effect of a policy or medical intervention, does it matter if some of your subjects leave the sample? In many cases, the answer is ‘yes’.

The Problem

As outlined in Grasdal (2001), the effect of the treatment is simply:

• Δ = E(Y|X, T=1) − E(Y|X, T=0)

However, in some cases we may not observe Y. For instance, if there is attrition in the study, we will not observe their outcomes. Thus, we can decompose the two components from the equation above as follows: The effect of treatment with attrition is:

• E(Y|X, T=1) = pTE(Y|X, T=1, A=0) + (1-pT)E(Y|X, T=1, A=1)
• E(Y|X, T=0) = pCE(Y|X, T=0, A=0) + (1-pC)E(Y|X, T=0, A=1)

where pT is the probability someone in the treatment group drops out of the sample (pT=p(A=0|X, T=1) and pC is the probability someone in the control group drops out of the sample (pC=p(A=0|X, T=0).

Rearranging terms we get:

• Δ = [E(Y|X, T=1, A=0)-E(Y|X, T=0, A=0)] + pT[E(Y|X, T=1, A=0)-E(Y|X, T=1, A=1)] + pC[E(Y|X, T=0, A=1)-E(Y|X, T=0, A=0)]

The first term in brackets is what we observe. The second term in brackets is the difference between is the outcome in the treatment group for the attrition and non-attrition group; the third term in brackets gives the difference between is the outcome in the control group for the attrition and non-attrition group. With random attrition, the two expressions inside the square brackets will cancel out. If attrition is random, then estimating the treatment effect using the first equation will produce unbiased estimates.

Potential Solutions

If one knows the source of the attrition bias, one can explicitly model the source of the attrition. Explicit models are typically sample selection model in which two simultaneous regression
models are calculated. “The first model is a regression model that addresses the research question, with the hypotheses of the study being examined by the regression of the dependent variable on the key independent variables in the study. The second model includes the variables that are causing attrition, with the dependent variable being a dichotomous variable indicating either continued participation or nonparticipation in the study. The error terms of the substantive dependent variable in the first regression model and the participation dependent variable in the second regression model are correlated. A significant correlation between the two error terms indicates attrition bias.”

If the source of the bias is unknown, one can use the Heckman selection model. The first step of the Heckman selection model “…not only tests for attrition bias but also creates an outcome variable, which Heckman calls λ (lambda). Thus, a λ value is computed for all cases in the study, and it represents the proxy variable that explains the causation of attrition in the study…The second step of Heckman’s procedure is to merge the λ value of each participant into the larger data set and then include it…in the regression equation that is used to test the hypotheses in the study. Including λ in the equation solves the problem of specification error and leads to more accurate regression coefficients.”

Empirical Investigation

A study by Grasdal looks at attrition in a randomized field trial of a rehabilitation programme designed to bring long-term sick listed workers with musculoskeletal problems back to work in Bergen, Norway. In this case, they found that “Both the parametric and the semi-parametric sample estimators that were considered indicated that sample attrition biased outcome data regarding posttreatment earnings, while the data regarding sick leave status remained unbiased. The sample selection estimators of post-treatment earnings perform quite well in terms of correcting for attrition bias and estimating treatment effects not very different from the experimental benchmark.”

…The analysis also demonstrates an inherent paradox in the ‘common support’ approach, which prescribes exclusion from the analysis of observations outside of common support for the selection probability. The more important treatment status is as a determinant of attrition, the larger is the proportion of treated with support for the selection probability outside the range, for which comparison with untreated counterparts is possible.”

Source: