You have data. You see that when people receive an intervention, the outcomes differ. Is this effect causal?
For this to be the case you need 4 critical assumptions.
Stable Unit Treatment Value Assumption (SUTVA). This assumes that each treated unit does not affect other units. For instance, consider the case where you imposed a quality improvement program in a clinic and compared units that received the training to those who did not. The impact of the program may be understated if people in the quality improvement program told their peers in the control arm about the program over lunch and the control arm group decided to implement some of the improvement techniques. When dealing with contagious diseases, clearly treating one person may benefit others. More informally, SUTVA means not contagion or spillover effects.
Additionally, it assumes that the treatment is stable over time. This assumptions works well for drugs and technology. However, perhaps for surgical intervention, treatment efficacy may improve over time as surgeons become more familiar with a technique. Thus, the causal effect at the start of an intervention may differ from the end of the intervention.
Consistency assumption. When doing causal inference, one key thought experiment we have is we look at what outcomes would look like if a person received an intervention A (i.e., a=1) compared to what would happen if a person did not get an intervention A (i.e., a=0). The consistence assumption basically says that if the person does (not) receive the treatment of interest, the outcome will Y1(or Y0) will correspond the to actual outcomes. Formally, Y=Ya if A=a ∀ a.
Ignorability assumption. Also known as the ‘no unmeasured confounders’ assumption, this says that once we condition on relevant observed confounders (X), treatment assignment is independent of outcomes. For instance, let’s say that everyone who has a severe case of COVID-19 gets a ventilator and those who are not infected or have a mild case of COVID-19 do not get a ventilator. Here clearly there is selection bias and the ignorability assumption is violated. If we compared outcomes for patients who received a ventilator compared to those who did not, those receiving a ventilator would have worse outcomes. However, it is clear that the ventilator itself is not causing worse outcomes, rather the ignorability assumption is being violated. If, on the other hand, we could perfectly measure COVID-19 severity with a covariaty X, one could find the causal effect of being on a ventilator. Formally, this assumption is that Y1, Y2 ⊥ A|X.
Positivity assumption. Positivity assumes that everyone has some chance of getting the treatment. For instance, a research might want to know the impact of getting a ventilator on patients without COVID-19. However, we would never give a person a ventilator if they don’t have any disease (ignoring the case where they’d need a ventilator for other causes). If there is no chance that a given population (e.g., people without COVID-19) would receive the intervention, then we can measure the average treatment effect across all patient types X. If this does occur, one would need to exclude these patient subgroups where positivity is violated from the analysis. Formally, positivity requires that P(A=a|X=x) > 0 ∀ a,x.
How do these assumptions help us prove causal effects from observed data.
Let’s say from expected data we could observe:
If the consistency assumption is valid, we know that the outcome a person has is the same as their poptenail outcome given treatment and thus can prove that:
If the ignorability assumption holds, we know that assignment is as good as random and there are no unmeasured confounders. In short, the treatment assignment mechanism does not matter and—after we condition on covariates—we can further simplify the equation to:
If one wants the marginal causal effect across all subgroups, we can simply average over the x’s. Specifically,
For more detail, see biostatistician Jason Roy’s video on “Causal Assumptions“.