Nobel laureate James Heckman has a nice summary of how applied econometricians and policy researchers should define causality. Some of the more interesting points I have excerpted below.
On the source of randomness in a sample
“One reason why many statistical models are incomplete is that they do not specify the sources of randomness generating variability among agents, i.e., they do not specify why otherwise observationally identical people make different choices and have different outcomes given the same choice. They do not distinguish what is in the agent’s information set from what is in the observing statistician’s information set, although the distinction is fundamental in justifying the properties of any estimator for solving selection and evaluation problems. They do not distinguish uncertainty from the point of view of the agent whose behavior is being analyzed from variability as analyzed by the observing analyst. They are also incomplete because they are recursive. They do not allow for simultaneity in choices of outcomes of treatment that are at the heart of game theory and models of social interactions and contagion (see, e.g., Brock & Durlauf, 2001; Tamer, 2003).”
Unbundling a treatment
Researchers often say that a policy change will cause a change in some outcome measure. However, a policy change is often made up of many components. Which components of the policy change actually influenced the outcomes? In Heckman’s words:
“Many causal models in statistics are black-box devices designed to investigate the impact of “treatments”—often complex packages of interventions—on observed outcomes in a given environment. Unbundling the components of complex treatments is rarely done. Explicit scientific models go into the black box to explore the mechanism(s) producing the effects.”
Outcomes vs. Utilities
Most researchers pick an outcome variable of interest and if the outcome increases–assuming a beneficial outcome measure–than people are better off. This may not be the case however. For instance, Bill Clinton’s welfare reform act (PRWORA) may have increased employment rates and income for single mothers, but the mother’s utility may have decreased. The single mothers may (or may not) have valued spending time caring for their child more than working.
Problems with non-linearity
Issues such as “social interactions, contagion and general equilibrium effects” can complicate causal inference.
What are you measuring?
Let us assume that Y is the outcome variable of interest. Y depends on what state, s, you are in. For instance, in a treatment/no treatment world, Y(s) is the outcome if you would be treated and Y(s’) is the effect if you were not treated. D(s)=1 if you were actually treated and D(s)=0 if you did not receive treatment in the data. Thus, we can measure various things:
- Average Treatment Effect (ATE): E (Y s) − Y(s’)). This is equal to the average effect if all individuals moved from a untreated to a treated state.
- Treatment on the Treated (TT): E[(Y(s) − Y(s’)) | D(s) = 1]. This looks at the average effect of treatment only on those who were treated. This is important if only certain individual select into the treatment group, or if the policy change is only relevant for certain individuals.
- Treatment on the Untreated (TUT): E[(Y(s) − Y(s’)) | D(s) = 0]. It is also possible that treatment can affect those who are not treated. For instance, instituting a work training program for treated individuals may reduce community college enrollment and thus may affect untreated individuals (e.g., if the community college closes from lack of enrollment).
- Policy relevant treatment effect (PRTE):Ep[Y(s)] − Ep’ [Y(s)]. The estimator compares the average outcomes of two different policy choices.
Heckman, James (2008) “Economic Causality” NBER WP #13934.