Two‐Stage Residual Inclusion: An Overview

Often times, researchers want to measure the effect of certain interventions in the real-world. Doing this in practice is often difficult.  For instance, consider measuring health outcomes among individuals who visit doctors compared to those who don’t.  Inevitably, individuals who visit doctors will have worse outcomes.  Why?  Are doctors killing patients?   This is clearly a…

Nested g-computation procedure

What is the difference in health care cost when two different treatments are used?  This question is challenging because cumulative health care cost is often censored either by death or lack of continuous enrollment.  Lin (2000) addressed this issue in his 2000 paper (see paper and my blog write-up). The problem with this approach, however,…

Dealing with time-censored cost data

We health economists deal with medical cost data all the time.  One challenge we all face is that the medical cost data is often censored.  The censoring may occur because the patient dies.  If you are using administrative health insurance claims data, censoring may occur because people switch their health plan and leave your sample.…

The problem with odds ratios

Many researchers use logit models to estimate the effect of specific variables on a binary (i.e., 0 or 1) outcome.  How are these models derived?  How are odds ratios calculated?  What are the problems with odds ratios?  I answer all these questions in this post, following a lovely summary by Norton and Dowd (2018). Deriving…

Longitudinal Modelling of Healthcare Expenditures: Challenges and Solutions

Previous analyses–such as Basu and Manning 2009–have addressed the problem of mass of health care expenditures around $0. In typical economic analyses, we assume that the dependent variable is normally distributed. In the case of health care expenditures, however, a large number of people have $0 expenditures (i.e., healthy individuals). Further, among sick individuals that…

Stratified Covariate Balancing

When selection bias is an issue, many researchers use propensity score matching to insure that observable differences in patient characteristics are balanced between individuals who receive a given treatment and those who do not.  If unobservable characteristics are correlated with observable characteristics, propensity score matching generally works well. Cases where propensity score matching does not work well include…

What is a Pseudo R-squared?

When running an ordinary least squares (OLS) regression, one common metric to assess model fit is the R-squared (R2). The R2 metric can is calculated as follows. R2 = 1 – [Σi(yi-ŷi)2]/[Σi(yi-ȳ)2] The dependent variable is y, the predicted value from the OLS regression is ŷ, and the average value of y across all observations…