Nested g-computation procedure

What is the difference in health care cost when two different treatments are used?  This question is challenging because cumulative health care cost is often censored either by death or lack of continuous enrollment.  Lin (2000) addressed this issue in his 2000 paper (see paper and my blog write-up). The problem with this approach, however,…

Dealing with time-censored cost data

We health economists deal with medical cost data all the time.  One challenge we all face is that the medical cost data is often censored.  The censoring may occur because the patient dies.  If you are using administrative health insurance claims data, censoring may occur because people switch their health plan and leave your sample.…

The problem with odds ratios

Many researchers use logit models to estimate the effect of specific variables on a binary (i.e., 0 or 1) outcome.  How are these models derived?  How are odds ratios calculated?  What are the problems with odds ratios?  I answer all these questions in this post, following a lovely summary by Norton and Dowd (2018). Deriving…

Longitudinal Modelling of Healthcare Expenditures: Challenges and Solutions

Previous analyses–such as Basu and Manning 2009–have addressed the problem of mass of health care expenditures around $0. In typical economic analyses, we assume that the dependent variable is normally distributed. In the case of health care expenditures, however, a large number of people have $0 expenditures (i.e., healthy individuals). Further, among sick individuals that…

Stratified Covariate Balancing

When selection bias is an issue, many researchers use propensity score matching to insure that observable differences in patient characteristics are balanced between individuals who receive a given treatment and those who do not.  If unobservable characteristics are correlated with observable characteristics, propensity score matching generally works well. Cases where propensity score matching does not work well include…

What is a Pseudo R-squared?

When running an ordinary least squares (OLS) regression, one common metric to assess model fit is the R-squared (R2). The R2 metric can is calculated as follows. R2 = 1 – [Σi(yi-ŷi)2]/[Σi(yi-ȳ)2] The dependent variable is y, the predicted value from the OLS regression is ŷ, and the average value of y across all observations…

Optimal Matching Techniques

In randomized controlled trials, participants are randomized to different groups where each group receives a unique intervention (or control). This process insures that any differences in the outcomes of interest are due entirely to the interventions under investigation.   While RCTs are useful, they are expensive to run, are highly controlled and suffer from their own…