How do HEOR studies handle missing data?

That is the questioned answered in a paper by Mukherjee et al. (2023). The authors define an “HEOR study” for this paper as

…real-world evidence studies that conducted a secondary/post-hoc analysis using randomized
controlled trial (RCT) data, and a within-trial cost-utility analysis in which the outcome of interest was costs or PROs including preference-based utilities (e.g., EQ-5D).

The most appropriate approach for imputing missing data depends on the assumptions about how the data are missing:

  • Missing completely at random (MCAR): the observed or unobserved values of all variables in a study do not have any influence on the probability of an observation being missing
  • Missing at random (MAR). The probability of missing data for a particular variable is associated with the observed values of variables (either observed values of other variables in the dataset or observed values for the same variable at previous timepoints) in the dataset, but not upon the missing data. One cannot test for whether MAR holds in a dataset.
  • Missing Not at Random (MNAR). In this case, the probability of missing data for a particular variable is related to the underlying value of that specific variable. MNAR can be ignorable (when missing values occur independently of the data collection process) or non-ignorable (when there is a structural cause to the missingness mechanism that depends on unobserved variables or the missing value itself).

To address the missing data, various techniques are available including: complete-case analysis (CCA), available-case (AC) analysis, multiple imputation (MI), multiple imputation by chained equation (MICE), and predictive mean matching.

To better understand which approaches are commonly used in health economics and outcomes research (HEOR), the authors conducted a systematic literature review in PubMed and examined what type of statistical methods were used to address missing cost, utility or patient-reported outcome measures.

The authors found that multiple imputation, multiple imputation by chained equation and complete-case analyses were most commonly used:

From 1433 identified records, 40 papers were included. Thirteen studies were economic evaluations. Thirty studies used multiple imputation with 17 studies using multiple imputation by chained equation, while 15 studies used a complete-case analysis. Seventeen studies addressed missing cost data and 23 studies dealt with missing outcome data. Eleven studies reported a single method while 20 studies used multiple methods to address missing data.

The authors note that while they found a large amount of HEOR methodological literature on how to handle missing data in a RCT context; however, there were very few studies that have attempted to actually implement these recommendations and impute the missing data. You can read the full article here.