Transportability of Comparative Effectiveness Evidence Across Countries

Let’s say that you have an international clinical trial that shows a new drug (SuperDrug) perform better than the previous standard of care (OldDrug). Also assume that individuals with a specific comorbidity–let’s call it EF–respond less well to the SuperDrug treatment. If you live in a country where comorbidity EF is common, how well do you think SuperDrug will work in your population?

This is the question posed by Turner et al. (2023) in their recent PharmacoEconomics paper. The general problem country decisionmakers face is the following:

When study populations are not randomly selected from a target population, external validity is more uncertain and it is possible that distributions of effect modifiers (characteristics that predict variation in treatment effects) differ between the trial sample and target population

Many of you may have guessed that my comorbidity EF actually stands for an effect modifier. Four classes of effect modifiers the authors consider include:

Patient/disease characteristics (e.g. biomarker prevalence),
Setting (e.g. location of and access to care),
Treatment (e.g. timing, dosage, comparator therapies, concomitant medications)
Outcomes (e.g. follow-up or
timing of measurements)

See Beal et al. (2022) for a potential checklist for effect modifiers.

In their paper, the authors examine the problem of transportability. What is transportability?

Whereas generalisability relates to whether inferences from a study can be extended to a target population from which the study dataset was sampled, transportability relates to whether
inferences can be extended to a separate (external) population from which the study sample was not derived.

https://link.springer.com/article/10.1007/s40273-023-01323-1

Key cross-country differences that may make transportability problematic include effect modifiers
such as disease characteristics, comparator therapies and treatment settings.

What is the problem of interest:

Typically, decision makers are interested in the target population average treatment effect (PATE): the average effect of treatment if all individuals in the target population were assigned the treatment. However, researchers commonly have access only to a sample and must estimate the study sample average treatment effect (SATE).

Key assumptions to estimate PATE are included below:

Primarily, there are two key items to address (for RCTs at least): (i) are there differences in the distributions of characteristics between study and population of the target country/geography and (ii) are these characteristics effect modifiers [or for single arm trials with external controls, prognostic factors].

One can test for differences in the distribution of covariates using mean differences of propensity scores, examining propensity score distributions, as well formal diagnostic tests to identify the absence of an overlap. Univariate standardized mean differences (and relevant tests) can subsequently be used to examine drivers of overall differences. If only aggregate data are available, one may be limited to comparing differences in mean values.

To test if a variable is an effect modifier, the authors recommend the following approaches:

Parametric models with treatment-covariate interactions can be used to detect effect modification. Where small study samples result in power issues or where unknown functional
forms increase the risk of model misspecification, machine learning techniques such as Bayesian additive regression trees could be considered, and the use of directed acyclic
graphs may be particularly crucial for selecting effect modifiers in this case.

Approaches for adjusting for effect modifiers vary depend on whether a research has access to individual patient data.

With IPD: Use outcome regression-based methods, matching, stratification, inverse odds of participation weighting and doubly robust methods combining matching/weighting with regression adjustment.
Without IPD. Use population-adjusted indirect treatment comparisons (e.g., matching-adjusted indirect comparisons).

To determine which in-country data–typically real-world data–should be used as the target population, one could consider a variety of tools such as EUnetHTA’s REQueST or the Data Suitability Assessment
Tool (DataSAT) tool from NICE.

You can read more recommendations on how to best validate transportability issues in the full paper here.