How do you evaluate treatment efficacy and safety outside of the clinical trial setting? This is not just a question of academic interest. In last week’s JAMA, Rubin 2021 writes about some of the challenges of evaluating COVID-19 vaccines outside of a clinical trial setting.
In an interesting, two-day workshop last week, the Duke Margolis Center for Health Policy attempted to begin to answer this question. The title of the workshop was “Evaluating RWE from Observational Studies in Regulatory Decision-Making: Lessons Learned from Trial Replication Analyses“
While there was a lot of content covered, there are three main points I’d like to highlight: (i) use caution when dealing with prevalent rather than incident users, (ii) increase transparency, and (iii) RWE and RCT results may differ even if RWE results are valid.
Using incident vs. prevalent patients
In a typical clinical trial, individuals are assigned to different treatment arms and given the relevant medication–typically either the treatment or the placebo. In the real-world, however, one simple way to create “arms” is to assign people who are using the treatment to one arm and people who are using the treatment to another arm [this “assignment” is done in an analytic sense, not in the real world]. If one examines looking at people who initiate the treatment, the real-world data can be very useful. One must consider whether there is systematic bias in how doctors are prescribing the treatment, but based on Miguel Hernán‘s talk on Day 1 of the webinar, identifying incident patients are key.
Another approach would be to compare people using the treatment compared to those who are not, but not requiring them to be incident users. The problem with this is approach is that the sample of prevalent users becomes more biased over time. Let’s say that the new treatment works well for half the users and poorly for the other half. It could be possible that the people for whom the treatment did not work would stop using it. Thus, you would only be left with individuals for whom the treatment worked. Thus, the sample would be biased and you would overestimate the health benefits of initiating treatment.
Thus, two tips for researchers would be: (i) focus on incident users, and (ii) identify the potential direction of any bias in treatment assignment of incident users and conduct sensitivity analysis to try to bound this bias.
When clinical trials are done, the are typically registered at ClinicalTrials.gov. For real-world studies, authors should also publish their study protocols ahead of time and document any deviations from said protocols. Doing this will not only increase the transparency of how real-world evidence is being used, but will also increase the credibility of the study. There are a number of efforts such as the RCT Duplicate, the Observational Patient Evidence for Regulatory Approval and uNderstanding Disease (OPERAND) Project, and efforts by the Yale University-Mayo Clinic’s Center of Excellence in Regulatory Science and Innovation (CERSI) project.
Why RCT and RWE studies may vary
On Day 2 of the workshop, Michele Jonsson-Funk of the UNC Gillings School of Public Health provided a nice overview of some common reasons why RWE and RCT results may differ even if the RCT results themselves are not biased.
- Random error. The first one is the most obvious. Even if the causal parameter of interest in the population are identical in the RWE and RCT, it could be the case that they differ in the samples collected just due to statistical error and random noise. For instance, RWE studies often have larger sample sizes and the causal effect may be more precisely estimated (if the RWE study design is well done).
- Answering different questions. In RCTs, one may be able to estimate the causal effect on the population of interest. In RWE, uptake of the new treatment may be more common for a specific subgroup. Thus, RCTs may often answer the causal effect for the population as a whole whereas the RWE data may be able to estimate the treatment effect on the treated. This later point is a completely valid causal estimate, but one must note that it does differ from the causal effect for the population as a whole.
- Different baseline hazard rates. In clinical trials, individuals with multiple comorbidities are often excluded whereas that is not the case in the real-world. On the other hand, some trials focus on later lines of therapy and individuals may have higher baseline hazards. Either way, the baseline risk–and thus baseline number of bad events (e.g., deaths, hospitalizations) may differ across the trial and real-world population. Even if the relative treatment effect is identical in the trial and real-world, the absolute impact will differ; or it could be the case that the absolute impacts are similar but the relative impacts would differ. Either way, different baseline event rates will complicate direct comparisons between trial and real-world data.
- Different treatment. In the clinical trial, dosing is done very systematically. In the real world, dosing may be more flexible (e.g., patients may switch therapies more often). Thus, the “treatment” being evaluated in the trial and real-world may differ across the settings. For instance, in the COVID-19 vaccine trials, second doses were given shortly after the first; in the real-world, many countries are postponing second doses.
- Outcome differences. Real world data may have much longer (or shorter) follow-up compared to the trial. Further, if one is using claims data, people can disenroll in health insurance for reasons that may or may not be related to the treatment of interest; disenrolling in a clinical trial may occur for very different reasons. Dr. Jonsson-Funk also noted the issue of competing risks may differ in the trial compared to real-world as well.
- Adherence differs. As is well known, treatment adherence is typically much worse in the real world. Patient often have cost-sharing burdens, visits to the clinic are less frequent, and the motivation of real-world patients may be lower than those in clinical trials. My own research notes the low real-world adherence levels, particularly for patients with multiple chronic conditions.
The workshop videos are now being posted online and I encourage those of you interested to take a look. Interesting throughout.