CER Econometrics

Use of expert elicitation for health care modelling studies

While robust data is preferred when parameterizing economic models, this information is not always available. For instance, it may be unclear the degree to which clinical trial results would translate to the real world. Or, clinical trial results may need to be extrapolated beyond the time period of the trial. Or, there may simply be no data available for certain parameters of interest.

In many cases, modelers turn to expert elicitation to help them parameterize the model. In some cases this is done informally (e.g., interviews); in other cases, structured expert elicitation is used (e.g., Delphi Panel, SHELF method)

A paper by Cadham et al. (2022) conducted a systematic literature review of health care models with expert elicitation methods. Here is what they found

  • Frequency of structured expert elicitation. Of the 152 studies identified, 40 used a structured expert elicitation approach, whereas 112 used an informal or indeterminate method. Use of clinical experts (n=36) were most common when expert elicitation was used with an unstructured/indeterminate expert elicitation.
  • How were experts selected? How experts were selected was unreported in 20 of the 40 studies. Eight studies reported a selection method that used convenience sampling and 4 studies used publication records to identify experts. Other methods included snowball sampling and involvement in the product’s clinical trial, or expertise in a specific country. The number of experts used ranged from range of 1 to 30.
  • Elicitation methods. The most frequent methods were Delphi or modified Delphi process (n = 10) and SHELF (n=2). Most elicitations (n=13) were done virtually or on the phone, n=5 were conducted in person. Web- or spreadsheet-based tools were used for data collection in 9 studies. ” Eleven studies elicited point estimates from experts. Of those, the most frequent estimates were the median (n = 5), mean (n = 3), and most likely (n = 3) values. Expert uncertainty was elicited by asking experts to provide ranges or different quantiles in 7 studies, whereas 7 other studies used the histogram method or a modified histogram method (in which experts mark a frequency chart) to elicit expert uncertainty.”
  • Aggregation. Some studies (n=4) required experts to reach a consensus together (i.e., behavioral aggregation) while the other studies used mathematical aggregation, such as linear pooling (n = 18). “Performance measures, in which the experts’ estimates are weighted based on their responses to seed variables (also known as calibration questions) with answers known to the facilitator but unknown or not readily available to the experts, were only used in 2 studies.”
  • Uncertainty. Most commonly this was done with one-way or probabilistic sensitivity analysis (n=26) but in some (n=3) studies responses from each expert were simulated independently and then aggregated.

More details from the full study are here.