Diagnostic Testing Physician Compensation Public Health Public Policy

Patient care under uncertainty: Or why I learned to stop loving clinical practice guidelines

The National Cancer Institute defines personalized medicine as “uses information about a person’s genes, proteins, and environment to prevent, diagnose, and treat disease”. In practice, however, patient treatment is rarely personalized by individual since evidence to support complete personalization is rarely available. Personalization in practice more often means care that varies with some individual characteristics. The President‘ s Council of Advisors on Science and Technology noted that personalized medicine really simply requires: “the ability to classify individuals into subpopulations that are uniquely or disproportionately susceptible to a particular disease or responsive to a specific treatment.

Based on this premise, it appears we have a clear solution: patients should be divided into groups having the same observed covariates and all patients within a given covariate group should receive the care that yields the highest within‐group mean welfare. This is the goal of evidence medicine. Find the best treatment for each patient group and insure physicians provide this care to patients within each group.

In what I believe will be a landmark paper, Charles Manski (2018), however, is skeptical of top-down efforts to enforce evidence-based medicine.  Further, while articles by numerous researchers (including those from the Dartmouth Atlas) have argued that the key to improving health care is to reduce unnecessary variation in health care provision, Manski argues that under uncertainty variation–or as he calls it diversification–is vital to resolve real-world uncertainty.  I encourage you to read the full paper, but I outline some key points below.

Guidelines are not mandates, but…

Although clinician guidelines are typically seen as guidelines, in practice, they may be much more binding.  Health plans and government policies may only reimburse for care that follows practice guidelines.  Thus, while the physicians making the guidelines may clearly recognize that they are suggestions for best practice that need to be tailored to individual situations, bureaucrats from health insurance plans and the government may try to enforce these guidelines so as to avoid paying for care outside of practice guidelines.

The effectiveness of clinical guidelines depends on the information available only to physicians…

Manksi considers two different forms of treatment.  In his example he examines the choice between active surveillance (option t=A) and aggressive treatment (option t=B) , but equivalently these could be two types of treatment.  The probability a treatment will work will depend on patient covariate vectors x and w, where is observed by both the clinician and centralized decisionmakers, but covariates w are only observed by the physician.  To the extend that the unobserved patient covariates meaningfully impact treatment effectiveness (or patient utility of a treatment), centralized evidence-based guideliness will not be useful for treating individual patients; on the other hand, if the vector is the empty set and the central planner has the same access to information as the physician, than evidence-based guidelines will be useful.  In practice, physicians most often have information not available to guidelines-makers for individual patients and thus physicians will be able to better treat patients using their own discretion rather than blindly following clinical practice guidelines.

…and how well physicians follow guidelines conditional on available information

The Manksi article notes that:

…empirical psychological research comparing evidence‐based statistical predictions with ones made by clinical judgment has concluded that the former consistently outperforms the latter when the predictions are made using the same patient covariates. The gap in performance persists even when clinical judgment uses additional covariates as predictors.

For instance, consider the case of a physician deciding whether or not to have the patient do a diagnostic test.  Phelps and Mushlin (1988) showed that physicians can make this decision optimally if they know: (a) expected patient utility of each treatment, in the absence of testing; (b) the probability distribution for the test result, and (c) expected patient utility with each treatment, with knowledge of the test result.  It is likely that physicians only have partial knowledge of these parameters.

Does this mean that despite theoretical evidence that clinical practice guidelines are inferior, in practice they may be superior?  Manski is still skeptical.

Psychologists have compared the accuracy of risk assessments and diagnoses made by statistical predictors and by clinicians, but they have not compared the accuracy of evaluations of patient preferences over (illness,treatment) outcomes. Thus, the literature has generated findings that may be informative about the accuracy of statistical and clinical assessments of Pxw(·) but not uxw(·, ·).

In short, even if physicians do not fully leverage the available information at hand and clinical guidelines may better better predict patient outcomes, physicians still have an advantage if they can better know individual patient (or classes of patient) preferences over different outcomes or health states.

The evidence underlying clinical practice guidelines has its own limitations

Further, clinical practice guidelines are themselves imperfect.  Guidelines are most often based on evidence from randomized controlled trial (RCTs).  While RCTs are the gold standard for causal inference, they are imperfect.  For instance, they typically do not capture long-run outcomes, the patient population for the trial may not be representative of patients who use the treatment in the real world, the trials may suffer from patient attrition, and treatment protocols in RCTs may not reflect real-world practices.  Previous Manski work has called applying RCT results to real-world long-term outcomes “wishful extrapolation”.

Manski raises some other concerns as well. The first is that the hypothesis testing used for FDA approval picks a sample size for some clinically meaningful threshold. The hypothesis test is then applied assuming a 5% type I error and a 20% type II error. While this approach helps for regulatory purposes by applying a consistent standard, in some cases, a type I error may be much worse than a type II error and in other cases the reverse may be true. Second, the clinical trial journal articles upon which guidelines are based typically do not report outcomes based on patient covariates or if they do only do so for a limited set of covariates and hardly ever report outcomes interacting covariates. While reporting results for patient covariates is becoming more common, if the absence of reporting these results the ability to apply personalized medicine based on clinical trial results is limited. The YODA Project advocates for sharing of clinical trial data and having more investigators submit clinical trial data to open-source repositories would help to further the mission of personalized data.

What to do?

So both physicians and clinical guidelines are imperfect. We already knew that. What should we be doing?

Manski recommends applying a framework from decision theory under uncertainty. The clinician should first eliminate dominated treatment options. Then, they should choose among the undominated options. This should simple but Manski notes that:

…there is no optimal way to choose among undominated alternatives. There are only various reasonable ways, each with its own properties.

Many statisticians recommend using a Bayesian approach. In this framework, the decision-maker would place a subjective probability distribution on the unknown parameters  and maximize subjective expected utility. While this approach is attractive, how does one pick the subjective probability distribution as that in and of itself is a form of knowledge?  Thus, Bayesian approaches are attractive for statisticians, but may be more difficult to implement in practice.

Other alternatives include the maxmin (MM) and mini-max-regret (MR).

The maximin criterion chooses an action that maximizes the minimum welfare that might possibly occur. The mini-max‐regret criterion considers each state of nature and computes the loss in welfare that would occur if one were to choose a specified action rather than the one that is best in this state. This quantity, called regret, measures the nearness to optimality of the specified action in the state of nature…To achieve a uniformly satisfactory result, he computes the maximum regret of each action; that is, the maximum distance from optimality that the action would yield across all possible states of nature. The MR criterion chooses an action that minimizes this maximum distance from optimality.

While applying the MR is useful in theory and may replicate some of physician and patient’s own decision-making, an alternative approach is to use the empirical success (ES) rule. The empirical success rule simply chooses the treatment within the RCT with the highest observed average outcome in the trial.   Previous research has shown that ES generally chooses the same treatment option as MR.

The benefits of variation in practice patterns

Manski argues that variation in practice patterns is good for two reasons. The first is diversification. If we knew with certainty that treatment A was better than treatment B, then everyone should use treatment A. In practice, there is uncertainty. Thus, if an RCT says A is better than B, but we later learn that that is not the case or that A is only superior for some patient populations, it is better for physicians to be able to access treatment B as an alternative option. In a simple case, consider that treatment A is superior to B, but later manufacturing of A has lead it to become tainted and poisonous. Having access to treatment B is beneficial in this case.

A second benefit is learning. As the real-world sample size is likely to be larger than the clinical trial setting, one could benefit from variation in practice patterns to monitor real world outcomes. In the case of drug treatments, small RCTs may not be sufficient to discover rare but very severe side effects for both treatment arms. Thus, variation in practice patterns allows for this real-world learning.

What do others think of the article?

Commentaries on Manski’s paper from Emma McIntosh, Karl Claxton, and Anirban Basu are also available in the same issue of Health Economics.  McIntosh summarizes the article and is generally supportive of Manski’s ideas, whereas Claxton–while agreeing with many of Manski’s principles–is more supportive of the use of need for guidelines in practice and the use of Bayesian decision theory to create these guidelines.  Basu mentions the use of passive personalization (see Basu et al. 2013) whereby physicians learn‐by‐doing and make individualized decisions for patients evening the absence of scientific evidence.  Clinical practice guidelines could serve as a baseline guidelines for physicians who then use their own experience to in essence update their prior based on their own personal history with different treatments and patient characteristics.


1 Comment

Leave a Reply

Your email address will not be published. Required fields are marked *