Value-based payment for providers is often predicated on being able to measure physician quality with a single composite measures. For instance, Medicare’ s Value-Based Payment Modifier (Value Modifier) combines a variety of individual quality metrics across domains to create a single quality score. Payment to physicians is adjusted based on a combination of physician quality and resource use.
The question remains, however, whether these composite scores do a good job of measuring quality. Martsolf, Carle, and Scanlon (2017) notes that this may not always be the case.
However, the creation of such global composite measures is not without risk. When multiple indicators measuring distinct aspects of quality are inappropriately combined into a single measure, the resulting composite measure is not useful or even completely uninterpretable. For example, when indicators measuring unrelated constructs are included in a single score, the high score on some indicators could “hide” low scores on other indicators or vice versa. In this case, the composite measure does not provide a clear quality signal. Inclusion of invalid composite measures could actually hurt quality reporting by leading to physician practice misclassification.”
To take a simple example, Physician A could be excellent at diagnosing a condition but poor at treatment and Physician B could be excellent at treatment but poor at diagnosis. If this information where known to patients, and all patients went to Physician A for diagnosis and Physician B for treatment, they would both be excellent at treating the patients they do even though a composite score could rank both physicians as average. This example captures cases where quality is multidimensional. Quality metrics also must be reliable as well and accurately capture underlying physician quality when measured across a reasonable sample size of patients.
While this argument is theoretical, the Martsolf, Carle, and Scanlon (2017) paper examines whether “HEDIS process indicators [can] be used to measure a single construct for the purpose of creating an internally valid global composite measure of physician practice quality.” The authors use physician quality scores from the Puget Sound Health Alliance’s (PSHA) Community Checkup scorecard. Their analytical approach was as follows:
We used measurement models (e.g., confirmatory factor analysis) to investigate the “ dimensionality” of 19 specific physician practice quality indicators. In this case, dimensionality refers to the extent to which multiple indicators can be used to assess a single construct or multiple constructs. Specifically, the measurement model approach is used to assess the extent to which a single factor accounts for the observed covariance among indicators. Models that “fit well” do a good job of reproducing the observed covariance matrix.
Using this approach, the authors’ results “…did not support the psychometric validity of a single unidimensional composite.” The implications of these results are very interesting. Although many payers and researchers have argued for a single quality measure for physicians and other providers, in practice this single measure may work poorly. Thus, tying reimbursement to quality–if quality is measured using HEDIS process measures–is problematic. In the words of the authors:
Our results may call into question efforts to create and use single unidimensional measures of physician practice quality, as using such measures can lead to spurious conclusions about quality by hiding important aspects of quality and to increased physician misclassification by exacerbating the measurement error inherent in any given measure. Particularly, performance on an invalid global measure of physician practice quality may obscure practices’ performance on more specific areas of clinical care.
For more details, do read the whole article.
- Martsolf, Grant R., Adam C. Carle, and Dennis P. Scanlon. “Creating Unidimensional Global Measures of Physician Practice Quality Based on Health Insurance Claims Data.” Health services research 52, no. 3 (2017): 1061-1078.