A paper by Thomas, Grazier, and Ward (2004) analyzes a variety of risk adjustment software products. Using these six risk adjustment products to calculate physician efficiency scores, they found “moderate to high levels of agreement were observed among the six risk-adjusted measures of practice efficiency.” However:
“And even though our analyses suggest that 50 percent to 60 percent of adult PCPs identified by their system as being high outliers are likely to be identified by other profiling systems as well, the client has no way to know which of the identified outliers are the ones that multiple systems would agree on. Thus the profiling client must deal with practice efficiency rankings knowing that, in all likelihood, 40 percent to 50 percent of PCPs identified as high outliers are actually not among the least efficient 10 percent of primary care physicians.”
The authors also compare two quality score metrics. The first is the ratio of the physician’s observed cost with the expected cost based on the physician’s patient’s risk scores. The O/E score is equal to:
Above, yk is physician k‘s observed score and Yk is their estimated score. The authors believe, however that the O/E score is not ideal. It is biased against providers who have a small sample size of patients. Thus, physician’s with smaller patient panels in the data set are more likely to be considered outliers. On the other hand, the authors advocate using a standardized cost difference (SCD). The SCD is calculated as follows:
The SCD measure explicitly takes into account the physician’s sample size. A large sample size will move the SCD more towards the difference in observed and expected costs; a small sample size will move the SCD score closer to the mean of 0.
Below is a list of the six risk adjustment tools used in the paper:
- Adjusted Clinical Groups from Johns Hopkins University. Adjusted clinical groups cluster health plan members having similar comorbidities into groups that have similar resource requirements and clinical characteristics. The ACG Case-Mix System then uses a branching algorithm to place each patient into one of 82 discrete, mutually exclusive categories based on the mix of clinical groups experienced during the time period under study.
- Burden of Illness Score from MEDecision, Inc. This system is based on MEDecision’s Practice Review System (PRS), which partitions care into episodes of illness and assigns services, severity levels, and medications to these episodes. The BOI Score is a linear-scaled measure that indicates relative health care cost risks associated with the particular mix of episodes experienced by a patient during a defined time period.
- Clinical Complexity Index from Solucient, Inc. The CCI methodology considers age, severity, comorbidity, hospital admissions, and categories of diagnoses (acute, chronic, mental health, and pregnancy) to assign patients into mutually exclusive CCI risk categories. Although the system provides for 1,418 different categories, 95 percent of patients fall into just 45 of these.
- Diagnostic Cost Groups from DxCG, Inc. The DCGsystem includes a whole family of multiple linear regression models.
- Episode Risk Groups from Symmetry Health Systems,Inc. Like BOI Score, ERGs are episode-based. The episodes underlying ERGs are created using Symmetry’s Episode Treatment Groups (ETGt) methodology, a basic illness classification system that uses a series of clinical and statistical algorithms to combine related services into more than 600 mutually exclusive and exhaustive categories. For a given patient, episodes experienced during a time period are mapped into 119 Episode Risk Groups, and then a risk score is determined based on age, gender, and mix of ERGs. For our analyses, we used the ERG retrospective risk score.
- General Diagnostic Groups from Allegiance LLC. General Diagnostic Groups were developed using the Agency for Health Care Policy and Research’s Clinical Classification Software (CCS). CCS aggregates individual ICD-9-CM codes identified on health care claims into 260 broad diagnosis categories for statistical analysis and reporting. The GDG system then lumps together CCS categories considered to be clinically similar and to have similar associated per-patient charges into 57 diagnostic categories. These 57 diagnostic categories are used as dummy variables in a multiple regression model for predicting health care costs.
- J. William Thomas, Kyle L. Grazier, and Kathleen Ward (2004) “Comparing Accuracy of Risk-Adjustment Methodologies Used in Economic Profiling of Physicians” HSR, 39(4):985-1004.