Adjusting Nursing Home Quality Measures

The Nursing Home Compare website provides consumers with quality ratings of thousands of nursing homes (NHs) around the country. Are these ratings accurate? Could they be improved?

This is the question which researchers Arling, Lewis, Kane, Mueller and Flood analyze in their 2007 HSR paper. The authors find 2 major flaws with the rankings: 1) there is weak risk-adjustment and thus the ratings do not fully take into account the underlying characteristics of the population being served by the NH, and 2) there are no precision measures included in the rankings.

In order to improve the rankings, the authors use an empirical Bayesian (EB) shrinkage model with risk adjustment.

In the empirical Bayesian model, an empirical distribution serves as the prior. When new data are collected, these serve as the “Likelihood” or posterior distribution. Confidence intervals are constructed around the EB estimates from the posterior distribution. In this paper, the authors have data at both the resident and facility level. The prior distribution is estimated from using the total nursing home resident population. The posterior distribution is based on facility level data and the Likelihood function is the product of the two distributions. The authors explain in more detail:

“The influence of the facility’s observed QM [quality measure] rate on the posterior estimate will depend on the size of the facility and the amount of QM variation within and between facilities. The QM rates in larger facilities will be more certain (e.g., have lower standard errors) than in smaller facilities and, thus, will have greater weight or influence on the overall posterior (EB) estimate. Also, QMs with less variation between facilities have a more certain empirical prior (population average QM rate), which then has a greater influence on the posterior. As the prior tends to pull the posterior estimate toward the population mean, EB estimates are referred to as ‘shrinkage’ estimates. “

Using the EB methodology, the standard deviations for most QMs “decreased considerably.” Smaller facilities experienced more shrinkage towards the mean due to their small number of residents. This is logical since one outlier patient would have a much higher impact on average QM rankings in a NH with 10 residents than another facility with 100 residents.

The risk adjustment is calculated in three ways: 1) simply excluding the sickest patients (i.e.: those with end-stage diseases or are in a coma), 2) group the sample in different risk strata, and 3) use a logistic regression to estimate a risk adjustment factor for each patient. Each of the risk adjustment methods was found to have a strong effect on the rankings.

One problem the authors acknowledge is that using EB and risk adjustment may let some facilities ‘off the hook.’ Small facilities with sicker than average patients may have low QM score because of an unlucky spate of ill patients or they may truly be poor facilities. Bayesian shrinkage moves their scores closer to the mean, so these facilities’ QM ratings are less responsive to quality improvements or backslides than larger facilities.

Arling; Lewis; Kane; Mueller and Flood (2007) “Improving Quality Assessment through Multilevel Modeling: The Case of Nursing Home Compare” Health Services Review, 42:3, pp. 1177-1199.