Pay-for-performance (P4P) has long been tauted as a means to improve quality. However, since the Holmstrom and Milgrom (1991) paper on multitasking, it has been known that compensating individuals on one measured dimension can compel them to substitute effort away from unmeasured dimensions. For instance, if a mortgage broker is compensated only for the number of new mortgages he secures and not the credit worthiness of the borrower, it is likely that they will bring in borrowers with bad credit. In the healthcare setting, compensating doctors to do certain tests (e.g., test A1C levels) may increase the probability the doctor conducts the A1C test for diabetics, but may decrease the amount of time the physician dedicates towards counseling the patient to lose weight or stop smoking.
A paper by Glazer, McGuire and Normand (2008) tries to remedy this problem. Take a look at the following Table. We see that discharges one and two are observable and can be measured. On the other hand, discharge 3 is unmeasurable. For instance, discharges one and two could represent patient mortality with respect to different types of cardiac operations. On the other hand “Discharge Three can be thought of as representing medical discharges associated with Skin, Subcutaneous Tissue, and Breast Disorders, for which in-hospital mortality is very low, that mortality would not be a feasible (or even valid) measure of quality.”
How should the hospital weight the overall quality score between outcomes one and two. The authors of this paper claim that more weight should be placed out outcome one. Why?
Discharge one and three have the same inputs. Thus, putting more weight on discharge one, will compel the hospital to increase inputs associated with a better outcomes associated with discharge one. Since discharge one and three share inputs, this will lead to an increase in quality improvement for discharge 3. For instance, an increase in nursing staff or computerized records may increase productivity for multiple observed an unobserved outcomes. On the other hand, if discharge 2 depends on the purchase of a machine that test for only one condition, less weight should be placed on high levels of discharge 2 since there are less spillovers.
A necessary condition for this type of measurement to work is that all inputs must be used in at least one of the observable discharge types. Further complications arise from the fact that, not all providers use the same inputs to treat patients with the same disease. Also, “…the existing evidence supporting commonality is too general to be usable yet as a basis for modifying proﬁle construction.”
Nevertheless, thinking about how quality improvements can spillover to other treatments is an important framework to have whether policy-makers are creating P4P metrics.
- Glazer, Jacob; McGuire, Thomas; and Normand, Sharon-Lise T. (2008) “Mitigating the Problem of Unmeasured Outcomes in Quality Reports,” The B.E. Journal of Economic Analysis & Policy: Vol. 8 : Iss. 2 (Contributions), Article 7.
- Holmstrom, B. and P. Milgrom (1991). “Multitask Principal-Agent Analyses: Incentive Contracts, Asset Ownership, and Job Design,” Journal of Law, Economics and Organization, 7 (Special Issue): 24-52.