Weighting has a number of uses. For instance, one can use weighting to estimate population sample statistics. The Panel Study of Income Dynamics (PSID) for instance oversamples households with low income. To get nationally mean values, one must reweight the PSID values, either using survey weights or matching to a nationally representative sample such as the CPS or ACS.
Researchers also use weighting when estimating causal effects. A recent working paper by Solon, Haider and Wooldridge (NBER 2013) examines whether weighting is useful in the following 3 applications: (1) to achieve precise estimates by correcting for heteroskedasticity, (2) to achieve consistent estimates by correcting for endogenous sampling, and (3) to identify average partial effects in the presence of unmodeled heterogeneity of effects. I discuss each of these situations below.
Heteroskedasticity occurs when one subpopulation has more variability than another. This heteroskedasticity can affect the precision of regression coefficients. However, can one use weighting to correct heteroskedasticity? The authors state:
Now suppose that one estimates that population regression by performing ordinary least squares (OLS) estimation of the regression of log earnings on the race dummy, years of schooling, and a quartic in potential earnings for black and white male household heads in the PSID sample…this estimate [the coefficient on the race dummy] might be distorted by the PSID’s oversampling of low-income households, which surely must lead to an unrepresentative sample with respect to male household heads’ earnings…one can apply a reverse funhouse mirror by using weights. In particular, instead of applying ordinary (i.e., equally weighted) least squares to the sample regression, one can use weighted least squares (WLS), minimizing the sum of squared residuals weighted by the inverse probabilities of selection.
Compared to Wyoming, California offers many more observations of the individual-level decision of whether or not to divorce, and therefore it seems at first that weighting by state population should lead to more precise coefficient estimation. And yet, for the specification shown in Table 1, it appears that weighting by population harms the precision of estimation.
In many cases, however, using WLS actually harms the precision of the estimates. This occurs because “…in many practical applications, the assumption that the individual-level error terms vij are independent is wrong. Instead, the individual-level error terms within a group are positively correlated with each other because they have unobserved group-level factors in common. In current parlance, the individual-level error terms are ‘clustered.'” Thus the true individual error term may be better modelled as:
- vij = ci + uij
where j indexes individuals and i indexes the groups. The cluster level variance causes the WLS to be relatively imprecise.
What should one do to address heteroskedasticity in this case?
One way to go is to…use the OLS residuals to perform the standard heteroskedasticity diagnostics we teach in introductory econometrics. For example, in this situation, the modified Breusch-Pagan test described in Wooldridge (2013, pp. 276-8) comes down to just applying OLS to a simple regression of the squared OLS residuals on the inverse within-group sample size 1/Ji, [where J is the size of a the group to which observation i belongs.] The significance of the t-ratio for the coefficient on 1/Ji indicates whether the OLS residuals display significant evidence of heteroskedasticity… A remarkable feature of this test is that the estimated intercept consistently estimates (σc)2, and the estimated coefficient of 1/Ji consistently estimates (σu)2.
Other recommendations include:
- Due to inevitable uncertainty about the true variance structure, report heteroskedasticity-robust standard error estimates.
- Report both weighting and unweighted estimates since the differences between OLS and WLS estimates can be used as a diagnostic for model misspecification or endogenous sampling
Endogenous sampling occurs when the criteria used to create the sample are correlated with the error term of one’s regression. For instance, if one conducted an earnings regression of various (exogenous) factors on income using PSID data, the resulting coefficients would be inconsistent because income itself is used to determine which individuals participate in the survey. [The PSID oversamples low-income individuals].
In the presence of endogenous sampling, estimation that ignores the endogenous sampling generally will be inconsistent. But if instead one weights the criterion function to be minimized (a sum of squares, a sum of absolute deviations, the negative of a log likelihood, a distance function for orthogonality conditions, etc.) by the inverse probabilities of selection, the estimation becomes consistent.
On the other hand, if the sampling probabilities are independent of the error term—for instance, if they vary only on the basis of the explanatory variables in the regression equation, then the estimates would be consistent. In fact weighting would be unnecessary and harmful for precision.
- If the sampling rate varies endogenously, estimation weighted by the inverse probabilities of selection is needed on consistency grounds.
- The weighted estimation should be accompanied by robust estimation of standard errors.
- When the variation in the sampling rate is exogenous, both weighted and unweighted estimation are consistent for the parameters of a correctly specified model, but unweighted estimation may be more precise.
Weighting to Estimate Partial Effects
Many times, the causal effect of one variable on another will vary across different subpopulations. For instance, in a drug trial, the study compares the average effect of being in the treatment versus control arms on drug outcomes. However, if the drug has heterogeneous treatment effects on outcomes depending on age, one may want to estimate the average partial effects of the drug.
Assume that the sample has more old people than young people relative to the population at large. In this case, OLS would not be able to estimate the partial effect since the old people are over-represented in the sample. Additionally:
In least squares estimation, observations with extreme values of the explanatory variables have particularly large influence on the estimates. As a result, the weighted average of the rural and urban effects [in my example, young and old] identified by OLS depends not only on the sample shares of the two sectors, but also on how the within-sector variance of X differs between the two sectors… By reweighting the sample to get the sectoral shares in line with the population shares, WLS eliminates the first reason that OLS fails to identify the population average partial effect, but it does not eliminate the second. As a result, the WLS estimator and the OLS estimator identify different weighted averages of the heterogeneous effects, and neither one identifies the population average effect.
- Do not believe that in the presence of unmodeled heterogeneous effects, weighting to reflect population shares generally identifies the population average partial effect.
- Contrasting the weighted and unweighted estimates can serve as a test for misspecification. The failure to model heterogeneous effects is one sort of misspecification that can generate a significant contrast.
- Where heterogeneous effects are salient, study the heterogeneity don’t ust try to average it out.
In situations in which you might be inclined to weight, it often is useful to report both weighted and unweighted estimates and to discuss what the contrast implies for the interpretation of the results. And, in many of the situations we have discussed, it is advisable to use robust standard error estimates.
- Gary Solon, Steven J. Haider, Jeffrey Wooldridge (2013). WHAT ARE WE WEIGHTING FOR? NBER Working Paper 18859.