On Saturday, UCSD Economics Professor Dr. Hal White passed away after an extended struggle with cancer. This is a sad day as Hal was one of my former professors. Here is an except from obituary written by Dr. Jim Hamilton regarding Dr. White’s work.
Hal was one of the world’s leading econometricians. One of his core beliefs was that the models and assumptions that we bring to the data are inevitably flawed and misspecified in some way. It might seem that if you believe that, there’s no hope in trying to do econometrics. But some of Hal’s most remarkable discoveries concerned how to form valid inference even if part of what you assumed was fundamentally wrong.
An example arises in ordinary regression analysis, in which a common assumption is that the variance of the regression model’s error is the same for all observations. Suppose that assumption is wrong, and instead the variance depends in an unknown way on the various explanatory variables. Hal found that it is possible to characterize how that dependence would affect the reliability of the inference from the regression, and construct modified t-statistics or F-statistics that take this into account. This was such a useful contribution that it is now a standard option a user can easily select in any decent regression software package. Hal once lamented to me that this was an example of a contribution that became so successful and widespread that people forgot who came up with it in the first place. Hal’s proposed adjustments are often described as “robust standard errors” or “heteroskedasticity-consistent standard errors”, though I have always introduced them to my students as “White standard errors”.
Hal also showed that this idea generalizes much more broadly, as spelled out in his classic article, Maximum Likelihood Estimation of Misspecified Models. The maximum likelihood estimator (affectionately known as the “MLE”) refers to a particular estimate of parameters that is derived from the claim that the researcher knows the family from which the true probability distribution that generated the data comes. Hal’s remarkable contribution here was to examine the properties of that inference if you have assumed the wrong class of probability distributions. He referred to that procedure (using an MLE that is based on an incorrect assumption about the probability distribution) as “quasi maximum likelihood estimation.” Again establishing the properties of such inference seems like (and is!) an astounding result. But when you get into the math, you discover that it makes perfect sense. For example, one could assume (mistakenly, perhaps), that the error terms in the regression model came from a Normal distribution with mean zero and constant variance. If your assumptions were correct, then the MLE turns out to be the usual formula for regression estimation. However, even if your assumption about the probability distribution is wrong, one can show that what you were calling the MLE is usually still giving you a decent estimate of something, namely, an estimate of the best prediction of y if you want to base your prediction on a linear function of x. In fact, White’s robust standard errors for ordinary regression prove to be a special case of his general results for quasi maximum likelihood estimation.
Hal had a host of other very fundamental contributions, ranging from the recognition that neural networks are essentially a statistical inference problem, elegant contributions to asymptotic theory, any number of extremely useful specification tests, and his most recent interest in some very deep ideas about causality and inference. There are I suspect a great many papers by Hal and his co-authors that have not yet been published, but soon will be, as he remained astonishingly productive up to the end, writing papers faster than the journals could publish them.