The generalized linear model (GLM) is a flexible generalization of ordinary least squares regression. OLS restricts the regression coefficients to have a constant effect on the dependent variable. GLM allows for the this effect to vary along the range of the explanatory variables.
The basic structure of GLM estimator is as follows:
- g(Y) = Xβ + ε
- E(Y) = μ = g-1(Xβ)
To estimate the model, one needs three components:
- Random component, specifying the conditional distribution of the
response variable, given the explanatory variables. Typically, this distribution is from the exponential family.
- A linear predictor which is a linear function of the regressors: η = β0 + β1X1 +…+ βkXk = Xβ
- A link function which transforms the expectation of the response to the linear predictor. In other words, the link function describes the relationship between the linear predictor and the mean of the distribution function. The link function must be invertible.
The table below lists commonly used link functions and their inverse: (source)
|Logit||ln[μi/(1- μi)]||exp(ηi)/[1+ exp(ηi)]|
To estimate the coefficients for a GLM model, most researchers use a maximum likelihood method although Bayesian approaches can also be used. In Stata, the glm command estimates the coefficients from a generalized linear model. In SAS the procedure GENMOD can be used.