Ordered Probit (or Logit) Estimation

What is one to do when the dependent variable under investigation is categorical?  Well if these categories are ordered, then an ordered probit (or logit) estimation technique is a sensible means for estimation.  An example where ordered probit estimation should be used is for an integer index ranking of physician quality between one and five.    On the other hand, if the dependent variable is the number of surgeries a patient has, a Poisson estmation methodology would be best since ‘y’ is a count variable. 

Let us continue with the physician ranking example.  Suppose there are three ranking categories: excellent (2), average (1), and poor (0).  We assume there is a latent variable y* which is a function of a vector of covariates (‘x‘).  The latent variable determines which category the physician falls into.

  • y* = + ε; ε|x~N(0,1
  • y=0 if  y*<α_1
  • y=1 if  α_1
  • y=2 if  y*>α_2

Now we can calculate the probabilities that a physician will fall into each category.

  • P(y=0|x)=P( + ε<α_1)=P(ε<α_1- )=Φ(α_1-)
  • P(y=1|x)=P( + ε<α_2) - P( + ε<α_1) = Φ(α_2-)-Φ(α_1-)
  • P(y=2|x)=P( + ε>α_2)=1-Φ(α_2-)

Using maximum likelihood estimation, we can now derive the α and β parameter vectors.  The log-likelihood function becomes:

  • l(α,β)=1{y_i=0}log[Φ(α_1-)] + 1{y_i=1}log[Φ(α_2-) – Φ(α_1-)] + 1{y_i=2}log[1-Φ(α_2-)]

If we instead assume that the cdf of ε|x is ‘exp()/[1+exp()]’, then we can use the logit model instead. 

The end statistic of interest is P(y=j|x).  This can be calculated as follows:

  • ∂p_0(x)/∂x_k= -β_kφ(α_1-)
  • ∂p_1(x)/∂x_k= β_k[φ(α_1-)-φ(α_2-)]
  • ∂p_2(x)/∂x_k= β_k[φ(α_2-)]

For more information on ordered probits, see the Tokyo Climate Center’s ordered probit explanation as well as the treatment in Econometric Analysis of Cross Section and Panel Data (pp. 504-509) by Wooldridge.