Econometrics

# ANOVA

Let us say that you are a hospital administrator.  You are very clever and have come up with a system to score the quality of the work done by the physicians at your hospital.  To simplify things, lets assume that you only have 3 physicians who work at your hospital.  The physician’s scores are as follows:

• Dr. Albert: 76, 85, 91, 67, 73
• Dr. Burns: 92, 90, 60, 79, 75
• Dr. Collin: 50, 80, 83, 80, 74

The average score for Dr. Albert is 78.4, for Dr. Burns is 79.2 and for Dr. Collin is 73.4.  As the hospital administrator, you want to know whether these differences are due to differences in doctor quality or likely from random chance.  If there were only two doctor’s a t-test would suffice, but what tests can you use in the case of multiple doctors?

The solution to this is to run an ANOVA test.  How do we do this?  Follow these easy steps.

1. Let j be the group number (j=a, b, c) and i be the number obervation within each group (i=1, 2,…,5)
2. Calculate the mean of each group (μj): μa= 78.4; μb 79.2; μc= 73.4.
3. Also calculate the mean of the entire sample. μ=77
4. Now calculate the Sum of Squares within each group [SSwithin = ΣΣ (Xij – μj)2].  This shows how much variation there is for each doctor.
• SSa = (76 – 78.4)2 + (85 – 78.4)2 + (91 – 78.4)2 (67 – 78.4)2 + (73.4 – 78.4)2 = 367.2
• SSb = 666.8
• SSc = 727.2
• SSwithin  = SSa + SSb +  SSc = 1761.2
5. Now calculate the Sum of Squares between each group. [SSbetween =Σ njj – μ)2].  This shows how much variation there is across each of the doctor’s average score.
• SSbetween = 5*(78.4 -77)2 + 5*(79.2 – 77)2 + 5*(73.4 – 77)2 = 98.8
6. The F-statistic is calculated as the mean square (MS) statistic for the between and within sum of squares (SS).  How do we go from the SS to the MS?  That’s easy, we just divide both by the degrees of freedom.
• MSwithin  = SSwithin/(N-J) = SSwithin/12.  This is because there are 15 observations and 3 doctors so 15-3=12.  Our answer here is: 1761.2/12 = 146.77
• MSbetween = SSbetween/(J-1) = SSwithin/2. This is because there are 3 doctors, we have 3-1=2. Our answer here is: 98.8/2 = 49.4.
7. Now we can calculate the F statistic as: F = MSbetween/MSwithin = 49.4/146.77 = .337
8. If we look this up on an chart for F-statistics, we see that the probability that all 3 doctors are equally good is .721.  Thus, we fail to reject the null that all three doctors are equally good.

STATA

Is there an easier way to do this?  Yes.  If you have Stata, you could just use the score as the dependent variable and have dummy variables for Drs. A, B, an C. The you can run a statistical test that the coefficient estimate for Dr. A = the coefficient estimate for Dr. B = the coefficient estimate for Dr. C.  This will give you the same probability that the three doctors are equally skilled that we calculated manually above.

Other Resources
Khan Academy has a great description of ANOVA as well.