I tried to find an easily comprehended explanation of the F-statistic for my students but I could not, so, here as a public service is mine. If you have some other pages you can recommend, please let me know.

Okay, why ANOVA? Why not just do a t-test? Well, let’s say you have five groups. Then you will have ten pairwise comparisons. You compare group 1 to groups 2, 3, 4 and 5. That’s four. Now you compare group 2 to groups 3, 4 and 5. That’s another three t-tests. And so on. So now, you don’t really have a 5% probability of a type I error when  p = .05 because you actually had TEN tests. If you did 100 tests, you’d expect five of them to turn out significant just by chance. So, let’s just accept that many pairwise tests = bad.

Enter ANOVA, short for Analysis of Variance. Let’s talk about a one-way ANOVA for now. You have a continuous, numeric dependent variable – say height. You have a categorical independent variable with two or more levels. You could do ANOVA with just two levels but in that case you might as well do a t-test. In this case, let’s assume that we have children raised eating an unrestricted diet, children who were raised vegetarian and children who were raised vegan. At age 10, we decide to measure all of their heights.

What is our null hypothesis? It is that there is no difference among the means, or

μ1 = μ2 = μ3

Enter the F-test. We are going to state that if there is no difference in the means then the estimate of variance you get from the difference in group means should be the same as the estimate of the population variance you get within groups. The F statistic is calculated like this

variance between groups
variance within groups

If the null hypothesis is correct, these two estimates of the variance should be close to the same and your F ratio should be near 1.0

How to get the within group variance

Well, it’s just like any other time you get a variance. Imagine that group 1 is a sample for a study. What do you do? You sum the squared deviations for the mean and divide by n minus 1, right?

variance group 1

That gives you the within group variance for group 1. You do the same thing for group 2 and group 3.

BUT … not all groups are created equal. What if you have five times as many people in group 3 as you do in group1 and group 2?

weighted average of group variancesBeing the reasonable person you are, you weight the within group variances by the degrees of freedom of each group, that is to say, the number of subjects minus 1. You divide this by the total number of subjects minus the number of groups. This is your within group estimate of the variance. This is your denominator.   Let’s say that the value you get for this is 42.

Now you need the between groups variance 

First, subtract each group mean from the overall mean. Square that.
Second, multiply by the number in each group
Third, add the result
Fourth, divide by the number of groups minus 1

between group variance

Let’s just suppose, for the sake of supposing, that the value you get for this is 108. Your F-ratio is then 108/42 =  2.57

And that, my dears is you get an F value.

Comments

9 Responses to “The F-statistic in ANOVA explained”

  1. Emilio L. Cano on November 29th, 2012 3:47 am

    Great Blog Annmaria. This post wold be much better with nice equations (check http://www.mathjax.org/, I use it and it is free and easy))

  2. AnnMaria on November 29th, 2012 5:15 am

    Thanks a lot. I’ve been looking for something like that because I don’t have the patience to do equations with the WordPress menu.

  3. Emilio L. Cano on November 30th, 2012 4:04 am

    You’re welcome!

  4. Elisha on July 9th, 2013 12:59 pm

    I really appreciate this.

  5. cat on January 23rd, 2014 3:24 am

    Thanks, that’s clear!

  6. Steph on September 4th, 2014 1:44 pm

    I understand how to get the F value and why it is important. When the F statistic is “large” then the between group variation is greater than the within group variation. My question is what is a “large” F value. Is it greater than 1? 2? 10?

    Thanks in advance!

  7. Adam on September 5th, 2014 11:25 am

    This is great, there is one correction, however. When describing the equations for the between groups variance, you say to:
    “First, subtract each group mean from the overall mean
    Second, multiply by the number in each group”

    Yet the equation shows to square the results from the first step. I’m not sure if the steps are right, or the equation is right (though, I’m assuming it’s just an omission by the author). Anywho, this was a great explanation, thank you!

  8. AnnMaria on September 5th, 2014 12:56 pm

    You are correct. It should be squared. Made the correction. Thanks for catching it.

  9. AnnMaria on September 5th, 2014 12:59 pm

    Steph -

    An F-value of 1 is VERY low. It says the variance between groups is exactly what you would expect by chance.

    I would look at three things, the F-value, the p-value and the r-square. That’s another post. Maybe I should get on that after I check out of this hotel room which I am supposed to do in 45 seconds (not kidding).

Leave a Reply