# Stop and explain the variance

I’m thinking I’m going to need to create a new category on my blog – here is Day 6 of the 20-day blogging challenge, which, if you are just now tuning in is (surprise!), 20 days of prompts on teaching, a challenge I decided to undertake for the hell of it, the same reason I do most things in life. Given that this is particular crunch month for work it’s kind of amazing I’ve done six of these already.

Today’s prompt was,

What is one thing you wish you were better at? Just one! Why? What can you do about it?

Sort of one and a half – the one thing I wish I was better at teaching statistics is pacing. I never feel as if I have enough time at the end of the course. On the other hand, I feel at the beginning of the course that I need to spend ample time on the basics. How can you understand explained variance if you don’t understand variance in the first place? The main thing I wish I spent more time on the past couple of courses I taught was on discussing what we mean by explained variance and residual variance.

Let’s say that we know nothing about each person who walks into a room and we are trying to predict his or her IQ. The mean population IQ is 100, with a standard deviation of 15, which means the variance is 225 (15 squared). Thus, the variance of our random guesses will be 225. This is the error variance , which is the same as the population variance, since we had no predictors.

Let’s say now we get everyone’s college GPA and we find that the correlation with GPA and IQ is .707 (it’s lower than that in real life, but just pretend). So, now, when Bob comes into the room and I know he has a GPA of 4.0, two standard deviations above the mean, I am going to predict that he has an IQ of 1.414 standard deviations above the mean.

My equation is Y = a +bX where a = the mean and b = the regression coefficient

On the average, now, my predictions will be more accurate. In fact, the variance of the prediction is now 112. Instead of being off by 15 points in my prediction, I’m off by about 10.5 points (the square root of 112.)

Notice that:

- Having a predictor reduces the error in prediction.
- The amount of error is not the correlation, but the correlation SQUARED
- If you only have a single predictor, the standardized beta weight in a regression equation equals the correlation

There’s a lot more I have to say about explained variance, but it is past 1 a.m. and I’m trying to get to bed earlier and get up earlier so I won’t die when I’m in North Dakota next week and get up at what is the equivalent of 7:30 a.m. Pacific time. (Because there is no way on God’s green earth I’m going to make it into those schools before 10 a.m. and that’s just a fact. Maybe it’s on God’s white earth, since I believe everything in North Dakota is under a blanket of snow for at least three more months.)