# How much matrix algebra do statistics students REALLY need?

Following a discussion using matrix algebra to show computation in a Multivariate Analysis of Variance, a doctoral student asked me,

“Professor, when will I ever use this? Why do I need to know this?”

He had a valid point. I’m always asking myself why I’m teaching something. Is it because it interests me personally, because it is in the textbook or because students really need to know it.

Let’s take some things about matrix algebra we always teach students in statistics.

*What conformable means and why it might matter*

Two matrices are conformable if they can be multiplied together. When you multiply two matrices, the row of the first matrix will be multiplied by the column of the second matrix. You sum the products and that is the first element in the matrix. You repeat this until you have multiplied all of the rows in the first matrix by all of the columns in the second.

So — you can multiply a 3 x 2 matrix by a 2 x 3 matrix but not vice versa.

Multiplying a matrix of dimension a x b and a matrix of dimension c x d will give you a resulting matrix with a rows and d columns, that is, of dimensions a x d .

This can give you results that sometimes seem counter-intuitive, like that the product of a 1 x 3 matrix and a 3 x 1 matrix is a 3 x 3 matrix.

It may seem weird that the result of matrix multiplication can either be a larger matrix than both of the matrices you multiplied, or smaller than both of them, but there it is.

If both matrices are square, that is, of dimension n x n, then the resulting product will also be an n x n matrix.

And, of course, any matrix can be multiplied by its transpose because the transpose of an m x n matrix will always be n x m .

If a square matrix is of full rank, it means that none of the rows are linearly dependent. If you DO have linear dependence, it means you have redundant measures. Now, I could go on to prove this mathematically and all of it is very interesting to me.

I question, though, whether you really need to know anything about matrix algebra to understand that redundant measures are a bad thing.

Do you need matrix algebra to explain that we are going to apply coefficients (do you even need to refer to it as a vector?) to the values of each variable for each record and get a predicted score such that

predicted score = b0 + b1X1 + b2X2 …. b.Xn

When I was in graduate school, calculators that did statistical analyses, even as simple as regression, cost a few hundred dollars which was the equivalent of three months of my car payment. Computer time was charged to your department by the hour. So … my first few courses, I did all of my homework problems using a pencil and paper, transposing and inverting matrices – and it was a huge pain in the ass.

Then, I got a job as a research assistant and one of the perks was hours of computer time. I thought I’d died and gone to heaven. It took me less than half an hour to get all of my homework done using SAS (which ran on a mini-computer and spit out printouts that I had to walk across campus to pick up).

My students are learning in a completely different environment. So … do they need to learn the same things in the same way I did? This is a question I ponder a lot.

If they ever need to fit a bayesian model, they’ll have to program their model in softwares like Stan, using linear algebra.

They’ll need to know that the linear predictor in a (g)lm is a product between the matrix of predictor and the vector of regression coefficients. Another example, they’ll need to know how they can get the covariance matrix by pre and post multiplying the correlation matrix by the diagonal matrices of standard deviations. (This decomposition allows them to specify their priors for the scales and the correlation structure independently.)

Now, there exists programs like Mplus which will fit certain bayesian models for you, but then you don’t have the flexibility that bayesian modeling can offer if you know how to specify them mathematically.

I wrote a whole reply assuming you were talking about a PhD student in statistics. I then realized it made no sense, re-read your first paragraph, and realized that a PhD *student* learning MANOVA is most probably living in a “soft” science somewhere (sociology, psychology, medicine, etc.) rather than someone from a STEM background who would have been expected to learn matrix linear algebra and MANOVA before the end of their second year of undergrad.

You can suggest to this student, regardless of their background, that if they want to actually *do* statistics (as opposed to running other people’s code and then having other people interpret the results) they will need to be able to understand the mechanisms used to get the results. You can’t understand error codes unless you understand the mechanisms in the algorithm that the error codes are referring to. You can’t *really* do regression (beyond blind plug-and-play) without understanding the algorithm for computing the coefficients that you obtain via the result. Etc, etc, etc.

Liner algebra is one of the most powerful tools in mathematics, especially for people in the applied/real world who work with data. The more you understand of it, the more powerful your other knowledge becomes, because it gives you a path to implementation and use.

Interesting, maybe I am using too narrow a definition of matrix algebra. Of course you have to understand what a matrix is, that a vector is, because this is pretty basic. The question is how far do students NEED to go. There are certainly basic concepts you can learn more than one way. I don’t think there is any question that learning matrix algebra, or French, for that matter, or any other non-trivial subject, has benefit.

And, you’re right, I’ve done models where I had to specify matrices – but this is far beyond what most of the students I teach are ever going to be expected to do in the normal course of their work. In defense of that student, I don’t think he was questioning whether there was any usefulness for matrix algebra but rather whether it was something that he, personally, would use.

I am a big proponent of high standards, particularly in mathematics, but there is also the fact that only so much material can be covered in a course. The two things I try to eliminate are a) information you should have learned already and b) information not essential to this subject.

After I have covered everything else, I may add in c) information it would definitely be good to know.

For example, in a basic statistics course, we will get to one-way ANOVA. If there is time, I will cover n-way ANOVA but it’s not assumed everyone who had basic statistics knows that and if the class is having trouble getting the earlier material, we may not reach that topic. I’m certainly not saying no one needs to learn a 2 x 3 ANOVA or a mixed model.

As I said, I’ve been giving this a lot of thought – well, this is turning into a post on itself, so I will write more on error codes later.

I was hopeful to see an answer to the question how exactly matrix algebra is used, but instead read another description of it, every book on matrix algebra simply starts from describing simple operations with matrices and goes on into more complicated staff and never bothers to explain how exactly matrix algebra is used, why multiplying data (basically variables) by same data makes sense.

You can multiply a 3 x 2 matrix by a 2 x 3 matrix! You should have said, e.g. you can multiply a 2 x 3 matrix by a 3 x 4 matrix but not vice versa!

Oh, you’re right. Fixed it. Thanks.