statistics

Variance and Eigenvalues

ByAnnMaria De Mars September 6, 2015September 6, 2015

I find this scree plot of eigenvalues very helpful in identifying the number of factors. A scree plot is a plot of the eigenvalues by the factor number.

I realized this is only helpful if one understands what an eigenvalue is.

First of all, go way back to Stat 101 & remember that correlation is the covariance of z-scores which have a standard deviation of 1, and since the square of 1 = 1, they also have a variance of 1

Understand that the default is to factor analyze the correlation matrix and that means that your variables are standardized before analysis, with a variance of 1. So, that is the total amount of variance we are trying to explain for each variable.

Therefore, the total amount of variance to be explained in a matrix will equal the number of variables. If you have 10 variables, the total amount of variance to be explained is 10.

If you’ve ever looked a correlation matrix you will have noticed that all of the diagonals are 1. The correlation of an item with itself is 1.

What percentage of the variance in an item is explained by itself? It should be obvious that it is 100%. If I know your age, for example, I can predict your age with 100% accuracy. Duh.

An eigenvalue is the total amount of variance in the variables in the dataset explained by the common factor. (Mathematically, it’s the sum of the squared factor loadings. If you are interested in that, you can come to my class at WUSS on Wednesday morning. Or, possibly, I’ll blog about it next week.)

Now, if a factor has an eigenvalue of 1, it is pretty useless. That is because the whole purposes of factor analysis is to replace your 20 or 50 items that each explain of variance of 1 (their own variance) with a few common factors. A factor with an eigenvalue of 1 doesn’t explain any more variance than a single item.

Let’s say you have 24 items and your first factor has an eigenvalue of 6. Is that good? Yes, because that means that a single factor explains as much of the variance in the matrix of data as six items. If you could get four orthogonal factors, each with an eigenvalue close to 6, then you would have explained nearly 100% of the variance in your 24 items with just 4 factors.

Think about correlation matrices again. It’s not often you see an EXACTLY zero correlation. You’ll find correlations of .08, .03, .12 just by chance. Who knows, the same person had the highest score on sticks of bubble gum chewed and number of asses kicked (R.I.P. Rowdy Roddy Piper), it doesn’t mean that those two variables really have that much in common. This is why we look at statistical significance and how likely something is to occur by chance.

How do you tell if that factor has a higher degree of common variance, that is the eigenvalue is higher, than would be expected by chance? One way is the scree plot. You look at the eigenvalue for each factor and see where it drops off.

I would write more about this but my family is urging me to leave for a barbecue for Maria’s birthday, so you will have to last until tomorrow for a more detailed explanation of scree plots and why they are called that.

——-

Random act of advertising: Buy our games – learn history, learn math, find herbs, spear fish

If you want to be really cool and get a Tourist Visa to our Virtual Worlds, you can apply here.

MANOVA beginning to end: Recoding Data is Part of the Process

ByAnnMaria De Mars June 11, 2017

Other people want to go see the new Wonder Woman movie. I’ve been wanting to talk about MANOVA, but first, we need some decent dependent and independent measures. I have the India Human Development Survey data on over 39,000 women and my hypothesis is that education is related to women’s rights’ issues, especially autonomy, health…

Software | statistics | Technology

What I learned from my favorite paper at SAS Global Forum

ByAnnMaria De Mars May 2, 2016

At first, I was thinking it wasn’t right to have a favorite paper, but then I realized that was idiotic. It’s not like these papers (or their presenters) are my children. My favorite paper was, Statistical modeling for large complex data: Five new directions from SAS/STAT software If you’re not a statistician, props to you…

statistics

People who annoy me: Mathematicians who pretend to be statisticians

ByAnnMaria De Mars January 1, 2011January 1, 2011

The first course I ever took in statistics was in the math department, over thirty years ago, and Dr. Spitznagel, at Washington University in St. Louis taught me a good deal despite my best efforts, assisted by Fraternity Row, to major in partying (please don’t tell my mom). So, math people, thanks for that. HOWEVER…

Software | statistics | Technology

An Introduction to Repeated Measures ANOVA

ByAnnMaria De Mars June 8, 2017

I’m teaching a course on multivariate statistics and for some of the students it’s been a minute since their last inferential statistics course. So, I have been doing a few videos here and there to refresh, for example, what is a repeated measures ANOVA and why you might want to do it. Sometimes I…

Dr. De Mars General Life Ramblings | Software | statistics

What I learned about statistics from martial arts

ByAnnMaria De Mars May 16, 2011May 16, 2011

I’m doing a workshop at the San Diego SAS users group meeting on Wednesday and had suggested opening the session with a clip of my daughter’s last amateur fight. Someone politely commented, “Uh, I guess that would be okay, if it was, uh, relevant.” Fair question, how can martial arts be related to statistics or…

Software | statistics | Technology

Changing SAS Enterprise Guide Data File

ByAnnMaria De Mars July 13, 2012July 13, 2012

With the new SAS On-Demand for Academics, I presume there will be a lot of professors who have a teaching assistant, research assistant or intern preparing the data for examples for their classes. Or, you may be co-authoring a paper with one of your colleagues. Let’s suppose you are working on a SAS Enterprise Guide…

4 Comments

Pingback: Mama, what’s a scree plot? : AnnMaria's Blog
David Pearson says:

October 2, 2017 at 5:24 pm

Very through, concise, easily understood explanation of eigenvalue. Thank you. Intending to use Factominer, an R capability to do MCA on a set of data with 4500 variables, most columns only have a small sub-set of those variables. Not certain that Factominer can provide discernable results, we’ll see. If you’re interested, I’ll share results with you.
Again, thanks for an interesting post, hope your cook-out went well.
Pingback: do you have shiny object syndrome
Dr Pulkit Pandey says:

March 31, 2021 at 12:38 pm

thank you so much. i was struggling to understand it. this totally cleared it for me

Similar Posts

4 Comments

Leave a Reply