Parallel Analysis Criterion Simplified?

ByAnnMaria De Mars October 23, 2014October 23, 2014

Am I missing something here? All of the macros I have seen for the parallel analysis criterion for factor analysis look pretty complicated, but, unless I am missing something, it is a simple deal.

The presumption is this:

There isn’t a number like a t-value or F-value to use to test if an eigenvalue is significant. However, it makes sense that the eigenvalue should be larger than if you factor analyzed a set of random data.

Random data is, well, random, so it’s possible you might have gotten a really large or really small eigenvalue the one time you analyzed the random data. So, what you want to do is analyze a set of random data with the same number of variables and the same number of observations a whole bunch of times.

Horn, back in 1965, was proposing that the eigenvalue should be higher than the average of when you analyzed a set of random data. Now, people are suggesting it should be higher than 95% of the time you analyzed random data (which kind of makes sense to me).

Either way, it seems simple. Here is what I did and it seems right so I am not clear why other macros I see are much more complicated. Please chime in if you see what I’m missing.

Randomly generate a set of random data with N variables and Y observations.
Keep the eigenvalues.
Repeat 500 times.
Combine the 500 datasets (each will only have 1 record with N variables)
Find the 95th percentile

%macro para(numvars,numreps) ;
%DO k = 1 %TO 500 ;
data A;
array nums {&numvars} a1- a&numvars ;
do i = 1 to &numreps;
do j = 1 to &numvars ;
nums{j} = rand(“Normal”) ;
if j < 2 then nums{j} = round(100*nums{j}) ;
else nums{j} = round(nums{j}) ;
end ;
drop i j ;
output;
end;

proc factor data= a outstat = a&k noprint;
var a1 – a&numvars ;
data a&k ;
set a&k ;
if trim(_type_) = “EIGENVAL” ;

%END ;
%mend ;

%para(30,1000) ;

data all ;
set a1-a500 ;

proc univariate data= all noprint ;
var a1 – a30 ;
output out = eigvals pctlpts = 95 pctlpre = pa1 – pa30;

*** You don’t need the transpose but I just find it easier to read ;
proc transpose data= eigvals out=eigsig ;
Title “95th Percentile of Eigenvalues ” ;
proc print data = eigsig ;
run ;

It runs fine and I have puzzled and puzzled over why a more complicated program would be necessary. I ran it 500 times with 1,000 observations and 30 variables and it took less than a minute on a remote desktop with 4GB RAM. Yes, I do see the possibility that if you had a much larger data set that you would want to optimize the speed in some way. Other than that, though, I can’t see why it needs to be any more complicated than this.

If you wanted to change the percentile, say, to 50, you would just change the 95 above. If you wanted to change the method from say, Principal Components Analysis (the default, with commonality of 1) to saying else, you could just do that in the PROC FACTOR step above.

The above assumes a normal distribution of your variables, but if that was not the case, you could change that in the RAND function above.

As I said, I am puzzled. Suggestions to my puzzlement welcome.

Messy Problems Made Simple with SAS

ByAnnMaria De Mars November 1, 2015November 2, 2015

Some problems that seem really complex are quite simple when you look at them in the right way. Take this one, for example: My hypothesis is that a major problem in math achievement is persistence. Students just give up at the first sign of trouble. I have three different data sets with student data from the Spirit…

Software

SAS Enterprise Guide Makes You Smarter

ByAnnMaria De Mars September 27, 2010September 27, 2010

The difference between working for a business and being at a university is in a business we’re not paying you to show us how smart you are. Don’t get me wrong. I think the best career advice I ever read was in an article by James Watson in Technology Review, never be the brightest person…

Dr. De Mars General Life Ramblings | statistics

Why I still teach

ByAnnMaria De Mars July 9, 2017July 9, 2017

Once every year, I teach an actual course, not a workshop or professional development, but a class with 20 – 40 students. One where I need to write a syllabus, have lectures, papers to grade, homework and exams. Now, I’m not comparing teaching masters or doctoral students 3- 6 hours a week to my friends…

Software | statistics | Technology

DO statistics and you can go almost anywhere

ByAnnMaria De Mars December 27, 2017

Let me say right off the bat that the number of contracts I’ve had where people wanted me to tell them what to do I can count on one hand – and I’ve been in business 30 years. Generally, whether it is an executive in an organization where I’m an employee or a client for…

statistics

Floor Effect, Ceiling Effect and Computing Internal Consistency Reliability at Post-test

ByAnnMaria De Mars January 28, 2013

Very often, researchers (including me) use multiple-choice tests to collect data to determine whether or not an intervention has worked. Does the Dance Your Way to Math curriculum really result in higher test scores? Does Lollipop Spelling reduce the number of spelling errors? and on and on. I remember being told that statistics to be…

Dr. De Mars General Life Ramblings | statistics | The Julia Group

Men, Women, Tech, Discrimination & Statistics

ByAnnMaria De Mars October 13, 2015October 13, 2015

Let’s get this out right up front – I have no question that there is discrimination in the tech industry. I gave an hour-long talk on this very subject at MIT a couple of weeks ago, where I pointed out that everyone’s first draft of pretty much everything is crap – your first game, first…

4 Comments

Abby Paden says:

December 18, 2016 at 11:27 am

I think a lot of the code out there in the ether is terribly optimized and unnecessarily complex. It’s as if some of the authors are trying to win an Obfuscated Code Challenge.

I will be using your PA program further into my degree.
AnnMaria says:

January 4, 2017 at 1:05 am

Ha ha ha, you win my best comment of the week award!
GM Jackson says:

January 4, 2017 at 3:57 pm

In my opinion, finding a simpler, faster, more efficient way is always better. If something seems overly complicated, it’s usually because the creator of that complicated mess doesn’t want anyone to figure out how it works. It’s easier to monopolize a given task if you are the only one who knows how to do it. Once you simplify it, practically anyone can do it.
AnnMaria says:

January 4, 2017 at 7:37 pm

I’m the opposite – if practically anyone can do it then I’m freed up to find a new, more interesting challenge.

Similar Posts

4 Comments

Leave a Reply