Dr. De Mars General Life Ramblings | statistics

Interpreting Confirmatory Factor Analysis Output from Mplus

ByAnnMaria De Mars May 15, 2013

Being able to find SPSS in the start menu does not qualify you to run a multi-nomial logistic regression.

This is the kind of comment statisticians find funny that leaves other people scratching their heads. The point is that it’s not that difficult to get output for some fairly complex statistical procedures.

Let’s start with the confirmatory factor analysis I mentioned in my last post. Once you get past the standard stuff that tells you that your model terminated successfully, the number of variables and factors, you see this:

Chi-Square Test of Model Fit

Value                              8.707
Degrees of Freedom                 8
P-Value                           0.3676

The null hypothesis is that there is no difference between the patterns observed in these data and the model specified. So, unlike many cases where you are hoping to reject the null hypothesis, in this case I certainly do NOT want to reject the hypothesis that this is a good fit. As you can see from my chi-square value above, this model is acceptable.

Another measure of goodness of fit is the root mean square error of approximation (RMSEA).

RMSEA (Root Mean Square Error Of Approximation)

Estimate                           0.011
90 Percent C.I.                    0.000 0.046
Probability RMSEA <= .05           0.973

An acceptable model should have an RMSEA less than .05. You can see above that the estimate for RMSEA is .011, the 90 percent confidence interval is 0 – .046 and the probability that the population RMSEA is less than .05 is 97.3%. Again, consistent with our chi-square, the model appears to fit.
…………………………………………………………Two-Tailed
…………………Estimate S.E. Est./S.E. P-Value

F1       BY
Q1F1               1.000      0.000    999.000    999.000
Q2F1               1.828      0.267      6.833      0.000
Q3F1               1.697      0.235      7.231      0.000

F2       BY
Q1F2               1.000      0.000    999.000    999.000
Q2F2               1.438      0.291      4.943      0.000
Q3F2               1.085      0.191      5.687      0.000

Here are the unstandardized estimates. By default the first variable for each factor is constrained to a value of 1, so, of course, there is no real standard error, probability or standard error of estimate. It isn’t really an estimate, that was set. Let’s look at the other two. Since they are unstandardized the more useful measure for us is the estimate divided by the standard error of the estimate, for example 1.828/ .267 . This is done for us in the column under Est. / S.E. and in that case comes out to 6.833. You interpret these values in the same way as any z-score, with 1.96 as the critical value, and you can see in the last column that all of my variables loaded on the factor hypothesized with a p-value much less than .05.

The next thing I look at is the residual variances. At this point my only concern is that I *not* have a residual variance that is negative. It makes no sense that you would have a negative variance because (among other reasons) variance is a sum of squares and squares cannot be negative. Also, in this case, the commonality is greater than 1, meaning you have explained over 100% of the variance in this variable by its relation to the latent construct. This also makes no sense. These are referred to as Heywood cases and explained beautifully here (even though the linked documentation is from SAS it applies to any confirmatory factor analysis).

The final thing I want to look at, for right now, anyway, is the R-squared

R-SQUARE

Observed Two-Tailed
Variable Estimate S.E. Est./S.E. P-Value

Q1F1               0.142      0.032      4.473      0.000
Q2F1               0.475      0.065      7.256      0.000
Q3F1               0.438      0.061      7.123      0.000
Q1F2               0.174      0.045      3.883      0.000
Q2F2               0.376      0.078      4.827      0.000
Q3F2               0.179      0.044      4.057      0.000

You can see that the r-square is pretty decent overall. These are interpreted just like any other R-square values. I didn’t show the standardized factor loadings here but just take my word for it that the R-squared values are the standardized loadings squared. So this is the variance in q1f1, for example, explained by factor 1.

I started this whole thing working with Mplus to do a factor analysis and overall, I’d have to call it a pretty painless experience.

statistics

Predictor variables – when order does and does not matter

ByAnnMaria De Mars June 27, 2012

Having failed recently to use BMI as a variable from a data set on school children in our example for propensity score matching, because people who fill out surveys are big, fat liars, we next went to a sample of really old people and used death as a dependent variable Try faking it on the…

Software | statistics | Technology

Make Yourself a Birthweight Data Set

ByAnnMaria De Mars January 11, 2016

Even though Rick Wicklin (buzzkill!) disabused me of the concern that SAS was communicating with aliens through the obscure coding in its sashelp data sets, I still wanted to roll my own. If you, too, feel more comfortable with a data set you have produced yourself, let me give you a few tips. There is…

Dr. De Mars General Life Ramblings | Software | Technology

The Next Big Thing

ByAnnMaria De Mars April 13, 2010April 15, 2010

I’m at Seattle this week, at SAS Global Forum, and it is even greater than usual. I go to several conferences each year, some because I am presenting, some because there is a topic that particularly interests me, but there are three I go to every year. Of these, SAS Global Forum is the one…

statistics

Americans May not Be Bad at Math but Some Journalists Sure Are

ByAnnMaria De Mars December 5, 2013

It’s that time of year again when we hear complaints about how terrible the U.S. is doing in math. This article by The Atlantic with the title American Schools vs. the World: Expensive, Unequal, Bad at Math is just one of many, many reports that showed up in my twitter stream. The first question anyone…

statistics

Why visual literacy matters

ByAnnMaria De Mars July 23, 2014

Visual literacy, being the word chooser of this blog, I have decided means the ability to “read” graphic information. A post I saw today on Facebook earnings over time gave a prime example of this. If you are a fluent “visualizer”, then just like a fluent reader can read a paragraph and comprehend it,…

computer games | Dr. De Mars General Life Ramblings | The Julia Group

Taking All of Your Children to Work Days: My take away from let’s move

ByAnnMaria De Mars February 27, 2014February 27, 2014

When I add up all of the ad revenue from this blog on top of the business it garners, in a good month it might average out to $30 an hour and in a not-so-good month maybe $10. Since my consulting rate is a heck of a lot more than $30 an hour you might…

5 Comments

SOMIA says:

April 4, 2017 at 3:11 pm

Is it possible to have overall fit model indices e.g.,CFI 0.96 RMSEA 0.04-0.07 but some items having non-significant loadings but R square is significant for all of them?
Ansh says:

September 8, 2017 at 9:17 am

Thanks for the beautiful explanation. Just to confirm whether I have understood completely when judging the model fit p>0.05 is desirable because of the way null hypothesis is framed and, however, when observing the factor loading’s estimate (estimate of say Q2F1) p value <0.05 is desirable same as most of the regression analysis.
Am I correct?
Miriam says:

January 15, 2018 at 10:46 am

If my chi-square test of model fit is significant (so I failed to reject the Ho) but my RMSEA is over 0.05 and my estimates are acceptable, does that mean the model is OK? or because my chi-square results are not good then I cannot accept the model?

Thank you.
Abas says:

April 30, 2019 at 7:17 pm

Miriam: if the chi-square model fit is significant this simply means you cannot accept the model as an exact fit, however it does not mean you cannot accept the model. If the RMSEA is greater (>0.05) then it is not a close fit, furthermore if the SRMR is greater than 0.8 then again its not a close fit hence poor. Above the mode fit you need to look at the standardised residuals and normalised residuals ; if these are very large (e.g >2.5) then there may be an modelled/unobserved factor we have not included in the model. Furthermore is the R squared values are very very small then again substantively speaking our model is not doing a good job in explaining the data or that particular variable/indicator. ** I AM NOT AFFILIATED WITH THIS WEBSITE I AM JUST A PASSER BY COMMENTING **
Pingback: create a dedicated server

Similar Posts

5 Comments

Leave a Reply