# Random Rambling on Structural Equation Models

Sometimes people talk about path analysis models, confirmatory factor analysis and/or exploratory factor analysis as separate and distinct techniques from structural equation modeling (SEM). That is rather like talking about Dogo Argentinos as different from dogs when in fact they are a TYPE of dog (picture of dogo attached for those wondering).

Similarly, path analysis and factor analysis (whether exploratory or confirmatory) are all types of SEM. When I first took a course in SEM (yes, in the 1980s) most people I knew, when they spoke of structural equation models, were referring to more complex models that combined a measurement model of latent variables with hypothesized paths among them, but those aren’t the ONLY types of models in SEM.

Glad we cleared that up.

AMOS, which is what I happen to be using today, does not use pairwise deletion nor listwise deletion nor data imputation. In computing maximum likelihood estimates in the presence of missing data with AMOS it is assumed that the data are MISSING AT RANDOM.

Just because life isn’t complicated enough, there are three categories to worry about Missing Completely at Random, Missing at Random and Not Missing at Random. There is a really nice post about these on the onbiostatistics blog. Missing Completely at Random means that the data being missing is not related to any values of any variables in the study. For example, in doing an analysis of academic achievement, if the subjects lost to follow-up occur at random, I would meet the MCAR criterion. However, with the work we do with 7 Generation Games, for example, that’s not usually the case. In general, students in larger communities are more likely to be lost to follow-up just because they can change classes within schools or change schools altogether and thus move out of our experimental group. In the smaller towns, there is only one school and only one fifth-grade classroom in that school, so for the student to be lost to follow up, he or she would have to move out of town. So …. missing data is related to the size of the town one lives in. It’s not missing COMPLETELY at random.

BUT …. if the academic achievement of children missing data are no different from the children for whom data are not missing, then we can say that the data are Missing at Random. The missingness or not-missingness is not related to the value of the data we are missing. It’s not as if they run you out of town because your child is none too bright or beg you to stay because you have the best speller in the third-grade. If that WERE the case, then our data would be Not Missing at Random.

Tune in again tomorrow because I’m in an SEM mood.