### Jan

#### 8

# Livebinders: 20-day blogging challenge, day two

January 8, 2014 | 1 Comment

Today I’m on day two of the 20-day blogging challenge, the brain child of Kelly Hines and a great way to find new, interesting bloggers. The second day prompt was to share an organizational tip from your classroom, one thing that works for you.

The latest tool I’ve been using is livebinders . Remember when you were in college having a binder full of notes, handouts from the professor, maybe even copies of tests to study for the final? Well, livebinders appears to be designed more for clipping websites and including media from the web but personally I am using it to create binders for teaching statistics. I’ve just started with one but I’m sure this will eventually split off into several binders.

I’m always writing notes to myself but I have them everywhere – I used Google notebook until they got rid of that, evernote, I’ve got notepads on my laptop, desktop, iPad, phone and even paper notebooks around the place. I even have a PadsX program The Invisible Developer wrote years ago just for me (yes, he loves me).

Still, I’m thinking livebinders is going to be really useful for me to organize all of these notes into one spot.

Why do I want to do that, you might ask?

Well, statistics is a big field, and I have taught a lot of it, from advanced multivariate statistics to psychometrics to biostatistics and a lot of special topics courses. It seems to me that we often assume students have a solid grasp of certain concepts, such as variance or standardization, when I’m sure many of them do not. As I read books and articles, I’m trying to note what these assumptions are. My next step is to have pages in the binders where students can get greater explanation of, say, what does a confidence interval really mean. Right now, I feel that universities are trying to cut costs by combining information into fewer and fewer courses. We say that students learned Analysis of Variance in a course, but did they really? The basic statistics I took in graduate school consisted of a descriptive statistics class (I tested out of that). It ended with a brief introduction to hypothesis testing and a discussion of t-tests, z-scores, t-tests and correlation. The inferential statistics course reviewed hypothesis testing, t-tests and correlation, then focused on regression and ANOVA. The multivariate statistics course covered techniques like cluster analysis, canonical correlation and discriminant function analysis. Psychometric statistics covered factor analysis and various types of reliability and validity. These four courses were the BASICS, what everyone in graduate school took. (People like me who specialized in applied statistics took a bunch more classes on top of that.) Oh, yes, and each class came with a three-hour computer lab AFTER the three-hour lecture, to teach you enough programming so you could do the analyses yourself. Now, many textbooks try to include all of this in one course, which is just a joke, and ends up with students concluding that they “are just not very good at math”.

I can’t change the curriculum, but what I at least can do is provide some type of resource where every time a student feels he or she needs to back up and understand some concept, there is an explanation of that something.

I plan to have this done by the time I teach Data Mining in August.

Suggestions for what to include are welcome.

### Jan

#### 6

I came across this really interesting post on the 20-Day Blogging Challenge for teachers. I’m not sure how likely I am to be able to finish it in January since it is already the sixth and January is a really busy month for me, but we will see.

The first prompt is “Tell about a favorite book to share or teach.Provide at least one example of a cross-curricular lesson.”

One book that I like and have been reading lately is the IBM SPSS Amos 22 User’s Guide, by James Arbuckle. Unlike most documentation, it isn’t just which statements to use when. It gives a good discussion of structural equation modeling from the very basics. (Here’s a link to a free download of the guide for Amos 21. It’s pretty much the same.) The nice thing about it is, if you have Amos installed, it comes with the data that is used in the examples so you can compare your results to the book.

For no reason, I just decide to see how close the covariance estimates you get with Amos are to the actual covariance . I ran the correlation procedure in SPSS and requested covariances using one of the Amos example data sets. Then I ran the same analysis in Amos. The estimates were all really close but not identical to the actual values, for example, the covariance of recall1 and recall2 was 2.622 and the estimated covariance was 2.556.

As far as a cross-curricular lesson – I think this might be useful if I had a chance to discuss maximum likelihood methods versus ordinary least squares. I just finished teaching a course in biostatistics and even though we did discuss logistic regression and I had a few students use logistic regression for their analysis projects, we did not have nearly enough time to delve into it in depth. I’m teaching a data mining course in August, but it is going to be using SAS Enterprise Miner, so while the concepts in the book might apply in some instances – he covers a lot of territory – it won’t be the same software.

As I was reading this book, though, I was thinking about the diversity of students in almost every class that I have ever taught. It would be fun to teach a course in SEM, but I know that some students are still struggling with the concept of variance. So … my decision for the day is to start this week on some short instructional videos that can supplement the limited class time that we have. I think I’m going to start with the very basics – what is variance and what is covariance.

### Jan

#### 5

# New Year’s Non-Resolutions

January 5, 2014 | 1 Comment

This isn’t the first year that I’ve resolved to be the same, so even my resolving to stay the same is staying the same.

I noticed I’ve gotten a lot better with javascript over the last couple of years, a little better with PHP and SQL. My resolution is to keep doing that. As with everything in life, I’d like to get better faster, but I don’t expect that to change either.

I used to read some book on programming before I got out of bed in the morning. Lately, I’ve been reading the New York Times instead. I think I’ll go back to reading programming books because the news doesn’t change much. Some people are shooting at some other people in a place I could not find on the map. Country X is doing better than the U.S. in math. Country Y had a natural disaster and the people who didn’t die are now homeless and hungry. The Republicans are against health care and the Democrats are against the Republicans. I’d be better off reading about new javascript libraries or modeling techniques.

Speaking of modeling techniques, I’ve been back and forth with SAS over the past couple of years. The new on-demand offerings hold some promise. Teaching biostatistics I got to use some procedures I hadn’t used in a while, or at all. On the other hand, I did almost no macro programming in SAS this year. Just didn’t come up, which is kind of weird. I’m teaching a data mining class in August, so I’ll finally be using Enterprise Miner. I’m looking forward to that. I’m also intrigued by the new model selection procedures and I may just be apply to apply those to a project I’m working on now, so I’m looking forward to that.

I have two small contracts to finish this month, and three others that run out this year. I keep saying I’m going to do less work but it seems as if whenever I cut back on one thing, something else increases. If there was anything I could make NOT the same, it would be that I’d find it easier to relax. It’s not that I don’t have plenty of opportunities – I could walk on the beach, hike in the mountains. The problem is that whenever I’m not working I keep thinking about that training video I need to make or programming interactive questions for the pretest or the next game activity. Though I have gotten better at turning down contracts, I still take on more than I should. The most helpful person with that has been The Spoiled One, who today suggested we walk down and try out a new vegan Thai restaurant on Main Street. (To my amazement, it was excellent). Tomorrow, the two of us are heading to Disneyland – the Invisible Developer bought me an annual pass for Christmas.

We applied to Co.lab for our newest start-up, 7 Generation Games, so if we get in there, that will be awesome and we’ll be relocating to Silicon Valley for a few months. If not, I have another source in mind for supplemental funding that I’ll apply to. I always have a Plan B and a Plan C and a Plan D and I don’t expect that to change, either.

My resolution is to make this year pretty much the same as last year, only better. Which might sound a bit boring, but looking back on it, I realize 2013 was pretty damn awesome. (If you’re curious, you can read my Christmas letter here, on my personal blog, which prompted my mother to call me and say that she thought a woman as educated as me should use better language. Obviously, my mother seldom reads my blog – or talks to me, for that matter.) New resolution: call Mom more often.

### Jan

#### 2

# Random Rambling on Structural Equation Models

January 2, 2014 | Leave a Comment

Sometimes people talk about path analysis models, confirmatory factor analysis and/or exploratory factor analysis as separate and distinct techniques from structural equation modeling (SEM). That is rather like talking about Dogo Argentinos as different from dogs when in fact they are a TYPE of dog (picture of dogo attached for those wondering).

Similarly, path analysis and factor analysis (whether exploratory or confirmatory) are all types of SEM. When I first took a course in SEM (yes, in the 1980s) most people I knew, when they spoke of structural equation models, were referring to more complex models that combined a measurement model of latent variables with hypothesized paths among them, but those aren’t the ONLY types of models in SEM.

Glad we cleared that up.

AMOS, which is what I happen to be using today, does not use pairwise deletion nor listwise deletion nor data imputation. In computing maximum likelihood estimates in the presence of missing data with AMOS it is assumed that the data are MISSING AT RANDOM.

Just because life isn’t complicated enough, there are three categories to worry about Missing Completely at Random, Missing at Random and Not Missing at Random. There is a really nice post about these on the onbiostatistics blog. Missing Completely at Random means that the data being missing is not related to any values of any variables in the study. For example, in doing an analysis of academic achievement, if the subjects lost to follow-up occur at random, I would meet the MCAR criterion. However, with the work we do with 7 Generation Games, for example, that’s not usually the case. In general, students in larger communities are more likely to be lost to follow-up just because they can change classes within schools or change schools altogether and thus move out of our experimental group. In the smaller towns, there is only one school and only one fifth-grade classroom in that school, so for the student to be lost to follow up, he or she would have to move out of town. So …. missing data is related to the size of the town one lives in. It’s not missing COMPLETELY at random.

BUT …. if the academic achievement of children missing data are no different from the children for whom data are not missing, then we can say that the data are Missing at Random. The missingness or not-missingness is not related to the value of the data we are missing. It’s not as if they run you out of town because your child is none too bright or beg you to stay because you have the best speller in the third-grade. If that WERE the case, then our data would be Not Missing at Random.

Tune in again tomorrow because I’m in an SEM mood.

### Jan

#### 1

# Dude, where’s my estimates? Illegal path? A forgetful person’s guide to AMOS

January 1, 2014 | 14 Comments

I don’t use AMOS for structural equation modeling all that often and every time I do I have to look up all of the steps again.

1. Install SPSS and AMOS. Fortunately, it seems to work on Windows 8. Yay! You can either open AMOS by double-clicking on it or you can open it directly from the ANALYZE menu in SPSS

2. Go to FILE > DATA FILES > Click on FILENAME and then go to wherever the SPSS file is saved. When you open the file, if you haven’t opened it from SPSS and want to look at the file to be sure you have the right data, if you click on the View Data tab it opens SPSS and the data file.

3. Click on the RECTANGLE (top left corner) and draw a box for each observed variable.

4. Double-click on each box to give it a variable name and label

5. Click on the single arrow to draw paths, the double arrow to draw covariances

6. Include an other term for error variance

7. Set the regression parameter of one of the paths to 1

8. Click on View > Analysis Properties and select Output. If you don’t do this, you won’t get much output and you will be disappointed. At a minimum here select standardized estimates, but you probably want squared multiple correlations and maybe some other stuff too.

9. Select Calculate Estimates

At this point, you may get the dreaded error … Path is not of a legal form.

10. Here is what you need to do – save your file. The AMOS manual says you should be prompted to save your file, but I wasn’t (neither on Windows 7 nor on Windows 8). However, saving the file solved the problem.

My assumption is that AMOS writes output to a path relative to where your AMOS file is saved and if you haven’t saved the file, it causes this error.

So, hurray, hurray it runs and you are looking at the exact same model you were a minute again. Where are the estimates?

11. Click the SECOND button in the top middle pane and change-O presto, your estimates appear on the path diagram. You can also select TEXT OUTPUT under the VIEW menu for some tables.

I’ll finish up this project and several months from now when I’m using AMOS again I’ll be glad I wrote this post.

« go back

## Blogroll

- Andrew Gelman's statistics blog - is far more interesting than the name
- Biological research made interesting
- Interesting economics blog
- Love Stats Blog - How can you not love a market research blog with a name like that?
- Me, twitter - Thoughts on stats
- SAS Blog for the rest of us - Not as funny as some, but twice as smart. If this is for the rest of us, who are those other people?
- Simply Statistics, simply interesting
- Tech News that Doesn’t Suck
- The Endeavor -John D Cook - Another statistics blog