Today I’m on day two of the 20-day blogging challenge, the brain child of Kelly Hines and a great way to find new, interesting bloggers. The second day prompt was to share an organizational tip from your classroom, one thing that works for you.
The latest tool I’ve been using is livebinders . Remember when you were in college having a binder full of notes, handouts from the professor, maybe even copies of tests to study for the final? Well, livebinders appears to be designed more for clipping websites and including media from the web but personally I am using it to create binders for teaching statistics. I’ve just started with one but I’m sure this will eventually split off into several binders.
I’m always writing notes to myself but I have them everywhere – I used Google notebook until they got rid of that, evernote, I’ve got notepads on my laptop, desktop, iPad, phone and even paper notebooks around the place. I even have a PadsX program The Invisible Developer wrote years ago just for me (yes, he loves me).
Still, I’m thinking livebinders is going to be really useful for me to organize all of these notes into one spot.
Why do I want to do that, you might ask?
Well, statistics is a big field, and I have taught a lot of it, from advanced multivariate statistics to psychometrics to biostatistics and a lot of special topics courses. It seems to me that we often assume students have a solid grasp of certain concepts, such as variance or standardization, when I’m sure many of them do not. As I read books and articles, I’m trying to note what these assumptions are. My next step is to have pages in the binders where students can get greater explanation of, say, what does a confidence interval really mean. Right now, I feel that universities are trying to cut costs by combining information into fewer and fewer courses. We say that students learned Analysis of Variance in a course, but did they really? The basic statistics I took in graduate school consisted of a descriptive statistics class (I tested out of that). It ended with a brief introduction to hypothesis testing and a discussion of t-tests, z-scores, t-tests and correlation. The inferential statistics course reviewed hypothesis testing, t-tests and correlation, then focused on regression and ANOVA. The multivariate statistics course covered techniques like cluster analysis, canonical correlation and discriminant function analysis. Psychometric statistics covered factor analysis and various types of reliability and validity. These four courses were the BASICS, what everyone in graduate school took. (People like me who specialized in applied statistics took a bunch more classes on top of that.) Oh, yes, and each class came with a three-hour computer lab AFTER the three-hour lecture, to teach you enough programming so you could do the analyses yourself. Now, many textbooks try to include all of this in one course, which is just a joke, and ends up with students concluding that they “are just not very good at math”.
I can’t change the curriculum, but what I at least can do is provide some type of resource where every time a student feels he or she needs to back up and understand some concept, there is an explanation of that something.
I plan to have this done by the time I teach Data Mining in August.
Suggestions for what to include are welcome.
I don’t use AMOS for structural equation modeling all that often and every time I do I have to look up all of the steps again.
1. Install SPSS and AMOS. Fortunately, it seems to work on Windows 8. Yay! You can either open AMOS by double-clicking on it or you can open it directly from the ANALYZE menu in SPSS
2. Go to FILE > DATA FILES > Click on FILENAME and then go to wherever the SPSS file is saved. When you open the file, if you haven’t opened it from SPSS and want to look at the file to be sure you have the right data, if you click on the View Data tab it opens SPSS and the data file.
3. Click on the RECTANGLE (top left corner) and draw a box for each observed variable.
4. Double-click on each box to give it a variable name and label
5. Click on the single arrow to draw paths, the double arrow to draw covariances
6. Include an other term for error variance
7. Set the regression parameter of one of the paths to 1
8. Click on View > Analysis Properties and select Output. If you don’t do this, you won’t get much output and you will be disappointed. At a minimum here select standardized estimates, but you probably want squared multiple correlations and maybe some other stuff too.
9. Select Calculate Estimates
At this point, you may get the dreaded error … Path is not of a legal form.
10. Here is what you need to do – save your file. The AMOS manual says you should be prompted to save your file, but I wasn’t (neither on Windows 7 nor on Windows 8). However, saving the file solved the problem.
My assumption is that AMOS writes output to a path relative to where your AMOS file is saved and if you haven’t saved the file, it causes this error.
So, hurray, hurray it runs and you are looking at the exact same model you were a minute again. Where are the estimates?
11. Click the SECOND button in the top middle pane and change-O presto, your estimates appear on the path diagram. You can also select TEXT OUTPUT under the VIEW menu for some tables.
I’ll finish up this project and several months from now when I’m using AMOS again I’ll be glad I wrote this post.
I was reading the powerpoints that came with a textbook, you know, in the instructor’s packet, and I was already thinking this book was a little more focused on computation over comprehension for my liking when I came to the following learning objective:
“Compute an Analysis of Variance by hand.”
Are you fucking kidding me? I have given this a lot of thought and I have come to the conclusion, “Just, no”.
You know why? Because this is the year 2013 and we have computers. Now, I’m not saying you cannot compute an ANOVA by hand if that makes you happy. I’m also not saying you should be like my friend from graduate school who answered the question on her comps
“What is the multiple R-squared and how do you get it?”
“The multiple R-squared is the square of the multiple R and the computer gives it to you.”
I can tease her about this now because she passed her exams the second time around and earned tenure over a decade ago. Contrary to what you think at moments like that, not only WILL you live it down, you will go on to laugh about it.
There will be those who say, “What if your computer doesn’t work?” In that case, I think I’d have more pressing issues on my mind, like getting my computer to work. For one thing, I’m going to assume that you are not just finding sums of squares due to your complete absence of a social life but rather are part of some organization that has an interest in sums of squares, and also probably has more than one piece of hardware. In my case, if one computer doesn’t work, I have two more in my office and four more upstairs. Of course, one each is currently occupied by The Spoiled One and The Invisible Developer, but I’m pretty certain if it came right down to it, I could wrestle a computer away from almost anyone in this group and that includes the dog. (She’s a Dogo Argentino, in case you wondered.)
Take tonight, for example. I am very, very annoyed because my class is using the SAS Web Editor and for some unknown reason the site has been down for the past 10 hours. Apparently, SAS has concluded that no one would ever do homework late at night or on weekends so there is no point in having the On-Demand for Academics available.
I do have SAS on my desktop, but that would involve switching over to boot camp. I also have SPSS but again, that would require restarting in Windows which I don’t feel like doing because I’m in the middle of writing a lecture. I installed Office 2010 on my laptop, was dismayed to find that there is no longer a data analysis tool pack for the Mac – yes, I do know it quit shipping with VBA at 2008 – and the third-part stat pack doesn’t do much.
So, what is the conclusion? Well, I guess I’ll see if the SAS Web Editor is up tomorrow. If not, I’ll finish the class that ends this month and go on to finally learn R. I thought the Web Editor was a great idea but you can’t run a program in the cloud that goes down for 14 hours and no one in your organization seems to notice. One of the reasons I have stuck with SAS is that they do have really cool statistical procedures, their model selection procedures are a neat idea and there is generally an enormous legacy of good stuff. I thought perhaps by moving to a web-based model SAS could recover some of the market share it has been losing, maybe even have both something students could use while in school and a product they could use once they graduated by paying a monthly use fee like Adobe has for its Creative Suite.
Contrast this with pair.com which we use for things like email, our MySQL databases, running our PHP scripts. I love pair. They have 24/7 support and not by some person reading out of a manual, but a person who can actually help you. Downtime on pair over the last several years (that we’ve noticed), hasn’t been more than two hours, total, and when we called them, they were already aware of it and able to fix it in under 30 minutes.
In fact, we’re already migrating away from SAS and for small clients that can’t afford a SAS license and require basic statistics, writing their applications in PHP and MySQL.
There are two points here.
First, nowhere in this situation did I think,
“You know what I need to do? I need to start computing statistics by hand, using a pencil and a piece of paper, like I did when I was in graduation school in 1978.”
Second, using SAS is becoming as laborious as computing statistics by hand. Yes, it’s great if you have it installed on your desktop (and that is often a whole kettle of fish in itself), but that is often thousands of dollars per seat. The Web Editor is a great idea but if it isn’t available, it’s not so great.
Here are your choices – using something that’s thousands of dollars, use something that’s free but doesn’t always work when you need it or use something that’s free and you can download on your desktop. I don’t know that I’m ready to give up on SAS completely let but I have to admit that I see why so many universities have gone to R.
This month, I’m teaching biostatistics for National University, and so far I am really enjoying it. There is just a really minor problem, though. While I received a copy of the textbook, I did not receive a copy of the instructor’s manual with answers to the homework problems. Since I am going to grade 20 people based on whatever I get, I need to be 100% correct in everything and it is taking up my time to computer Cumulative Incidence for the population, cumulative incidence for people with hypertension, population attributable risk - and I am busy.
So, check this out, and all of you epidemiologists, I am sure this is old hat to you …. I had a table that gave me the number of people who were and were not hypertensive and whether or not they had a stroke in the five years they were followed. I wanted cumulative incidence for those with hypertension, those without and the population attributable risk.
And here we go …..
DATA stroke ;
INPUT Event_E Count_E Event_NE Count_NE;
18 252 46 998
proc stdrate data=stroke
population event=Event_E total=Count_E;
reference event=Event_NE total=Count_NE;
All I need to do is create a data set where I give the number of people who were exposed, (in this case, who had hypertension) who had the event, a stroke, in my example, and the total number of exposed people. Then, the number not exposed (that is, not hypertensive) who had the event, and the total number not exposed.
I just invoke the PROC STDRATE giving it the name of my dataset and specifying that I wanted risk as the statistics.
In my POPULATION statement, I specify that for the population of interest, people with hypertension, the number who had the event was found in the variable Event_E and the total number was in Count_E .
In my REFERENCE statement, I give the number who had the event and the total number for people who were not exposed to the risk factor.
A lot of start-ups we know are not as fortunate as we are and they are looking for developers and having trouble finding them. Some have even tried to poach The Invisible Developer away from me, but they have found it impossible to compete with my offer of paying him six figures, letting him work at home in his underwear and having sex with him. (Every time I say this, The Spoiled One puts her fingers in her ears and chants, “La la la, I can’t hear you!” )
If you are looking for a software developer of your very own, here are a few suggestions from me and additions from The I.D. Stop thinking about what YOU need and start thinking what you might offer. Yes, you might be able to go on some random website and find someone willing to code for minimum wage. I guarantee you that the best people (isn’t that who you want by your side to change the world?) are not there. They already are working on other projects.
1. Pay decent money: I don’t care WHAT stupid article you read that said technical people are motivated by more than salary. Yes, there are is a threshold. If you came in and offered us each a million dollars to work for you today doing something like creating a replacement for SQL or a new operating system, neither of us would be interested. The key point is, though, we are already making enough money to live by the beach and shop at Bloomingdales with The Spoiled One. If people can’t pay their bills on what you are paying they will either:
- Quit your project for one that does pay enough that they can afford housing, food, clothes and Chardonnay, or
- Take another job to pay the bills and work on yours in their spare time, which will be very limited.
2. Have interesting work: The definition of ‘interesting’ is a personal one but anyone who is really good got that way because they were continually learning. In selling your potential developers, talk about how they will have the opportunity to choose the language, IDE, libraries, hardware, etc. they use to develop. Talk about the new things they could learn. Certainly, there will be some limits. At the moment we don’t develop for Linux, although we’d love to, because it’s not compatible with Unity. Both the I.D. and I have left six-figure jobs for other jobs because they weren’t fun. Note that I did not say we left to work for free. Fun only matters after the rent and kids’ tuition are paid (see #1).
3. Have perks: Like interesting, this is a personal definition. For some people, it is having flexible hours so they can spend time with their children. For others, it might be telecommuting. At The Julia Group, we know we can’t match Microsoft or Google in salaries and other financial incentives. We can offer you the flexibility to set your own hours, work from home, maybe buy you the exact hardware and software to your specifications.
4. Address a need the developer is passionate about: It seems most start-ups looking for a developer start here but I don’t know a lot of developers who do. This isn’t to say that I don’t know some great people who would like to have an impact on the world, but they first would like to pay the rent (Maslow’s hierarchy, anyone?). I know developers who are passionate about climate change, education, inequality – but really not all that many. I mean, they do care about those things but they aren’t any more likely to quit their day jobs and devote their lives to them than the guy who runs the car lot down the street from me. Your mileage may vary. I’m sure people I know personally are not a representative, random sample.
The Invisible Developer added this:
5. Hang out where the developers hang out: If you are looking for someone to create iphone apps, there are iphone developer forums. Lurk there and see who is asking beginner questions and who is answering them. In many forums, people will post if they are available and looking for work. If you’ve read a number of their posts, you might have an idea if you want to contact them or not.
6. Learn to code or at least a little bit about coding: I’m not saying you need to create your own operating system from scratch, but you ought to know the difference between a jpeg file and a website design, have some idea about how long it should take to code a web form (not very) versus a really good 3-D adventure game (the rest of your natural life – just kidding, sort of).
I was wrong.
Getting the pages in the game to all look alike was one of those tasks I put off until later, and maybe we would just hire someone to do it. Well, guess what, later has arrived. So, I spent a day reading Stylin’ with CSS (which rocks, by the way) . The main motivating factor was that I had to do some test questions for our game that match up to the type of items on the new computerized exams testing the Common Core curriculum. This means that I needed things draggable and droppable – no problem with jquery – but I also needed them laid out very specifically on a number line. I could have done this with the canvas tag, but really, css proved the perfect solution.
Not only was I able to use margins, relative positioning and float to get my objects on the page to show up exactly how I wanted them, but I was also able to do it so that I am pretty sure it will look the same in most browsers on most computers and not just my lovely cinema display using Firefox.
On top of this, I learned about the acronym tag which I could not believe I did not know existed before now. (Yes, I know it is an HTML tag but it was in a book on CSS I happened to be reading.)
In short, you do this
<acronym title="What you want to show up when you hover">The thing you hover over</acronym>
In our games, we use many words from the students’ tribal language. For example, for the game we are going to be piloting on the Turtle Mountain reservation this spring,
Nookomis says …
How was it possible I did not know this? Because I had the stupid idea that CSS was a woman thing and if you want to make money in life and be taken seriously you hire someone at a low salary to do the things that women do and you concentrate in other areas.
I was wrong about CSS and that was the second thing involving stereotypes about women that I was completely wrong about this week. The other one was a book on women in fitness. You can read about that here.
I am old. I remember punched cards, COBOL, dumb terminals and having to walk over to the computer center and load tapes on to the drive if I wanted to use large data sets – large back then meaning 100,000 records or more with a few hundred variables. We thought that was pretty big data.
So …. when I went to the Western Users of SAS Software conference this year, I was struck by the fact that I seemed to be about the median age. There were A LOT of people older than me. Most of the younger people were the student scholarship winners and junior professional award winners.
This does not bode well for SAS, and it made me a bit sad, because as I said in a prior post, the model selection procedures were cool, from a statistical perspective, there is a lot of good stuff from SAS.
I used to go to the user group meetings and they would give you a book (yes, on paper, children) that had macros written by SAS users. I think that was the first time I saw the parallel analysis criterion code for factor analysis – a macro I used in my dissertation and in one of the first articles I published.
Tonight, I was looking for a way to do power analysis for a repeated measures ANCOVA and I could not find it for SAS, neither using PROC POWER, PROC GLMPOWER nor any user-written macros. It may exist – I looked several other places as well, found a paper on how to do it using SPSS syntax (although that code did not work!) and someone else wrote a procedure in R that I didn’t try.
SAS used to be the place for the cutting edge. What happened?
One reason is that everyone used to use either SAS or SPSS at universities and that isn’t the case any more. A second is that SAS is really expensive, so universities who do not have a license aren’t inclined to get one.
This all sounds like the death knell is tolling for SAS and it is just a matter of time until it follows COBOL and Blackberry as one of those things that people ask, “Why are you using that?”
I think there is still some possibility for SAS to turn things around – although whether they will or not remains to be seen.
The smartest thing SAS has done in years is to come out with SAS On-Demand for Academics. This makes SAS free for university students and professors. It’s perfect for on-line courses because you can upload your data to the class website and all of your students can access it.
Now the next thing SAS needs to do is start making that available at a reasonable cost once students graduate. Instead of charging them thousands of dollars a year for a license, they can charge $50 a month like Adobe does for its design package or Google does for its apps. (Yes, Google apps for business are cheaper than $50 a month but they don’t do all that much.)
New graduates aren’t going to pay several thousand dollars for a license because they don’t have that kind of money. They might shell out $50 plus occasional extra charges to access some high performance computing capabilities.
SAS already has millions of lines of code and tens of thousands of pages of good documentation. It’s some good stuff.
Think about this – years ago, the Mac was considered a better computer than Windows but over-priced. Many people thought Apple would go under. Instead, they came out with the iPhone and the iPad and they are wildly successful.
The Web Editor and other cloud products could become the SAS version of the iPad.
Here’s to hoping they don’t fuck it up.
It’s also not just so you can get your own original, signed illustration of the difference between ordinary least squares and maximum entropy methods from Don from SAS,
I was very pleasantly surprised to learn more than I expected at WUSS this year. I was aware of the GLMSELECT procedure available to select the best-fitting model, but I have not actually used it. Funda Gunes, from SAS, gave a great talk on model selection methods. To summarize the last hour – you create 1,000 or so bootstrapped samples, then run models with those each of those and select the average coefficient estimates from the 1,000 models. This is the best model not in the stepwise regression sense of giving you the highest explained variance, but as in most likely to correctly reflect the population values. That is a GROSS over-simplification but I highly recommend if you have any interest in model selection techniques, you download and read her paper which should be available from the conference proceedings, which will be published on the WUSS site eventually.
A second good paper on model selection was by Scott Leslie, pretty much on the polar opposite on the technical side from Funda’s, where he showed a series of ROC curves to illustrate the gradual (or sometimes substantial) improvement in a model as new predictors were added. He ended with a discussion of what might be better predictors of adherence to a prescribed medication regimen and how would you get that data.
In Kechen Zhao’s presentation, I learned about using PROC GENMOD to compare four different model types – logistic, log-binomial, Poisson and modified Poisson. He discussed relative risk as a variable of interest versus odds ratios, and the fact that logistic regression in particular can produce substantially different estimates then the other models. This is worth a whole post in itself that I will try to get to next week.
As an added icing on the cake, in a session by Marie Bowman-Davis I learned about a public use data set, the California Health Interview Survey. (I did not know these data were available for public use and they are obviously a great resource for teaching.)
Despite all of these good things, I left the conference a bit concerned about the future of SAS – the average age of attendees at the conference was probably over 50. More about why that is and why that’s a problem later, since this post is already long enough and I have actual work to do.
If you have a problem here are the 3 most likely suspects
1. You’re a professor and uploading a file. Connect using FTP. It seems like every program I have on my computer is set at SFTP. Change that.
2. Trying to log into the web editor with your email address. You have a user name. It is not an email address. It doesn’t have any @xxx.edu It’s just something like bozoclown
3. You have a data set for your class, you are sure it was uploaded, but you can’t find it. Even though your data was successfully uploaded to the class directory, it is not going to show up under libraries unless your program includes a LIBNAME statement. Then, you need to run that program. Now you should see it under LIBRARIES
If you’re interested in learning more about the SAS Web editor,
Here is a bit about uploading files and a caveat on working with open data at the last minute
Here is where you find what your LIBNAME is. If you’re a professor, be sure to send your students an email with this information if you want them to access data you upload for your class.
Here is some more about using PROC CPORT/CIMPORT if you are uploading data in an older SAS format
And that’s my tips for early morning in Las Vegas. Now I’m headed downstairs to teach a class on categorical data analysis and attend WUSS13. If you’re here, be sure to say hi.
if you read yesterday’s post (and how could you not!) You were treated to my scintillating tale of my experience with Microsoft customer support where I called five people and got five different answers to the question about upgrading to Windows 8.1 Pro. Although two people did give me the same answer, one person gave me two different answers, so it cancelled out.
I said I would let you know which story turned out to be true once I got home and opened the package from Microsoft ….
And the answer is ….. NONE OF THE ABOVE !
When I got home, there was a box with no DVD despite the fact that the website and the receipt both said DVD – English. The box contained a product key and correct instructions.
Go to the Windows start screen
Type in Add features to Windows 8.1. Then click SETTINGS.
Click Add features to Windows 8.1 then click I already have a product key
Enter the product key and click next.
There is no DVD.