A large part of my day is spent playing with new software and trying to break it. Yes, there are actually grown-ups who get paid to do this for a living.
I find it hard to believe myself.
The theory, which actually works well, is that whenever someone has a question about something he or she wants to do, no matter how esoteric, I will have tried it at some point, based on my general philosophy of life which is, “What the hell… let’s see what happens.”
My inappropriately named desktop, since it is actually under my desk, runs Mac OS 10.6 and has five virtual machines with Vista, Windows 7 (32 & 64 bit), XP and Ubuntu. There is a supercomputer over my head that I can tap into from here directly that also runs SAS and Stata. So, why would I need JMP?
Besides, what really annoyed me at all the JMP events I went to (an N of 3) were all about look at these pretty pictures we got with JMP and nothing on how to do it. Finally, I went to one at SAS Global Forum which was by Wayne Levin of Predictum and was excellent (full disclosure: I probably wouldn’t recognize Wayne Levin again if I tripped over him, I only know the name because it is on a handout on my desk which has not been cleaned since I got back from SGF and he’s never given me so much as a jelly bean. It was still excellent.)
JMP is one of the many things that has been laying around here for the last couple of years that I’d look at every now and then, and think maybe I should do something with this. Lately, three things occurred to me.
1. It runs on a Mac, thus sparing me the 30 seconds of opening a virtual machine, that could then be used for such extremely important tasks as getting jelly beans out of my drawer.
2. It makes pictures, which fits well into my current interest in visual data analysis.
3. It gives me an answer for people who call up and say,
“SAS doesn’t run on a Mac? What the hell am I supposed to do now?”
I am actually married to one of those people who doesn’t believe anyone should buy software unless it AT LEAST runs on a mac and preferably Linux, too. Learning JMP turned out to be less trouble than finding another husband as good as the one I already have, so I decided to go with that.
I had a dataset downloaded from ICPSR and that I had done lots of work on in SAS. I was working on a project with someone who only uses JMP. So, I saved the dataset as a JMP file. We were working on a project to predict who would enlist in the military. I had a sample of > 2,500 high school sophomores who had been asked their plans after graduation. In JMP, I select ANALYZE from the main menu and then DISTRIBUTION. I moved the two variables into the Y column and clicked OK.
JMP TIP —-> NOTICE THE ARROWS —>
Those little red arrows next to almost everything do stuff. For example, when the results window first came up, I didn’t like the looks of it. No, it wasn’t rolling its eyes at me. It had the histogram vertically oriented and a table of Quantiles I had no interest in. Grey arrows expand and contract things. Red arrows give you options. If a grey arrow is pointing dowm and you click it, it hides what is underneath. Conversely, if it is pointing sideways it has hidden stuff underneath and you can click it to expand and see what that is. So. I got rid of the quantiles.
Clicking on the red arrow next to each variable gives a whole list of options and some options of the options. I clicked HISTOGRAM OPTIONS and then I clicked on VERTICAL which had been selected by default. Then I selected SHOW PERCENTS. Here is my first picture and my first conclusion. People are a bunch of liars.
Curt Gilroy, who was cited in the Army Times and has the impressive title of Director of Accessions for the Pentagon (which does not, despite what may have been implied by Sister Marion in the seventh-grade, have anything to do with the Virgin Mary going to heaven. That was the Ascension, or the Assumption. Either way, it definitely did not involve the Pentagon.)
Anyway, Gilroy says that 12% of military eligible youth show an interest in military service. So, if we put the 4% who said they “definitely will” (=4) and the 9% who said they “probably will” (=3) join the armed services after high school, we get 13% which sound about right.
However, 89% say that they definitely or probably will go to a four-year college. Uh, no. First of all, the percentage of freshman students who will graduate is only 73% according to the National Center for Education Statistics and of those only 69% will enroll in a four-year school. So, .73 *.69 = 50.4% and even given that some of the high school drop out has already occurred by the spring of tenth grade, uh, how about no, 89% of you are not going to four-year schools I am sorry to say.
I think race is a factor in military service. The data I used included race as 1 = African-American 2= White 3= Everyone else. I thought that third category doesn’t really make much sense for analysis. So, I created a new variable African-American which was 1 if race =1 and 0 if race = 2 or 3. Here is how:
Select COLS then NEW COLUMN. In the pop-up window, give it a name and then select FORMULA under column properties.
In the functions select CONDITIONAL and pick IF.
Formula box will pop up and it should be pretty obvious. You can just click on RACE to have it moved into your formula, then type = and put a 1 in the first box and a one in the second box for
If RACE = 1 then the new variable = 1.
Next, I can go to ANALYZE, MODELING, PARTITION and click on SPLIT a few times and I get my decision tree. It’s a start. I still think race should factor in there and I think the reason it doesn’t is because of that “garbage category” of three for everyone else – Asian, Native American, people who didn’t say. My hypothesis is that if I change that, race will become a factor.
So, what would I do with JMP? I guess since I should have left for home an hour ago, the answer is “get immersed in questions I’m interested in and lose track of time.”
Essentially, the same thing I do every day.