“But look, you found the notice didn’t you?”
“Yes,” said Arthur, “yes I did. It was on display in the bottom of a locked filing cabinet stuck in a disused lavatory with a sign on the door saying ‘Beware of the Leopard’.”
It sounds as if Douglas Adams, author of the Hitchhikers’ Guide to the Galaxy, started his career out as a programmer. There are some very useful pieces of information that might as well be guarded by a leopard. As a public service, here they are in plain view, guarded only by a couple of relatively harmless meerkats.
SPSS uninstall, whether it wants to or not.
This tip from the MAC OS X Tips blog was written referring to SPSS 16 but people still have the same problem and this fix still works with SPSS 17 and SPSS 18 (going incognito as PASW 18) .
I am going to assume that either:
A. You tried to do the right thing and uninstalled using the uninstaller. If you haven’t tried that yet, do.
For SPSS 17 look in
/Applications/SPSSInc/Statistics17 there is a folder ‘_uninst’, doubleclick ‘uninstall.jar’ to uninstall SPSS 17.
For PASW 18
/Applications/SPSSInc/PASW18 Statistics/Uninstall PASW 18 Statistics double-click on the file Uninstall PASW Statistics 18. It is the little icon that is inexplicably a red ball with a plane flying around it.
That should work. Things are not always as they should be. Let’s say you tried that and it did not work. OR
b. You are now saying, “Oops” because you did not realize that there was an uninstall file and you just dragged some SPSS-related files to the trash.
In either event. It didn’t work. Don’t worry. Do this:
Delete the SPSSInc folder if you didn’t already. Usually it is found under applications.
Delete anything else you find with SPSS or PASW in the name. Then do the following:
In order to resolve this issue if SPSS is not removed through the uninstall process please go into the ‘Users’ folder on the main operating systems hard drive. In here you should find a house with your login name on it and you should go into this item. In this house (folder), please trash the following:
~/InstallShield (directory, remove the whole thing)
~/Library/Preferences/com.spss.spss for mac.plist
Empty the trash.
Now, go ahead and install SPSS 18. When I did this with PASW/SPSS 18 it DID give me a message saying that a version of SPSS was installed and please uninstall it before continuing. I ignored that message and continued anyway and SPSS 18 installed fine.
Other useful bits of information:
SPSS 17 and later does NOT work on non-Intel-based Macs. If you have an older Mac, the Power-PC version, SPSS 16 is the latest version that will run on it. The good news is that it runs perfectly fine and there is no real noticeable difference between 16 and 17 anyway.
SPSS 17 works on Mac OS 10.4 and higher.
PASW 18/ SPSS 18 works on Mac OS 10.5 and higher. If you have 10.4 either upgrade or use SPSS 17.
Last week, a couple of really sharp cookies from JMP were on campus giving a presentation and their academic program manager, Curt Hinrchs commented that what is really needed is a course on statistical thinking. I think he is absolutely on to something.
I mentioned in my last post how there is a debate over whether fluid intelligence really declines with age or whether older people are just less inclined to put effort into a pointless task. This happened to me the other day when someone stopped by with a series of very complicated equations with the log of this and the log of that squared. Here is how our conversation went:
“Why do you have years of experience squared as an independent variable? Do you think that average earnings increase up to a certain number of years of experience and then have a negative relationship after that? Why do you think that would happen?”
“Oh, well we just have years of experience squared to check for a quadratic effect.”
“Do we really need to have a reason?”
“I think so. I would.”
“We want to test for the impact of an increase in education. So we would like to get the predicted value for population earnings if everyone who has less than a high school education graduated from high school.”
“That’s easy enough. You have a regression equation. You have a predicted value for people who have 12 years of education. Write some code to replace the predicted value for all those people who have less than 12 years to that value. Get the totals. Compare the two.”
“But we would have to decrease everyone’s years of experience.”
“Because if a person is in school for two more years, they would work for two less years.”
“Not at all. The unemployment rate is high for high school dropouts. Particularly youth. You are assuming that a person who drops out of school in the tenth grade will be employed continuously for the next two years. “
“But you know that is wrong.” (In fact, I looked it up, according to the Bureau of Labor Statistics, the unemployment rate for recent high school dropouts was 32.9%, not 0% ).
I was not much help to this person in solving the equations because I just could not see a lot of point to it. I try to help anyone that comes to see me, and they usually go away pretty happy. I am also very happy they come because it gives me a chance to steer them in the right direction. Here is another one:
“I am predicting the success of start-up firms but I have eliminated those that ended up making more than $10 million per year, and also those that weren’t in business five years later.”
“Because I am using income as my dependent variable and if I include the companies that made zero or over $10 million my data would be really skewed. Some of them made over $100 million.”
“But don’t you think that most investors really want to be able to predict which companies will make them a LOT of money? And if they apply your equation to predict which companies in which to invest, these won’t even be in your sample.”
“Yes, that is correct.”
“Have you thought about applying your equation to those companies and seeing if it is accurate? Maybe you want your dependent variable to be something different. Perhaps it could be categorical, something like the company failed, it was in business five years later with earnings less than $1,000,000 a year or more than $1,000, 000 a year. I don’t know if that is exactly what you want to do, but I do think you need to revise your design.”
“Because investors want to predict which companies will produce very high returns and which will cause them to lose their investment. I am not sure your equations will do that.”
And yet another one …
“Does this output look correct to you.”
“Because your R-squared is .99 . That means you have explained 99% of the variance.”
“Are you familiar with my field?”
“I am familiar with reality. You have dummy-coded county. That gives you about 3,000 independent variables right there. Altogether, you have about 4,997 independent variables and a sample of 5,000. Also, you have one variable, MALE, that is coded 1 if the subject is male and 0 if not. You have a second variable, FEMALE, that is coded 1 if female and 0 if not. “
“What should I do?”
“Read this article on multicollinearity. Drop the FEMALE variable from your equation. Take a look at your counties and see if there is some way you can subdivide these that make sense in terms of your field, for example, rural persistent poverty counties, urban persistent poverty counties, urban middle income, urban upper income – whatever makes sense in the context of your study. Is there even a reason for having county as a variable? Think about that. Come back next week and we will rewrite your program.”
I give all of these people credit for sensing something was not quite right and stopping by to talk with me. They come from different universities, all well respected, and they are all very intelligent people. Some of them have been taught mathematics very well. That is a good thing. Their equations are elegant and correct, as far as it goes.
Dr. F.N. (Florence Nightingale) David was chair of the statistics department at the University of California, Riverside. She left the university about the time I started my doctoral program there, but I remember one of my professors telling this story.
“_____ did the defense of his dissertation and F.N. David was the outside member on his committee. He gave a very good description of all of the analyses he had done and why his hypotheses were supported. She was really the statistical expert and we all turned to her when the candidate was out of the room and asked her opinion. She took her cigar out of her mouth and said, ‘That young man believes his numbers too much.’ “
Maybe it is true that the more things change the more they stay the same.
Somewhere in all of this we maybe have to get back to the idea of science we had when we were ten years old. Of not being so wedded to our ideas but just thinking, wondering about stuff, mucking around and seeing what works.
Two or three lifetimes ago, I was an Associate Professor at a small, liberal arts college, teaching, among other things, lifespan developmental psychology because, well, somebody needed to teach it and I had published several articles on assessment of families and other semi-related issues. One debate in the field, I learned, was how much of the purported decline in “fluid intelligence” was real and how much was just a function of the unwillingness of older people to solve pointless problems to please a researcher. I call this “the crotchety effect”.
I have been experiencing that effect a lot lately. I am reading a great book entitled , This will change everything, a product of edge.org , which is best described as a website full of people who think about things besides next quarter’s stock prices , whether their clothes are springtime fresh, what’s for dinner and who is president of the country club. Several of the chapters were the expected emerging science/ technology. We’ll be able to know our own DNA so we can predict if we will get Alzheimer’s at 65 or ovarian cancer at 40, we’ll have robots that we can love, or as close a facsimile as most marriages anyway, we’ll have genetically engineered children to be faster, smarter, stronger. A chapter I read this morning really struck me. The author, Frank Tipler, said something like this:
What will change everything is the enormous decline in the number of people going into science and mathematics. Steven Hawking and the other less well-known but equally brilliant and creative scientists will soon retire and I do not see anyone coming up to replace them.
The author goes on to note that this decline is even greater than has been realized within America because such a growing proportion of our graduate programs are filled with students from other countries. Where I disagree with this chapter is the assertion is due to the ability of graduates entering finance to make three times the salary the author was receiving as a tenured professor. I don’t think too many people in previous generations entered science or mathematics because they expected it to pay better than any alternative. In fact, I think the reverse is true. I see many more people now, especially those from outside the U.S., who are entering these fields precisely because they expect to make a very comfortable living. Thirty-five years ago, when I was a college freshman, most of my friends in engineering or computer science majored in that because it interested them.
I had enough courses in business and economics to major in either field but I had business on my transcript because I thought it was more likely to get me a job. Statistics, economics and computer science classes I took just because I thought they were interesting, for the same reason I listen to National Public Radio, or read the New York Times , and devoted about as much serious study to them as an undergraduate. HOWEVER, after I graduated, I never gave another thought to net present value, whatever it was I was supposed to have learned in finance or industrial psychology, while I continued to read every day about statistics, programming and systems design.
In this country, we have a great faith in the ability of science and technology to “fix things”, everything from the mortgage crisis to AIDS. Yet, few people want to be making or maintaining that science and technology. Even the people that say they do only want to do so tangentially. I was quite disappointed today looking at the National Science Foundation SMETE Digital Library. There is some good stuff, for example, MIT Open Courseware is a gift to the nation and God bless ’em. Honestly, though, it is over and above what the average undergraduate and many graduate students can comprehend on their own. That doesn’t mean it isn’t great for those above average students in motivation or intellect. On the other hand, a lot of what I found in the digital library was fluff – links like “What famous mathematicians had birthdays today?” or “Biographies of famous women mathematicians.”
Okay. Stop right there damn it! You know what would help my beautiful little girl have a better chance of being a mathematician? If Mommy sits down with her and says,
“You know what a normal distribution is, my angel? You don’t? Well, it looks like this, with a few people very far below average, most of the people are average, and a few people way above average. Then, there are other distributions. This one is skewed. You can measure how different something is from normal. You can look at it, or you can use numbers … “
This is exactly what I did tonight, drawing a histogram on a white board that, for some unknown reason, was in the middle of the living room floor when I walked in.
Why has the quality of our students declined so precipitously? I’m really not sure but I think a big part of the reason is that our demands on them have dropped dramatically. My mother, my high school teachers, my siblings and my professors as an undergraduate could all line up to be first to tell you that I was not a particularly good student. In college, I went to more parties than I did classes. I studied mathematical statistics, Fortran, BASIC programming, Differential and Integral Calculus – because those were – you’re not going to believe this – ways to MEET MY GENERAL EDUCATION REQUIREMENTS. Yes, your random, relatively unmotivated college freshman whose main priority was which frat party to attend on Friday night was expected to prove the central limit theorem, find a derivative and write a subroutine just because those were the sorts of things a well-rounded educated person was expected to know.
So, I took classes in high school, like matrix algebra, analytic geometry and Calculus because it is what was offered and my teachers taught me far better than I deserved. You could take general math but then all of your friends made fun of you because you were stupid.
When I went to college, I had some of that stuff all over again and, with a little effort, kept the B average necessary to keep my scholarship and continue going to frat parties. I have to admit it wasn’t until I was in my mid-twenties, working on my second masters and Ph.D. that I became really motivated to learn statistics, and I haven’t looked back.
Three really, really key points here, though are:
A. I was prepared to learn by having been literally force-fed mathematics throughout school from middle school all the way to graduate school. No one, from Sister Marion, who taught me math in sixth grade, to Phyllis, my matrix algebra teacher at Logos High School, to Chris, my high school Calculus teacher, to Dr. Spitznagel who taught the statistics course I skipped every Friday as an undergraduate gave a damn if I wanted to learn it or not.
B. All of those people I mentioned knew math really,really well. I had the opportunity to learn from people who had every expectation in the world that I could and WOULD learn whatever it was they were teaching. If I had suggested getting extra credit for writing a paper on a famous woman mathematician they would have thought I had lost my mind.
C. Because of this, when I finally got interested in statistics, learning it was not insurmountable. Because I had matrix algebra in high school and in regional economics and urban economics in college, when I needed to learn the normal equations, I knew how to invert and transpose a matrix.
Things are different now. I have taught in North Dakota and California, and have friends at universities all over the country. We all see the same thing. Even at the doctoral level at the best universities, students do not have the same level of numeracy that your average, unmotivated party-going undergraduate did thirty-five years ago.
Unless we reverse it, this will change everything. And not in the good way.
A few days ago, I tried installing Enterprise Miner for, I think, the third time. The first time, I could not get it to work, saw we needed something called a planned install for which I needed a plan file which I was to get from my SAS administrator, who happens to be me and I did not have one. Since I did not really need Enterprise Miner, I went ahead, figured out how to do the basic install and got busy with other things.
The second time, I really was interested in data mining and thought it might be nice to try to figure this out. I looked up a few documents on it but still did not have a planned file, even when I asked myself very nicely if I could have one. I found some things on standard plan files, tried downloading some stuff but it didn’t work and I got busy.
In between, I was copied on a couple of emails from someone else on campus complaining that they had never actually gotten it to work, but since it wasn’t directed to me, I was busy and the person writing the emails is an extremely experienced SAS programmer who I figured could get by fine without any help from me, I didn’t give it much thought.
Lately, though, three related events piqued my interest. First, a notice came across my desk that the department paying for Enterprise Miner was canceling the license. Second and third, two people in two completely different departments asked about using Enterprise Miner. So, having a little bit of time, I decided I would try to install it.
1. I could not install it from the DVDs we distribute for installation. I call the helpful people at SAS Tech support and they tell me that I need to install it from a SAS software depot. Unfortunately (long story I will skip) we no longer have a SAS software depot and cannot download it again.
2. We want SAS 9.2 Maintenance release 2 anyway, so super-nice people in SAS contracts help me get a new download order and I download that. I decide to hedge my bets by installing this on the most stable computer I have, Windows Vista 32-bit, plain vanilla as they come. This is not a machine used for testing, it is one I actually do my work on. Yes, that was stupid of me.
3. I create a software depot and start to do a planned install, select what appears to be the appropriate plan file which is now included as a choice. Everything seems to be fine until the 10th step where it gives me a message about an error with the Object Spawner. I decide to go ahead with the install anyway. After a few hours of downloading and installing, I have Enterprise Miner on my computer but it doesn’t work. Doesn’t work as in I can’t create a project.
4. There was a long interlude in here with me reading numerous documents and two and a half hours with an extremely kind and patient person from SAS Tech Support named Heidi Johnson, who deserves a raise for not screaming. In some extremely bizarre way, when I semi-installed SAS Enterprise Miner and related SAS stuff on my computer it revoked my administrator privileges so a lot of the reasonable suggestions made by Heidi have not worked. Also, SAS 9.2 which I need to do work no longer works on my computer, as in, when I try to import a file from Excel for example, it gives me an error message
After all of this my first thought was to either:
a. ) Delete everything with the word SAS in it off of my computer,
b.) Completely delete the virtual machine.
In an uncharacteristic burst of maturity, though, I realized that it would be difficult to be the SAS administrator on campus without using SAS, that although we do support Stata and SPSS also, dropping SAS because I was annoyed was probably not a justifiable decision to people who might ask me to justify it, besides which I had three questions that came in my inbox while I was on the phone regarding SAS programs. I already answered two of them. I suppose I could just tell anyone who calls tomorrow “I’m sorry we don’t support SAS on Thursdays. Call back on Friday.”
Unfortunately, I cannot uninstall the (non-working) software from my computer because I don’t have administrator privileges due to the software I cannot uninstall. That part, at least, I fixed. If for some unexplainable reason you have an urge to install Enterprise Miner and get stuck, here is what you can do.
Restart your computer in safe mode. Create a new account. Call it Administrator1 . Give it administrator privileges. Uninstall SAS, including Enterpriser Miner.
Now, when you start your computer again and log in as yourself you will once again have administrative privileges.
You may find that your uninstall did not completely uninstall SAS and when you try to reinstall it, you get some kind of error. At this point, you need to show your computer who is in charge.
1. Go to the add/ remove programs and remove everything with the word “SAS” in it.
2. Search on your computer and find if there is anything left with the word SAS in it. In the Program Files almost everything was still there in a folder labeled SAS. Apparently the uninstall in Windows did not uninstall. Move all that stuff to the recycle bin and empty the recycle bin.
After this, I re-installed SAS TS2M2 and it seemed to work. I am not 100% sure because as I was leaving the install just finished, without errors, but when I tried to import an Excel file nothing happened. It may be that I didn’t wait long enough.
In all of this, my daughter, the perfect and patient Jennifer, had been sitting in the lobby downstairs for the past half hour waiting for me to give her a ride home. I realized that I had not fed my fish, which by this time was swimming against the side of the tank, trying to attract my attention.
So, I fed my betta fish, Beta, along with Type I and Type II, the frogs in the other tank, answered a couple of questions on multicollinearity and how to code an infile statement to skip over hundreds of “header” lines, and headed out the door.
After spending so much time trying unsuccessfully to install Enterprise Miner (I will skip the previous problems with 9.1.3 ) I have a backlog of questions to answer on everything from the significance of increases in chi-square to how Stata processes large datasets (short answer: inefficiently).
I will only be in two days next week, as I am presenting at a conference in Minneapolis. The topic is analysis of ethics, an interesting enough subject to almost make me forget that I am going to be in Minneapolis in February. Almost.
As for Enterprise Miner, I asked Justin The Hardware Guy to see if he could find me a computer running Windows XP since maybe it is that I have a virtual machine. Maybe it is that I am using Vista. Hell, maybe Enterprise Miner was designed by UCLA fans. All I know is that some people somewhere have it working, just no one here. He actually did have one computer running XP but it only had 512 MB of RAM so I can’t imagine that would work.
In a few weeks,I may try again, when I have caught up again, analyzed ethics data, found a Windows XP machine with a minimum of 1G RAM and thawed out, and when Heidi can see my name pop up on the caller ID again without being tempted to feign her own death to get out of talking to me. (I have actually done that. It helps that few people expect their statistician to be a Latina grandmother – Dr. De Mars ? No, he just left. You probably passed him on the way in – really old guy, white-haired, balding, pot belly, yeah, that was him.)
Too bad for Heidi I know what she sounds like now. As for the people who asked me about Enterprise Miner, I will give them my honest opinion. In the last week, I have spent more time with this thing than my husband and he gives me money, helps raise the children, does the laundry and has sex with me. Unless they expect Enterprise Miner to do more for them than that, it probably isn’t worth the effort. But, if I ever find out differently, I will let them know.
The other day I needed to have a job run daily. My two efforts at solving this problem failed. These were:
1. Ask the six people hanging around for no particular reason how to do it. All vaguely remembered learning that at some point and had forgotten specifically how to do it.
2. Using Google to see if I could find exactly how to do it in five minutes or less which was the amount of time I was willing to stick around after the end of an annoying day.
There are man pages but I have never been able to read a man page without wanting to find the people who wrote it and slap them.
crontab – maintain crontab files for individual users (ISC Cron V4.1)
crontab [-u user] file
crontab [-u user] [-l | -r | -e] [-i] [-s]
Crontab is the program used to install, deinstall or list the tables used to drive the cron(8) daemon in ISC Cron. Each user can have their own crontab, and though these are files in /var/spool/ , they are not intended to be edited directly. For SELinux in mls mode can be even more crontabs – for each range. For more see selinux(8).
If the cron.allow file exists, then you must be listed therein in order to be allowed to use this command. If the cron.allow file does not exist but the cron.deny file does exist, then you must not be listed in the cron.deny file in order to use this command. If neither of these files exists, only the super user will be allowed to use this command.
-u It specifies the name of the user whose crontab is to be tweaked. If this option is not given, crontab examines “your” crontab, i.e., the crontab of the person executing the command. Note that su(8) can confuse crontab and that if you are running inside of su(8) you should always use the -u option for safety’s sake. The first form of this command is used to install a new crontab from some named file or standard input if the pseudo-filename “-” is given.”
I mean, seriously, if that kind of thing is clear to you, you were raised by aliens. Joe, the sysadmin who was standing here five minutes ago disagrees. You know that means? It means Joe is wrong.
What a cron job is & how to see what cron jobs you have
A cron job is simply a job that runs at a certain time. A crontab is a table listing all the cron jobs.
To see the list of cron jobs in your account, log in and type
You should see something that looks like this:
00 08 * * * /usr/joeblow/sas/default/bin/sas dir1/dir2/joesdayjob.sas
30 07 * * 1 /usr/joeblow/sas/default/bin/sas dir1/sub1/joesweekend.sas
31 07 1 * * /usr/usc/sas/default/bin/sas dir2/sub1/monthlything.sas
20 08 * * * mail -s email@example.com < dir1/dir2/things.txt
There are six fields in a cron job. The first one is minutes, the second is hours. Hours are on a 24-hour clock. The third field is day of the month, the fourth is month and the fifth is day of the week. Day of the week is 1= Monday through 7= Sunday. The last field is the command you want executed.
So, the first job above runs at 8 a.m. every day. It runs a job called joesdayjob.sas
The second job runs at 7:30 a.m. every Monday. It runs a job called joesweekend.sas
The third job runs at 7:31 a.m. on the first day of every month. It runs a job called monthlything.sas
The final job runs at 8:20 a.m. each day. It mails a file to me that is created at the end of the daily job just to tell me how frigging awesome I am. At the end of the daily job, SAS creates a file called things.txt . When everything is working it will send a message that says:
You are just unbelievably awesome.
If the job does not run successfully, I will get an error message and be sad, also, not awesome.
To change the recipient of the message, because, say, I go on vacation or get hit by a truck or something, a person could just change the email in the last line of the crontab.
How to edit your cron job
Please don’t do this unless you know what you are doing because if you delete the crontab, which is the table of cron jobs, and they all stop running, if some of them were my jobs I would be most annoyed.
This will bring up the emacs editor. Add your job at the bottom. That way, if it has an error, the other jobs should run and only yours will be messed up, and so you have only caused trouble for yourself and not other people.
Save the file. That’s it.
Now, next time I google this I will find it.
I have long suspected that the main role of other people on this earth is to annoy me. Take FIPS codes.
FIPS stands for “Federal Information Processing Standard”. Does no one know the meaning of the word “standard” ? A FIPS code is a useful thing. For counties it is a FIVE digit code. Got that, five?
Large numbers of people have apparently NOT gotten that. It would be immensely convenient to have FIPS codes defined the same everywhere. However, they are not. Recently, someone came by who was working with datasets some of which have state and county codes defined as numeric, so 01 = 1, 013 = 13 and she needed these merged with another file that had FIPS codes defined as string.
Just to be more annoying the length of the numeric FIPS codes was defined as 11. I don’t normally write SPSS syntax; I think SAS is preferable for data management. However, as a consultant my job isn’t to tell people what software to use but rather to help them use whatever they have chosen. So, here is one solution for the FIPS problem. FIRST change STATE and COUNTY from numeric to string, then run this code.
STRING stfips (A13).
IF (LENGTH(LTRIM(STATE)) = 2) stfips=LTRIM(STATE).
IF (LENGTH(LTRIM(STATE)) = 1) stfips=CONCAT(‘0’,LTRIM(STATE)).
STRING ctyfips (A14).
IF (LENGTH(LTRIM(County)) = 3) ctyfips=LTRIM(county).
IF (LENGTH(LTRIM(county)) = 2) ctyfips=CONCAT(‘0′,LTRIM(county)).
IF (LENGTH(LTRIM(county)) = 1) ctyfips=CONCAT(’00’,LTRIM(county)).
STRING fips (A5).
Here are several problems fixed – the county and state FIPS codes had leading blanks. How annoying. LTRIM takes care of those. The LENGTH function is used with the LTRIM function to determine if there is only 1 digit for state code. If so, a 0 is added to the front. Otherwise, the state fips code is just the existing two digits entered. Similarly, if the trimmed county fips is one digit, two zeroes are added, if it is two digits, one zero is added and if it is three digits nothing is added.
This is a really simple, common problem and I thought, using my new method of using Google to find solutions already written, I would find a snippet of code easily so I would not have to exert any effort. Unfortunately, no. I could not expect some very nice person to fill in FIPS codes by hand for 3,000 counties times eight categories. That would just be wrong. So, now she can open the 50 state files, run this code for each one and FIPS is once again standard.
My work here is done. I am heading to The Galley, known as the place to get the best steaks and best martinis in Santa Monica.