Every now and then I post a mistake I made using either statistical software or statistics. Students often get discouraged feeling they make so many mistakes and they will never get it all right. No one gets it all right all the time.
Obvious mistake of the day ….
I was making minor cosmetic changes on a production job on a test computer and none of the data for the current week were being selected. This is bad. It kept reading only 20 observations. I reviewed the code for a subsetting IF statement or other logic that would delete all but 20 observations. Nope, looked fine. It was supposed to read the records in the last week:
If rec_day > today() – 8 then do :
Using the SAS function today() and then doing a bunch of stuff.
I created a null dataset and put the value of today() to my SAS log
Data _null_ ;
set odd_data ;
put “Record Day = ” rec_day ;
put “Today = ” today () ;
Only 20 records were printed and sure enough, none of them were within the last week. The value of today() was exactly what it should be.
Then, it dawned on me in one of those moments where you slap your head and can’t believe you missed it. Earlier in the day, I had been running another job trying to format some output to be exactly right. Since I didn’t want pages of output, I had set
options obs = 20 ;
I had just closed the old program, opened the new one and kept on going.
My code was fine, the options were still in effect. I re-set
options obs = max ;
and life was good again.
I have been asked several times by students in my classes if I would consider writing a blog on statistics and statistical programming. Apparently, blogs are “in” with people younger than me, as witness daughter #2 at left who has just informed me that people do not use the term “in” any more. Whatever.
Giving it some thought, I could see three advantages to a blog on statistics, statistical software and common errors.
- Even after twenty-five years of experience, I am still making mistakes every day, so there will be no shortage of topics.
- It may help to dispel the myth that some people cling to that math, statistics and computer programming is something that you are either good at or not, the sole domain of those whos brains work differently than the normal people. On the contrary, I would say that these are both areas where I am very good and where I learn every day. These two are related.
- Speaking of the whole learning thing, it is possible that one could learn from reading a blog on other people’s mistakes. In fact, I have now added to my infinite to-do list, “Find blogs of other people’s mistakes.
SAS on UNIX – a lesson about how there is no place like home
A little background – the university where I work as a consultant has one of the top high performance computing centers in the world. Way cool. Everyone who has an account has a home directory where their personal information is stored – login files, etc. You can run small programs there and save small files. (I define small as anything under 20,000 records or so.) If you are working on a major project you may have something like the entire Medicaid records database which would take up a huge amount of disk space. Rarely would you work on a project like that by yourself and it doesn’t make sense for everyone to have a copy of some enormous file in their home directory, so you would have a project directory where your data are stored and shared with other people.
Yesterday, I kept trying to log in to my HPCC account and I got a message saying “explicit kill or server shutdown”. I thought perhaps there was something wrong with my Windows machine. Let’s face it, there’s always something wrong with Windows. I did all of the usual things – closed the XWin program, restarted the computer. I tried logging on to my account using two other computers and I still could not log on.
I tried using Fetch on my Mac to upload a file. The little Fetch dog kept running and running but nothing happened. No error message, just a continually running little Fetch dog.
Since it wasn’t my computer – I had tried logging on with three computers using three different operating systems – it must be my account. I logged in with another account no problem. Hmmm … definitely my account.
I logged in using PuTTY on my Windows machine thinking perhaps something had gone wrong with the XWin settings. I tried editing a file using Pico and it would not let me save it, saying “Disk quota exceeded”. This really made no sense since the quota for my project directory had been doubled to 200 GB due to another anomaly, a day or so ago. I had tried to copy a file and received a message “disk quota exceeded”.
I had looked at the files in my project directory and everything seemed fine. However, I asked the kind folks at HPCC to increase my quota to 200 GB and they did. Still, I did not think I had over 50 GB in my project directory.
Having managed to log in with PuTTY, I searched around my home directory, did an ls and found that I had inadvertently at some point saved a very large file to my project directory instead of my home director by having my home directory in the LIBNAME statement instead of my project directory. This had greatly exceeded my personal disk space quota of 100 MB. So … I moved the file from my home directory to my project directory giving myself gonzo amounts of space again and life was good.
LIBNAME libref “~/somename” ;
saves things to your home directory. I know that. I must have just been in a hurry.
Since I had just enough space in my home directory for the dataset to completely fill it up, my SAS job ran fine without errors and I was locked out of my account and over quota for a day.
LIBNAME libref “/projectdirectory/subdirectory” ;
saves it in your project directory.
Beware the ~ !!!!