You know those give-aways you get at conferences? The favorite one I ever saw, which I did not get because they had run out of them, was a magic wand. It was wand-shaped, with sparkles floating inside, they had a vase full of them with a note that this was the magic wand some people seemed to want you to wave to make all of the bugs disappear.
It did not have sparkles, much to my disappointment, but SAS Enterprise Guide actually made all of my data problems disappear today and I was happy.
Here is my problem – I downloaded a dataset from ICPSR (Interuniversity Consortium for Political and Social Research) that had hundreds of variables, each of which had a user-defined format and a name like V12345. I did not want hundreds of variables. I actually only wanted (I thought) 21.
So, first I did this in SAS 9.2 which read in the .stc file using PROC CIMPORT and kept me from getting format errors since I had the nofmterr option.
Libname in "E:\DS0008" ;
Filename readit 'E:\DS0008\25422-0008-Data.stc' ;
proc cimport infile = readit library = in ;
options nofmterr ;
data in.iom ;
set in.da25422p8 ;
keep caseid v4259 - v4267 v4240 v4253 v4254 v4255 v4241 v4240 v4116 - v4121 ;
BUT … I still wanted to rename all of these variables and change the formats. I closed SAS and opened up Enterprise Guide.
Under EDIT, I turned off the PROTECT DATA. Then, for each of the variables, I right-clicked on the column (actual ctrl-click, since I was using a Mac) and selected properties. This was very efficient for me because I was not actually sure these were the exact variables I wanted and when I saw the labels I could delete some right then. I changed the names, labels and formats.
I did not have to do a proc contents, write a drop statement for the variables I didn’t want, a rename statement for the variables I wanted to rename, a label statement for the variables I wanted to relabel and then an attrib statement or some other method of changing the format.
Then, I opened a code window and wrote a few lines for all numeric variables to have the -9 value changed to . so it was Missing and didn’t throw off my calculations.
Because I was very curious I selected from TASKS > CHARACTERIZE DATA to take a look at what I had. It was kind of sad, really. These data are from a longitudinal study of youth, and the particular variables I had were from their senior year of high school. The sad part was the great disparity between the percentage of students who said they expected to go to a four-year college and the percentage who actually will. Because this was the interesting part, I went to TASKS > MULTIVARIATE > CORRELATIONS (yeah, I wouldn’t have put correlations there, either, but whatever). In short, mother’s and father’s education both relate significantly to every positive educational outcome you can imagine, but mother’s education matters more.
I right-clicked on the dataset in the Process Flow window and picked Export, to export it to Excel, since the person I am working with on this project does not have SAS on her computer.
Okay, it’s past 1 a.m. and even though it is supercool that I was able to at least look at my data somewhat tonight, I need to go to bed so I can get up tomorrow and work to buy the world’s most spoiled twelve-year-old what she decides she wants next. Today it was a 32G iPhone, but the fact that my sixth-grader has in her pocket more computing power than existed in the world when her grandparents were twelve, well that’s another story.