Anyone who uses SAS (or doesn’t) probably has their own reasons. I have a few but a major one is the ease of importing just about any type of data.
Mo’ clients, Mo’ problems
There are multiple types of consultants. I’m the type who is, literally, all over the map. I’ve been in five countries this year and I think 11 states plus the District of Columbia, but I might have left off a couple. I said 9 in a post on a different blog where I occasionally write about my life and judo, but then I remembered I’d been in Texas for SAS Global Forum where I gave a talk on biostatistics and also in New Mexico speaking on transition from school to work for tribal youth with disabilities.
What that means is that I work with a wide range of organizations and their data is not all in the same format.
If you work with a wide range of clients, ease of data import matters
If you’re a consultant who works consistently with one client, data formats may not be your biggest issue. You probably wrote a program to read in that data, no matter what messy format it was in and you’re good to go. In my case, though, every dataset, every project is different.
All the data, all the time
In the previous post, I mentioned reading in the IPEDS data, which is a relatively small public data set (around 7,000 x 60). Fantastically, that came with a SAS program so all I needed to do was upload the raw data file and change the INFILE statement.
- A couple of years ago, I wrote about getting data from PHPMyAdmin into SAS Enterprise Guide just by doing a lot of pointing and clicking.
- Earlier this year, I mentioned reading JSON data that had been saved as a string in an SQL database into SAS , all it took was a DATA step and liberal use of the INDEX function.
- SAS also works well for reading in extremely large data sets, older data sets that used really old versions of SAS and I’ve even used it once to import a very large Excel file which the customer had omitted to tell me was in Korean. I haven’t had to use PROC CPORT for a while so I’m hoping to have work that requires that at some point so I can test it out again.
Proc import does not a consultant make
Maybe when you were a student you imported your data sets by a PROC IMPORT step. This isn’t terrible. You should use this procedure when you can. However, you’re going to need to go several steps further.
Even worse, if you’ve been getting your data by simply using the LIBNAME statement your professor provided you or doing some pointy-clicky thing with SAS Studio or Enterprise Guide (or SPSS) you have a lot to learn.
Every year, I have graduate students who tell me they are going to become consultants. More often than not, I shake my head and think,
“You have no idea what you are getting into.”– Me
If you are going to be working as a statistical consultant for a variety of clients, far more than PROC LOGISTIC or PROC GLIMMIX, your time is going to be spent in the DATA step.
It’s not just a matter of data formatting or missing data, but of creating the data you need that isn’t there. What do I mean by that? Ha ha, that is a future blog post that I may write next time I’m on a plane somewhere and have a spare moment. Probably tomorrow.