A few years ago, I was at the Western Uses of SAS Software conference and renowned statistician Stanley Azen played the piano and sang at the closing ceremony.
Briefly, very briefly, I considered beginning my presentation on 10 SAS steps to an annual report, by writing a song, These are a few of my favorite PROCs, and then singing it to the tune of “These are a few of my favorite things.”
This plan was dismissed a nanosecond later when I was reminded by The Perfect Jennifer that my singing bears an uncanny resemblance to the sound Beijing the cat used to make in the middle of the night when fighting with the cat next door.
To make up for my disappointment over my lack of musical rendition, I decided to do a few posts on my favorite PROCS, in no particular order. Today’s contestant is … drum roll please ….
Whenever possible, I try reading in the data using the IMPORT procedure, because, it is very simple and my goal in programming is not impress people with my brilliance – it is to get the job done with maximum efficiency and minimum effort.
As can be seen in the example below, there is no need to declare variable lengths, type or names. Only three statements are required.
PROC IMPORT OUT= work.studentsf DATAFILE= "/courses/u_mine.edu1/wuss14/Fish_students.csv" DBMS = CSV REPLACE;
GETNAMES = YES ;
This PROC IMPORT statement gives the location of the data file, specifies that its format is csv (comma separate values), the output file name is studentsf, in the work directory and that if the file specified in the OUT= option already exists, I want it to be replaced.
The second statement will cause SAS to get the variable names from the first row in the file. Since the variable names are in the first row of the file, the data begins in row 2.
Limitations of PROC IMPORT
As handy as it can be, PROC IMPORT has its limitations. Three we ran into in this project are:
- Excel files cannot be uploaded via FTP to the SAS server, so , no PROC IMPORT with Excel if you are using the SAS Web Editor,
- If the data that you want to import is a type that SAS does not support, PROC IMPORT attempts to convert the data, but that does not always work.
- For delimited files, the first 20 rows are used to determine the variable attributes. You can give a higher value for the number of rows scanned using the GUESSINGROWS statement, but you may have no idea what that higher value should be. For example, the first 300 rows may all have numbers and then the class that was records 301-324 has entered their grade as “4th” instead of the number 4.
Although PROC IMPORT is the first thing I always try, one of my pet peeves about instructors and textbooks is when it is the only thing they teach. It’s smart to try the simplest solution first. It’s dumb not to have a back up plan for the instances when that doesn’t work.
For more on what to do in those cases, you can come to WUSS in San Jose. Just a reminder – regular registration closes August 25. After that date, you’ll have to register on site.