I was reminded today how useful a SAS log can be, even when it doesn’t give you any errors.
I’m analyzing data from a study on educational technology in rural schools. The first step is to concatenate 10 different data sets. I want to keep the source of the data, that is, which data set it came from, so if there are issues with these data, outliers, etc. I can more easily pinpoint where it occurred.
I used the IN= option for each data set when I read them in and then some IF statements to assign a source.
DATA mydata.all_users18 ;
SET sl_pre_users18 (in=slp )
aztech_pre_clean (in=azp )
mydata.fl_students18 (in=fls )
After I run the data step, I see that 425 observations do not have a value for “source”. How would you spot the error?
Of course, there is more than one way, but I thought the simplest thing was to search in the SAS log and see which of the data sets had exactly 425 observations. Yep. There it is. Took me 2 seconds to find.
147 PROC IMPORT DATAFILE=REFFILE
149 OUT=WORK.MC_bilinguaL_students18 replace;
NOTE: The import data set has 425 observations and 2 variables.
So, I looked at the code again and sure enough, I had misspelled “source”
IF slp THEN source = “Spirit Pre” ;
else if azp then source = “Az Pre” ;
else if fls then source = “Fish Studn”;
else if mcb then sourc = “M.Camp.Bil” ;
You might think I could have just read through the code, and you are right, but there were a lot of lines of code. In this case, I could immediately identify that it was something to do with that specific data set and reduce the code I needed to look at significantly. I just started with the last place that data set was referenced to work backward. Fortunately for me, it was in the very last place I called it.
The fact is, you will probably spend as much time debugging code as you do writing it. The log and logic are your friends. Also, no matter how long you have been programming you still make typos.