view from plane windowLittle known fact (because, seriously, how would you know) , I write a lot of code while sitting on a plane and I can’t always connect to the Internet.

NOT ALL QUOTATION MARKS ARE CREATED EQUAL

Sometimes, when I copy and paste my code into SAS Studio, it doesn’t work.

if compress(q23) = “3/4” then q23 = “.75” ;

Just so you know, this does not work because some programs like Word, or even TextEdit on the Mac will replace quotes with some swirly shit (see above) that SAS and other languages don’t read as quotes.

This article from the University of Michigan gives some hints on how to prevent or fix this problem.

How to tell if your quotations are a problem


SAS Studio is color-coded.
Note that the first two lines have the values shown in purple.

color coded

The next two lines don’t. If you look closely, those are the evil curly quotes. If you realize this, you can tell at a glance if there is a problem with your code.

Getting rid of text

Okay, I replaced the evil curly quotes, but I still have a problem. The questions are things like,

“What is the area of this shape in square feet”, and let’s say the answer is 240 .

Students answer all kinds of variations of that, like :

  • 240 square feet
  • The answer is 240
  • 240 sq ft

All of these answers are correct but if I just compared them to 240,  they would not be equal and be marked wrong. Enter the COMPRESS function.

q3 = compress(Q3,'ABCDEFGHIJKLMNOPQRSTUVWXYZ','l');

The above statement will remove all alphabetic characters from the answer and return just numbers.

The COMPRESS function has three parts –

  • the source, which is the variable you want modified, in this case, Q3,
  • the characters you want added or removed (the default is removal),
  • an optional modifier

In my case, I used the modifier ‘l’  – that is a lower-case L, not a number 1 – because I wanted all of those letters removed if they were lower-case, too. So, I don’t have to type all of the letters of the alphabet twice.

Getting rid of special characters

You can also use the COMPRESS function to get rid of special characters. Say the question is “If tickets are normally $100 and tickets are 50% off, how much does it cost Cassandra for a ticket to the Dead Fleas concert?” Students will enter answers like, $50 or 50.   To get rid of the $, simply do this:

q1 = compress(q1, ‘$’) ;

When I’m not teaching statistics or writing about SAS, I’m making video games. We’re doing a Kickstarter campaign to make our bilingual games available everywhere and if you backed us, that would be AMAZING ! Plus you will get cool prizes.

I read a comment on line saying SAS probably would not disappear as an option for statistical analysis because “it’s good when you need to do a lot of data manipulation”.

I wonder what world those people live in that data comes all cleanly packaged and whether they have unicorns there.

Back on Planet Earth, I have a data set that has multiple records for the same date for the same students.  For some reason, the data were being sent at the end of each screen at one site, instead of at the end of the test. So, the data look like this:

kat123 4 5 18 11   2017-04-23 17:39:26

kat123 4 5 18 11   42 17 8 0 1 2017-04-23 17:41:12

and so on.

The students also took a post-test, months later, so …

I need the last record for each date, but my data has date and time

You might think doing

testday= datepart(date_entered);

would work and it would except for the fact that

My date is saved as a character format! What do I do?

You can read some suggestions here in SAS communities

https://communities.sas.com/t5/Base-SAS-Programming/how-to-convert-char-var-to-sas-date/td-p/45067

I could not find

2016-02-03 19:41:26

and I spent a good hour trying different methods to get this to work. I will spare you the details and maybe I could have gotten some method to work (no, whatever you are considering, I probably already tried). However, this occurred to me …

Do you really need to change it to a date format?

In this case, I was not doing any calculations with the date value, I simply needed the day part as a unique value.

I could just use the first 10 characters like this

day_of_test = substr(date_entered,1,10) ;

If you figured this out in the first sentence or two you are probably laughing by now (shut up).  Yes, it doesn’t matter if it is formatted as a date or not. So, that is what I did.  After creating a variable that is just the day of the test, I sorted by username, day of test and date entered (which included the time value). Then, I read in the data using the BY statement in the Data step so there would be  last. variable created that is whether or not this is the last record with that value in the BY group.  I output the last record for each day by using a subsetting IF statement.

Data fixdata ;
set mydata.aztech_pre ;

*** CREATE day_of_test variable as characters 1-10  ;
day_of_test = substr(date_entered,1,10) ;

*** SORT by username, day of test and date entered (including time);
proc sort data=fixdata;
by username day_of_test date_entered ;

*** DATA step that only saves last record ;
Data mydata.aztech_pre ;
set fixdata ;

***  BY statement to define that the data is by username and day_of_test ;
*** NOTE:  If you didn’t do the PROC sort first, this won’t work. For shame! ;
by username day_of_test ;

***
if last.day_of_test  ;
run;

So, that worked perfectly. I included my missteps because it is easy when you are a newbie to believe that everyone is smarter than you and never makes bonehead mistakes. Not so. We all make them all of the time. The important thing is, figuring it out in the end. Sometimes the easy way is not so obvious.

Or, maybe it is and I’m a bonehead. Either way, it worked. Now on to step 2.

 

When I am not writing about SAS, I’m making games that teach math, social studies and language.

Check them out.

screen shots from our games

A while back, I wrote a post on getting your Excel data into SAS Studio the quick and easy way. However,  I hear you saying,

What about ME? What about MY needs? What if I don’t want my data written to the working directory? What if my file has the names at the top and I want to keep those names?

First of all, open a program file and run some code that assigns the LIBNAME to the directory where you want your data stored. It should look like this but whatever is in the quotation marks should be where your data are stored.

LIBNAME mydata “/courses/d1234566789” ;
run;

Second, upload your Excel File

sasexcel1

Under FILES, select the folder where you would like your data stored. Click on the UPLOAD FILES button (the arrow pointing up at the top of the screen) and then click CHOOSE FILES to go to where the file is stored on your computer. Select that file, click the button on the pop-up window that says UPLOAD. Now you have your Excel file, uploaded but you want a SAS file.sasExcel2

Go under TASKS and UTILITIES and click the arrow to select UTILITIES and then select IMPORT DATA.

 

sasExcel3

On the right, you’ll see this big window that says DRAG AND DROP YOUR FILE HERE.

file list

In the left pane, open the FILES directory and go to where you saved your Excel file. Drag it into the window. Once you’ve done that, this wi If you stopped here, you would have the file written to the working directory, and named import.

import option

If you want to change that, click the button that says CHANGE.

changing default name in boxes

This pops up. Don’t see the directory you want? Did you run the LIBNAME statement at the very beginning of this post to assign a library reference to that directory? For shame! You think I just make this stuff up? Go back and do it now.

Okay, should you be concerned that your library name is greyed out? No, you should not. That just means you cannot change the name of your library reference here. If you wanted to change that library name from “mydata” to “yourdata” you’d have to do it in the LIBNAME statement.

Type the name you want for the data set. Do not forget to click SAVE or you may as well have skipped this step.

Click the little running guy at the top of the window.

Before you go, notice that SAS also generates code for you. If, like me, you anticipate that your data may change and you may need to do this again, you can copy and paste the code generated by SAS and save it in a program file. Run it again to recreate your data set. How likely is that to happen?  Well, it happened to me today when I inadvertently (that’s a synonym for “stupidly”, right?) wrote over this exact data set.

/* Generated Code (IMPORT) */
/* Source File: az_pretest.xlsx */
/* Source Path: /home/annmaria.demars/data_analysis_examples/data2017 */
/* Code generated on: 7/31/17, 6:09 PM */

%web_drop_table(MYDATA.aztech_pre);
FILENAME REFFILE ‘/home/annmaria.demars/data_analysis_examples/data2017/az_pretest.xlsx’;

PROC IMPORT DATAFILE=REFFILE
DBMS=XLSX
OUT=MYDATA.aztech_pre;
GETNAMES=YES;
RUN;

PROC CONTENTS DATA=MYDATA.aztech_pre; RUN;
%web_open_table(MYDATA.aztech_pre);
run;

Okay, there you go. With a few clicks, your Excel file is accessible in SAS Studio as a SAS data set and you have a copy of the code that did it.

Next post we’ll start whipping that data into shape.

When I am not writing about SAS, I’m making games that teach math, social studies and language.

Check them out.

screen shots from our games