Been working with SAS a lot lately to get the data in shape for a new client. Here’s one of the problems I ran across.

We want to get the average pay rate for people who got jobs as a result of our client’s services. However, when one of the people they served did not get a job, the pay rate is listed as “no”. When the person did get a job it’s listed as something like $11.00/hr .

Obviously, this is a character variable, and equally obviously, what I would like to do is:

  1. Change the “no” values to missing.
  2. Remove the “$”, “/” and “hr”
  3. Change the variable type from character to numeric

Why not just do a couple of Replace commands in Excel?

We have a contract with this client for the fiscal year and we expect, like almost all of our clients, that this contract will be renewed and we’ll work with them for years to come.  I’ll have to do that replace every month, unless I get them to change the way they enter their data and I think that is bad service to ask our clients to change for my convenience. They’re paying us to make life easier for them, not the other way around.

Here’s how easy this is to fix

Data services ;
set in.report7_11 ;
attrib jobpay length = 8  ;
jobpay = compress(earnings,'$/hrno') ;

The ATTRIB statement creates a new variable that is numeric. The COMPRESS function strips all of these characters out of the value for the earnings variable, ‘$/hrno’ /

Done.

Comments

One Response to “Stripping data the easy way in SAS”

  1. Stan on July 21st, 2011 3:04 pm

    That could still give you a nasty surprise if other characters pop up. I prefer to use the K modifier so compress keeps what’s in quotes instead of deletes:

    jobpay= compress(earnings,’1234567890.’,’K’);

Leave a Reply