|

Web Editor May Save SAS from Going the Way of COBOL

I am old. I remember punched cards, COBOL, dumb terminals and having to walk over to the computer center and load tapes on to the drive if I wanted to use large data sets – large back then meaning 100,000 records or more with a few hundred variables. We thought that was pretty big data.

By the time I finished graduate school in 1990, almost everyone I knew who still programmed using COBOL was over 40. They had learned it in college, or picked it up somewhere along the line and stuck with it. I didn’t know anyone who was learning COBOL. It was pretty clearly on the way out. Java, C++ , PHP, Perl and javascript were all taking up the attention of the cool kids on the block. SAS was a relatively new, cool thing if you were into statistics, while BMDP was on the way out. BMDP – that was another thing no one under 40 seemed to use.

So …. when I went to the Western Users of SAS Software conference this year, I was struck by the fact that I seemed to be about the median age. There were A LOT of people older than me. Most of the younger people were the student scholarship winners and junior professional award winners.

This does not bode well for SAS, and it made me a bit sad, because as I said in a prior post, the model selection procedures were cool, from a statistical perspective, there is a lot of good stuff from SAS.

I used to go to the user group meetings and they would give you a book (yes, on paper, children) that had macros written by SAS users. I think that was the first time I saw the parallel analysis criterion code for factor analysis – a macro I used in my dissertation and in one of the first articles I published.

Tonight, I was looking for a way to do power analysis for a repeated measures ANCOVA and I could not find it for SAS, neither using PROC POWER, PROC GLMPOWER nor any user-written macros. It may exist – I looked several other places as well, found a paper on how to do it using SPSS syntax (although that code did not work!) and someone else wrote a procedure in R that I didn’t try.

SAS used to be the place for the cutting edge. What happened?

One reason is that everyone used to use either SAS or SPSS at universities and that isn’t the case any more. A second is that SAS is really expensive, so universities who do not have a license aren’t inclined to get one.

This all sounds like the death knell is tolling for SAS and it is just a matter of time until it follows COBOL and Blackberry as one of those things that people ask, “Why are you using that?”

I think there is still some possibility for SAS to turn things around – although whether they will or not remains to be seen.

The smartest thing SAS has done in years is to come out with SAS On-Demand for Academics. This makes SAS free for university students and professors. It’s perfect for on-line courses because you can upload your data to the class website and all of your students can access it.

Now the next thing SAS needs to do is start making that available at a reasonable cost once students graduate. Instead of charging them thousands of dollars a year for a license, they can charge $50 a month like Adobe does for its design package or Google does for its apps. (Yes, Google apps for business are cheaper than $50 a month but they don’t do all that much.)

New graduates aren’t going to pay several thousand dollars for a license because they don’t have that kind of money. They might shell out $50 plus occasional extra charges to access some high performance computing capabilities.

SAS already has millions of lines of code and tens of thousands of pages of good documentation. It’s some good stuff.

Think about this – years ago, the Mac was considered a better computer than Windows but over-priced. Many people thought Apple would go under. Instead, they came out with the iPhone and the iPad and they are wildly successful.

The Web Editor and other cloud products could become the SAS version of the iPad.

Here’s to hoping they don’t fuck it up.

 

Similar Posts

11 Comments

  1. I’m studying statistics and almost no one likes SAS here. The only reason to learn it is because a few old teachers make it compulsory in class, and because it is still (but for how long) an asset to get a job.

    I absolutely *hated* programming in SAS. In fact when I graduate I prefer to be jobless for a few additional month than taking a job where I have to be near this thing. It really would make me miserable.

  2. I love programming. I hated SAS because of its card-punching syntax, because of the peculiarities of accessing and manipulating data, because of the overall rigidity of the language. I especially did not like that it was impossible to interface my favorite code editor with the SAS engine. Of course I also don’t like the fact that it’s expensive, science should be free and based on free stuff.

    The biggest problem with SAS is the interactivity. I think slicing and looking at the data and fitted models should be made as flexible and as easy as possible. In that respect, SAS gives me the impression of riding a pair of old wooden skis.

    I like R with Hadley Whickham’s libraries to manipulate and plot the data. I also like Python but it still lacks specialized functions in many statistical areas.

  3. I forgot: If I remember correctly, generating data was not so straightforward in SAS. It’s important to generate data from models of arbitrary complexity and nature, to get a feel of what clean data looks like, and to generate simulation-based confidence intervals and so on.

  4. What yop might not appreciate is in the ease of data manipulation and management strategies offered with SAS. Also, SAS does do quite a bit to validate its procedures and methods, and updating for stability, something terribly lacking in R.

    I won’t address the naivete of yop’s quote, “science should be free and based on free stuff.”

    I love using SAS for most tasks, but rely on R for graphics and quick analyses when I don’t have access to SAS nor do I need a reliable record to repeat the analyses.

    As long as SAS is a corporate and FDA favorite, I don’t see SAS losing its appeal in pharma and business. I do see, year after year, SAS’s incredibly poor marketing to colleges, professors, and the student base. This alone may sink SAS to the destinies shared by Polaroid, Blockbuster, and BDMP.

    What could sink SAS is if R Project starts to demand more accountability from package developers in documentation, stability, version testing and updating. A more organized documentation system, reflecting such accountability, would benefit R and R users.

    However, I can’t agree with RR’s Mom here. A $50/month access fee is way too much for a new graduate.

    Schools that do not offer SAS limit their graduates from many lucrative opportunities in the marketplace. Sadly, it is the ambitious student that learns SAS and R in college, in addition to knowing some Perl and Python. This combination is technically formidable and very marketable. Thanks.

  5. It’s not naïveté.

    For a recent example look at the last generation bayesian sampler Stan. Andrew Gelman gave it away, for free, sources included (and even personalized support, the mailing list is amazing). He chose a BSD license so that commercial companies can freely include it in their softwares. Look at R, it’s free, sources included, with high quality packages like lme4. The preferred writing tool in technical scientific disciplines is Latex, which is for free, sources included. More generally, most programming languages and frameworks are for free, sources included. You don’t have to pay an expensive yearly license to code in Python! How is that naïveté? Support is expensive, use is free. Besides softwares, the current state of the publication system is a recurrent subject of controversy. What is the point of making rich publication consortiums? Universities provide the work power, and in return they get a paywall. That’s not sustainable. It will take time but I think in less than 20 years scientific publications will be freely available on the Internet. Some people just have a problem when they hear the word “free”. That, is naïveté.

    Otherwise I agree with R’s lacking documentation and the variable stability of R packages. Maybe the future of R in business is with dedicated companies providing support and bugfixes, as in the Linux world?

  6. I find it very odd that those in Academia feel like only one language should be taught based on personal bias. And these biases are passed on to students who are learning programming for the first time and will regurgitate back what they have been told. I am constantly told by students that they don’t like SAS because it has bad graphics (seriously?) Or that you cannot program simulations easily in SAS (have you noticed the similarities between IML and R?) Or that they can’t run that R package in SAS (yes you can and you can take the results and make even more amazing graphics with SAS!) We live in a world where being bilingual in spoken language is a huge plus, wouldn’t it be the same with statistical programming? Why would we want to limit our students by only teaching them one option? The student who has the ability to jump between languages and leverage the weaknesses of one with the strengths of the other is the person that I would want to hire.
    How does this relate to this post? I hope that SAS is reading it because everything that AnnMaria and others have said is true. SAS is seriously losing out in Academia. You would be hard pressed to find a pure statistics program that is even teaching SAS these days. Do I agree with this, absolutely not, for the reasons that I state above. What this will create though is a bunch of recent grads out looking for jobs who will either be limited in what they can find, but more likely that they will slowly shift the world of statistical programming in industry away from SAS to what they learned in school.

  7. One of the arguments for R is that it is free, but I find that argument hard to accept coming from universities with $50K tuition and billion dollar endowments. They aren’t spending their money on hardware, software, libraries or computer labs to justify that tuition, I guarantee. SAS On-demand is also free and works MUCH better than the original Enterprise Guide version.

    Another argument for R is that “real statisticians” are doing their work in it and that is what you use if you are going to develop new statistics. Whether that is true or not is a moot point. For the students I teach, who are interested in careers in management, biostatistics, marketing – mostly in the corporate sector – they have no interest in developing new statistics. Their goal is to get a job when they graduate.

    I completely agree with the need to be bilingual and jump between languages. Instead of picking up R, I decided to learn javascript, CSS, PHP & HTML . No, they don’t do at all the same things, but the point is expanding one’s repertoire is a good thing, however you choose to do it.

  8. I think the best thing universities could do is have graduate stat classes that show you how to do things in multiple packages. Show students the multiple ways to run different analysis in various packages. They also need have classes around dealing with data. Not statistics, but boring (but important) stuff like how to pull from databases, various file types, manipulating variables, merging, etc. I know some people have advocated computing languages be part of all research graduate programs.

    At the end of the day, like retirement investing and social justice, diversity is the key. You can probably make it as just a SAS or R programmer, but you do much better when you can be relatively fluent in multiple languages.

    I was taught statistics in classes using SAS, did my research in psychology using SPSS (because that’s what the department used), and then self-taught myself R because I wanted to do advanced psychometrics and SAS & SPSS seem to not want to update their bases packages to include functions that approaching 30 and 40 years old. My SAS is rusty since the department I’m in has wedded itself to SPSS, but I’ve found I split most of my duties up between R, SPSS, and a little mixture of Python, SQL, and VBA.

    I do give SPSS credit, though I guess it’s kind of cheating, but adapting their syntax engine to run R code in it. But in the future, I think the business model of SPSS and SAS may have to change. With the growing use of R, plus the emerging SciPy community; I really wonder how long those companies can function selling such expensive software. SAS may do some things better than R, but at least in R I have a program I can try different analyses with. I wonder if those execs thank the stars every day for all the legacy code out there.

  9. I agree with doing things in multiple packages BUT (there’s always a but) you need to give enough time to learn multiple packages and you need to find professors who can teach using multiple packages.

    Because universities are unreasonably expensive, most students want to take the minimal number of courses in the minimal amount of time and graduate. (Public universities seem to be less of a problem in this regard, but I have limited experience there so I don’t really know.)

    I’ve taught using SPSS, SAS and Stata depending on the department preference. I’ve offered students the option to use whatever package they know for analyses and provided assistance to them on whatever it is. I can do that because I only teach one class a year and I schedule it when I have time to devote to it. I could not do that it was a full-time faculty member, nor if I was teaching 4 or 5 courses a year plus another full time job (as many adjuncts are).

    As for me, most of my day is SAS, javascript, SPSS with occasional PHP and SQL. Rarely touch Python and haven’t thought of VBA in a couple of years until you mentioned it. Still, your point is absolutely well taken. The more diversity in your experience , the better your options in the job market and the better your ability to solve problems in general.

Leave a Reply

Your email address will not be published. Required fields are marked *