I was reading this article in the Wall Street Journal on SAS software and CEO Jim Goodnight said,
“We’re producing so many new products. [They're all] in the funnel and we’ve got to get them in production and it’s taking us longer to get them out the door…. A lot of the problems are testing issues. It’s taking too long to solve all the problems. Every piece of software has its bugs. And with more and more products, we’re struggling with compatibility to make sure the next release is easy to migrate to for certain customers. With the sheer number of software solutions we have, it’s making it harder and harder to develop and test them all.”
Which got me to wondering why SAS does not take advantage of its considerable user base. In another article from CBR Business Intelligence, it’s claimed that SAS has 45,000 customer sites. (Not that I’m stalking y’all or anything). I have no idea what the average number of users is at each site but let’s try a really low estimate of around 22. I say really low because I know that many of those sites are universities and corporations that have hundreds of users and it would take a heck of a lot of small shops to bring it down to 22, but let’s go with that. So, this gives them around a million users.
You would think that somewhere in those million users you would have people who would be happy to dive in and break things (also known as testing). One benefit of open source software is that you have people trying things all the time. When they come up with problems, they create fixes. There are also disadvantages to open source software SOMETIMES - lack of documentation, lack of support. That’s not always true. Personally for large open source options, like Linux, the documentation is massive as is the support within the user community. There can also be legal issues if a company is marketing something and it has someone else’s code included.
However, there is a hybrid option, that can happen on two fronts:
When I was at USC when we would get new versions of any software or operating system, there were several people (including me), who had the immediate reaction of ,
“Let’s try to break it!”
We’d install it on every configuration of virtual machine, hardware and operating system. We’d try to use enormous files and analyze huge matrices of dependent and independent variables. We’d compare results from different applications. Ostensibly we did this as part of our job so that when anyone asked us a question we had a better answer than,
“Your guess is as good as mine.”
There was also the aspect of, as one of my co-workers said,
“They pay me to play with computers all day. How cool is that?”
The fact is when you create a piece of software of a significant degree of complexity (and SAS is certainly on the far right of the bell curve) it is damn near impossible to test every possible permutation -
“What if I run SAS on Linux running on VMWare on a Macbook to read a 100GB dataset that was orginally created by SAS on Windows in China using a double-bit character system and then exported using PROC COPY?”
With a large enough user base, SAS could make copies of work in progress available for testing which would allow identification of problems. I can see many reasons that users would be happy to do this:
- Consultants would see this as an opportunity to get ahead of the curve, using the newest software before it was available to the general public.
- Students could use this chance to learn more about the latest software.
- Just for the hell of it. (This is my motivation for most things.)
Fixing those bugs would still have to be done in-house to allow for quality control, documentation and liability issues. Hence the hybrid part. I’d be interested to see how having a large number of users turning the software inside and out could supplement whatever is done internally to minimize the bugs in software when it is released (and just accept the fact that there will always be bugs).
Someone suggested possibly SAS is worried about it damaging their reputation if they let out software with lots of bugs in it. I don’t think that would occur if they were very upfront about,
“Look, this is a work in progress. Run it through the paces and see what you think.”
as opposed to doing what some companies do and essentially releasing their beta version.
And, of course, no matter what you do or say, some people will complain and criticize because some people are just stupid that way.
“What! I can’t believe you did not see the importance of including a Serbo-Croatian to Mandarin translation function! And you say you have a comprehensive set of character functions, you fools!”
2. User-written macros
Back in the 1980′s, there used to be a book (yes, an actual physical phone book type of book) of SAS user-written macros. Over the years, of course, many of those have morphed into SAS procedures. I remember learning SPSS because SAS didn’t do loglinear models, so this was quite some time ago and there is not as much need.
Of course, there are a zillion packages for R, which is true open source, but Stata, which is not, also has a host of Stata procedures that are written by users. Raynald Levesque’s website has 140 macros for SPSS. SAS macros exist in diverse places – individual websites. SUGI/SGF and local user group proceedings.
With the data.gov initiative alone I think there is a lot of extensibility of SAS that is going untapped.
Lately, I’ve been pondering if I should be picking up some other language, partly just because it is good to learn new stuff, but also because it seems as if SAS is a bit behind on getting involved with some of these opportunities like with open government.
All the same points could be made about SPSS but they at least have the excuse that there are not as many users, not as many people writing syntax versus pointing-and-clicking and they have been bought by IBM so who knows what direction they are taking.
It’s just puzzling to me why with a such a strong user community SAS is overlooking so much of the potential for being faster, better, smarter.