# I was wrong not to teach my statistics students programming

It’s been a good day. I had to drag myself away from PHPStorms to write this blog. I used phpMyAdmin to create the tables I needed, then wrote a few scripts in PHP to connect , insert records and execute queries. I used Dreamweaver and Textwrangler for the HTML. In a day or so, I’ll probably add some JavaScript for error-checking, prompting and hints.

Although I took BASIC, FORTRAN and yes, even COBOL, back in the dark ages, most of my programming for several years was limited to SAS in every flavor – on Windows, Unix, Mac (when they had that), then the new SAS On-Demand. I used SAS macro language quite a bit, a huge variety of statistical techniques, SAS Enterprise Guide, SAS Enterprise Miner a little bit and SAS/GRAPH (which I hate) even less.

When I started graduate school, I already knew SAS because I had been working as an industrial engineer and we used it for analyses. As a graduate teaching assistant, I taught three-hour computer labs that students were required to take in addition to their lectures. Yes, a three-credit statistics course required SIX hours of class time, three hours in the lab and three in the lecture. On top of that, you were actually expected to read the book and do the homework problems, which could take from one to four or more additional hours, depending on how much previous mathematics/ statistics you had. SPSS and SAS existed, but only as code. Students had to learn programming because that was the only way to get your results.

For reasons beyond one blog post, universities have been cutting back on requirements to accommodate working students, so now students get less for their money – less requirements, less hours in class. I teach statistics primarily to students who don’t want to learn it – people getting doctorates in education, psychology, business. Add on top of that programming and I (erroneously) thought it was a bit much given that most of my students work full time, drive an hour or two to get to class and back, and have additional classes on top of that. Many of my students are in professional and managerial positions, working over 40 hours a week.

Because they didn’t really want to do programming and they could use pointy-clicky options like SAS Enterprise Guide and SPSS, I went with that. Even then, many of the students had problems because the whole statistical analysis process was new.

Still, I was wrong. The work I did today (and for the last couple of years), was VASTLY easier because I had used SAS. I understand the concept of loops, arrays, macro variables, user-defined functions (a.k.a. macros), and a very large number of varied types of functions – text processing ones like changing to upper case, trimming blanks, finding substrings, and mathematical functions to find minimum, maximum, etc. etc. I know about IF and ELSE statements, and thanks to PROC SQL, SELECT, INSERT and CREATE statements.

I could go on and on but the point is that whether I picked up Ruby or JavaScript or PHP, all of which I did in the past couple of years, it was made immensely easier because of my prior experience. A Ph.D. is not a vocational certificate, even though it has turned into that for many people and institutions. A college degree of ANY type ought to prepare you not just for the jobs that are available now, but also for the jobs that haven’t been invented yet. Learning any programming language would give my students a greater range of job opportunities in the future. Also, the whole data analysis with a computer application is new to them, so maybe learning SAS will be a bit more uncomfortable, but it is very definitely not beyond their capabilities.

If I had a Ph.D. in Educational Psychology without the ability to program, my options for supporting my children when my husband fell ill and later died would have been much more restricted than they were. Financially, having learned programming has been a huge benefit in my life and that of my children. In many projects, even when I wasn’t doing much of the coding myself, understanding programming has made me much more effective in design and management. On top of it, I love my work.

So … starting this year, I am back to teaching programming.

For an amusing discussion of whether to teach SPSS or SAS or R, check out the Lovestats blog.

Totally agree. I think it’s sinful that some universities teach stats with STATA or Minitab so they can teach stats without worrying about teaching how to write a LIBNAME statement or DATA step. STATA is not an employable skill. Best thing I got out of one year in an MPH program was that along side the stats course, they had a statistical programming course (in addition to stat lab). So you learned SAS (and a couple other languages) while you learned statistics. Stat prof taught stat, programming prof taught programming.

The free languages can be really liberating, because you can give your script to anyone and tell them to run it again on their computer. Or you can bring up your old work, evan after you leave school. Also because they are newer than SAS (which is older than C!) they have true functions, classes, and all that stuff computer scientists invented after 1966. And R graphs are much easier than SAS graphs.

Interesting. I’m taking stats independently so currently SPSS and some Minitab. But having worked (and still am) as a programmer and looking at a lot of what is being written about data analytics at the moment, I’m pretty convinced that it’s in my interest to add R and python to my own mix. There’s a lot to be said for having to build the solution from scratch sometimes – it allows you more latitude. Plus, as you note, it is a transferable skill.