It’s been 15-20 years since I was last a member of the American Statistical Association. I read an article in their journals occasionally but not much of it is relevant to me. I work with clients who are designing surveys, analyzing messy data and evaluating programs. They do research but it is not in a laboratory or with accommodating undergraduates seeking ten points extra credit. It’s more often with people with substance abuse issues or learning difficulties who are not too excited about research, the researchers or anything related. My clients are not interested in some esoteric statistical technique that three people in the world will ever use or simulations using perfect data that demonstrate method X has a standard error of .337 while method Y has a standard error of .436.
As for me, I find some of that mildly entertaining, but, in my view, most academic journal articles are written by very intelligent people who are reinforced at every level of their education and career for writing that is deliberately obscure. I remember a professor joking (I hope he was joking!) when a student told him that he had to read his latest article twice before he understood it,
“Good! That means I can count it twice on my c.v.”
All of that being said, one might wonder how I found myself on a rainy Monday afternoon driving from Santa Monica to Irvine, on the 405, at rush hour. Generally, this is the sort of thing people don’t undertake voluntarily unless God is appearing at the Irvine amphitheater, or, at the very least, the Grateful Dead.
Actually, it was neither. It was a meeting of the southern California chapter of the American Statistical Association. Brenda Osuna, the senior statistical consultant at USC, had sent me an email asking if I was going. I wasn’t planning on it, but I took a look at the invitation and saw that the speaker was the new ASA president, Bob Rodriguez, who is also some mucky-muck statistical something at SAS. So, that sounded a little interesting. Then, I saw the topic was on big data and business analytics. That looked a lot more interesting, and not the typical ASA journal thing, so I was intrigued. Intrigued enough to pay $189 to join ASA again and drive down there.
Was it worth it? Yes. I’d say so. Here is a summary of my tweets during the talk. Yes, this is a lazy way to do a post for the day, but since I took several hours away from business to attend, there is work that needs to be done. Here are my notes, comments from me are in italics.
Bob Rodriguez, ASA President speaking at University of California, Irvine on Business analytics and big data: What statisticians need to succeed
- Rodriguez advises student on job interviews to ask the interviewer about the types of data they have, the problems they have with data. He says whatever their industry, once people get started talking on that topic it is hard to get them to stop.
- Graduate programs confer 2,200 statistics degrees annually but there is a need for 160,000 more people with expertise in advanced analytics, data mining and statistical analysis. Where are we going to get those people? (I think Rodriguez is right. There is a shortage of statisticians. Business is through the roof for our company. (He didn’t ask, but I do wonder, WHY are our own programs not recruiting and graduating more people? You can’t tell me we only have about 2k smart people out of every graduating class of college seniors.)
- Rodriguez gives the example of optimizing store markdowns as a statistical problem. ( I am really impressed because on first mention, I feel myself starting to snore, but … ) … he casts it as a problem with big data – tens of thousands of items, hundreds of millions of individual transactions, with mixed effects with stores as a random effect – and I start getting fascinated. THIS is the type of teaching we need in statistics!
- What is big data? Big data is different from “not-so-big” data in volume, velocity and variety – AND it is increasing on all three dimensions.
- One way to attract new people to the field is to show them the number of really interesting, challenging problems in statistics/analytics/programming. (He couldn’t be more right here. This is why I need 30 hours in a day.)
- (In response to a question from the audience on how to handle big data …) Three possibilities. One is distributed processing across multiple computers/ nodes. A second is to co-locate the analyses with the data / in-database computing. The third is rewrite internal SAS procedures, for example, so it does not depend on the SAS supervisor. Rodriguez is leading a group that is hard at work on that.
- High demand areas for graduates are design & analysis of survey data, computation & large data computing and econometrics.
- Young statisticians need to be able to present the relevance of their work and to write concisely and clearly. (I agree, but I think the preferences of academic journals in general, including those from ASA, and most professors are complicit in having seen that has NOT happened up to this point. That’s a rant for another day.)
- 70% of enterprise data is unstructured: images, email, documents. This is why SAS is investing so much in text mining.
- (Comment from someone from the UCI Medical School – (I think) – we find very few statisticians who can write a good statistics section of a grant or an article. )
In summary – it seems like the American Statistical Association is changing quite a bit from the last time I paid attention to it 20 years ago or so. I’m sure these changes have been in the works for a long time, but it is like any organization or company, once you decide it doesn’t meet your needs, you quit paying attention to it unless something comes up – in this case the invitation from Brenda.
I’m also going to the Joint Statistical Meetings for the first time ever. I even got roped into being a discussant on a panel. (Brenda, again). JSM is another thing I was aware of forever but there were always plenty of options of conferences to attend and to present, and it always seemed like SAS Global Forum, the American Educational Research Association, SPSS Directions, the National Council on Family Relations – I could go through my c.v. and come up with a couple of dozen places I’ve presented over the years, and another dozen conferences I attended that seemed more relevant to my business and the research I was doing than JSM.
Since I just re-joined ASA after *decades*, I would say that their efforts to broaden their appeal worked with me. So, yeah, if you dismissed them years ago as irrelevant, maybe you want to take a new look. From what I can see, it’s no longer your father’s American Statistical Association any more. Worth checking out.