The Emperor’s New Statistics

I had the pleasure of attending a lecture Rand Wilcox gave on the state of research. He was far more amusing than I expected from a statistician (perhaps this reflects low self-esteem on my part). He made the very valid point that all statisticians learn in the infancy of their careers that the general linear model makes certain assumptions, like normal distribution, measurement without error (give me a break!), homoscedasticity. In fact, there is a very well-written summary in an electronic journal in an article entitled Four assumptions of Multiple Regression that Researchers Should Always Test. (Being far less given to snarky comments than me, there was not a parenthetical addition, “But you never do, do you?”). That was one of Wilcox’s points, that SO often analyses are conducted by people who never test the simplest assumptions.

My favorite comment, though, was

“Anyone who thinks they know all of statistics is certifiably insane.”

This is becoming more and more true. I remember chugging along happily in graduate school doing two- and three-way ANOVAs. Then, all of a sudden, if you were going to do an ANOVA with say, ten schools and compare the impact of whole language versus phonics, you had to do a mixed model and specify curriculum type as a fixed effect and school as a random effect. If you did a regular two-way Analysis of Variance it was WRONG (beatings with bamboo sticks for you.) If you switched from a two-way fixed effects model in this case to a mixed model it was more correct. However, did your results turn out dramatically different? Well, actually, no. Slightly different.

Over the years, I have seen the number of statistical software procedures grow dramatically, from those written by Stata users to SPSS add-ons to whole categories of SAS procedures, e.g. Bayesian. What I have NOT seen is a practical increase in the usefulness of our predictions.

From terrorist attacks to volcano eruptions to financial market crises to mortgage prices to unemployment rates, our predictions are so-so in the short-term (as they often amount to no more than – pretty much like now since all predictors are pretty much like now) and not very helpful at all in the long-run and most effective when viewed in reverse. For example, Mashable tells me that if instead of paying $3,000 for a G4 Powerbook back in 2002 if I had invested it in Apple stock I would now have $94,000. Am I the only one who is thinking,

“This prediction would have been a lot more useful in 2002?”

Another thing I have NOT noticed is more understanding of statistics by the general public (unless the words “more” and “understanding” mean the exact opposite of what I think they mean). This commentary by Bill Maher uses the tea partiers as an example, but would apply to just about any group in America (and a whole lot of the rest of the world, too). Maher points out the complete impossibility of cutting taxes, maintaining services and reducing the deficit all at the same time. He notes that Americans want to cut spending rather than increase taxes but when asked what they want to cut spending on, their usual answer is “Nothing”.

Let’s talk about cutting federal spending; 14% of the budget goes to Medicare, 20% to Social Security and 7% to veterans and federal retirees – to be blunt, 41% of the budget is going to old people, of which we are getting more as the population as a whole ages. Another 6% is going to interest payments on our national debt, which we can’t exactly decide not to pay and another 20% is defense spending. So, now we are up to 67%, or two-thirds of the budget. (These statistics are from the Center on Budget and Policy Priorities. ) And yet, every time I turn on the radio or television, I am bombarded with commentators telling me that the problem is with government “pork”, welfare, that we need smaller government. And yet, again, those same people are not arguing that we should decrease social security, Medicare or defense spending. Just how much do they think government spending can be reduced by cutting the 2% that is spent on scientific and medical research? (Answer: At most, 2%. It wasn’t a trick question.)

As statisticians, we are getting better and better at impressing each other with how smart we are. Maybe we are even getting better at impressing the general public, when they think about us at all.

Many years ago, I decided that my role as a teacher was not to leave the class impressed with how much _I_ know but knowing, understanding more themselves.

I’m not sure we’ve made progress in that direction.

The Emperor’s New Statistics

Multicollinearity statistics with SPSS

Starting with Perfect Data from SASHELP

Text Mining with SAS – class notes

The Facts of Factor Patterns

Messy Problems Made Simple with SAS

Yes, You Totally CAN Understand Model Fit Statistics, with M & M’s

One Comment

Leave a Reply

Similar Posts

One Comment

Leave a Reply