# 6 things about statistics everyone must know

*What must everyone know and why must everyone know that?*

This was one of the two central questions in Dr. J.T. Dillon’s course on Curriculum and Instruction, which I thought was going to be a colossal waste of time. After all, I was going to be a statistician, why would I need a course on curriculum, the most boring topic on earth? Like most of the courses I thought would be a waste of time and which the university in its infinite wisdom require I take, they were right and I was wrong. (Why would someone who was planning a career as a professor need to know something about teaching, gee, I can’t imagine!)

Usually when I am asked if I’d be available to teach a course, I say, “No”, partly because it doesn’t pay that well but mostly because I usually am NOT available unless you ask at least six months in advance. Well, they did. I haven’t taught a graduate statistics course in a few years, so I’m really looking forward to it. Most of these doctoral students will be writing a dissertation and then, for the rest of their careers, reading research and making decisions based on their evaluation of that research. None of them are mathematics or statistics majors and none are planning to do a lot of scientific research themselves. The question then, is what do they need to learn?

1. Descriptive statistics, distributions and data visualization – In analyzing their own data they need to get a feel for it. They need to understand what the average person is like, the variance among their population, and identify the outliers.

2. Correlation and regression – A basic understanding and application of statistics is the knowledge of relationships, how you measure relationships, how you interpret them.

3. Group mean differences – Yes, mathematically you can couch this as a regression problem and they should probably understand a little about that. Definitely need to know how to compare groups.

4. Hypothesis testing – Understand the difference between statistical significance and practical significance. This is usually an intuitive concept when teaching physicians because they are familiar with the idea of something being clinically significant. Other professions don’t always get this so easily. To understand statistical significance, you need to know something about probability. To understand probability, it helps to know something about combinations and permutations.

5. How to analyze categorical data – Sometimes the world realize does fit in neat little boxes. You have never had cancer, you have cancer now or you are in remission. You’re married or you’re not (and no, that weekend in Cabo doesn’t count, unless you were visited by the Holy Ghost) . For those times you need to understand chi-square and logistic regression. Maybe a little bit of odds ratios.

6. How to use some kind of software to compute results – It would be nice to have minions to do this for you. If you have the budget you can hire someone like me. Sadly, “graduate student” is one of the lower paid occupations, so it behooves the students to learn how to do at least some of the analysis themselves.

So, there’s my syllabus. I still don’t have a textbook. I’m debating on The Statistical Sleuth , supplemented by some other resources.

Anyone else have any ideas for topics, exercises or readings, please dive in.

I’m really looking forward to this semester. It’s going to be great!

The two things I had to teach myself/learn the hard way were Logistic Regression and how to look at Residuals. You probably won’t find a basic textbook that has those in it (the Gravetter and Wallnau book I teach out of doesn’t). However a lot of dissertation questions use logistic regression, and even just a basic awareness of residuals will point them in the right direction. I teach SPSS in my class, and there are a few decent books out there on that topic as well. (Green and Salkind or Mallery or Yocky are all one’s I’ve looked at and thought would be useful.) There are a couple of books out there that attempt to teach Statistics and SPSS at the same time, but they tend not to do either as well as I wanted.

Speaking of residuals

http://www.thejuliagroup.com/blog/?p=1523

(-:

Yes, I know people who use the Green and Salkind book. It’s pretty good.

I have to credit SAS’ Anne Milley (@annemilley)for telling me about “The Lady Sipping Tea” — more a why than how book. Supplemental reading? http://sww.sas.com/gobot/564

Hammer in the variance concept, it doesn’t sink in easily. It’s a good exercise to sketch different distributions and let students figure out which has greater/lesser variance.

ANOVA, SPSS, and graphic knowledge never hurts! A Good basic book is Research Methods for Business Students, fifth edition by M. Saunders, P. Lewis, and A. Thornhill.

Pearson has a good book for aiding in graphic knowledge and SPSS called Statistical Persuasion.