statistics

Interesting thoughts from JSM

ByAnnMaria De Mars July 30, 2012

Several interesting random thoughts from JSM:

From a session by Freeda Cooner entitled, “Bayesian statistics are powerful, not magical”

ways in which Bayes results could be slanted (one hopes unwittingly) were discussed. One point worth repeating is that the validity assumes accurate priors. Kind of obvious, no? Yet, the question is, where did you get your prior probabilities? Did you base them on studies of use of this drug with adults and your current study is of children? Did you base them on studies of a “similar” drug but this is a study of a new drug?

As I said, when you think about this point, it is kind of obvious but I suspect people don’t think about it often enough.

A second interesting point was made by Milo Schield about “causal heterogeneity”. That is, we like to think if we are testing a new treatment that those who live survive because of the treatment (saved) and those who die do so as a result of the failure of the treatment (killed). That is, we act as if there are only two categories. In reality, he says, there are four groups. In addition to the “saved” and “killed” groups there are those who would have lived regardless “immune” and those who would have died regardless “doomed”.

Another point by Schield was that although we always say that correlation does not mean causation we almost always give examples of confounding variables. We say, for example, that although ice cream sales go up along with violent crime, eating ice cream doesn’t cause you to go after your neighbor with a baseball bat, unless perhaps your neighbor is spied eating ice cream off of your spouse. However, when we look at probability in terms of p-values we are really spending most of our introductory statistics courses testing whether or not observed relationships are a coincidence and we should emphasize guarding against coincidence more than confounding.

Personally, I think I do talk about this a lot, so if you do not, feel shame.

Another really interesting idea came from Chris Fonnenbeck. He was discussing putting his code up on github and I thought,

“Why don’t I do that? Why don’t other people in our company?”

I hate to admit that we had just never taken the time to do it, for which I now feel guilt because I do look for code on github occasionally, and, more often, just browse it looking for interesting ideas, or the hell of it.

Speaking of @fonnenbeck, I met both him and @randomjohn from twitter tonight. I feel smarter just having been around them.

MANOVA from beginning to end: Reliability

ByAnnMaria De Mars June 15, 2017

Where is the Multivariate Analysis of Variance ? You promised there would be MANOVA ! Now we’re in the third post! First there was recoding of variables. Then, there was creating scales. Now, we’re looking at reliability. Patience is a virtue. Before we get to doing a MANOVA we want to be sure that our…

Algebra | statistics

Math and Computer Programming through Black Belt Eyes

ByAnnMaria De Mars December 10, 2010

In my misspent youth, I was the first American to win the world judo championships. This came about since I had a propensity to run my mouth off, which often led to fights. Those people who said I better be able to “walk the walk if I was going to talk the talk”. Well, I…

Dr. De Mars General Life Ramblings | statistics

It only seems like this has nothing to do with statistics

ByAnnMaria De Mars September 13, 2017September 13, 2017

Last post, I talked about bricolage, the fine art of throwing random stuff together to make something useful. This is something of a philosophy of life for me. Seems rambling but it’s not … Over 30 years ago, I was the first American to win the world judo championships. A few years ago, I co-authored…

statistics

Death is different: More on event history models

ByAnnMaria De Mars May 20, 2009

Event history models of all types have a few characteristics that make them unique. First of all, forget that whole symmetry thing around zero. Here our dependent variable of interest is time to event. We are interested in how long a person lives, remains sober, stays with a given company, or, in a study of…

statistics

Americans May not Be Bad at Math but Some Journalists Sure Are

ByAnnMaria De Mars December 5, 2013

It’s that time of year again when we hear complaints about how terrible the U.S. is doing in math. This article by The Atlantic with the title American Schools vs. the World: Expensive, Unequal, Bad at Math is just one of many, many reports that showed up in my twitter stream. The first question anyone…

Software | statistics

SAS vs SPSS for Teaching Multivariate Analysis in Social Sciences

ByAnnMaria De Mars April 26, 2017April 26, 2017

I have to choose between either SAS or SPSS for a new course in multivariate statistics. You can take it up with the university if you like, but these are my only two options, in part because the course is starting soon. I need to decide in a few days which way to go. Here…

4 Comments

Josh M says:

July 30, 2012 at 11:16 pm

Do you spend much/any time on monte carlo methods? One of the more amusing things I’ve done in stats/probabilities is to implement monte carlo solutions fo most of “Fifty Challenging Problems in Probabilities” (by Mosteller). His solutions are involved, beautiful, nuanced applications of basic probabilities. My solutions are a couple dozen ugly lines of ruby. Yet, with enough iterations*, they converge to many decimal places.

It seems like this is something that we don’t drag enough students through, since it’s often a more effective/efficient solution for much of the “statistics” work we do in industry (I’m in machine learning/”data science”).

But, I may only be mentioning this because we just hired a guy with a fresh Master’s degree, and I’ve had to drag him kicking and screaming into just writing simulations, instead of spending hours at the whiteboard.

* How many iterations are needed to get a certain level of precision? Well, that’s just a matter of running a lot of runs of simulations, and seeing how they settle out! It’s dice rolls all the way down.
Annmaria says:

August 1, 2012 at 10:43 pm

That does sound really fun but sadly it is one of the many things I don’t have time for at the moment
John Johnson says:

August 4, 2012 at 2:32 pm

Awwww, thank you! I wish you could have been at Rick Wicklin’s roundtable!

A couple of random thoughts:

1. Many intro stats classes will be better off if shown (and discuss) this:
http://dilbert.com/strips/comic/2011-11-28/

2. The addition of “immune” and “doomed” groups are part of a discussion by Agrist and Rubin (principal stratification) that appears to be catching on, but I don’t quite understand well enough. It’s related to this notion of causality by Pearl (in one camp) and Rubin (in another, hopefully reconcilable with Pearl’s) that I’ve really had to study for a couple of different applications — cancer vaccines in one area, and observational research in another — and still don’t understand well enough.
John Johnson says:

August 4, 2012 at 2:38 pm

One more thought — the thing I like about Bayesian statistics is that you can characterize how bullheaded you have to be to reject a study’s results. The smaller the variance on your prior, the more you believe it, and the harder the likelihood has to work to overcome it. On the flip side, if you put a huge variance on your prior, you don’t formally show faith in it, and the posterior looks more like the likelihood. There is this interesting theory that characterizes priors in terms of an effective sample size they represent. If you don’t have true prior data or some justification for your prior’s effective sample size, you are placing too much faith in it. It’s not a perfect system, but a useful way of recasting priors in a language that makes you (hopefully) stop and think about it.

Similar Posts

4 Comments

Leave a Reply