When acceptance is really rejection: Death by Green Pants

ByAnnMaria De Mars June 12, 2009June 12, 2009

The model is non-significant, therefore my theory is supported.

Huh?

Just when you thought it was safe to get back into statistics… It took you two years of graduate school but now you have it down. P-value low = good, relationship detected, publication, tenure, Abercrombie & Fitch models at your feet.

P-value = high, no relationship, no publications, no money, dating the creepy guy next door.

Enter Hosmer to screw things up.

There are a whole bunch of reasons you might want to do a logistic regression (no, I’m serious). If you want to predict a categorical dependent variable like death, drop-out or watching Afghan Star. If you were going to do a propensity score match you would start with logistic regression. If you plain can’t think of anything else to do with your evenings.

The first thing would be to see if your dependent had a relationship with your grouping variable or you really are wasting your time. Okay, now that is settled, you have found that people seen in hospitals with Intensive Care Units are more likely to die than those seen at other hospitals.

You also want to see if the variables on which they differ have anything to do with the outcome. For example, I ran an analysis where I coded their favorite colors of pants – blue, brown, white, black or green pants (seriously, who buys green pants?) . People who went into intensive care were more likely to own green pants. To test if this is significant, I run a logistic regression with death as the outcome variable and pants color as the predictor.

In SPSS you go to ANALYZE > REGRESSION > BINARY LOGISTIC

So, the Hosmer and Lemeshow Test is statistically significant with a chi-square of 349.06, df = 4 and p < .001. Is that exciting? Do I immediately publish an article on “The American Apparel Effect” and how poor fashion taste is dangerous to your health?

Not so fast. You see, Hosmer & Lemeshow tests the Goodness of Fit of the model predictions to the observed data. If you reject the hypothesis that your model fits the data, that is bad!

In my next logistic regression, I used age over 65 as a dichotomous variable. My second variable was the Dr. MechOth scale. Dr MechOth (not her real name) was a friend of mine when I was a young Assistant Professor who occasionally hung out in bars. Dr. MechOth rated all men on a 1 to 3 scale, where 1= “Yes” , 2 =”Maybe if I was drunk” & 3=”I couldn’t get drunk enough”.

The results of the Hosmer & Lemeshow test shown below, with a chi-square = 4.52, df = 3, p > .20 show that the data fit the model somewhat, although it could be better.

significlogistic

Does this mean that in logistic regression high p-values are always a good thing? Nope, that would be too easy for you to remember. In fact, no sooner have we inverted our understanding of p-values but now it is time to do it again. When interpreting the COEFFICIENTS, a low p-value is a good thing. So, which of Dr. MechOth’s groups one is in, and being really, really old are related to probability of death.

significlogistic2

Sadly, my original hypothesis about death by green pants is not supported and all I have discovered is that if you are really, really old and no one would go home from a bar with you if you are the last person on earth, you are more likely to keel over dead from natural causes or suicide, whichever comes first, than hot, young people.

I do not think I will be winning the Nobel Prize for Medicine any time soon. I wonder if that guy next door likes Cup-A-Noodle soup.

Dr. De Mars General Life Ramblings | The Julia Group

Mama, What’s an Accelerator?

ByAnnMaria De Mars August 24, 2015August 24, 2015

First, the big, exciting news, if you have not heard it – we are one of EIGHT COMPANIES IN THE WORLD selected to be part of the Boom Startup Accelerator for Educational Technology. Can you tell that I’m super-excited about this? At lunch today, for the second time in two days, someone asked me what…

Software | statistics | Technology

How to compute odds-ratios

ByAnnMaria De Mars August 1, 2013August 1, 2013

A two-by-two table is a very common design. The column variable is some type of treatment or intervention. For example, older (over 65 years of age) people who either lived in a nursing home or who did not. Your row variable is some categorical outcome. For example, they either lived or died. You create a…

Dr. De Mars General Life Ramblings | statistics

Finding Groups in Data

ByAnnMaria De Mars September 26, 2008

Today, Dr. De Mars is — happy. One of the fun things about my job is that I get to do lots of different things. That can be a bit troubling some days, because statistical software consultant encompasses a wide range from different types of models, to coding, to various operating systems to all of…

Software | statistics | Technology

Text Miner = Coolness

ByAnnMaria De Mars May 2, 2012May 2, 2012

In the past, when I had to do any type of parsing of text, I wrote my own code with a zillion SUBSTR functions and IF statements and it did the job but it was *so-o-o ugly and painful that I never even considered including text mining in any courses I taught. I looked into…

Dr. De Mars General Life Ramblings | The Julia Group

The Benefits of Growing Your Own: The view from the code monkey cage

ByAnnMaria De Mars November 25, 2012November 25, 2012

Last week, I wrote about my disagreement with those who want to go out and hire a code monkey. Being deeply immersed in writing a computer game to teach kids math, here is my perspective from the monkey cage on the benefits of coding your own stunts. I like it. This seems to be a…

Dr. De Mars General Life Ramblings | The Julia Group

How our start-up culture is different

ByAnnMaria De Mars September 1, 2013September 1, 2013

If you haven’t read this post by Shanley Kane on toxic lies about start-up culture, you should check it out. One value in this post is it reminds me of what we DON’T want to do at The Julia Group, as we start up a new venture with 7 Generation Games 1. We hire people…

2 Comments

Pingback: SAS Enterprise Guide: It’s a woman’s prerogative : AnnMaria’s Blog
Dave Houg says:

July 15, 2009 at 10:07 am

Which is cause and which is effect is still a tough call even after finding a great corelation.

Or even IF there is a cause and effect. If (pulled out of thin air) 97% of traffic accident victims had eaten a pickle the week before would that tell you anything? Of course it might tell it was summertime but that is about it.

PS Statistically speaking: half of all brain surgeons are below average in skill & competence!!!

Similar Posts

2 Comments

Leave a Reply