statistics

Logistic regression in pictures: Part 3

ByAnnMaria De Mars August 7, 2013August 7, 2013

This is the third and last part of my attempt to explain logistic regression in pictures. You can see a picture of odds ratios here, and a picture of two charts of predicted probabilities, to compare models, here.

If people only know one chart associated with logistic regression, it is usually the ROC chart, though many of them cannot tell you what ROC stands for (not that it really matters) or how to interpret the chart – which kind of does matter, because it’s useful.

The ROC curve is an abbreviation for receiver operating characteristic curve (I told you it didn’t matter). This is a plot of

SENSITIVITY – the percentage of true positives, the people we predicted would die who did, and

SPECIFICITY – or true negatives, the number of people we said would NOT die, who did not

We actually plot (1 – specificity) by sensitivity. If we predicted no one would die, our rate of true negatives would be 100%. Since we predicted nobody would die, we would be exactly right for all of the people who didn’t die. 1 – 1.0 = 0 so we’d be at 0 on the X axis.

On the other hand, we’d have zero sensitivity. Since we predicted no one would die, we would have zero true positives.

At the other extreme, if we predicted everyone would die, we would have 100% true positives and 0 true negatives. Since 1-0 = 1 , that would be at the upper right corner here.

The straight line is what we would get without any predictor variables, if we just randomly guessed whether a person would live or die. The top left corner, where we have correctly predicted all of our positives and all of our negatives is what we would get in a perfect model.

The more that curve is bowed toward the top left and away from the straight line, the better our model.

Let’s take a look at our actual curve from the Kaiser-Permanente data, where we used gender, age, number of emergency room visits and nursing home residence (yes or no) to predict whether or not a person would die within the next nine years.

From this, we can conclude that while our model is substantially better than random guessing – a conclusion that is consistent with what we saw in our previous charts. We can also see that there is definitely room for improvement. Perhaps future research could improve prediction by including behavioral risk indicators such as amount of alcohol and tobacco usage, as well as socioeconomic status and diagnosis of chronic illness.

So, there you have it – logistic regression in three blog posts and four pictures.

statistics

The Lies about Anchor Babies

ByAnnMaria De Mars April 26, 2011April 26, 2011

My father was born in New York City to two non-citizens who were in the U.S. for a few years, left and never returned. In his twenties, he returned to the U.S. and joined the military, I am pretty sure because it was the one thing he could think of that would most piss off…

Dr. De Mars General Life Ramblings | statistics

Why Present Your Data at a Software Conference?

ByAnnMaria De Mars August 13, 2015August 13, 2015

I read this in a review of a study on teacher expectancy effects but it could really apply to so many other studies. If these results bear any relationship at all to reality, it is indeed a fortunate coincidence. Those of us who choose careers in research like to believe that it is all like…

Software | statistics | Technology

Learning Advanced SAS from a Macro: Part 2

ByAnnMaria De Mars February 23, 2012February 23, 2012

Okay, where we left off on the propensity score macro from Feng, Wu and Xu and the nifty things you can learn from reading someone else’s code, in this case, their propensity score macro with calipers …. We previously dealt with the situation where you had no matches and exactly one match. If you find…

statistics | Technology | The Julia Group

What Matters in Statistics

ByAnnMaria De Mars November 19, 2009November 19, 2009

Maybe I have been wrong. It wouldn’t be the first time, in fact, most of the really great things in my life have come about when I realized I was on the wrong track and took a sharp right turn. (Uncharacteristically skipping the opportunity here to make snarky comment about my first boyfriend, job or…

Software | statistics

Excel statistics functions – simple answers to simple questions

ByAnnMaria De Mars December 30, 2012December 30, 2012

I have colleagues who hate Excel with a passion. Why, they demand to know, would ANYONE use Excel for statistics when there are so many options that are so much better? Actually, I don’t find the Excel add-on for statistics that terrible, but that isn’t even the topic of this post. I use Excel because…

statistics

ASA’s New Look : It’s not your father’s statistical association

ByAnnMaria De Mars January 25, 2012January 26, 2012

Photo from Nic Cubrilovic. Creative Commons license. Thanks, dude! It’s been 15-20 years since I was last a member of the American Statistical Association. I read an article in their journals occasionally but not much of it is relevant to me. I work with clients who are designing surveys, analyzing messy data and evaluating programs….

Similar Posts

Leave a Reply