statistics

Before you even THINK about propensity score matching

ByAnnMaria De Mars August 27, 2012April 26, 2017

Propensity score matching has had a huge rise in popularity over the past few years. That isn’t a terrible thing, but in my not so humble opinion, many people are jumping on the bandwagon without thinking through if this is what they really need to do.

The idea is quite simple – you have two groups which are non-equivalent, say, people who attend a support group to quit being douchebags and people who don’t. At the end of the group term, you want to test for a decline in douchebaggery.

However, you believe that that people who don’t attend the groups are likely different from those who do in the first place, bigger douchebags, younger, and, it goes without saying, more likely to be male.

The very, very important key phrase in that sentence is YOU BELIEVE.

Before you ever do a propensity score matching program you should test that belief and see if your groups really ARE different. If not, you can stop right now. You’d think doing a few ANOVAs, t-tests or cross-tabs in advance would be common sense. Let me tell you something, common sense suffers from false advertising. It’s not common at all.

Even if there are differences between the groups, it may not matter unless it is related to your dependent variable, in this case, the Unreliable Measure of Douchebaggedness.

Say, for example, that you find that your subjects in the support group are more likely to eat grapefruits for breakfast, live on even-numbered streets and own a parrot. Even though I’d be a little suspicious of anyone who gets up early enough to eat breakfast, if it turns out that none of those variables are related to how big of douchebag you are, there is no point in doing a propensity score match.

Finally, and perhaps most obvious and most frequently overlooked, if your dependent variable is not measured reliably, no amount of statistical hocus-pocus is going to make anything predict it. (Short explanation – an unreliable measure is one that has a large proportion of error variance. Error variance is, by definition, random. Random error is not going to be related to anything. Imagine that every student just colored in the bubbles in the test at random. Now imagine trying to predict the test scores with any variable. Not happening. I think all students SHOULD color in the test sheets at random. I did once. The school psychologist told me I was mentally retarded. She was wrong.)

and AFTER you do propensity score matching (or anything else) …

Even after all of this, sometimes it still doesn’t work. A few years ago, I had a client who had a really logical theory, well-designed study and when we ran the analyses every which way, none of the data supported their hypotheses.

At the end of it all, the client asked me what else we could do, and I said

“There isn’t anything else we can do that I would recommend. You know, sometimes the theory is just wrong.”

It reminds me of the title of a good presentation I went to at the Joint Statistical Meetings earlier this month,

“Bayesian statistics are powerful but they’re not magical”

I think that could be applied to just about any kind of statistical technique. I wish I had said it first.

Support my day job AND get smarter. Buy Fish Lake for Mac or Windows. Brush up on math skills and canoe the rapids.

For random advice from me and my lovely children, subscribe to our youtube channel 7GenGames TV

statistics

What’s all that factor analysis crap mean anyway? Part 1 of Several

ByAnnMaria De Mars July 10, 2013July 10, 2013

My doctoral advisor, the late, great Dr. Eyman, used to tell me that my psychometric theory lectures were, A light treatment of a very serious subject. Hmph. Well, with all due respect to a truly wonderful mentor, I still have to state unequivocally that the majority of students when looking at a factor analysis for…

Why I Don’t Have Minions

ByAnnMaria De Mars September 20, 2011September 20, 2011

Admit it, more than once you have thought to yourself, Wouldn’t it be convenient about now to have some mindless minions to do my bidding? I’d always thought if this whole statistical consulting thing didn’t work out, I could be an evil scientist. I mean, I already went to the trouble to get a Ph.D….

Dr. De Mars General Life Ramblings | statistics

Spring-cleaning in a statistician’s office

ByAnnMaria De Mars April 11, 2012April 11, 2012

There must be some hallucinogenic drug given off by spring flowers, because it is the one time in the year when I am struck by the irrational thought, “Maybe I should clean my office.” Like any other spring cleaning undertaking, random things are uncovered that make me think, “Huh. I wonder what that was for.”…

statistics | Technology

Life is Full of Disappointments

ByAnnMaria De Mars April 25, 2010April 25, 2010

I have been trying to get ready for two workshops this summer. One is called Visual Data with SPSS (pretty obvious what it is about). The second one is statistics using SAS Enterprise Guide. I was going to call the first course Statistics without Numbers and the second one Statistics without Programming. A colleague pointed…

Software | statistics | Technology

The Village Watchman and SAS Enterprise Guide Summary Tables

ByAnnMaria De Mars June 20, 2016September 15, 2016

The government is extremely fond of amassing great quantities of statistics. These are raised to the nth degree, the cube roots are extracted, and the results are arranged into elaborate and impressive displays. What must be kept ever in mind, however, is that in every case, the figures are first put down by a village…

statistics

Using SAS to test whether “It gets better” makes you gay

ByAnnMaria De Mars October 17, 2011October 17, 2011

Next question on categorical data analysis … Correlated proportions. There are a lot of reasons why you might have correlated data in a two-way contingency table. The most common is that you have measured people twice. I have heard people say that including discussion of homosexuality in school makes it more likely that children…

4 Comments

Peter Flom says:

August 29, 2012 at 6:25 am

Nice post, but I thought the real advantage of propensity score matching was to combine the effects of a bunch of variables on which the groups likely vary into one score, thus saving a lot of degrees of freedom in the regression (of whatever type) you are doing.

It can also make the output from a regression simpler, if you aren’t interested in all those covariates.

But propensity score matching has problems if the groups are really different – that is, if there isn’t much overlap in their scores on the covariates. I saw this happen in one study of the effects of job training. The two groups had almost no overlap on education, and no overlap at all on joblessness. Propensity score analysis gave ridiculous results.
draypresct says:

August 29, 2012 at 11:54 am

Aside from the situation Peter mentions (saving degrees of freedom when your dataset is extremely limited), you shouldn’t expect propensity scores to do anything that a normal regression model controlling for the same variables would do. If you have unmeasured confounding factors, both types of analysis are going to be biased.

If you’re dealing with a client who needs to be convinced that propensity scores have no magic powers, you might be interested in “Propensity scores: help or hype?” by Winkelmayer et al. (Nephrol Dial Transplant 2004) as a reference.
Pingback: A Beginner’s Guide to Propensity Score Matching : AnnMaria's Blog
Harold Medows says:

June 17, 2017 at 11:16 pm

I do consider all the concepts you’ve presented on your post. They’re very convincing and will definitely work. Still, the posts are very brief for starters. May you please extend them a little from next time? Thanks for the post.

Similar Posts

4 Comments

Leave a Reply