Lately I have been on a roll looking at relatively less common statistical techniques, proportional hazards, survival analysis, etc.
In keeping with that, I have been taking a look at propensity score matching, fondly known as PSM by, – well, by no one actually.
The problem to be solved ….
Think about some of these comparisons:
- Hospitals with special burn,cardiac or neonatal units versus general hospitals
- Public schools versus parochial, private or charter schools
- People who watch TV > 40 hours weekly versus those surfing the Internet > 40 hours
In all of these cases, and probably a lot more you can think of, there are very likely differences in certain “outcome” variables, whether it be survival in the case of hospital patients, academic achievement of students or annual income of TV versus Internet users. However, all of these comparisons also begin with groups who are already different.
For example …
You have two groups, say people who are treated at a hospital with a specialized unit for terminally ill patients and patients from another hospital without any such specialized unit. Your outcome variable of interest is whether the patient lived or died.
The simplest way to test this is a chi-square. You compare the percentage of people who survived at St. George of Money Hospital versus Heart of Despair County Hospital. There is a problem with that, though. A simple comparison will almost always show WORSE outcomes for hospitals with special units for patients who are terminally ill, seriously burned, extremely premature births, etc. The reason is probably obvious – if you get sicker patients, they are less likely to live. If your interest is in knowing whether having a specialized unit increases your chances of survival, you would want to compare similar groups.
It isn’t as simple as just controlling for severity of condition, though. There are other variables, for example, people who are better educated, who have private insurance and who live in urban areas all may be more likely to be patients at more “elite” hospitals. Some of those factors may be related to survival as well. What we’d really like is to compare a group of people from St. Money’s that is similar to patients from Despair.
In short, certain types of people have a greater propensity to be admitted to one type of place than the other.
Enter propensity score matching — to the sounds of trumpets and wearing a cape.
In fact, the first step is to do a logistic regression analysis and I will admit that it is not strictly necessary to wear a cape while doing so but it would probably be more comfortable than this business suit from Filene’s that I am wearing.
Using SPSS, go to the ANALYZE menu, select REGRESSION, then select BINARY LOGISTIC. Your dependent variable will be the hospital to which the patient was admitted. Covariates are the variables such education, severity of illness and insurance that you want to control. For variables that are categorical, e.g., insurance, which could be private, public (a.l.a. MediCal if it hasn’t disappeared in the latest round of state budget cuts) and none, click on the CATEGORICAL button and move those over to the “Categorical covariate” window.
Here’s the really important part — click on SAVE and select PREDICTED PROBABILITIES – that is your propensity score.
This is what you are going to match on. Hence the name.
This is step one. I would say it gets easier after this point – but it doesn’t.