### Feb

#### 4

# Phi coefficients, odds ratios and the F-word

February 4, 2009 | 1 Comment

Yes, I am the F-word – a feminist. I was at a faculty meeting this weekend and one of the presenters began by saying, pointing to a colleague in the audience,

*“I am sure Dr. Y knows more about this than me.”
*

Several times in her presentation on analysis of assessment data she would pause and make comments such as,

*“Well, I am not very good at statistics, but this is pretty easy to understand.” *

I was a bit annoyed at her self-deprecating manner. I wanted to walk up to her and say,

*“You understand this perfectly well and I know Dr. Y, who is very smart and competent, but no more so than you.”*

Even more annoying was another presenter, also a woman, also very competent, who gave a very good presentation on assessment. Near the end of it, she said,

*“You don’t have to use numbers. For those of you who don’t do math, you can put your students in categories as having exceeded criterion, met criterion or failed. You can just put it in bullet points.”*

**For those of you who don’t do math …. ????**

**What the hell?** This is a university faculty meeting; 99% of the people in the room have graduate degrees and at least three-fourths of them have Ph.D.’s.

Since when has it become acceptable to not be competent, particularly in math??? Would that same presenter have started a sentence with,

*“For those of you who can’t read, I have recorded this presentation as a podcast?”*

There may be some people who can’t read because they are visually impaired or have a learning disability, but we consider this a disability, not a lifestyle choice.

This particular department is overwhelmingly female, and I could not help but wonder if the same sort of statements would be made in a predominantly male department? In my admittedly non-random and non-representative experience, the answer is, “No.”

So, first of all, for all of you women (and men), who say you aren’t good at math – cut it out! That’s a lot of nonsense that some people are naturally good at math and some aren’t. It’s a lot like swimming. You aren’t born knowing how to swim and, yes, very few people will become Olympic swimmers, but the vast majority of people can learn to dive in a pool and swim a few laps. It just takes time and effort to practice.

Let’s start with the phi coefficient. I blatantly stole this table from the Children’s Mercy Hospital website because I thought it was very well-explained and easy to understand – until I realized that it wasn’t and I only understood it because I already knew exactly how to calculate a phi coefficient. However, not one to let any act of larceny go to waste, I used it anyway.

The formula for Phi is

Notice that Phi compares the product of the diagonal cells (a*d) to the product of the off-diagonal cells (b*c). The denominator is an adjustment that ensures that Phi is always between -1 and +1.

Let me explain this a little better. We have two categorical variables, gender – coded 1 =female, 2= male, and “Did you eat today?” – coded 0 = no , 1 = yes

In our table below, you can see that there is zero correlation between gender and if you ate today, as males and females are both equally likely to have had something to eat.

Gender \Ate today? NO YES TOTAL

Female 10 90 100

Male 10 90 100

Total 20 180 200

When we subtract (10*90) – (10*90) — obviously, the numbers are the same, so we get zero. There is zero relationship. In the formula above, a, b, c & d are the numbers in each cell.

So, we have mathematically shown that there is no relationship between gender and whether one eats or not. Let’s try another question, “Did you do the dishes?” This time, we get the following results:

Gender \Washed Dishes? NO YES TOTAL

Female 10 90 100

Male 90 10 100

Total 100 100 200

Let’s look at the phi coefficient again.

10*10 – 90*90 = 100 – 8100 = -8,000

100*100*100*100 = 100,000,000 and the square root of that is 10,000

So, our phi coefficient is -8,000/ 10,0000 or -.80. That is a pretty high correlation, considering that the coefficient ranges from -1 to +1.0 . A negative coefficient means that those who are lower on one variable (1= female, 2= male) are more likely to be higher on the other variable (0 = did not do the dishes, 1 = washed dishes).

So, our conclusion is that, while women are no more likely to eat each day than men, they are significantly more likely to do the dishes with data that I just made up to prove it. My daughter, Maria, tells me that any married woman knows that without the need for statistics.

Why did I just go into this in such detail and all about one coefficient? Because I think that is a big part of the reason that many people don’t learn math is that there are so many assumptions that we can “just skip over this”. In fact, the reason I liked the Mercy Hospital site is it did not start out with n10n21 – n21n10 / √(n0+n1+n+1n+2)

and assume that everyone knew what marginal distributions and array subscripts meant, because, I can guarantee you, that they don’t.

Sheila Tobias wrote a really interesting book about teaching and learning science, the title of which is “They’re not dumb, they’re different”.

Maybe, but I guarantee you that part of the problem is that they’re not clairvoyant. No one was born knowing that n10 means the number in the cell where the row value =1 and the column value = 0. It doesn’t help that at other times that same cell would be represented as n11 as the first row and first column.

If you can make that switch in your mind easily, it is no doubt because you, like me, have looked at thousands of matrices and had that notation explained to you so long ago that it is probably like learning to swim, you can’t even remember it. The secret to being good at math is the same as being good at swimming – practice!

Completely random fact – in my misspent youth, I was the first American to win the world championships in judo. If you type judo blog into google, the first of 3,000,000+ pages that comes up is mine. And my most recent judo blog was on outliers and practice. Rather unusual when the two halves of my split personality come together.

As to odds ratios, I have more to say about those, but it is 1:30 a.m. and I have to get up in 7 1/2 hours to go to work, so that will have to wait until another day.

# Comments

1 Comment so far

## Blogroll

- Andrew Gelman's statistics blog - is far more interesting than the name
- Biological research made interesting
- Interesting economics blog
- Love Stats Blog - How can you not love a market research blog with a name like that?
- Me, twitter - Thoughts on stats
- SAS Blog for the rest of us - Not as funny as some, but twice as smart. If this is for the rest of us, who are those other people?
- Simply Statistics, simply interesting
- Tech News that Doesn’t Suck
- The Endeavor -John D Cook - Another statistics blog

You have written about measuring the relationship between binary variables more than once, and have mentioned different summaries of this relationship. I am curious as to your thoughts on the use of entropy (or measures base on entropy, such as the uncertainty coefficient), as these seem unpopular in “traditional” statistics.

On an unrelated note: “For those of you who don’t do math …. ???? What the hell?” Yes!!! I could not agree more with your comments on that subject.