It must be that time of year because I was asked to speak at two different schools in downtown Los Angeles this week, one elementary school and one middle school.  The Perfect Jennifer probably won the coolest teacher award for getting her younger sister, a world champion in mixed martial arts and subject of a made for TV movie this summer to come talk for career day.

jenn_ronda2013-05-22 10.06.12

 

However, after the mobs of autograph seekers had departed, there were still plenty of questions for the old mom, just as there were at the elementary school in MacArthur Park (yes the same of disco song and gang fame).

Here are some of my favorite questions and the answers that I gave.

Q. Were you always a math genius?

I was not a particularly good student. I got in trouble a lot for fighting and I wasn’t all THAT interested in school. I think I started being interested in math when I was in the sixth grade just because the math teacher (Sister Marion) was really nice and some of my other teachers were really mean. I mean, really mean, like throwing stuff at me. It’s true, I was an annoying child, but still. Since I liked her, I liked her class, so I studied harder for it and did better.

Q. Is your mother proud of you?

Yes, I believe she is. I’ve gotten a lot of education, started a company that does good work, been a teacher and been able to take care of my children well, so I would say, yes, she is proud of me.

Q. What do you dislike about your job?

I really had to think about this one and for a long time I could not think of anything. Then, The Perfect Jennifer reminded me that sometimes I have to go to North Dakota in the winter. That is the one thing I don’t like about my job, when I have to go somewhere it is really cold because I hate cold weather.

Q. What was your Plan B?

I had to think about that, too, for a while. I finally said that I really like being a statistician and the work that I do and if it doesn’t work out, if the grant that I’m working on now doesn’t get funded, if my game I’m working on now doesn’t sell then I think I will just try again. It’s like my daughter Ronda (who spoke earlier in the morning) said. Someone asked her in an interview once,

“You’ve won every match so far in your career with the arm bar in the first round. What are you going to do if you try the arm bar on someone one day and it doesn’t work?”

She replied,

“Well, I guess in that case, I’d probably try again.”

(In fact, if you saw her last match, that is exactly what she did.) So, I said, I think my Plan B would be to try again to succeed as a statistician.

Q. What do you like about your job?

Everything. I like traveling. I like working with really smart, nice people which is all I work with any more, because if they are jerks, I just turn down the contract and don’t work with them. I like the fact that every project is something new, sometimes it’s seeing if a program works, some days it’s trying  to catch fraud, other days it is teaching a class. I like the fact that I don’t have to get up before 10 o’clock in the morning.

Finally I told them,

If you don’t remember anything else I said or that anyone else said today, remember this, because it took me a long time to figure it out. Don’t EVER believe that other people are smarter than you, that they have some special kind of math brain that they can get it and you can’t, that everyone knows more than you. If they do know more than you it is just because they worked at it longer and harder and if you work long enough and hard enough you will get to the same place. Don’t believe you need  to  be a certain race or age or look a certain way to start a technology company and be successful. It just is not true. I used to think that way, that people who are really good at math were not people like me, certainly none of the math professors I had in college or people I saw on television talking about starting companies looked like me. None of that matters. Now I write the sort of things that I could not imagine even understanding when I was young and I toss it off like it’s nothing and it IS nothing because I’ve been doing it for twenty years. Math, martial arts, programming – anything – you just bang away at and you get it eventually. Why do you think they call it hacking?

Last week I wrote a bit about how to get an exploratory factor analysis using Mplus. The question now, is what does that output MEAN ?

First, you just get some information on the programming statements or defaults that produced your output:

INPUT READING TERMINATED NORMALLY

Exploratory Factor Analysis ;

SUMMARY OF ANALYSIS
Number of groups                                                 1
Number of observations                                         730

Number of dependent variables                                    6
Number of independent variables                                  0
Number of continuous latent variables                          0

Observed dependent variables

Continuous
Q1F1        Q2F1        Q3F1        Q1F2        Q2F2        Q3F2

Estimator                                                       ML
Rotation                                                    GEOMIN
Row standardization                                    CORRELATION
Type of rotation                                           OBLIQUE

This tells us we our analyzing all of the data as one group, and not, for example, separate analyses for males and females. We have 730 records, six variables, all of which are continuous and listed above. The maximum likelihood method (ML) of estimation is used and the default rotation, GEOMIN, which is an oblique method, that is it allows the factors to be correlated.

Here we have a list of our eigenvalues

RESULTS FOR EXPLORATORY FACTOR ANALYSIS

EIGENVALUES FOR SAMPLE CORRELATION MATRIX
1           ………  2         ………    3             4             5
________      ________      _____     ________      ________
1.866         1.262         0.866         0.750         0.716

EIGENVALUES FOR SAMPLE CORRELATION MATRIX
6
________
0.539

In this case, you could go ahead with the eigenvalue greater than one rule, but let’s take a look at a couple of other statistics. First, we have the results from the one factor solution.  Here we have the chi-square testing the goodness of fit of the model

Chi-Square Test of Model Fit

Value                             96.228
Degrees of Freedom                     9
P-Value                           0.0000

We want this test to be non-significant because our null hypothesis is there is no difference between the observed data and our hypothesized one-factor model. This null is soundly rejected.

Let’s take a look at the Chi-square for our two-factor solution
Chi-Square Test of Model Fit

Value                              3.016
Degrees of Freedom                  4
P-Value                           0.5552

You can clearly see that the chi-square is much smaller and non-significant.

Let’s take a look at two other tests. The Root Mean Square Error of Approximation (RMSEA) for the one-factor solution is .115, as shown below. We would like to see an RMSEA less than .05 which is clearly not the case here.

RMSEA (Root Mean Square Error Of Approximation)

Estimate                           0.115
90 Percent C.I.                    0.095  0.137
Probability RMSEA <= .05           0.000

For the two factor solution, our RMSEA rounds to zero, as shown below

RMSEA (Root Mean Square Error Of Approximation)

Estimate                           0.000
90 Percent C.I.                    0.000  0.049
Probability RMSEA <= .05           0.954

Clearly, we are liking the two-factor solution here, yes? The eigenvalue > 1 rule (which should not be TOO emphasized) points there, as does the model fit chi-square and the RMSEA.

In their course on factor analysis, Muthen & Muthen give this very nice example of a table comparing different factor solutions using the data

Mplus_EFAmodel_selection

They also like the scree plot, which I do, too. I also agree with them that one should never blindly follow some rule but rather have some theory or expectation about how the factors should fall out. I also agree with them in looking at multiple indicators, for example, scree plot, chi-square, RMSEA and eigen-values.

Previously, I discussed how to do a confirmatory factor analysis with Mplus. What if you aren’t sure what variables should load on what factor? Then you are doing an exploratory factor analysis. Really, you should probably do the exploratory factor analysis first unless you have some very large body of research behind you saying that there should be X number of factors and these exact variables should load on them. If you’re analyzing the Weschler Intelligence Scale, you probably could skip the exploratory step. For everyone else …. here is how you do an exploratory factor analysis with Mplus.

TITLE : Exploratory Factor Analysis ;
Data:  FILE IS ‘values.dat’ ;
VARIABLE: NAMES ARE q1f1 q2f1 q3f1 q1f2 q2f2 q3f2 ;
ANALYSIS: TYPE = EFA 1 3 ;
ESTIMATOR = ML ;

When no rotation is specified using the ROTATION option of the ANALYSIS command, the default oblique GEOMIN rotation is used.

I explained the first three statements earlier this week.

The fourth statement is new. Like the other statements, you need to follow the ANALYSIS key word with a colon and end each statement in the command (or if you are familiar with SAS, think of it as a procedure) with a semi-colon.

TYPE = EFA 1 3 ;

Requests an exploratory factor analysis with a 1 factor solution, 2-factor solution and 3-factor solution.  Of course, depending upon your own study, you can request whatever solutions you want. This is really useful because often in an exploratory study you aren’t quite sure of the number of factors. Maybe it is two or maybe three will work better. Mplus gives you a really simple way to request multiple solutions and compare them. I’ll talk more about that in the next post.

ESTIMATOR = ML ;

requests maximum likelihood estimation.

If you are interested in factor analysis at all, there is a really good video on the Mplus site. Far more of it discusses exploratory and confirmatory factor analysis – methods, goodness of fit tests, equations, interpretation of factor matrix – than Mplus, which as you can see, is pretty easy, so even if you are using some other software the video is definitely worth checking out.

 

 

Being able to find SPSS in the start menu does not qualify you to run a multi-nomial logistic regression.

This is the kind of comment statisticians find funny that leaves other people scratching their heads. The point is that it’s not that difficult to get output for some fairly complex statistical procedures.

Let’s start with the confirmatory factor analysis I mentioned in my last post. Once you get past the standard stuff that tells you that your model terminated successfully, the number of variables and factors, you see this:

Chi-Square Test of Model Fit

Value                              8.707
Degrees of Freedom                 8
P-Value                           0.3676

The null hypothesis is that there is no difference between the patterns observed in these data and the model specified. So, unlike many cases where you are hoping to reject the null hypothesis, in this case I certainly do NOT want to reject the hypothesis that this is a good fit. As you can see from my chi-square value above, this model is acceptable.

Another measure of goodness of fit is the root mean square error of approximation (RMSEA).

RMSEA (Root Mean Square Error Of Approximation)

Estimate                           0.011
90 Percent C.I.                    0.000  0.046
Probability RMSEA <= .05           0.973

An acceptable model should have an RMSEA less than .05. You can see above that the estimate for RMSEA is .011, the 90 percent confidence interval is 0 – .046 and the probability that the population RMSEA is less than .05 is 97.3%. Again, consistent with our chi-square, the model appears to fit.
…………………………………………………………Two-Tailed
…………………Estimate       S.E.  Est./S.E.    P-Value

F1       BY
Q1F1               1.000      0.000    999.000    999.000
Q2F1               1.828      0.267      6.833      0.000
Q3F1               1.697      0.235      7.231      0.000

F2       BY
Q1F2               1.000      0.000    999.000    999.000
Q2F2               1.438      0.291      4.943      0.000
Q3F2               1.085      0.191      5.687      0.000

Here are the unstandardized estimates. By default the first variable for each factor is constrained to a value of 1, so, of course, there is no real standard error, probability or standard error of estimate. It isn’t really an estimate, that was set. Let’s look at the other two. Since they are unstandardized the more useful measure for us is the estimate divided by the standard error of the estimate, for example 1.828/ .267 . This is done for us in the column under Est. / S.E.  and in that case comes out to 6.833. You interpret these values in the same way as any z-score, with 1.96 as the critical value, and you can see in the last column that all of my variables loaded on the factor hypothesized with a p-value much less than .05.

The next thing I look at is the residual variances. At this point my only concern is that I *not* have a residual variance that is negative. It makes no sense that you would have a negative variance because (among other reasons) variance is a sum of squares and squares cannot be negative. Also, in this case, the commonality is greater than 1, meaning you have explained over 100% of the variance in this variable by its relation to the latent construct. This also makes no sense. These are referred to as Heywood cases and explained beautifully here (even though the linked documentation is from SAS it applies to any confirmatory factor analysis).

The final thing I want to look at, for right now, anyway, is the R-squared

R-SQUARE

Observed                                        Two-Tailed
Variable        Estimate       S.E.  Est./S.E.    P-Value

Q1F1               0.142      0.032      4.473      0.000
Q2F1               0.475      0.065      7.256      0.000
Q3F1               0.438      0.061      7.123      0.000
Q1F2               0.174      0.045      3.883      0.000
Q2F2               0.376      0.078      4.827      0.000
Q3F2               0.179      0.044      4.057      0.000

You can see that the r-square is pretty decent overall. These are interpreted just like any other R-square values. I didn’t show the standardized factor loadings here but just take my word for it that the R-squared values are the standardized loadings squared. So this is the variance in q1f1, for example, explained by factor 1.

I started this whole thing working with Mplus to do a factor analysis and overall, I’d have to call it a pretty painless experience.

 

Someone had a question about factor analysis with Mplus and even though it is not a piece of software I work with normally, we aim to please at The Julia Group, so I downloaded the demo version and away I went.

It truly was, as my granddaughter says, easy-peasy lemon squeezie.

You might not think so, because the first thing you are confronted with is pretty much a blank window like this

screen shot of editorFor people who are used to Excel, SPSS, SAS Enterprise Guide or other friendly GUI interfaces, this might be a bit off-putting. However, doing a confirmatory factor analysis was this easy.

1. Create a .dat file from the original file. The file was in a SAS format and I did not have SAS on the laptop I was working on (I’m in Cambridge, MA at the moment). What I did was

  • Open the file in SPSS by, from the FILE menu selecting READ TEXT DATA and then selecting SAS as the format
  • Ran this SPSS command from the syntax window to output a tab-delimited file with no header, which was the type of input Mplus would expect.

2. Type in this program to do a two-factor solution with the first three variables loading on the first factor and the next three loading on the second factor.

TITLE : Confirmatory Factor Analysis ;
DATA:  FILE IS ‘/Users/annmaria/Documents/mplustest/values.dat’ ;
VARIABLE: NAMES ARE q1f1 q2f1 q3f1 q1f2 q2f2 q3f2 ;
MODEL: f1 BY q1f1 q2f1 q3f1 ;
f2 BY q1f2 q2f2 q3f2 ;
OUTPUT: standardized ;

3. Click the RUN button.

That is really all there was to it.

Okay, well that is easy if you knew what to type so let me explain a few things. If you know SAS or SPSS this will be easy.

Each of those things that I put in all capitals is a command in Mplus, analogous to a DATA or PROC step in SAS and a command in SPSS. They don’t need to be in all caps, I just did that for ease for the reader. They DO need to be followed by a colon and then end the statement in a semi-colon.

Title – pretty obvious, gives your output a title.

DATA: FILE IS  — gives the path to locate your data.If your file is in the same directory as your program, you don’t need a fully qualified path and can just call it ‘values.dat’

VARIABLE: NAMES ARE

Give the names of your variables. You can specify a format but if you do not Mplus assumes they are in free format, which is the same as what SAS refers to as list format.  You might want to note that if you are using the demo version you can only have a maximum of 6 independent and 2 dependent variables.

MODEL:  This is my model (duh) and I am modeling two factors. The first factor I creatively named f1 and it is represented BY (notice the BY in the command) variables also creatively named q1f1 q2f1 and q3f1.

Similarly, I have a second factor named f2 ;

I added an OUTPUT statement with a standardized option because I wanted (surprise) standardized estimates. That statement is not required but as you’ll see in my next post on interpreting factor analysis data, you do want it.

I am intrigued by Mplus. It sort of assumes you have close to perfectly cleaned up data because I wouldn’t want to be doing a lot of data management with it, but for doing some relatively complex models  – factor analysis, path analysis, structural equation modeling – it looks pretty cool.

 

Here is the scenario …

1. The researcher had a return rate more than triple the average return rate for emailed surveys in general. This return rate was despite apparently not having any of the features related to higher return rates – incentives for completion, an advance postcard or email explaining the survey and incentives, email or phone follow-ups to non-respondents. A 2005 study conducted with 1,500 subjects from the same population in the same state on a similar topic had a return rate less than one-third that this researcher claims, even though the 2005 researchers used incentives AND advance notice AND in four out of five sites, emailed or called non-respondents.

2. The completion rate for each item was 97-98%. From the first item asking about the topic to the 50th, there was no drop out in responses. There appeared to be almost no drop out of the study, over 98% finished it. There was zero correlation between where the item was on the survey and how many people completed it because virtually every one of over 750 people completed every question.

3. Responses came in three types – from Group 0, Group 1 and Group 2. Over half of the subjects – nearly 400 – came from the same location, there were times when as many as 9 respondents would start the survey at the exact same minute of the same day. In one group, surveys were ONLY started between 10 and 11 in the morning or between 3 and 7:15 pm.  (Well, 5% did come in between 7:15 and 9). None of the 141 surveys in that group came  at any other time.

4. The typical audience member quoted is verbatim the same in both this report and another report from a study at a different site completed two years ago.

5.  In the report, there is an almost complete absence of detail on sampling method, nothing on return rate, nothing on how the online survey was distributed, nothing about missing data, incentives or follow-up. There is minimal discussion about possible bias in the sample. Return rate had to be computed based on the raw data and a report on the number of students surveyed. Similarly, missing data percentage was computed from the raw data.

When someone questioned this, it was stated that,

“The data were reviewed by a statistician who said that he saw no problems.”

Your comments and opinions are eagerly awaited.

ugly fish head

Trying this live blogging from SAS Global Forum again.
The title kind of says it PROC QUANTLIFE new procedure in SAS 9.3
Why DO we need a new procedure for survival analysis?

======
Survival analysis used to analyze time-to-event data
already had procs lifetes, lifereg & phreg
========
Lifereg is fine if you have IID errors – but what I’d you don’t . Enter quantile regression, possibly wearing a cape #Sasgf13 #noCape
=========
Qy(tau) is the tau-th quantile of a random variable Y eg Qy(25) is 25th percentile
==========
Quantile regression – can have same slope & different intercept for each value given for tau
Quantile regression, option 2 can have different slopes for each value of tau #Sasgf13
=============

Cumulative distribution function is the inverse of the quantile function #Sasgf13

QUANTLIFE example shows covariate that has negative effect for those with short life but positive effect for those with longer life #Sasgf13

Interested in survival analysis when covariates have non-linear relationship to time to event? Check the QUANTLIFE procedure paper #Sasgf13

I’m always a bit bemused when people refer to me as a “SAS expert”. I don’t think of myself as an expert at anything except, perhaps, bricolage, a word I am indebted to sascommunity.org for even being aware of its existence.

Merriam-Webster defines it as:

: construction (as of a sculpture or a structure of ideas) achieved by using whatever comes to hand; also : something constructed in this way
Origin of BRICOLAGE
French, from bricoler to putter about

Often, I think there are probably much more elegant ways of doing things but mine gets done very quickly which my clients appreciate since they pay me by  the hour.

Christmas tree made from shopping carts

One example of bricolage that comes to mind is my frequent use of PROC SUMMARY for output as input.

Take this recent problem.

Bricolage #1

A researcher had three measurements on each subject. The dependent variable was the mean of these measures. They were all taken at the same time, so this wasn’t a repeated measures type of design.  All I needed was to get the mean. Data were entered like this.

ID Group  Measure

01 ooo1 .47

01 ooo1 .46

01 ooo1 .46

02 o001 .49

02 ooo1 .48

I could have created a couple of variables using the LAG function, but really, I found this to be much quicker.

PROC SORT data = myfile ;

BY ID  GROUP;

PROC SUMMARY DATA = myfile ;

BY ID  GROUP;

VAR  measure ;

OUTPUT OUT = myfilefix / AUTONAME ;

DATA myfilefix ;

SET myfilefix ;

WHERE _STAT_ = “MEAN” ;

DROP _STAT_  _FREQ_  _TYPE_ ;

See what I mean about bricolage? I’m sure a real expert would have used some DROP option on the dataset in the PROC SUMMARY or something and not needed the DATA step to only keep the variables for ID, and the mean of the measure variable, and some other option to only compute the mean statistics but since this took me all of five minutes and it was not a large data set to worry about sorting and time taken by extra steps, I didn’t bother. The reason for including the group variable in the sort and proc summary as well as the id variable, even though it is obvious that the same individual will always be in the same group, is simply so the group variable was carried along and saved in the file output by PROC SUMMARY. I’m sure a real SAS expert would have a less kluge-y way of doing that, also.

Bricolage #2

The researcher had individual subject data in one file and the group data in another file. For example, a record of all students and a record of data for the classroom. We want to merge these two files. In this case, however, the student file had a student id variable and the class file had a classroom id. Fortunately, there were 10 students selected from each class, so that students with id numbers 1-10 were in class 1, id numbers 11 – 20 were in class number 2, and so on.

I could have done a PROC FORMAT or maybe a DO -loop. What I did was this”

class = INT((student – 1)/10) + 1 ;

If I use the SAS function INT take the integer part of  1/10 to 9/10   I get 0 and if I add 1 to that, I get class = 1.

What about student number 10? Well, he or she will end up with 10/10 = 1  , add a 1 and that student will be in group 2. Not correct.

If I subtract one from every student number, it works out and sorts them exactly correct. I’m sure a real expert would know some function that does exactly that, but hey, it was one line and gave me the exact results I wanted.

See what I mean? Bricolage.

I will concede that there are times when I am working with an enormous dataset when I need to do things as efficiently as possible, or when I am teaching and I need to show the exact “best” method .

There are other times, though, when I just slap something together and call it a day.

A friend of mine, also a consultant, who would definitely consider himself a SAS expert, said, disapprovingly,

“I would never do that.”

He went on to ask me if my clients were satisfied with the way I work.

I told him that sure, I have lots of the same clients for a decade or more. Often, with simple problems like the ones above, it takes me so little time to get the correct results, I just knock it out in a few minutes and I don’t even charge them, which makes them particularly satisfied. That’s the part where he REALLY gasped and said,

“I would NEVER do that!”

” … we may then define intellect in general as the power of good response from the point of view of truth or fact.” - Thorndike, 1921

Edward Tufte impresses me. His books on visual data show him as possessing in copious amounts that very rare commodity – truly original thoughts . So, when he tweeted the other day that this paper by Hill was “probably the best paper ever about making causal inference about human behavior”, of course I had to read it.

This got me to wondering about how we know something is true and led me to another thing I have learned in (almost) 55 years.

As I responded to a commenter on my blog the other day,

#15: Just because you believe something passionately doesn’t make it true.

Sometimes it might. If you believe passionately that you can earn a Ph.D. , win a gold medal in the world judo championships, run a marathon or swim the English channel, perhaps you can make it true. However, no matter how much you like your cigarettes, no matter how strongly you believe that tobacco is good for you because it is “natural”, the data are not on your side.

world championship

When a correlation between two characteristics is observed, it is common for people who don’t want that relationship to exist to object,

“Correlation doesn’t prove causation”

That is completely true. That is also not the same thing as correlation being unrelated to causation. Correlation can provide SUPPORT for a hypothesis of causation, although it is true that it cannot provide proof. In other words, we have more confidence to believe that some things  are good, more than others, from the point of view of truth or fact. Statisticians even quantify that degree of confidence in something called a confidence interval.

In his paper, Hill discusses the strength of the association found. If the death rate of the population of people in an area with a very high rate of air pollution is 14 times higher than in another area with a low rate of air pollution, then we have more confidence of a possible causal relationship than if it is 1.14 times higher.

He also discusses replication and consistency. If you can find one or two studies on a topic that support your belief, that doesn’t make it true. There is a lot more in Hill’s article. It’s both short and brilliant. You should read it.

My dissertation advisor, the late Dr. Richard Eyman, gave me  a lot of profound advice. One piece of it was

When the results don’t come out the way you expect, check everything over again. Make sure your measures are reliable and valid. Check for outliers and re-run your analyses without them. Go over everything again and look for threats to the validity of your design – the treatment was administered as you expected, the tests were administered according to the standard procedures. Run your study again and see if you get the same results. And when your results do come out the way you expect – DO THE EXACT SAME THING!

Just because you believe it, doesn’t make it true.

For lies (and data) about anchor babies, click here.

For #14 of the things I have learned in (almost) 55 years, click here. I’m trying to get to 55 by the time I turn 55 in August. I believe I can do it!

 

There has been far more heat than light surrounding the current controversy over whether a transgender (male to female) fighter should be allowed to compete in mixed martial arts in the women’s division.

This article on The Verge said that opponents of Ms. Fox competition “are not supported by the current science”, citing the fact that the International Olympic Committee allows transgender athletes to compete under certain conditions – set number of years post-surgery, hormonal therapy.

Since mixed martial arts is not an Olympic sport whatever science on which this decision was based most likely did not include any studies involving mixed martial arts. I say most likely because although I asked on my other (personal blog), where I write about sports a lot for citations of this supposedly voluminous scientific literature no one provided me any relevant references and I did not uncover any searching the National Library of Medicine database. A few people did send me references to articles on hormonal therapy but none of these even discussed the issue of sports participation. Their focus instead was on the possible side effects, e.g. cancer or other ill effects, of people with various hormone regimens.

The most reasoned discussion on this topic I have read is on Dr. Rosi Sexton’s blog, and I agree with her three main points:

  1. There is not much at all in the way of data documenting whether or not Ms. Fox has an advantage competing in martial arts. Expert opinion is split on this issue.
  2. No one has a right to compete in mixed martial arts. There are all kinds of qualifications – you have to be a certain weight and gender to compete in a division. You can’t be pregnant.
  3. Mixed martial arts are different than say, canoeing, because you are trying to do bodily harm to your opponent. Unlike in many sports, a Type I error – rejecting a true null hypothesis – is likely to cause harm to others.

Read her blog. It’s good. Personally, I want to address a couple of other points. On twitter, Shelly Summers made the comment that it was difficult to dispute that Fallon Fox (or any transgender fighter) does not have an advantage unless it is spelled out exactly what her advantage is supposed to be. Now here I’m on firmer ground at least because we can couch this in terms of an equation. Logistic regression would be best, with win or loss as the dependent variable. The question then is what are the independent variables. This might seem a straightforward question but it’s not. Let’s start with what should be obvious.

1. Different sports have different requirements for success. Males don’t seem to have an advantage in equestrian – the sport is mixed gender in the Olympics. We have separate shooting events for men and women, but I don’t know why. Winning  the marathon requires more endurance. Winning a gold medal in weight-lifting requires more strength. Women tend to do better in long-distance swimming. Men are better at football.  Height is an advantage in some sports (basketball, volleyball) and not in others. I could go on, but you get the point, I hope, which is that you cannot generalize about “sports”.

2. Even in the sports like football, baseball and basketball where millions of dollars are spent on data analysis, the predictions are far from perfect.

3. There are several variables that predict athletic success, including in mixed martial arts.

I know a lot more about judo than mixed martial arts specifically, but there is some overlap, so let’s look at what you certainly need:

  • Physical strength. While technique will beat strength, if other things are equal, the match goes to the person with more strength. It’s like using steroids – it doesn’t guarantee you the win, but it gives you an edge. People have argued that the hormonal levels of transgender females are no different than those born female. A more relevant test would be measures of physical strength. Given that two people of the same weight engaged in the same strength and conditioning program would there be a difference in measures of strength like the maximum weight in the bench press, dead lift , the maximum number of repetitions at a given weight, recovery after a given rest period, etc.? I don’t know.  It might also be hard to assess this if people knew they were being studied because either person might consciously or subconsciously lift less and affect the results. This is why researchers favor double-blind studies where neither those collecting the data nor those giving it know if they are in the experimental or control group. The only way I see you could do this though, is through deception, e.g., telling both parties you were assessing the effect of a specific hormone regimen. Regardless, as far as I know these type of data are not available from well-controlled (or really, any) studies.
  • Endurance. A judo competition is  several matches of four minutes each, usually with a rest period of 10- 60 minutes in between. Mixed martial arts matches are either three or five five-minute rounds, with short rests in between. An indirect measure might be something like lung capacity, but a more direct measure would include things like resting heart rate immediately after each round. Physiological measures are not my specialization, but I cannot imagine any way in which direct measures would not be preferable to indirect ones. Again, this is an area where I am not aware of any research, certainly not for MMA specifically.
  • Reach. I can guarantee this from having fought for years – and having won a world championships, I fought at a pretty high level – competitors who have a much longer reach have an advantage.
  • Psychological factors. If you have watched many combat sports at all, you have seen those matches where someone should not have won and yet they did. This is something every top athlete has, that absolute refusal to lose.
  • Speed. If you can beat your opponent to the punch every time (figuratively as well as literally) you will win.

Have there been studies establishing transgender female mixed martial artists and other female mixed martial artists on these characteristics? I’m almost certain not. If we don’t have any evidence that Ms. Fox does NOT have an advantage, it would make sense to agree with Dr. Sexton that it is best to err on the side of the safety of the other women in the division and disallow her competition.

One thing did immediately strike me when I heard about this story that made me say — wait a minute. Fallon Fox is 37 years old. How many professional competitors in women’s mixed martial arts are 37 or older ? I looked up the women’s rankings in the Fight Matrix for the 145 and 135 lb divisions. I added Ms. Fox into the mix, and to be fair, I also added Peggy Morgan, the woman who would have been her opponent in her next bout, except she has announced she refuses to fight Ms. Fox. While MMA Junkie lists Ms. Fox age as 43, other sources list it as 37, so I used the lower age. To give one more data point, I added Marina Shafir who just won her fight tonight and is the same division as Ms. Fox. Since Marina is only 24, I calculated the results with and without her. Still significant.

Here is the age distribution for those 30 women.

fighterbyage

There is exactly one woman older than Fallon Fox among those competitors – Hitomi Akano. Ms. Akano lost her last two fights . ** NOTE CORRECTED ON 3/27/2013 see comment below.

The average age of the other 29 fighters was 29.21. Using a z-test with this as the population value and the population standard deviation of 3.9 gives a z-value of 1.99. If we were conducting a one-tailed test of the hypothesis that Fox is significantly older (based on an assumption that a transgender female would have an advantage and be competitive at a later age) we would reject the null hypothesis as the critical value of a one-tailed test is 1.64. However, if we were to use the more rigorous two-tailed test and say our alternate hypothesis is that being transgender could be an advantage or disadvantage, we’d still accept the null hypothesis as the z-value is greater than 1.96

At most we can say there is slight evidence for an advantage, and that based on a small amount of data.

We can note that in this sample, as Ms. Akano has not won a fight since she was 36, there is exactly one person in here with a winning record after age 36, and that is Ms Fox. There are nine fighters in this sample that have winning record of 100%. The other eight fighters range in age from 24 to 33 with an average age of 28.9 years.

What we can say from this admittedly small sample of data is that Ms. Fox appears to be winning decisively at an age that is significantly older than the average female competitor in or near her weight division. At least as far as the age at which she is successful in competition, Ms. Fox DOES appear to be significantly different than a sample of mixed martial arts fighters who were born female.  Could this be because she is more determined, trains harder, wants it more or just has an amazing coaching team? Yes, it could. Could it be because we only have a small sample which could be non-representative of women mixed martial arts fighters? Yes, it could. I’d be happy to do a large, well-controlled study with lots of variables, but it turns out that I have to get back to doing the analyses for which people pay me money.

Anyone else is welcome to find their own data, list their sources and post it or publish it wherever they like. Please give a link in the comments if you do. What I did was use the data that was available to me and actually looked at female mixed martial artists and performance. What I did not do was consider data on hormones,  law, a hypothesized set of data that somebody must have had somewhere before making a policy and not someone’s opinion on what people should or should not do in their private lives.

Next Page →