A recent tweet about mixed martial arts decisions set me to thinking about probability. @Fight_ghost tweeted that a TV commentator made no sense when she said that she thought a fighter should have won by split, not unanimous decision. Others on twitter agreed with him that was a stupid comment, and asked did she think judges should say the other fighter only 2/3 won or what.
I thought it did make sense in statistical terms. Think of it this way:
The “true score” of the population in this case is the mean of what an infinite number of judges would rate a fighter’s performance. Of course, there is going to be variation around that mean. Some judges may tend to weight take downs a tiny bit more. Judges vary in their definition of a significant strike. Some judges are just going to be clueless or inattentive and give a score that is far from accurate. On the average, though, these balance out and the mean of all of those infinite judges’ scores should be the true score. Let’s say our fighter, Bob, had a true score of 27. The most common score we should see a judge give him is 27, but a 26 or 28 would not be totally unexpected. Given that the standard deviation of fight scores is low, we would be surprised to see him given a score of 25 or 29 and completely floored if he received a 24 or a 30.
Let’s say we have a second fighter, Fred. His true score is 29. The most common score we should see for him is a 29, but again, a 28 or a 30 would not be unexpected because there is variation in our sample of judges.
Here is the point … when fighters are far apart in the true score of their performance, judges should very seldom have a difference of opinion in who won. Even when Bob is scored high, for him, at 28 and Fred is scored his average of 29, Fred still wins. Let’s say the standard deviation of judge’s scores is 1. I believe it is really lower than that and I do know that the winner of a round has to get 10 points, but for ease of computation, just go with me.
For Bob to win, he must be rated at least two standard deviations above his true score (which occurs 2.5% of the time) and Fred must be rated below his true score, which occurs half the time. Since the scores for Bob and Fred are independent probabilities the probability of BOTH of these events happening is .025 x .5 = .0125
The other way for Bob to win is if Fred scores two standard deviations below his true score, which will occur 2.5% of the time AND for Bob to score above his true score. Again, the combined probability is .0125. SO …. only 2.5% of the time (.0125 + .0125) would Bob win. Since judges’ scores are independent, the probability of any one scoring it for him, causing a split decision is .025 + .025 + .025 = 7.5%
(If all three judges scored it for Bob, that would be a very, very low probability of .o25 * .025 * .025 because, again, the judges scores are assumed independent of one another. In only 0.063% of the cases would this occur. We should probably subtract that and the probability of two of them scoring it for Bob to be exact, but I have to finish grading papers tonight so we’ll just acknowledge that it is not exactly 7.5% and move on.)
Let’s go back to the fight that actually happened. I didn’t see it so I am going to take some people’s word that it was a close fight. They might be lying but let’s assume not.
In this case, Bob, who has a true score of 27, is not fighting Fred, but rather, Ignatz, who has a true score of 27.3 (with three judges, he’d get a 27, 27, 28 score). There is great overlap in Bob and Ignatz’s scores. To outscore Ignatz’s average score, Bob would need a score of 27.4 – well, a z-score of .4 occurs about 35% of the time. Half of the time Ignatz is going to score 27.3 or lower so the probability of him both having an average or below score AND Bob having a 27.4 or high score is .5 *.35 or .175. So 17.5% of the time, a judge would give Bob a higher score. Since there are three judges, the probability of ONE of them giving him a higher score would be .175 + .175 + .175 = 52.5%
There is also the small probability that it could go unanimous the other way, but that’s not really pertinent to our argument.
The point is simply this … if two fighters’ true scores are close, it is much less likely that you will see a unanimous decision than if their true scores are really far apart. The closer they are, the more that statement holds. So, no, it is not a stupid comment to say that you believe someone warranted a split decision rather than a unanimous decision. It may simply mean that you think the fighters’ were so close that you were surprised there was not any variance in favor of the only slightly better fighter.
Really, I think most people would find that a reasonable statement.
Extra credit points:
Give one reason why the Central Limit Theorem does not apply in the above scenario.
Answer this question:
Does the fact that the distribution of errors is necessarily non-symmetric in Fred’s case (cannot score above 30) negate the application of the Central Limit Theorem?