Testing Judge Bias

The goal is to conduct a series of t-tests testing the null hypothesis that the scores a judge assigns to their countrymen's dives are different than the scores they assign to those of other divers. We calculated a metric that measures how a judge's score for a particular dive differs from the panel mean for that dive. For each judge, we will compare the scores calculated for dives done by divers from the judge's home country and dives done by other divers.

We will loop over all the judges and calculate a two-sample t-test assuming unequal variance and normally distributed data.

##                     judges               pvalue
## 1             WANG Facheng    0.989655993691901
## 2               MENA Jesus 0.000262738981201337
## 3             ZAITSEV Oleg 4.62390325993386e-06
## 4          McFARLAND Steve 0.000219535771251978
## 5               ALT Walter 2.93772682243184e-06
## 6        BARNETT Madeleine 0.000381120502870843
## 7         BOOTHROYD Sydney   0.0204961042925179
## 8  RUIZ-PEDREGUERA Rolando   0.0191200611932846
## 9               CRUZ Julia  0.00179308771262052
## 10           BOYS Beverley   0.0289269502603285
## 11         BOUSSARD Michel    0.121683011822911
## 12         BURK Hans-Peter   0.0095337659932683
## 13               XU Yiming  0.00236207070454411
## 14            SEAMAN Kathy    0.111057727179812
## 15     GEISSBUHLER Michael     0.03290497237863
## 16             HUBER Peter   0.0575318481056923
## 17          CALDERON Felix   0.0759961039010729

We can see from the table that some judges are clearly bias. Moreover the plots below start to highlight the direction of this bias. Note that certain judges only scored a handful of their countrymen so the results for these judges are not robust.

plot of chunk unnamed-chunk-2

The evidence support Steve McFarland's bias is certainly mounting. Other judges, such as Jesus Mena, Oleg Zaitsev, Walter Alt, and Madeleine Barnett also stand out. That said, these results could change, keeping in mind the equal variance and normality assumptions we made at the start.