Hypothetical Unfair Result

cvdwightw · Post by **cvdwightw** » Mon Apr 28, 2008 5:30 pm

The following does not actually reflect any known situation. I drew this up Sunday morning and worked on it slightly more on the plane ride back Sunday night to see if it was indeed possible that such a situation might exist.

Consider the following two teams in separate seven-team, six-game brackets, each of which finishes 3-3, and the statistical tiebreaker for one spot is based upon bonus conversion.

Team A:
Round 1: W, 165-120, 8 tossups, 1 negs, Bonus conversion 11.25
Round 2: W, 220-130, 12 tossups, 0 negs, Bonus conversion 8.33
Round 3: L, 150-220, 6 tossups, 2 negs, Bonus conversion 16.67
Round 4: L, 130-330, 5 tossups, 0 negs, Bonus conversion 16.00
Round 5: L, 145-280, 7 tossups, 1 negs, Bonus conversion 11.43
Round 7: W, 260-65, 13 tossups, 2 negs, Bonus conversion 10.77

Team B:
Round 1: L, 130-180, 7 tossups, 2 negs, Bonus conversion 10.00
Round 2: L, 85-320, 5 tossups, 1 negs, Bonus conversion 8.00
Round 3: W, 230-160, 9 tossups, 0 negs, Bonus conversion 15.56
Round 4: W, 235-70, 11 tossups, 3 negs, Bonus conversion 12.73
Round 5: L, 205-210, 10 tossups, 1 negs, Bonus conversion 11.00
Round 7: W, 150-135, 8 tossups, 2 negs, Bonus conversion 10.00

You will notice that Team A holds a slight points per game advantage over Team B (178.3 to 172.5), has played a theoretically tougher schedule (190.8 points against to 179.2), has converted more total tossups than Team B (51 to 50), and has a higher bonus conversion than Team B in EVERY ROUND (the two have the same bye round, so effects of a bye are irrelevant). Yet TEAM B wins the Bonus Conversion tiebreaker, 11.60 to 11.57. In other words, Team A's overall profile is slightly better, Team A has outperformed Team B on the bonuses in every single round, and yet Team B wins the bonus conversion tie-breaker.

This works for bonus conversions on the high end as well:

Team A
Round 1: W, 275-160, 10 tossups, 1 negs, Bonus conversion 18.00
Round 2: W, 300-120, 12 tossups, 0 negs, Bonus conversion 15.00
Round 3: W, 230-225, 8 tossups, 0 negs, Bonus conversion 18.75
Round 5: L, 115-355, 4 tossups, 1 negs, Bonus conversion 20.00
Round 6: W, 260-220, 9 tossups, 2 negs, Bonus conversion 20.00
Round 7: W, 335-95, 13 tossups, 3 negs, Bonus conversion 16.92

Team B
Round 1: W, 250-205, 9 tossups, 0 negs, Bonus conversion 17.78
Round 2: L, 110-375, 5 tossups, 2 negs, Bonus conversion 14.00
Round 3: W, 305-140, 11 tossups, 1 negs, Bonus conversion 18.18
Round 5: W, 310-120, 11 tossups, 2 negs, Bonus conversion 19.09
Round 6: W, 340-175, 12 tossups, 2 negs, Bonus conversion 19.17
Round 7: W, 210-150, 8 tossups, 0 negs, Bonus conversion 16.25

Team A and Team B have the exact same tossup line of 56-9. Team A has once again outperformed Team B on bonus conversion in every round (each team has a Round 4 bye, so bye effects are irrelevant). Team B wins the bonus conversion tiebreaker, 17.86 to 17.68.

Based on these mathematical possibilities, I conjecture that the inherent variation in difficulty between packets necessarily forces bonus conversion to be dependent on the schedule assigned to each team. Suppose that in the round with the statistically easiest packet of the tournament Dartmouth A is playing Brown and a team of comparable strength, let's say Harvard A, is playing a bunch of inexperienced freshmen. Dartmouth will get some tossups against Brown, but on average probably no more than 6 or 7. Meanwhile, Harvard would probably get at least 15. Bonus conversions would probably be near 20 for both teams. Now let's say the next round is the statistically hardest one in the tournament. Harvard gets something like 5 questions against Maryland and Dartmouth cleans up on some inexperienced freshmen. Bonus conversion for both teams is around 12. If both Harvard and Dartmouth play similarly to each other over the rest of the tournament, then Dartmouth has been penalized for playing a weak opponent on a tough packet while Harvard has similarly benefited from playing a weak opponent on an easy packet. If the schedules were reversed, it might be Harvard being penalized by the combination of packet and opponent rather than Dartmouth.

It seems that a team benefits more from a strong bracket when the mean overall bonus conversion is lower than the median bonus conversion, and from a weak bracket when the mean overall bonus conversion is higher than the median bonus conversion. What this means is that a tournament whose bonuses skew to the hard side (more packets on the more difficult end of the "average" difficulty than on the easier end) will, over time, favor teams in a weaker bracket, while a tournament whose bonuses skew easy will, over time, favor teams in a stronger bracket. The expected bonus conversion will be the same; however, we do not care about the expected bonus conversion. We do not care about how much higher one team's bonus conversion is than the other's; we simply care whether or not that team's bonus conversion is higher.

When the bonuses skew easy, a team in a bracket with many weak teams will be more likely to be far above its average when it is above its average expected conversion. This is, in turn, balanced out by many more trials slightly below the expected bonus conversion. A team in a bracket with fewer weak teams, however, is more likely to be "just above" the expected bonus conversion. This leads to the team in the tougher bracket having a higher bonus conversion somewhere above 50 percent of the time, depending on the bracket strength and how steeply the tournament skews. Conversely, a team in a bracket with many weak teams will be more likely to be far below its average when it is below its expected conversion, and this is balanced out by many more trials slightly above the expected bonus conversion. A team in a bracket with fewer weak teams is more likely to be just below the expected bonus conversion. This leads to the team in the weaker bracket having a higher bonus conversion somewhere above 50 percent of the time.

In conclusion, I have shown that there exist multiple unlikely scenarios in which a team can score an equal or greater number of tossups than another team, outperform that team on bonus conversion in every round, and still lose the bonus conversion tiebreaker. I have also conjectured that bonus conversion is indeed dependent upon a team's schedule and that it is not completely independent of the overall strength of that schedule. Based on the shape of distributions, a tournament with more packets below the average bonus difficulty will, over the long run, benefit teams in weaker brackets while a tournament with more packets above the average bonus difficulty will, over the long run, benefit teams in stronger brackets.

I hope to perform more simulations and see whether the data does in fact support the theory. Many people argue that bonus conversion is the most fair tiebreaker because it is independent of the strength of the bracket. I claim that there exist conditions where this does not hold and hope to determine what those conditions are and how much bonus conversion is affected under those conditions.

NoahMinkCHS · Post by **NoahMinkCHS** » Mon Apr 28, 2008 6:22 pm

Interesting. I, myself, have never been a huge fan of bonus conversion as a sole tiebreaker.

That said, of course, ANY statistical tiebreaker is going to have its flaws. I'm willing to accept the currently prevailing notion that BC is less-flawed than other methods, and am curious to see how common the problem posed in this hypothetical is. (Probably "not at all", although I'm sure less-extreme cases can be observed.) I'd also be curious to see if BC "fails" more or less often than other forms of tiebreakers.

Wall of Ham · Post by **Wall of Ham** » Mon Apr 28, 2008 6:25 pm

Isn't this Simpson's Paradox?

Post by **theMoMA** » Mon Apr 28, 2008 6:28 pm

NoahMinkCHS wrote:I'm willing to accept the currently prevailing notion that BC is less-flawed than other methods, and am curious to see how common the problem posed in this hypothetical is. (Probably "not at all", although I'm sure less-extreme cases can be observed.) I'd also be curious to see if BC "fails" more or less often than other forms of tiebreakers.

In the case of ACF Nationals, the result that most parallels this situation is that UCI slipped into the second bracket because we had a higher bonus conversion in the prelims. However, unlike the above result, we out-PPB'd UCI on four of the six packets, including the packet where they had a 7.27 conversion on 11 tossups (we had 13.33 on 6 tossups in a loss to Stanford).

It's unfortunate that UCI slipped because of this, because I would have enjoyed finally getting to play them. But although PPB tiebreakers could lead to unfairness in the above scenario, I don't think that this particular result was unfair.

cvdwightw · Post by **cvdwightw** » Mon Apr 28, 2008 7:06 pm

I'm pretty sure that this is not common at all, and I agree with Andrew that, having seen the statistics, the ultimate result of rebracketing at ACF Nationals was completely fair.

The standard argument is that BC more adequately corrects for imbalanced brackets than PPG. PPG takes in a measure of tossup performance as well as bonus performance, and therefore theoretically is more subject to variation based on bracket strength. Since only one team can answer the bonus, bonus conversion is a way to measure how much one team knows versus another.

However, not all teams get the same bonuses and all teams get more bonuses in some rounds than others due to tossup performance. What I am interested in is the combination of bracket strength and the distribution of expected bonus conversion on a team's overall bonus conversion.

To give a relevant example, we just ran up against a packet in round 2 where we couldn't buy 20 points on a bonus. It happens. From an overall bonus conversion paradigm, we would have been better off playing a stronger team that would have prevented us from getting as many tossups; thus, the contribution of that round to our overall bonus conversion would have been less. On the other hand, we clearly benefited from the round 4 bonuses, and would have been better off playing a weaker team than Dartmouth; Minnesota similarly benefited from the bonuses in that packet, and would have been better off playing a team where they could have gotten more than 5 tossups.

A team's bonus conversion is affected by its tossup performance such that we can be x% confident that the team's true bonus conversion (the conversion it would get if it heard every bonus in the tournament) is within a certain interval. Large deviations from the average bonus conversion are more likely to yield a larger range of the interval than small deviations, and if the packet difficulty skews one way or the other then that confidence interval will reflect it.

Matt Weiner · Post by **Matt Weiner** » Mon Apr 28, 2008 7:12 pm

Wall of Ham wrote:Isn't this Simpson's Paradox?

Looks like it to me.

Kyle · Post by **Kyle** » Mon Apr 28, 2008 7:36 pm

The ICT and ACF Nationals produced somewhat opposite results in the sense that the former finished Harvard-Dartmouth-Minnesota and the latter Minnesota-Dartmouth-Harvard. At the ICT, the team that got an advantage in the final (us) got it because of our record from the prelims. At ACF Nationals, the two teams that got to make the finals (not us) got there because of their records in the prelims. In both cases, I think that the winner was the best team on that day in that format, so I have no complaints, but it does sort of bother me that the prelims matter so much more than the playoffs in determining the undergraduate champion. We were joking at the ICT because we went 5-0 on the first day and could have lost nine consecutive games on the second day and still won the undergrad championship (in the end we went 2-7 on the second day). This is too bad because we would have played intense, interesting games against the other top undergraduate teams, but we ended up playing meaningless* (for us) games against the top overall teams instead (at the ICT) and having already lost our chance at the undergrad title before lunch (at ACF). I realize that the undergrad titles are afterthoughts in planning the brackets, but I don't like winning or losing the undergrad title without getting a chance to play all the top undergrad teams.

(*meaningless in the sense of not affecting our position in the undergrad standings — but meaningful in the sense that every game against Jerry Vinokurov is for me a deeply meaningful experience)

evilmonkey · Post by **evilmonkey** » Mon Apr 28, 2008 10:36 pm

cvdwightw wrote:The standard argument is that BC more adequately corrects for imbalanced brackets than PPG. PPG takes in a measure of tossup performance as well as bonus performance, and therefore theoretically is more subject to variation based on bracket strength. Since only one team can answer the bonus, bonus conversion is a way to measure how much one team knows versus another.

However, not all teams get the same bonuses and all teams get more bonuses in some rounds than others due to tossup performance. What I am interested in is the combination of bracket strength and the distribution of expected bonus conversion on a team's overall bonus conversion.

I suppose a more accurate way of tiebreaking using bonus conversion would be to only take into account bonuses that both teams answered. This would eliminate a team getting lucky with bonuses as compared to another. Of course, this would also increase the amount of things a score-keeper would need to keep track of (unless you synchronized TU/B - read Bonus 1 with Tossup 1, Bonus 20 with Tossup 20, no matter how many Tossup had been converted). Also, it would increase the time needed to resolve ties by about 10-15 minutes. However, if there are serious concerns about the flaws of BC, this may be the way to go.

vandyhawk · Post by **vandyhawk** » Mon Apr 28, 2008 11:11 pm

This issue is a complicated one. I don't think any result besides playing off the tie is ideal in every situation. Even playing off a tie isn't perfect, though, as the same two closely matched teams could easily split even like a 10 game series. I feel bad for Irvine that they tied for 1st in our prelim bracket but did not make the top playoff bracket. If I had actually been able to pull out any reasonable number of tossup answers in round 2, which according to the round report was the hardest round bonus-wise, they may have had an overall higher ppb. I understand that the editors announced a definite method of choosing the 7th top bracket participant, and I doubt they had considered a scenario as convoluted as the one that transpired, where a team in a 3 way tie for 3rd place got in ahead of a team in a 3 way tie for 1st place. As laid out, I don't think they had any other option than what was done. PPB may be the most un-biased way of breaking ties, especially in light of different prelim bracket strengths (the 2nd prelim bracket was significantly weaker than the other two, by the way), but bonuses aren't the only part of the game. You have to answer tossups to get bonuses first. Take Harvard B for example - they had about the same bonus conversion I did in the prelims (just under 10), but they hardly answered any tossups and therefore did significantly worse than I did. Anyway, this is kind of a convoluted way of saying that I think we need to have some sort of clause that prevents a T-3 team from beating out a T-1 team, especially when their bonus conversions are quite close and subject to the whims of how bonuses fall your way or not. This has nothing to do with the merits of MN or Irvine, or subsequent playoff performance, just general thoughts.

Post by **theMoMA** » Mon Apr 28, 2008 11:42 pm

I'm not sure if that would be entirely fair, Matt. Consider that in the real scenario, we had up-and-down harder opponents to get tossups against than UCI, yet still managed to convert just three fewer tossups (in addition to our fairly significantly superior bonus conversion). They may have been a T-1 team, but who's to say that's any more valuable than being the best T-3 team in a more difficult bracket?

Let me clarify that I do think that the result that happened was the fairest resolution in this year's scenario, but I don't think that straight PPB tiebreakers will always yield that result.

vandyhawk · Post by **vandyhawk** » Tue Apr 29, 2008 12:56 am

theMoMA wrote:I'm not sure if that would be entirely fair, Matt. Consider that in the real scenario, we had up-and-down harder opponents to get tossups against than UCI, yet still managed to convert just three fewer tossups (in addition to our fairly significantly superior bonus conversion). They may have been a T-1 team, but who's to say that's any more valuable than being the best T-3 team in a more difficult bracket?

Let me clarify that I do think that the result that happened was the fairest resolution in this year's scenario, but I don't think that straight PPB tiebreakers will always yield that result.

Yeah, I don't think any way of resolving would really be fair in everyone's eyes, other than playing a match. It's an inherent flaw in the 3 brackets of 7 format I guess. Not that I think any other format would've been better, but it leads to making choices based on a limited sample size that may or may not be ideal given all the circumstances, especially the fact of playing a completely different schedule, and having byes in different rounds.

setht · Post by **setht** » Tue Apr 29, 2008 11:39 am

Dwight's right that PPB is not a perfect reflection of team knowledge, especially in a situation where teams don't all play the same packets and bonus difficulty can vary from packet to packet. One attempt at a quick fix might be to compare a given team's PPB for a given packet with the average PPB of all other teams on that packet, then take the average (over rounds) of that number. This should get rid of the problem of having a team drop to low bonus conversion by earning lots of bonuses in a packet full of tough bonuses, but it would be more susceptible to random fluctuations, since now we're looking at team performance on something like 10 bonuses at a time--hitting a couple bonuses in areas that the team knows really well or doesn't know at all would skew things much more than it does in the total PPB statistic.

I think Bryce's proposal is effectively impossible to implement and also susceptible to issues of small statistics.

In the end, I think PPB is the best simple statistic available. It'll be interesting to see if Dwight can come up with any feasible modifications or alternatives that work better.

-Seth

Down and out in Quintana Roo · Tue Apr 29, 2008 12:17 pm

Just want to say that this was a really really interesting read with so many theoreticals and hypotheticals.

There are no perfect solutions to this but the best thing is that we're talking about it, trying to figure out what it is best to do. That's that matters.

The Quizbowl Resource Center

Hypothetical Unfair Result

Hypothetical Unfair Result

Re: Hypothetical Unfair Result

Re: Hypothetical Unfair Result

Re: Hypothetical Unfair Result

Re: Hypothetical Unfair Result

Re: Hypothetical Unfair Result

Re: Hypothetical Unfair Result

Re: Hypothetical Unfair Result

Re: Hypothetical Unfair Result

Re: Hypothetical Unfair Result

Re: Hypothetical Unfair Result

Re: Hypothetical Unfair Result

Re: Hypothetical Unfair Result