Update to Computer Rankings

Old college threads.
Locked
Schweizerkas
Lulu
Posts: 83
Joined: Sat May 12, 2007 1:01 am
Location: Stanford, CA

Update to Computer Rankings

Post by Schweizerkas »

I've finally had some time to update my quizbowl computer rankings. You can find them, as usual,
here.

Here's what I've done since my last post about these rankings:

1) I've improved the SQBS text parser, so it's better at recognizing names. For example, it now
automatically realizes that "Jerry from Brown" and "Jerry Vinokurov from Brown" are the same person.
You'll see that the rankings now include 134 players, as opposed to ~60 players previously.

2) I've added stats from some more tournaments.

3) I've actually changed the calculation method a bit. The old Markov-chain method had this unpleasant
feature that people who played more games ended up getting higher ratings, which I had previously tried to artificially
fix by dividing ratings by the number of opponents a player had faced. However, I've come across a
different calculation method that automatically takes care of this (i.e., the rankings from this method are not
biased by how many games a player has played in.) I still calculate head-to-head results, and derive the
probability that Player A is better than Player B, in the exact same manner as before. But now, instead of using
a Markov-chain method to calculate a steady-state ranking, I create a system of N simultaneous differential
equations (N = number of players being ranked), and calculate a stationary solution to this system of equations.
It's qualitatively the same idea as before, just using differential equations instead of Markov chains. The
method is based on the ranking system described here.
This method produces slightly different rankings than the old method, but they're similar.

4) I've investigated the issue others brought up about the rankings being biased towards people who play solo, or
with weak teammates. I've reached the following conclusions:
A) If Player A is better than Player B, then on average Player A will outscore Player B in a head-to-head
matchup, REGARDLESS of whether Player A's or Player B's teammates score more points in that match. I think the
explanation for this is that Player A's teammates's buzzes steal points from Player B just as much as they steal
from Player A.
B) Despite this, the rankings still have a bias towards players with weaker teammates. I think this is the
explanation: let's say both Player A and Player B are equally good, and they're both top-10 players. Let's
also say A has really good teammates, while B plays solo. Now, if A plays B, on average they'll roughly split
their head-to-head matches (while A's team will of course almost always beat B's team). So you might think, OK,
that means the rankings aren't biased. However, the bias creeps in when A and B play against bad teams. B will
destroy the bad teams, and since B is playing solo, he'll rack up huge head-to-head wins against all the bad
teams, averaging 15 tossups against these teams. Meanwhile, when player A plays those teams, he'll average maybe
4 tossups against them. Yes, he'll usually win the head-to-head matchups versus all these bad opponents, but it
will only be by margins of 4-2 instead of 15-2. And even if we were to ignore "margin-of-victory", when A is only
averaging 4 tossups/game (because of his awesome teammates), it's pretty easy for A's score in some games to
fluctuate to 0, meaning A could conceivably flat-out lose a head-to-head matchup versus a bad opponent on a bad team.
B of course would never lose a head-to-head matchup to a bad player on a bad team.

So, anyway, I'll concede that these rankings have some bias towards players who play solo, but I hope people still
enjoy them. Even with this bias, they still look pretty decent to me. I know where the bias lies, and I might
try fixing it sometime, but I doubt I'll do it anytime soon, since I suspect figuring out exactly how much of a
bias correction needs to be applied will be tricky.

Things you could do to help:
i. Let me know any tournaments that aren't in the database yet. Pointing me to the SQBS files would be great.
I'm aware that Chicago Open and VCU Open aren't in there yet, I just need to get around to it. Eventually, I'd
like to do separate rankings for each school year, so we can go and look back at who were the top players in 2005
(I'm not sure yet how many years you can go back before there aren't enough tournaments with SQBS results).
ii. If you're not in the rankings, first check how many of the tournaments in the database you've played in. I
require a player to have played at least 16 games, so if you've played in less than 2 tournaments in the database,
then that's probably the reason you're not listed. If you have played at least 16 games in tournaments in the
database, and you're still not listed, let me know what tournaments you played in, (and what name you used in
those tournaments, if it's not obvious), and I'll try to figure out what's wrong.
iii. If you share a first name with another player on your team, it's entirely possible I've swapped your stats
somewhere (I'm talking to you, Steven Katz/LaRue), so if you care about such things, you can double-check your stats.
iv. If anybody wants to break the anonymity of all those mystery MIT players, let me know...
Brian
Stanford University

User avatar
Frater Taciturnus
Auron
Posts: 2463
Joined: Mon Dec 12, 2005 1:26 pm
Location: Richmond, VA

Re: Update to Computer Rankings

Post by Frater Taciturnus »

hope this helps a bit:

Titanomachy (Maryland):
http://www.studentorg.umd.edu/maqt/2007 ... dings.html

Minnesota Undergrad Tournament:
http://limozeen.googlepages.com/MUT2008_standings.html

Minnesota Undergrad Tournament (VCU Mirror):
http://www.hsquizbowl.org/jbchp/jbchp_standings.html

MCMNT 2008:
https://netfiles.uiuc.edu/dtaylor4/MCMN ... niq=prqpwm

FICHTE:
http://www.studentorg.umd.edu/maqt/2008 ... dings.html


Note:
-The "George Berry" who played for the CNU/JSR team at EFT as "Kearney" (W&M), W&M for ACF Fall (VCU) and Titanomachy (UMD), and J. Sargeant Reynolds at MUT and FICHTE are the same person.
Last edited by Frater Taciturnus on Fri Aug 22, 2008 4:23 pm, edited 1 time in total.
George Berry
[email protected]
--------------
J. Sargeant Reynolds CC 2008, 2009, 2014
Virginia Commonwealth 2010, 2011, 2012, 2013,
Douglas Freeman 2005, 2006, 2007

User avatar
theMoMA
Forums Staff: Administrator
Posts: 5784
Joined: Mon Oct 23, 2006 2:00 am

Re: Update to Computer Rankings

Post by theMoMA »

I'm not so sure about the efficacy of this ranking. It seems to do a fine job of showing who scores more points than other people, but it seems to me that this isn't very revolutionary, cool, or useful. Raw, context-free points seem to be a poor measure of actual player value. You will have a hard time convincing me that Jason Keller is a more valuable player than Eric Mukherjee. In this way, I don't really think this program passes the "laugh test."

Would it be hard to rework this to spit out results based on PATH or some kind of stat that adjusts for teammate prowess, just for comparison's sake?
Andrew Hart
Minnesota alum

User avatar
cvdwightw
Auron
Posts: 3446
Joined: Tue May 13, 2003 12:46 am
Location: Southern CA
Contact:

Re: Update to Computer Rankings

Post by cvdwightw »

Yeah, I think Brian already hit the nail on the head with his assessment of the major problems. For instance, just using an example from my own stats, it looks like I get the same outcome for a good game in which I was outplayed by #8 Matt Keller (70-90) as I do for a game in which I laid an egg against #40 Bruce Arthur (0-20). Essentially, I think this system more penalizes players for having bad games against bad and average opponents than it rewards players for having good games against the best opponents.

The idea of rewarding players for having good games against great opponents, while not severely penalizing players for bad games against bad opponents, was the major upside of the Litvak slope statistic. The major problem of that statistic was that it could be artificially inflated by "tanking" against bad teams. I see this as kind of the opposite of that statistic - if one is trying to maximize one's ranking, there's not much incentive to play well against the best players, while there's every incentive not to tank against other opponents.

Oh, also, the rankings on the homepage don't match the rankings used when I click on a player's name (e.g. I'm #8 if someone plays me, but I'm #20 in the rankings). I don't know how much of a difference that makes.
Dwight Wynne
socalquizbowl.org
UC Irvine 2008-2013; UCLA 2004-2007; Capistrano Valley High School 2000-2003

"It's a competition, but it's not a sport. On a scale, if football is a 10, then rowing would be a two. One would be Quiz Bowl." --Matt Birk on rowing, SI On Campus, 10/21/03

"If you were my teammate, I would have tossed your ass out the door so fast you'd be emitting Cerenkov radiation, but I'm not classy like Dwight." --Jerry

User avatar
grapesmoker
Sin
Posts: 6368
Joined: Sat Oct 25, 2003 5:23 pm
Location: NYC
Contact:

Re: Update to Computer Rankings

Post by grapesmoker »

I'd like to note that Jason Keller is a glaring outlier in these rankings. He shows up high on the list because he scores a lot of points against weak teams, a situation that statistically speaking occurs far more often than scoring few points against good teams (since there are many more weak teams than strong teams). Most of the rankings, while maybe not exactly the same as I would have chosen if I were picking a team, are pretty close to the truth. It seems to me that beyond the top 10 players, many of the distinctions between players become relatively subjective depending on what niches you are trying to fill. Nevertheless, I think it's cool that Brian has put this together; it's always fun to play with numbers, as long as we remember that what matters in the end is who wins.
Jerry Vinokurov
ex-LJHS, ex-Berkeley, ex-Brown, sorta-ex-CMU
code ape, loud voice, general nuissance

User avatar
No Rules Westbrook
Auron
Posts: 1232
Joined: Mon Nov 22, 2004 1:04 pm
Contact:

Re: Update to Computer Rankings

Post by No Rules Westbrook »

Yeah, I think there is a clear bias in these rankings leaning towards either: (1) players who typically play in weak regions or who play against a lot of weak teams, or (2) players who typically play solo or with fairly weak teammates.

I think these biases can be seen in the ranking of players like Matt Keller (who is certainly a skilled player, but often played in a weak region where he could be counted on to dominate the stats), Matt Alford, and most obviously Jason Keller.
Ryan Westbrook, no affiliation whatsoever.

I am pure energy...and as ancient as the cosmos. Feeble creatures, GO!

Left here since birth...forgotten in the river of time...I've had an eternity to...ponder the meaning of things...and now I have an answer!

User avatar
cvdwightw
Auron
Posts: 3446
Joined: Tue May 13, 2003 12:46 am
Location: Southern CA
Contact:

Re: Update to Computer Rankings

Post by cvdwightw »

No Rules Westbrook wrote:Yeah, I think there is a clear bias in these rankings leaning towards either: (1) players who typically play in weak regions or who play against a lot of weak teams, or (2) players who typically play solo or with fairly weak teammates.
I think as more national tournament data is entered, we'll see less of a bias toward (1), at least, if I'm correctly assuming that a head-to-head victory over a weak opponent is worth less than a victory over a strong opponent. There's definitely a bias toward (2), and this has historically been one of the problems with any kind of quizbowl ranking stat.
Dwight Wynne
socalquizbowl.org
UC Irvine 2008-2013; UCLA 2004-2007; Capistrano Valley High School 2000-2003

"It's a competition, but it's not a sport. On a scale, if football is a 10, then rowing would be a two. One would be Quiz Bowl." --Matt Birk on rowing, SI On Campus, 10/21/03

"If you were my teammate, I would have tossed your ass out the door so fast you'd be emitting Cerenkov radiation, but I'm not classy like Dwight." --Jerry

User avatar
theMoMA
Forums Staff: Administrator
Posts: 5784
Joined: Mon Oct 23, 2006 2:00 am

Re: Update to Computer Rankings

Post by theMoMA »

This is why I wonder if using PATH might not be a good solution to this problem. I think PATH manages to correctly value players who play with good teammates in a way that pretty much no other stat does.

The current problem with the system is that it says that a player who goes 5-3 against a team of four players who each go 3-0 is the best player in the room. Basically, the formula rewards things that aren't really conducive to winning games. So you do get a pretty nice ranking of who's expected to score more points, but who doesn't mostly know this from looking at various individual stats, anyway?
Andrew Hart
Minnesota alum

vandyhawk
Tidus
Posts: 584
Joined: Sat Dec 13, 2003 3:42 am
Location: Seattle

Re: Update to Computer Rankings

Post by vandyhawk »

No Rules Westbrook wrote:Yeah, I think there is a clear bias in these rankings leaning towards either: (1) players who typically play in weak regions or who play against a lot of weak teams, or (2) players who typically play solo or with fairly weak teammates.

I think these biases can be seen in the ranking of players like Matt Keller (who is certainly a skilled player, but often played in a weak region where he could be counted on to dominate the stats), Matt Alford, and most obviously Jason Keller.
While Ryan's comments certainly have some truth to them (and I obviously take no offense), there is actually only one tournament in the list of those included where I played in the southeast, and even then, I had Paul with me. Playing solo at Nats this year certainly didn't hurt my standing though... To help with including some southeastern tournaments, here are links to stats from tourneys we've hosted:

Commodore Classic / William Wirt (March '06): http://studentorgs.vanderbilt.edu/colle ... dings.html

ACF Fall 06: http://studentorgs.vanderbilt.edu/colle ... dings.html

EFT2: http://studentorgs.vanderbilt.edu/colle ... dings.html

ACF Nationals 07: http://studentorgs.vanderbilt.edu/colle ... dings.html
(Note here - if you want just prelims, get rid of the "playoff" in the URL. Also, Sargon = Paul Gauthier)

It seems Reg's '08 is already included, all like 5 collegiate people who actually attended at our site.
Matt Keller
Vanderbilt (alum)
ACF editor (emeritus)
NAQT editor (emeritus)

Schweizerkas
Lulu
Posts: 83
Joined: Sat May 12, 2007 1:01 am
Location: Stanford, CA

Re: Update to Computer Rankings

Post by Schweizerkas »

cvdwightw wrote:For instance, just using an example from my own stats, it looks like I get the same outcome for a good game in which I was outplayed by #8 Matt Keller (70-90) as I do for a game in which I laid an egg against #40 Bruce Arthur (0-20).
This is false. You would indeed get the same outcome for a (70-90) games versus Matt Keller as a (0-20) game versus Matt Keller, which is probably not ideally what should happen. However, a close game versus Matt would yield a better rating than a close game versus Bruce. This rating system certainly accounts for opponent strength, which is one of its best features.
cvdwightw wrote: Oh, also, the rankings on the homepage don't match the rankings used when I click on a player's name (e.g. I'm #8 if someone plays me, but I'm #20 in the rankings). I don't know how much of a difference that makes.
Thanks for pointing that out. It's fixed now (the rankings on the homepage were the correct ones).

Andrew, I'll try calculating PATH stats for all the tournaments in the database. I'll probably do this early next week.

So that people can better understand some of the motivation behind this rating system, I'd like to explain what I think one of its best features is: its ability to compare quizbowlers from separate regions who never play each other. The problem in making comparisons between regions is obviously that some regions are significantly stronger than others. Of course lots of ratings or statistics have some kind of "strength of schedule" factor built in, but many (most?) of these rating systems have a flawed approach to calculating opponent strength. Take college basketball's RPI as a(n) (in)famous example. The RPI adds in some points for a team's opponents' winning percentage, and the opponents' opponents' winning percentage. The problem is, what if you and local teams in your region are mainly just playing against each other? No matter how good or bad the teams in your region are, somebody's got to be winning the games. You can have an isolated region that's really good, and another isolated region that's really bad, and the RPI and other rating systems won't be able to tell them apart.

My QB rating system, however, avoids these problems by not looking at some overall, average "strength of schedule", but instead keeping track of exactly what opponents a quizbowler faces, and how that quizbowler performs against each of those opponents. Let's say we want to compare Bozo, who only plays quizbowl in the southern circuit, to Dumbo, who only plays in the midwest circuit, and let's say they both average 20 ppg (and have the same PATH, and everything). If we want to compare them, some rating systems might just look at Bozo's and Dumbo's opponents, figure out which set of opponents had a higher average ppg (or PATH, or whatever), and declare the winner that way. However, this is obviously a flawed approach, since in theory two mostly-isolated circuits could contain players with similar ppg but vastly different talent levels. Instead, my rating system says (very roughly speaking): "Bozo played Matt Keller head-to-head, and from that game, I think Bozo's 30% as good as Matt, and Matt played Bruce at ACF Nats, and on that basis, I think Matt's twice as good as Bruce, and Bruce played Dumbo at EFT, and on the basis of that match, I think Bruce is 20% better than Dumbo, and therefore I think Dumbo is better than Bozo." Of course, it's more complicated than that, since there are probably multiple "paths" connecting Bozo and Dumbo, and this rating system is also trying to simultaneously rank over 100 different players. But basically, that's what this huge system of differential equations is trying to do: to compare quizbowlers across the country by connecting them via common opponents, sort of like 6-degrees-of-separation.
Brian
Stanford University

Schweizerkas
Lulu
Posts: 83
Joined: Sat May 12, 2007 1:01 am
Location: Stanford, CA

Re: Update to Computer Rankings

Post by Schweizerkas »

I've added PATH statistics to the website now. If you click on the "PATH" link at the top of the column, you can see every player sorted by PATH.
May Batman Prevail? wrote: -The "George Berry" who played for the CNU/JSR team at EFT as "Kearney" (W&M), W&M for ACF Fall (VCU) and Titanomachy (UMD), and J. Sargeant Reynolds at MUT and FICHTE are the same person.
Thanks. All your games should be in the database now.
Brian
Stanford University

User avatar
Mechanical Beasts
Banned Cheater
Posts: 5673
Joined: Thu Jun 08, 2006 10:50 pm

Re: Update to Computer Rankings

Post by Mechanical Beasts »

I'm assuming I'm not on the list, but Brian Young (my teammate from ACF Nationals) is because I haven't played enough listed tournaments (i.e. only ACF Nationals)?
Andrew Watkins

User avatar
Captain Sinico
Auron
Posts: 2859
Joined: Sun Sep 21, 2003 1:46 pm
Location: Champaign, Illinois

Re: Update to Computer Rankings

Post by Captain Sinico »

Can you use PATH rather than points to determine game outcomes? I think that's what people want.

MaS
Mike Sorice
Coach, Centennial High School of Champaign, IL (2014-) & Team Illinois (2016-2018)
Alumnus, Illinois ABT (2000-2002; 2003-2009) & Fenwick Scholastic Bowl (1999-2000)
ACF
IHSSBCA
PACE

Locked