Rules for Human / Computer Quiz Bowl Matches

Dormant threads from the high school sections are preserved here.
Locked
User avatar
ezubaric
Rikku
Posts: 369
Joined: Mon Feb 09, 2004 8:02 pm
Location: College Park, MD
Contact:

Rules for Human / Computer Quiz Bowl Matches

Post by ezubaric »

Last night, we had our first exhibition match between a computer that can play quiz bowl and a team of former Jeopardy champions. It was a lot of fun, but ultimately inconclusive. The game ended in a tie.

At the risk of hubris, I hope that this will become a more common phenomenon in the future. That said, our first exhibition match had some rough edges and issues. We did lots of things wrong, and I'd like to use this opportunity to both review how we did things suboptimally and how I think they should be done in the future.

This was a very new process, and I think there are still lots of wrinkles to be ironed out. We made design choices that at the time seemed reasonable, but which I think that made the game somewhat unfair to one side or the other.

And to add cuteness to my hubris, I'm proposing these be called the "Bradbury rules" as an homage to "Queensbury rules" and to how this whole discussion is rather sci-fi.

Below I give my reflections and recommendations. I'd really like people to chime in with their thoughts. What do people think? Are there issues I overlooked?

1. The process of reading questions

What we did: As I read the questions, I pressed a button to reveal each word. The second I pressed a button, the computer could use that word to offer an answer. This was effectively instantaneous for the computer (for various engineering reasons that are unimportant for this conversation).

Why this was wrong: If you watch the video (which I will upload once I'm done traveling and have real Internet), you can spot the exact second when I realize that we made a bad design decision. I'm reading a question where the answer is William Makepeace Thackery.

I've just read the word "Amelia", and I realize that if I were playing on this packet, my logic would be that if the next word is "Sedley", I'll buzz in with Thackery. However, the system has a huge advantage. The second I hit the button, the computer will be able to answer: I will not be able to say the word, it will process the information and use it before the humans get to hear it.

What we should do in the future: quiz bowl is historically a spoken medium, and we need to accept and appreciate that there is inherent latency in that process. My proposal would be that the computer only gets access to a word when the the *following* word is available for the moderator to read.

2. Seeing what the computer is thinking.

What we did: During our demo, we were showing what the computer was thinking; i.e., the the output from the guesser and the associated scores.

Why this was wrong: The computer built up an early lead, but the humans quickly figured out that if the computer was in the right ballpark of answers, it would likely answer soon. However, if the current guesses were completely outlandish, the humans could cool their heels and be more conservative. This let them rally; if they had this strategy from the start, they probably would have won.

What we should do in the future: The output is valuable for spectators and scientists, but should not be visible to opponents.

3. Asymmetric rules.

What we did: We stated that the computer could not protest, so R declared that the humans could not protest either. We also had a bug in the interaction protocol that said that made the end of questions problematic. So we encouraged the humans to buzz in when they were confident of the answer.

Why this was bad: I think this was mostly okay in this match, except that the human team once buzzed early on "Aethelred the Unready" after a computer neg with "Genghis Khan". This is easily fixable, and was mostly shoddy programming on my part.

What we should do: The normal rules of quiz bowl should be followed as closely as possible and be applied equally to both teams. If the computer can't protest or be prompted, tough luck.

4. Question delivery / reproducibility

What we did: Logistically, we were in a tight spot because NAQT had its normal production crunch and we needed the questions in specific, electronic format.

So, NAQT sequestered the staffers for the two rounds our questions were used. We got the questions early via e-mail so we could: remove unicode issues (we need to switch to Python 3), associate questions to Wikipedia pages (so we can know whether our answer is correct or not), and format them in a CSV.

Why this was bad: I think it was okay this time, it assumed a level of trust on both sides that while okay for a friendly match like this, sets a bad precedent.

We're also open sourcing our code so folks can verify that we did what we said we would. However, it isn't completely reproducible because I) in the last minutes some parameter settings got lost because of frantic modifications combined with multiple computers / files and poor record keeping II) NAQT won't let us distribute their data (rightly so, as that's a significant income stream for them).

One nice benefit of open sourcing our code is that there will be more people ballparking performance using only open-domain inputs so we can see how much the NAQT data help. If they don’t help that much, it might be better for openness to only use publicly available data.

What we should do: I would propose one of two models for "official" human-computer quiz bowl tournaments in the future. One requires more human effort / coordination and the other requires more engineering effort.

A) Escrow: The computer teams must submit their code and data to the tournament organizers, who will use machines with minimum hardware requirements. This requires better documentation and code robustness than we currently have with QANTA.

B) Server / Client: Participants will agree on a format for machine-readable questions. Participants bring their systems to the site and connect to a tournament computer. That computer provides words one at a time via an agreed upon communication protocol (direct TCP/IP would allow for non-local computers to participate).

My preference would be for (B), since that would allow for computer-only tournaments to happen virtually. I hope to do something like that next summer, assuming I can get participants to show up.

5. Have a framework for tiebreakers.

What we did: We had no idea there could be a tie. We thought it would be decisive one way or another.

Why this was bad: I think this is self-explanatory.

What we should do: Have a framework for tie-breakers, preferably following established rules.

TLDR: Future human-computer matches should:

1. Only provide input to computers after it has been completely spoken by the moderator.
2. Not allow human opponents to see the computer's thought process.
3. Apply rules consistently to both sides.
4. Use either an escrow or client/server model to preserve question security.
5. Have a framework for tiebreakers.

Edit: Missing negation (thanks Jonah!)
Last edited by ezubaric on Sun May 31, 2015 7:35 pm, edited 1 time in total.
Jordan Boyd-Graber
UMD (College Park, MD), Faculty Advisor 2018-present
UC Boulder, Founder / Faculty Advisor 2014-2017
UMD (College Park, MD), Faculty Advisor 2010-2014
Princeton, Player 2004-2009
Caltech (Pasadena, CA), Player / President 2000-2004
Ark Math & Science (Hot Springs, AR), Player 1998-2000
Monticello High School, Player 1997-1998

Human-Computer Question Answering:
http://qanta.org/
jonah
Auron
Posts: 2383
Joined: Thu Jul 20, 2006 5:51 pm
Location: Chicago

Re: Rules for Human / Computer Quiz Bowl Matches

Post by jonah »

ezubaric wrote:What we should do in the future: quiz bowl is historically a spoken medium, and we need to accept and appreciate that there is inherent latency in that process. My proposal would be that the computer only gets access to a word when the the *following* word is available for the moderator to read.
Could the computer use voice recognition to get the words? That seems more realistic, especially to the reality that moderator mispronunciations etc. happen and screw players up (so the computer should get screwed too).
ezubaric wrote:What we should do in the future: The output is valuable for spectators and scientists, but should be visible to opponents.
"should not be visible", right?

As I mentioned to you right after the match, the humans also have the advantage of knowing that any displayed answer the computer is thinking about, will be wrong. So if a human was also thinking of that answer, s/he can eliminate it based on the computer showing it.
Jonah Greenthal
National Academic Quiz Tournaments
User avatar
Excelsior (smack)
Rikku
Posts: 386
Joined: Sun Jan 25, 2009 12:20 am
Location: Madison, WI

Re: Rules for Human / Computer Quiz Bowl Matches

Post by Excelsior (smack) »

jonah wrote:Could the computer use voice recognition to get the words? That seems more realistic, especially to the reality that moderator mispronunciations etc. happen and screw players up (so the computer should get screwed too).
I don't think commercial voice recognition technology is at a point where it'll have good success on processing quizbowl-speech (fast, lots of foreign words and polysyllabic science words, etc.). (At least, that's the impression I get from my limited experience with Dragon.)
Ashvin Srivatsa
Corporate drone '?? | Yale University '14 | Sycamore High School (OH) '10
jonah
Auron
Posts: 2383
Joined: Thu Jul 20, 2006 5:51 pm
Location: Chicago

Re: Rules for Human / Computer Quiz Bowl Matches

Post by jonah »

Excelsior (smack) wrote:
jonah wrote:Could the computer use voice recognition to get the words? That seems more realistic, especially to the reality that moderator mispronunciations etc. happen and screw players up (so the computer should get screwed too).
I don't think commercial voice recognition technology is at a point where it'll have good success on processing quizbowl-speech (fast, lots of foreign words and polysyllabic science words, etc.). (At least, that's the impression I get from my limited experience with Dragon.)
I share that suspicion, but I thought we were talking about idealism rather than reality.
Jonah Greenthal
National Academic Quiz Tournaments
User avatar
ezubaric
Rikku
Posts: 369
Joined: Mon Feb 09, 2004 8:02 pm
Location: College Park, MD
Contact:

Re: Rules for Human / Computer Quiz Bowl Matches

Post by ezubaric »

jonah wrote:
Excelsior (smack) wrote:
jonah wrote:Could the computer use voice recognition to get the words? That seems more realistic, especially to the reality that moderator mispronunciations etc. happen and screw players up (so the computer should get screwed too).
I don't think commercial voice recognition technology is at a point where it'll have good success on processing quizbowl-speech (fast, lots of foreign words and polysyllabic science words, etc.). (At least, that's the impression I get from my limited experience with Dragon.)
I share that suspicion, but I thought we were talking about idealism rather than reality.

There are things you could do. There's a kind of speech recognition called forced Viterbi decoding where you know what's going to be said, you just don't know when (they do this for captioning movies where you have the script. That could probably be used here if the speech recognition side sat on the server and used speech recognition to send the text electronically to the players.
Jordan Boyd-Graber
UMD (College Park, MD), Faculty Advisor 2018-present
UC Boulder, Founder / Faculty Advisor 2014-2017
UMD (College Park, MD), Faculty Advisor 2010-2014
Princeton, Player 2004-2009
Caltech (Pasadena, CA), Player / President 2000-2004
Ark Math & Science (Hot Springs, AR), Player 1998-2000
Monticello High School, Player 1997-1998

Human-Computer Question Answering:
http://qanta.org/
Locked