Correcting errors in the set

Old college threads.
Locked
User avatar
Important Bird Area
Forums Staff: Administrator
Posts: 5604
Joined: Thu Aug 28, 2003 3:33 pm
Location: San Francisco Bay Area
Contact:

Correcting errors in the set

Post by Important Bird Area »

I just went through and fixed the list of known errors in the SCT set before we send it to George Mason and Chipola. To my knowledge I caught up with everything reported in this forum and in the #sct channel Saturday night. Something else that's a problem that hasn't been noted yet? Here's your thread to keep from inflicting those troubles on mid-Atlantic and CC players. Thanks to everyone who sent in comments over the last week.
Jeff Hoppes
President, Northern California Quiz Bowl Alliance
former HSQB Chief Admin (2012-13)
VP for Communication and history subject editor, NAQT
Editor emeritus, ACF

"I wish to make some kind of joke about Jeff's love of birds, but I always fear he'll turn them on me Hitchcock-style." -Fred

User avatar
Sen. Estes Kefauver (D-TN)
Chairman of Anti-Music Mafia Committee
Posts: 5640
Joined: Wed Jul 26, 2006 11:46 pm
Location: Columbia, MO

Re: Correcting errors in the set

Post by Sen. Estes Kefauver (D-TN) »

I am concerned (not for myself, but rather for everyone else out of fairness's sake) about what kinds of changes are being made. If there are factual errors in the set, or any other changes that might make it easier for teams playing the new set to score points, doesn't this punish everyone who played the old sets?
Charlie Dees, North Kansas City HS '08
"I won't say more because I know some of you parse everything I say." - Jeremy Gibbs

"At one TJ tournament the neg prize was the Hampshire College ultimate frisbee team (nude) calender featuring one Evan Silberman. In retrospect that could have been a disaster." - Harry White

User avatar
MicroEStudent
Rikku
Posts: 462
Joined: Tue Feb 17, 2009 11:20 pm

Re: Correcting errors in the set

Post by MicroEStudent »

Katamari Damacy wrote:I am concerned (not for myself, but rather for everyone else out of fairness's sake) about what kinds of changes are being made. If there are factual errors in the set, or any other changes that might make it easier for teams playing the new set to score points, doesn't this punish everyone who played the old sets?
I brought this up in another thread:
bt_green_warbler wrote:
Old Man of the Mountain wrote:Just a thought, but does editing questions for clarity or any other reason a good idea when there is another site still to go? Is there a chance that these edits allow a team in Region 4 to answer additional questions correctly, thus improving their chances for a wild card bid? I know that this is potentially a small consideration, but I think that all sites should hear the same exact questions even if there are errors.
Obviously the number of changes that we're making is relatively small; we believe the impact on answerability would be tiny. I for one consider "deliberately using questions with known errors in them" to be bad quizbowl.
Nathaniel Kane
RIT '09, '11 (BS Microelectronic Engineering, MS Microelectronic Engineering)

User avatar
Sen. Estes Kefauver (D-TN)
Chairman of Anti-Music Mafia Committee
Posts: 5640
Joined: Wed Jul 26, 2006 11:46 pm
Location: Columbia, MO

Re: Correcting errors in the set

Post by Sen. Estes Kefauver (D-TN) »

OK, but what is defined as too small a change to affect proper D-value rankings? Is there going to be a very slight correction or something if it appears that maybe these changes do more than expected?
Charlie Dees, North Kansas City HS '08
"I won't say more because I know some of you parse everything I say." - Jeremy Gibbs

"At one TJ tournament the neg prize was the Hampshire College ultimate frisbee team (nude) calender featuring one Evan Silberman. In retrospect that could have been a disaster." - Harry White

User avatar
The Friar
Wakka
Posts: 158
Joined: Fri Jul 10, 2009 2:39 pm

Re: Correcting errors in the set

Post by The Friar »

As a side note, under an algorithm with an explicit adjustment for field strength, the effect on D-values of changing questions between sites may be larger than expected due to positive feedback: teams get credit not only for scoring more points, but for doing so against teams who themselves scored more points, than they would have on the set played at another site.

Nonetheless, this problem would be handled even worse in something like FRIAR, which estimates the difficulty of each individual question assuming pretty strongly that the question itself is not easier at one site than another. Hey, it seemed a reasonable modeling strategy at the time.

EDIT: spelling
Gordon Arsenoff
Rochester '06
WUStL '14 (really)

Developer of WUStL Updates Statistics Live!

User avatar
cvdwightw
Auron
Posts: 3446
Joined: Tue May 13, 2003 12:46 am
Location: Southern CA
Contact:

Re: Correcting errors in the set

Post by cvdwightw »

The Friar wrote:As a side note, under an algorithm with an explicit adjustment for field strength, the effect on D-values of changing questions between sites may be larger than expected due to positive feedback: teams get credit not only for scoring more points, but for doing so against teams who themselves scored more points, than they would have on the set played at another site.

Nonetheless, this problem would be handled even worse in something like FRIAR, which estimates the difficulty of each individual question assuming pretty strongly that the question itself is not easier at one site than another. Hey, it seemed a reasonable modeling strategy at the time.
You're absolutely right. Any reasonable S-value model is going to operate on the assumption that "the questions played at every site are not substantially different from the questions played at any other site." A tournament-level model would produce a positive feedback loop that would disproportionately favor teams on the revised packet set; a question-level model would indeed fare even worse due to the actual difficulty of the question at each site being not at all correlated with the estimated overall difficulty.
Dwight Wynne
socalquizbowl.org
UC Irvine 2008-2013; UCLA 2004-2007; Capistrano Valley High School 2000-2003

"It's a competition, but it's not a sport. On a scale, if football is a 10, then rowing would be a two. One would be Quiz Bowl." --Matt Birk on rowing, SI On Campus, 10/21/03

"If you were my teammate, I would have tossed your ass out the door so fast you'd be emitting Cerenkov radiation, but I'm not classy like Dwight." --Jerry

User avatar
Maxwell Sniffingwell
Auron
Posts: 2162
Joined: Sun Feb 12, 2006 3:22 pm
Location: Des Moines, IA

Re: Correcting errors in the set

Post by Maxwell Sniffingwell »

As a player on a team that made it off of the wait list last year (and might do that again this year,) I'd strongly, strongly prefer that the Region 4 players play on the exact same set as was used at all of the other sites.
Greg Peterson

Northwestern University '18
Lawrence University '11
Maine South HS '07

"a decent player" - Mike Cheyne

User avatar
Matt Weiner
Sin
Posts: 8413
Joined: Fri Apr 11, 2003 8:34 pm
Location: Richmond, VA

Re: Correcting errors in the set

Post by Matt Weiner »

cornfused wrote:As a player on a team that made it off of the wait list last year (and might do that again this year,) I'd strongly, strongly prefer that the Region 4 players play on the exact same set as was used at all of the other sites.
As a person who is not given to completely insane ideas, I strongly urge NAQT to not knowingly send out incorrect or bad questions for the sake of making sure that the privilege of finishing last at the ICT is handed out accurately. In the exceedingly unlikely situation that the last spot or spots of qualification come down to a tie or a D-value change dependent on the number of altered questions, that involves both Mid-Atlantic and non-Mid-Atlantic teams, it would be appropriate to investigate exactly which teams got those questions at each site and see if it made the difference and adjust bids accordingly; since this isn't going to happen, I'll again request that known errors be fixed as opposed to the unbelievably shortsighted idea of intentionally not doing so. The interests of the Mid-Atlantic SCT being a good tournament far overweigh the very, very remote possibility that teams 32 and 33 on the invite list might be flipped.
Matt Weiner
Founder of hsquizbowl.org

User avatar
Important Bird Area
Forums Staff: Administrator
Posts: 5604
Joined: Thu Aug 28, 2003 3:33 pm
Location: San Francisco Bay Area
Contact:

Re: Correcting errors in the set

Post by Important Bird Area »

For the record, we have already made the relevant changes to the set. NAQT agrees with Matt that knowingly using incorrect or bad questions is bad quizbowl. (The magnitude of replacement here, by the way, is something like 2/1 per tournament. We're convinced that the D-value impact of that is less than other sources of friction we don't try to account for.)
Jeff Hoppes
President, Northern California Quiz Bowl Alliance
former HSQB Chief Admin (2012-13)
VP for Communication and history subject editor, NAQT
Editor emeritus, ACF

"I wish to make some kind of joke about Jeff's love of birds, but I always fear he'll turn them on me Hitchcock-style." -Fred

User avatar
stevebahnaman
Wakka
Posts: 106
Joined: Thu Sep 17, 2009 1:25 pm

Re: Correcting errors in the set

Post by stevebahnaman »

I am not an amazingly relevant voice but I wanted someone other than Matt and Jeff to say that these errors should obviously have been corrected.
Steve Bahnaman, Campbell University
NC Wesleyan College, Librarian and Quiz Bowl Advisor/Coach 2009-2011
Emory Academic Team, 1999-2004
Pretty trashy

User avatar
Maxwell Sniffingwell
Auron
Posts: 2162
Joined: Sun Feb 12, 2006 3:22 pm
Location: Des Moines, IA

Re: Correcting errors in the set

Post by Maxwell Sniffingwell »

Well, as long as the discussion is still going,
Matt Weiner wrote:
cornfused wrote:As a player on a team that made it off of the wait list last year (and might do that again this year,) I'd strongly, strongly prefer that the Region 4 players play on the exact same set as was used at all of the other sites.
As a person who is not given to completely insane ideas...
This seems a bit strong. Both of the stats people who have posted in this thread said the same thing: this kind of thing can make a feedback loop. At an ACF Regionals-style tournament with multiple sites but no connection between them, that's one thing. But I think that having multiple qualifying tournaments with different sets of questions is more of a "bad quizbowl" trope than having a set with errors in it.


And four Region 4 teams are on the cusp of the tourney field. I want to make it very clear that I'm accusing VCU/Virginia/Pitt/CMU of having an unfair advantage - I agree with Jeff's explanation of "the changes are too small to matter." But there are in fact four teams that benefitted from the corrections and will likely be among the last 5 or so teams into the field.


Now, I may be wrong, but I really don't agree with Matt that it's an "insane idea" to have the 14 SCTs playing the same set.

And as for the argument that the changes were tiny, that works in both directions: tiny errors only cause tiny feedback loops, but they also only cause tiny problems with the tournament.
Greg Peterson

Northwestern University '18
Lawrence University '11
Maine South HS '07

"a decent player" - Mike Cheyne

User avatar
Sen. Estes Kefauver (D-TN)
Chairman of Anti-Music Mafia Committee
Posts: 5640
Joined: Wed Jul 26, 2006 11:46 pm
Location: Columbia, MO

Re: Correcting errors in the set

Post by Sen. Estes Kefauver (D-TN) »

I mean, if there are tiny grammatical errors that are fixed or whatever, I think it is kind of insane to think whatever feedback loops they cause can really make a difference we should care about (especially considering that some moderators will occasionally slightly change the grammar of a question they read if they go quickly). If its shown that the basic clue content or answer selection of some questions were altered in ways that clearly make it easier for someone at the UMD site to answer questions than someone at every other site, then I think we should explore some way to handle that, but if Jeff is right and all they changed were just cosmetics, I'm inclined to say that nobody at Maryland's site was able to gain an advantage in their D-value.
Charlie Dees, North Kansas City HS '08
"I won't say more because I know some of you parse everything I say." - Jeremy Gibbs

"At one TJ tournament the neg prize was the Hampshire College ultimate frisbee team (nude) calender featuring one Evan Silberman. In retrospect that could have been a disaster." - Harry White

User avatar
Maxwell Sniffingwell
Auron
Posts: 2162
Joined: Sun Feb 12, 2006 3:22 pm
Location: Des Moines, IA

Re: Correcting errors in the set

Post by Maxwell Sniffingwell »

Katamari Damacy wrote:if Jeff is right and all they changed were just cosmetics, I'm inclined to say that nobody at Maryland's site was able to gain an advantage in their D-value.
I'll take this moment to note that yeah, I agree with that.
Greg Peterson

Northwestern University '18
Lawrence University '11
Maine South HS '07

"a decent player" - Mike Cheyne

User avatar
Important Bird Area
Forums Staff: Administrator
Posts: 5604
Joined: Thu Aug 28, 2003 3:33 pm
Location: San Francisco Bay Area
Contact:

Re: Correcting errors in the set

Post by Important Bird Area »

cornfused wrote:And four Region 4 teams are on the cusp of the tourney field. I want to make it very clear that I'm accusing VCU/Virginia/Pitt/CMU of having an unfair advantage
Here's the change list.

DI set:
replaced the Izaak Walton tossup (spoiled in the irc)
replaced the thirteen at table tossup (multiple reports of hoses above and beyond the bad answer selection)
replaced the Asch bonus (repeat)
antibiotic valinomycin "targets" mitochondria rather than "inhibits" them
in the permutation tossup, specified the wording of "this mathematical operation" and named the L-C tensor.
in the p-n junction bonus part, replaced "semiconductor transistor" with "bipolar junction transistor"
in the "beta minus decay" question, ruled that "beta decay" was outright acceptable rather than promptable
clarified underlining in the Seattle Supersonics tossup (such that, per usual policy, either is acceptable rather than requiring both name and city)
for Canadian Pacific Railway, added prompt on equivalents such as "Canadian trans-continental railway"
in the Cellini salt cellar tossup, clarified pronoun from "its" to "this work's"
clarified wording of mulberry tossup to improve accuracy re: Sanjuro.
fixed the display output in the CS tossup on "stack"

DII set:

removed the Harry Reid repeat
fixed PG on Cardinal Mindszenty

Note that there are several changes here that will have near-zero impact on answerability. Also note that there are a number of prominent changes in which leaving the set unaltered would confer a strong advantage on the later sectional (such as the tossup that players from that site heard in the irc, or the bonus that repeated information found in a previous tossup).

In short: I'm quite convinced the d-value impact of these changes is absolutely minimal. It is very likely less than the effect of factors we don't even try to account for (such as: adjusting for difficulty variation between packets as different teams have different bye rounds).
Jeff Hoppes
President, Northern California Quiz Bowl Alliance
former HSQB Chief Admin (2012-13)
VP for Communication and history subject editor, NAQT
Editor emeritus, ACF

"I wish to make some kind of joke about Jeff's love of birds, but I always fear he'll turn them on me Hitchcock-style." -Fred

User avatar
stevebahnaman
Wakka
Posts: 106
Joined: Thu Sep 17, 2009 1:25 pm

Re: Correcting errors in the set

Post by stevebahnaman »

cornfused wrote: I think that having multiple qualifying tournaments with different sets of questions is more of a "bad quizbowl" trope than having a set with errors in it.
Because NAQT uses a clock and teams can play different numbers of rounds at each site, every tournament site already uses "different sets of questions." Getting to 22 tossups in a round versus 24 doesn't just increase statistical accuracy by increasing sample size, it gives the teams that heard those questions a differently-distributed packet. The changes here (and presumably any changes they would ever make to things like answer underlining) are much, much less likely to affect D-values than changes like "a reader throws out a tossup because he read the answer inadvertently," "a reader mangles the living crap out of a word," or "a reader is going so damn fast that nobody hears this clue." Certainly "a reader doesn't read tossups with hard words in them," which happened, is even worse than this.

I'm assuming you oppose the use of the clock (I support it), because otherwise your argument doesn't really work well for NAQT sets.
Steve Bahnaman, Campbell University
NC Wesleyan College, Librarian and Quiz Bowl Advisor/Coach 2009-2011
Emory Academic Team, 1999-2004
Pretty trashy

User avatar
grapesmoker
Sin
Posts: 6368
Joined: Sat Oct 25, 2003 5:23 pm
Location: NYC
Contact:

Re: Correcting errors in the set

Post by grapesmoker »

stevebahnaman wrote:I'm assuming you oppose the use of the clock (I support it), because otherwise your argument doesn't really work well for NAQT sets.
I think I can safely both oppose the use of a clock as a general principle but also support the idea that tournaments should take place on error-free sets. These things are not in conflict with each other.
Jerry Vinokurov
ex-LJHS, ex-Berkeley, ex-Brown, sorta-ex-CMU
code ape, loud voice, general nuissance

User avatar
setht
Auron
Posts: 1186
Joined: Mon Oct 18, 2004 2:41 pm
Location: Columbus, Ohio

Re: Correcting errors in the set

Post by setht »

grapesmoker wrote:
stevebahnaman wrote:I'm assuming you oppose the use of the clock (I support it), because otherwise your argument doesn't really work well for NAQT sets.
I think I can safely both oppose the use of a clock as a general principle but also support the idea that tournaments should take place on error-free sets. These things are not in conflict with each other.
I think you've misread Steve's point--my reading is that Steve is responding to Greg (and Charlie and Nathaniel and possibly other people who did not want NAQT to correct known errors in the set) by pointing out that a timed qualifying tournament like SCT necessarily has teams qualifying on different question sets, and that the differences in questions heard almost certainly make more of a difference to D-values than a small number of fixes to problematic questions. In other words, not fixing some problematic questions might make sense for an untimed qualifying tournament, but not for a timed qualifying tournament.

I think you and I and Steve (and Jeff and Matt and others) are all in agreement that NAQT did the right thing in fixing the small number of known errors.

-Seth
Seth Teitler
Formerly UC Berkeley and U. Chicago
President and Chief Editor, NAQT
Emeritus member, ACF

User avatar
stevebahnaman
Wakka
Posts: 106
Joined: Thu Sep 17, 2009 1:25 pm

Re: Correcting errors in the set

Post by stevebahnaman »

grapesmoker wrote:
stevebahnaman wrote:I'm assuming you oppose the use of the clock (I support it), because otherwise your argument doesn't really work well for NAQT sets.
I think I can safely both oppose the use of a clock as a general principle but also support the idea that tournaments should take place on error-free sets. These things are not in conflict with each other.
That is absolutely true. I just want to make sure that the person above also does, as opposed to taking issue with something relatively minor -- compared to a bunch of other stuff that makes D-values not perfectly comparable -- that is inherent in the NAQT format as it presently stands.

EDIT: Also, yeah, I wanted error-free sets, not identical sets, so...
Steve Bahnaman, Campbell University
NC Wesleyan College, Librarian and Quiz Bowl Advisor/Coach 2009-2011
Emory Academic Team, 1999-2004
Pretty trashy

User avatar
Sen. Estes Kefauver (D-TN)
Chairman of Anti-Music Mafia Committee
Posts: 5640
Joined: Wed Jul 26, 2006 11:46 pm
Location: Columbia, MO

Re: Correcting errors in the set

Post by Sen. Estes Kefauver (D-TN) »

I do not want sets to be left with errors in them. In my opinion there should be some very small correction to the D-value for teams that played the set with a chunk of questions that were harder to answer than what the final product was, and not that we should just make the mid-Atlantic SCT an even worse experience for those teams.
Charlie Dees, North Kansas City HS '08
"I won't say more because I know some of you parse everything I say." - Jeremy Gibbs

"At one TJ tournament the neg prize was the Hampshire College ultimate frisbee team (nude) calender featuring one Evan Silberman. In retrospect that could have been a disaster." - Harry White

User avatar
Maxwell Sniffingwell
Auron
Posts: 2162
Joined: Sun Feb 12, 2006 3:22 pm
Location: Des Moines, IA

Re: Correcting errors in the set

Post by Maxwell Sniffingwell »

Jeremy Gibbs Freesy Does It wrote:I do not want sets to be left with errors in them. In my opinion there should be some very small correction to the D-value for teams that played the set with a chunk of questions that were harder to answer than what the final product was, and not that we should just make the mid-Atlantic SCT an even worse experience for those teams.
This is what I support - I think that we should be playing the same set across sites in every way that we can control, but if we could correct for changes while allowing the Region 4 SCT to be as good as possible, that would be my preferred option.




HOLY CRAP.

I just reread my post calling out the Region 4 teams by name. I omitted the most important word of that sentence, which I must have deleted in the attempt to italicize it. What I meant to say was: "I want to make it very clear that I'm not accusing VCU/Virginia/Pitt/CMU of having an unfair advantage."

I'll leave the original post as is because I'm pretty sure I'd get tempbanned if I changed it now, but let me apologize to the four teams I mentioned for what is clearly a typo - if you read the second clause of that sentence, it's pretty apparent that I meant to include the "not." Again, sorry for the confusion there.
Greg Peterson

Northwestern University '18
Lawrence University '11
Maine South HS '07

"a decent player" - Mike Cheyne

User avatar
stevebahnaman
Wakka
Posts: 106
Joined: Thu Sep 17, 2009 1:25 pm

Re: Correcting errors in the set

Post by stevebahnaman »

Jeremy Gibbs Freesy Does It wrote:I do not want sets to be left with errors in them. In my opinion there should be some very small correction to the D-value for teams that played the set with a chunk of questions that were harder to answer than what the final product was, and not that we should just make the mid-Atlantic SCT an even worse experience for those teams.
Who would determine what that number is, and what formula should they use to determine it?

Should similar adjustments be made for teams that heard 19 tossups in round 1 when tossups 20-23 were easier? How about teams whose reader chose not to read questions with hard words in them?

Which of these do you think has a larger impact on D-value?
Steve Bahnaman, Campbell University
NC Wesleyan College, Librarian and Quiz Bowl Advisor/Coach 2009-2011
Emory Academic Team, 1999-2004
Pretty trashy

User avatar
The Friar
Wakka
Posts: 158
Joined: Fri Jul 10, 2009 2:39 pm

Re: Correcting errors in the set

Post by The Friar »

stevebahnaman wrote:
Jeremy Gibbs Freesy Does It wrote:I do not want sets to be left with errors in them. In my opinion there should be some very small correction to the D-value for teams that played the set with a chunk of questions that were harder to answer than what the final product was, and not that we should just make the mid-Atlantic SCT an even worse experience for those teams.
Who would determine what that number is, and what formula should they use to determine it?

Should similar adjustments be made for teams that heard 19 tossups in round 1 when tossups 20-23 were easier? How about teams whose reader chose not to read questions with hard words in them?

Which of these do you think has a larger impact on D-value?
It'd be nearly impossible to determine how big a correction would be needed without modeling individual question difficulties (at which point we don't need to make the correction anyhow), and I agree that the kind of things you mention, Steve, are likely to necessitate a bigger correction (although later tossups, per the thread on position within packet of bad tossups, are likely to be harder rather than easier). This is why I've baldly stated in another thread that D-value and the clock are incompatible.

Given the changes noted above, I'd be inclined not to make a correction at all. When the Great Model-Off happens, if FRIAR is set up to treat different versions of a question as different questions, we can learn something (although we might not have enough data points to learn much) about the difference in difficulty between the original and corrected versions.
Gordon Arsenoff
Rochester '06
WUStL '14 (really)

Developer of WUStL Updates Statistics Live!

Locked