What is the purpose of a first line?

Old college threads.
Locked
User avatar
AKKOLADE
Sin
Posts: 15773
Joined: Thu Apr 24, 2003 8:08 am

What is the purpose of a first line?

Post by AKKOLADE »

I was looking at the buzz point data for this year's ACF Regionals and noticed something about the first lines of the tossups. Specifically, that they got very, very few buzzes (whether conversions or negs) on their first sentences.

Image

(It is of course important to state the obvious that sentences are of varying lengths, etc. etc. Also, no buzz point data appeared for the Bellafonte question)

Approximately 50-60% of tossups are drawing no buzzes on their first sentences across their 55-60 rooms of play. 70-80% are drawing no or one buzzes across those rooms. Perhaps this will change as I look at more data.

Obviously, buzzer races in the first sentence aren't desirable and are, in fact, bad - probably worse than dead clues. But if less than 1% of the field are buzzing on the first line of the vast majority of questions, are we just wasting everyone's time?
Fred Morlan
University of Kentucky CoP, 2017
International Quiz Bowl Tournaments, CEO, co-owner
former PACE member, president, etc.
former hsqbrank manager, former NAQT writer & subject editor, former hsqb Administrator/Chief Administrator
User avatar
heterodyne
Rikku
Posts: 427
Joined: Tue Jun 26, 2012 9:47 am

Re: What is the purpose of a first line?

Post by heterodyne »

In my experience, many (in fact, likely the majority) of buzzes that are "on the first sentence" or "on the first clue" have a buzzpoint somewhere in the second sentence or later, because of parsing time/cautiousness/etc. In addition to the many arguments given elsewhere on these forums for the contextualizing and constructive work done by leadins, we should be cautious about saying that nobody buzzed on them from looking only at buzzpoints.
Alston [Montgomery] Boyd
Bloomington High School '15
UChicago '19
UChicago Divinity '21
they
User avatar
Auks Ran Ova
Forums Staff: Chief Administrator
Posts: 4295
Joined: Sun Apr 30, 2006 10:28 pm
Location: Minneapolis
Contact:

Re: What is the purpose of a first line?

Post by Auks Ran Ova »

heterodyne wrote: Tue Mar 05, 2019 6:01 pm In my experience, many (in fact, likely the majority) of buzzes that are "on the first sentence" or "on the first clue" have a buzzpoint somewhere in the second sentence or later, because of parsing time/cautiousness/etc. In addition to the many arguments given elsewhere on these forums for the contextualizing and constructive work done by leadins, we should be cautious about saying that nobody buzzed on them from looking only at buzzpoints.
I heartily endorse this event or product.
Rob Carson
University of Minnesota '11, MCTC '??, BHSU forever
Member, ACF
Member emeritus, PACE
Writer and Editor, NAQT
User avatar
Auroni
Auron
Posts: 3145
Joined: Thu Nov 15, 2007 6:23 pm

Re: What is the purpose of a first line?

Post by Auroni »

May be if people knew more they would buzz on more first lines
Auroni Gupta (she/her)
User avatar
Sima Guang Hater
Auron
Posts: 1957
Joined: Mon Feb 05, 2007 1:43 pm
Location: Nashville, TN

Re: What is the purpose of a first line?

Post by Sima Guang Hater »

AKKOLADE wrote: Tue Mar 05, 2019 5:45 pmObviously, buzzer races in the first sentence aren't desirable and are, in fact, bad - probably worse than dead clues. But if less than 1% of the field are buzzing on the first line of the vast majority of questions, are we just wasting everyone's time?
Fred's asked an excellent question here. As a writer, I try to use leadins for educational purposes (to introduce interesting, relevant, or humorous clues that I feel people should know) and context purposes (cluing knowledgeable people into the type of answer or general area we're going for).

That being said, if our leadins are such that no one's buzzing on them, I'm willing to listen to a statistical argument that they should be made easier. In an ideal world, I would want there to be no races on leadins, but I would want a handful of people in the field buzzing on them.
Eric Mukherjee, MD PhD
Brown 2009, Penn Med 2018
Instructor/Attending Physician/Postdoctoral Fellow, Vanderbilt University Medical Center
Coach, University School of Nashville

“The next generation will always surpass the previous one. It’s one of the never-ending cycles in life.”
Support the Stevens-Johnson Syndrome Foundation
User avatar
AKKOLADE
Sin
Posts: 15773
Joined: Thu Apr 24, 2003 8:08 am

Re: What is the purpose of a first line?

Post by AKKOLADE »

Auroni wrote: Tue Mar 05, 2019 7:17 pm May be if people knew more they would buzz on more first lines
got eeeeeeem
heterodyne wrote: Tue Mar 05, 2019 6:01 pm In my experience, many (in fact, likely the majority) of buzzes that are "on the first sentence" or "on the first clue" have a buzzpoint somewhere in the second sentence or later, because of parsing time/cautiousness/etc. In addition to the many arguments given elsewhere on these forums for the contextualizing and constructive work done by leadins, we should be cautious about saying that nobody buzzed on them from looking only at buzzpoints.
This is a good point. I can expand to looking at the first two sentences.
Fred Morlan
University of Kentucky CoP, 2017
International Quiz Bowl Tournaments, CEO, co-owner
former PACE member, president, etc.
former hsqbrank manager, former NAQT writer & subject editor, former hsqb Administrator/Chief Administrator
User avatar
theMoMA
Forums Staff: Administrator
Posts: 5993
Joined: Mon Oct 23, 2006 2:00 am

Re: What is the purpose of a first line?

Post by theMoMA »

It may be helpful to read this thread from the 2018 ACF Regionals discussion.

I will gauchely quote myself on the subject (in the context of comparing the hardest clues of tossups to the hard parts of bonuses):
theMoMA wrote:The lead-in/hard clues serve a slightly different function than the hard parts of bonuses. Whereas a bonus part is pointless to write if literally no team stands a chance to get it, the lead-in/hard clues do play a major role in the tossup, even if no one buzzes on them. These clues set the table for the tossup: they introducing the player to the topic and give a knowledgeable player a chance to buzz on difficult clues, but even if no player knows the clues, the savvier players can still separate themselves from the less savvy by contextualizing the lead-in to narrow down the possible range of answers, leading to a confident buzz later in the question. It's also important to remember another key distinction between lead-ins/hard clues and hard parts of bonuses, namely that the penalty for guessing at a bonus part is nothing, while the penalty of guessing on a lead-in/hard clue is severe. I think it stands to reason that lots more early buzzes would occur if players got a free guess on the first two lines, and that, given quizbowl players' well-honed skills at contextualizing information, many of those buzzes would be successful, much as many good teams' guesses at hard parts succeed.

***

Finally, the hard part to a bonus is designed to stretch a team's knowledge base and see whether they have truly mastered the possible answer space for a particular theme or topic while introducing the teams to interesting clues and connections. There are so many possible hard parts for any given topic or theme that no team can be guaranteed to convert the hard part, so converting them requires a lot of skill and a bit of luck. The lead-ins/hard clues of tossups operate similarly--they are designed to stretch a team's knowledge and introduce interesting clues and connections. Like with middle clues above, there are some slight differences: lead-ins don't need to be converted to serve their function, and in fact are less likely to be converted than hard parts because teams have a strong incentive to be risk averse when buzzing on a tossup.

The effect of making lead-ins more buzzable would be similar to the effect of removing hard parts from the game. It would flatten the playing field, reduce the advantages of the most knowledgeable players, and cut down on the space for introducing new and interesting material. I think it would also be very difficult to do this in a way that didn't frequently err on the too-easy side; I see the lead-ins/hard clues as a courtesy to the most knowledgeable players, allowing them to get one or two pieces of information that the writer/editor judges to be quite difficult before the beginning of the material that's come up before and/or that players are effectively "on notice" that they should know. If you start with that material right away, you've got a tournament like ACF Fall, which is both quantitatively and qualitatively different to play. A tournament like Fall still tends to result in the best team winning, but the marginal space in the tossups and bonuses that separate the best from the rest is much smaller, and thus, the resolving power of those questions is lower.
Andrew Hart
Minnesota alum
User avatar
setht
Auron
Posts: 1205
Joined: Mon Oct 18, 2004 2:41 pm
Location: Columbus, Ohio

Re: What is the purpose of a first line?

Post by setht »

First off, thanks to Fred for starting this discussion—this is an issue I've been hoping someone would tackle since the advent of Ophirstats. And thanks to Ophir for his outstanding work on providing detailed buzz point and conversion data, of course! Whether or not people wind up deciding that changes in our approach to crafting first lines are merited, I hope this discussion will help get the ball rolling on scrutinizing other aspects of question writing and editing.

I think ideally people would propose some targets for buzz distributions (and maybe discuss and come to some consensus targets), then inspect the buzz point data and draw conclusions accordingly. (This would help us keep ourselves honest—not shifting the goalposts to better align with the data.) In this post from last year's ACF Regionals detailed stats discussion thread, Will Alston suggested f(x) = x^2 (where f is the [cumulative] fraction of buzzes by point x, and x is the percentage of tossup text read) as a target buzz point distribution. I think I would suggest something between f(x) = x^2 and f(x) = x so there are a few more early buzzes and middle buzzes. And now that I'm thinking about this some more, there should probably be some adjustment to allow for the fact that not every single tossup will get 100% conversion by the end. Maybe we need to make some adjustment to account for early clues that generate buzzes a little bit later in the question (possibly causing some buzzes on first-sentence clues to come early in the second sentence). And maybe we just want to ignore targets for the first sentence—because we really really really want to avoid first-sentence buzzer races; because first-sentence clues are really about "setting the table" (to borrow Andrew's phrase) rather than generating some percentage of buzzes; or for any other reasons. But whatever, Will's suggestion seems like a broadly plausible target, and it's easy to use as a guide to contextualizing the data. And I'll go ahead and use f(x) = x as a very generous counterpoint. (This is my own bias here; maybe other people feel that Will's suggestion is already generous, and we should be thinking about alternate targets that push more buzzes to later in tossups.)

After a comprehensive survey of 7 tossups from 2019 ACF Regionals, I concluded that the average tossup length is about 6-7 sentences, so the first sentence covers about 15% of the tossup text. Will's guideline would then suggest that about 2% of buzzes should come in the first sentence, with another 7% coming in the second sentence. The f(x) = x guideline would suggest that about 15% of buzzes should come in the first sentence, with another 15% coming in the second sentence. Looking at Fred's data, and assuming that the "3+ buzzes" entries correspond to exactly 3 buzzes, I count 20 first-line buzzes out of 1100 tossup-rooms in round 1, 19 out of 1200 in round 2, and 11 out of 1159 in round 3. The corresponding percentages are 1.8%, 1.6%, and 0.95%. If we consider buzz delay due to clue processing, I think those numbers are reasonably in line with Will's guideline. They are clearly well below the f(x) guideline.

If the goal is to avoid first-sentence buzzer races, something like 20% of buzzes coming in the first line is a naive upper limit. That way about 4% of rooms would have two buzzes in the first sentence, or somewhat less than 1 tossup per game. That limit should be shifted down a ways: since good teams grab more of the first-line buzzes, we would expect that a match between two good teams would have noticeably more than 4% of tossups getting two first-sentence buzzes. I would guess that about 10% of buzzes coming in the first sentence would be sufficient here, but it really depends on how many expected instances of "two buzzes in the first sentence" we're willing to tolerate, how often "two buzzes in the first sentence" translates to "buzzer race in the first sentence," etc. Note that this is "actual buzzes in the first sentence," not "buzzes in the second sentence that were sparked by clues from the first sentence that took a while to process." This suggests that first-sentence clues could be a good bit more generous without causing early buzzer race problems. (Assuming we're content with "very very few buzzer races in the first sentence" as our standard.)

With all that said, I think the data on second-line buzzes is crucial here. I look forward to seeing the fruits of Fred's labor.
Seth Teitler
Formerly UC Berkeley and U. Chicago
President of NAQT
Emeritus member of ACF
User avatar
AKKOLADE
Sin
Posts: 15773
Joined: Thu Apr 24, 2003 8:08 am

Re: What is the purpose of a first line?

Post by AKKOLADE »

I've went through round one and added more data - first & second sentence buzz #s and word counts, plus percentages for both. I wanted to make sure people thought this info was actually going to be useful before I kept working on this.
Attachments
acf regs first sentences.xlsx
(14.8 KiB) Downloaded 222 times
Fred Morlan
University of Kentucky CoP, 2017
International Quiz Bowl Tournaments, CEO, co-owner
former PACE member, president, etc.
former hsqbrank manager, former NAQT writer & subject editor, former hsqb Administrator/Chief Administrator
User avatar
setht
Auron
Posts: 1205
Joined: Mon Oct 18, 2004 2:41 pm
Location: Columbus, Ohio

Re: What is the purpose of a first line?

Post by setht »

AKKOLADE wrote: Wed Mar 06, 2019 9:04 am I've went through round one and added more data - first & second sentence buzz #s and word counts, plus percentages for both. I wanted to make sure people thought this info was actually going to be useful before I kept working on this.
Fred, do your second-sentence buzz data include the first-sentence buzzes? For instance, you list tossup 1 with 1 first-sentence buzz and 3 second-sentence buzzes. Does this mean that there were 4 buzzes by the end of the second sentence, with 1 buzz in the first sentence and 3 in the second? Or does it mean there were 3 buzzes by the end of the second sentence: 1 in the first sentence and 2 in the second?

Also, I grabbed the raw tossup data, and it looked to me like there were 2 correct buzzes by the end of the second sentence: John Wheatley at word 16, and Nathan Fredman at word 38. There was also an early neg at word 38; is it possible you're counting negs in your buzz numbers?
Seth Teitler
Formerly UC Berkeley and U. Chicago
President of NAQT
Emeritus member of ACF
User avatar
AKKOLADE
Sin
Posts: 15773
Joined: Thu Apr 24, 2003 8:08 am

Re: What is the purpose of a first line?

Post by AKKOLADE »

setht wrote: Wed Mar 06, 2019 10:50 am
AKKOLADE wrote: Wed Mar 06, 2019 9:04 am I've went through round one and added more data - first & second sentence buzz #s and word counts, plus percentages for both. I wanted to make sure people thought this info was actually going to be useful before I kept working on this.
Or does it mean there were 3 buzzes by the end of the second sentence: 1 in the first sentence and 2 in the second?

Also, I grabbed the raw tossup data, and it looked to me like there were 2 correct buzzes by the end of the second sentence: John Wheatley at word 16, and Nathan Fredman at word 38. There was also an early neg at word 38; is it possible you're counting negs in your buzz numbers?
It means the definition I quoted. I am including negs currently, since I think the real point we should be discussing is if clues are drawing buzzes of any type. A question drawing an unusually high number of early buzzes which are all negs would suffer from a different problem than what I'm looking at here.
Fred Morlan
University of Kentucky CoP, 2017
International Quiz Bowl Tournaments, CEO, co-owner
former PACE member, president, etc.
former hsqbrank manager, former NAQT writer & subject editor, former hsqb Administrator/Chief Administrator
User avatar
setht
Auron
Posts: 1205
Joined: Mon Oct 18, 2004 2:41 pm
Location: Columbus, Ohio

Re: What is the purpose of a first line?

Post by setht »

AKKOLADE wrote: Wed Mar 06, 2019 10:57 am I am including negs currently, since I think the real point we should be discussing is if clues are drawing buzzes of any type. A question drawing an unusually high number of early buzzes which are all negs would suffer from a different problem than what I'm looking at here.
Ah, interesting. I'm used to focusing entirely on the distribution of correct buzzes.

I went ahead and calculated the average percentage of (correct and wrong) buzzes by the end of the first sentence and by the end of the second sentence; the numbers are 2.7% and 7.9% for all 20 tossups. I also calculated the same averages excluding tossup 4, which Fred marked as an outlier. Those numbers are 1.6% and 6.0%.

I also calculated the value of f(x) = x^2, where f is the (target) cumulative number of correct answers by point x in the question, and x is the word percentage at the end of the first sentence and at the end of the second sentence. The average value of f at the end of the first sentence is 3.3%, and the average value at the end of the second sentence is 11.3%. (This is for all 20 tossups; redoing the calculation while excluding the two tossups marked as outliers for second sentence word percentage changes f(second sentence) to 11.2%.)

Given that the buzz percentages include negs, I think the empirical conversion numbers look a bit low overall. (Especially if we exclude tossup 4.) But again, I don't know whether other people think f(x) = x^2 is a reasonable target for correct buzz distribution, or if people agree with me that there should somewhat more early buzzes (in which case the mismatch with the empirical data is worse), or if people think the target distribution should have fewer early buzzes. (If we take assume that many of the second-sentence buzzes should be attributed to first-sentence clues, we might guess that something like 4-5% of the tossups got a "first-sentence-clue buzz" [with most of those being delayed buzzes that came in the second sentence].

Looking at individual questions: tossup 4 is an outlier with high early conversion. I grabbed the raw tossup data, and it looks to me like none of the buzzes in the first 2 sentences were negs. From the reported numbers, we'd expect that 3 out of 55 rooms had 2 buzzes in the first line. (And another 2 rooms with 2 buzzes in the second line.) I'm guessing that is considered too many, but again, it depends on our tolerance for early buzzer races and how spread out the buzzes were within the early sentences.

On the flip side, there are 5 tossups that had 0 buzzes (correct + negs) in the first 2 sentences, and another 2 that had 0 buzzes in the first sentence and 1 buzz apiece in the second sentence. It is not the case that these 7 tossups have abnormally short early sentences: on average, the end of the second sentence comes 32.7% of the way through these tossups, vs. the end of the second sentence coming 33.3% of the way through the full 20-tossup sample. Regardless of the details of any target distribution for correct buzzes people wind up thinking is appropriate, I would imagine that these 7 tossups would be seen as skewing too hard. (And there are another 5 tossups that had 0-1 buzzes in the first sentence, and 2-3 buzzes by the end of the second sentence. That's a buzz rate of 3.6-5.5% by the end of the second sentence, which seems noticeably low to me—it's well below the value of x^2, for instance. But if we attribute most buzzes in the second sentence to players processing and then buzzing on clues from the first sentence, then these 5 tossups might be fine.)

All this suggests that the writers and editors might want to look at tossup 4 with an eye towards "this tossup should have had somewhat harder early clues," and look at tossups 2, 6, 8, 10, 13, 18, and 19 (and perhaps also tossups 1, 5, 7, 11, and 14) with an eye towards "these tossups should have had somewhat easier early clues." (And in fact, buzz point data for later clues in these tossups might immediately suggest some clues that would have worked better as first- and second-sentence material. Or there might be a big cliff right after the second sentence, rather than that ideal 3-10% buzz rate clue—or whatever people come up with as a target for early clue buzz rates.)

Writers and editors who want to improve their craft can also check how their questions went over, and (in the case of having multiple too-hard questions) start trying to figure out if there's some common pattern to the too-hard questions: are these the work of one writer? One editor? Are they clustered in some chunk of the distribution? Are they mostly common links or wacky answers? Mostly answers that haven't come up much in recent sets? Mostly answers that have come up a lot in recent sets (so writers/editors felt obliged to dig deep for new material)? Some of these patterns might hold across packets in the set, which would be a strong signal some tinkering is in order. Maybe an editor needs to ease up across the board, or in a specific category or two. Maybe the field is struggling to get early buzzes on science questions, even across many different writers and multiple editors, and we should ease up there. Maybe really new answers need to start easier than we imagine. Etc.

Looking things over some more, I see that tossup 17 had 4 buzzes in the first sentence and another 4 in the second sentence—probably fine, but perhaps this one stayed a bit flatter than we'd like across the first 2 sentences. And tossup 20 had 3 buzzes in the first sentence and 1 in the second; perhaps that second sentence deserves some scrutiny.


Getting back to Fred's original question: I think it's up to ACF and the circuit to decide what role(s) the first line of a tossup in an (m)ACF tournament is supposed to fill. For NAQT, I think the length limits mean we have to have a minimal correct buzz rate goal that is higher than 3% by the end of the first sentence. (I'm sure we don't always hit our mark, but we certainly can't pick clues with a lower rate as a goal.)

In particular, I think it would be valid to decide that the first sentence is meant to provide valuable context for knowledgeable players, present interesting information, and make sure first-sentence buzzer races almost never happen—and it doesn't matter whether there are any correct buzzes. But it's worth discussing and coming to an informed decision, then making sure everyone's clear on the goal. There's a big difference to how writers and editors should approach "find one sentence's worth of interesting, difficult material; it has to be at least hard enough to avoid buzzer races, but there's no upper limit on how hard you can go" vs. "aim for 5% correct buzzes by the end of the first sentence."
Seth Teitler
Formerly UC Berkeley and U. Chicago
President of NAQT
Emeritus member of ACF
User avatar
ryanrosenberg
Auron
Posts: 1890
Joined: Thu May 05, 2011 5:48 pm
Location: Palo Alto, California

Re: What is the purpose of a first line?

Post by ryanrosenberg »

setht wrote: Wed Mar 06, 2019 12:33 am I think ideally people would propose some targets for buzz distributions (and maybe discuss and come to some consensus targets), then inspect the buzz point data and draw conclusions accordingly. (This would help us keep ourselves honest—not shifting the goalposts to better align with the data.) In this post from last year's ACF Regionals detailed stats discussion thread, Will Alston suggested f(x) = x^2 (where f is the [cumulative] fraction of buzzes by point x, and x is the percentage of tossup text read) as a target buzz point distribution. I think I would suggest something between f(x) = x^2 and f(x) = x so there are a few more early buzzes and middle buzzes. And now that I'm thinking about this some more, there should probably be some adjustment to allow for the fact that not every single tossup will get 100% conversion by the end. Maybe we need to make some adjustment to account for early clues that generate buzzes a little bit later in the question (possibly causing some buzzes on first-sentence clues to come early in the second sentence). And maybe we just want to ignore targets for the first sentence—because we really really really want to avoid first-sentence buzzer races; because first-sentence clues are really about "setting the table" (to borrow Andrew's phrase) rather than generating some percentage of buzzes; or for any other reasons. But whatever, Will's suggestion seems like a broadly plausible target, and it's easy to use as a guide to contextualizing the data. And I'll go ahead and use f(x) = x as a very generous counterpoint. (This is my own bias here; maybe other people feel that Will's suggestion is already generous, and we should be thinking about alternate targets that push more buzzes to later in tossups.)
This is a really interesting point, and I decided to take a graphical look at how this year's Regs compared to the hypothetical buzz distributions in Seth's post. The graph below compares the cumulative distribution of buzzes (both gets and negs) in Regs to hypothetical y = x, y = x^2, and y = x^3 distributions.

Image

As you can see, the distribution of buzzes at Regionals closely mirrors the y = x^2 distribution, but shifted about 7.5% of the question back (approximately 9-10 words depending on tossup length). This holds until towards the end of the question, at which point I'm inclined to think gameplay takes over, since players where the other team has already negged will not be buzzing.
Ryan Rosenberg
North Carolina '16
ACF
User avatar
theMoMA
Forums Staff: Administrator
Posts: 5993
Joined: Mon Oct 23, 2006 2:00 am

Re: What is the purpose of a first line?

Post by theMoMA »

That's cool. Based on absolutely nothing, I'd venture that this shows that buzzes (at least at EFT) generally do follow the x^2 distribution with an average of 2-3 seconds of processing lag (I just timed myself reading a few lines of question text and, depending on how long the words are, it takes about 2-3 seconds to read 9-10 words.) I guess that's something to consider in this discussion; even if you pause to process a clue for only a couple seconds, a lot of question text gets read even in the interim.

If it's possible to slice the data this way, it would be interesting to see the cumulative buzz distribution for only "gets that do not come after a neg."
Andrew Hart
Minnesota alum
User avatar
setht
Auron
Posts: 1205
Joined: Mon Oct 18, 2004 2:41 pm
Location: Columbus, Ohio

Re: What is the purpose of a first line?

Post by setht »

ryanrosenberg wrote: Wed Mar 06, 2019 3:24 pm I decided to take a graphical look at how this year's Regs compared to the hypothetical buzz distributions in Seth's post. The graph below compares the cumulative distribution of buzzes (both gets and negs) in Regs to hypothetical y = x, y = x^2, and y = x^3 distributions.

Image

As you can see, the distribution of buzzes at Regionals closely mirrors the y = x^2 distribution, but shifted about 7.5% of the question back (approximately 9-10 words depending on tossup length). This holds until towards the end of the question, at which point I'm inclined to think gameplay takes over, since players where the other team has already negged will not be buzzing.
Thanks Ryan, this graphic is very interesting.

It seems reasonable (or at least plausible) to me that the distribution of buzzes on early clues is pretty much "y = x^2, but with a ~9-10 word delay while players process hard clues." But I would interpret the middle and late parts of the distribution a bit differently—my impression has been that most buzzes on middle/late clues are not of the "I recognized something 9-10 words ago but it took me a while to pull it" variety; they're much closer to "when I hear clue X, I immediately mash my buzzer and say Y." So I would argue that this graphic indicates that the middle and late clues really were harder than a "y = x^2" distribution. (There's probably still some lag on average between hearing a crucial middle/late clue and buzzing in correctly, but I would imagine that lag is less than the lag for processing early clues.)

It might be interesting to generate a similar CDF that ignores buzz point data whenever there's a neg, and CDFs looking only at the tossups in the bottom quarter or third or half of correct buzz percentage by the 30% mark or the 50% mark. From looking at Fred's data for packet 1, there were a good number of questions that were quite hard for the first two sentences. Do those questions quickly catch up to the overall CDF, or generally stay harder throughout? (I.e. is the issue confined to a few early clues, or are these questions where every clue is a sentence or two harder than it should be?)
Seth Teitler
Formerly UC Berkeley and U. Chicago
President of NAQT
Emeritus member of ACF
Jack
Lulu
Posts: 91
Joined: Thu Sep 28, 2017 5:07 pm

Re: What is the purpose of a first line?

Post by Jack »

This may not be specifically germane to the discussion at hand, but seeing the graph immediately made me think of Lorentz curves and Gini coefficients. Now, don't get me wrong, I've never edited a quiz bowl question in my life, and anything I've ever written has probably been terrible, but I wonder if you could potentially create a similar metric to the Gini coefficient for buzz distributions which could be used as a way to gauge difficulty of sets, or the relative difficulty the editors have designed for a set.

You could compare to the y=x line (much like Gini) as the baseline (since anything less steep would probably mean the set is terrible and imply it is nonpyramidal) and measure the area compared to the area above a regression line for buzzes like Ryan made. Perhaps you could use this to compare the relative difficulty of the same set across years (or break it down further for a category, like science), or to compare the buzzpoint distributions of categories in the same tournament (though obviously the amount of data is lacking for this since this is relatively new). It might make sense that ACF nationals has a much higher "index" than ACF Fall, and in theory it could help editors get a sense of how their final product was received compared to other sets. Again, I'm not sure if this type of data even exists in order to make these comparisons, but it seems like a cool idea to mathematically express "xxx set felt like people were not getting science until late in questions compared to other questions, this year's science was too hard." Maybe this idea isn't that useful since average buzzpoints already exist. I don't know. Just popped into my head.
Jack
Bermudian Springs HS
Princeton University '21
User avatar
Carlos Be
Wakka
Posts: 216
Joined: Sun Jun 25, 2017 11:34 pm

Re: What is the purpose of a first line?

Post by Carlos Be »

A short anecdote: because of this discussion, I went back to look at some first-lines and I noticed in particular this lead-in for the "Volsunga saga":
A poem about the “Lovers” of a character from this work appears in the third part of ​The Earthly Paradise​ by William Morris
After a bit of research I found that this lead-in is inaccurate. Morris's "Lovers of Gudrun" is actually based on a character from the Laxdaela saga that shares her name with the character from the Volsunga saga. Apparently, no one noticed this. (If someone did notice this and I didn't notice the noticing just ignore this post.) Obviously, one example does not mean anything, but the fact that out of the first five lead-ins I looked at one happened to be false and no one noticed suggests that there is a reasonably significant proportion of lead-ins that no one is engaging with, whether by buzzing or by getting context.

More questions would have to be analyzed to make a conclusion, but based on the buzz-stats in this thread and conversations with other players I'll hypothesize that many lead-ins could be removed without significant effects.
Justine French
she/her
User avatar
Smuttynose Island
Forums Staff: Moderator
Posts: 614
Joined: Wed Oct 21, 2009 9:07 pm

Re: What is the purpose of a first line?

Post by Smuttynose Island »

jacke wrote: Wed Mar 06, 2019 8:34 pm This may not be specifically germane to the discussion at hand, but seeing the graph immediately made me think of Lorentz curves and Gini coefficients. Now, don't get me wrong, I've never edited a quiz bowl question in my life, and anything I've ever written has probably been terrible, but I wonder if you could potentially create a similar metric to the Gini coefficient for buzz distributions which could be used as a way to gauge difficulty of sets, or the relative difficulty the editors have designed for a set.

You could compare to the y=x line (much like Gini) as the baseline (since anything less steep would probably mean the set is terrible and imply it is nonpyramidal) and measure the area compared to the area above a regression line for buzzes like Ryan made. Perhaps you could use this to compare the relative difficulty of the same set across years (or break it down further for a category, like science), or to compare the buzzpoint distributions of categories in the same tournament (though obviously the amount of data is lacking for this since this is relatively new). It might make sense that ACF nationals has a much higher "index" than ACF Fall, and in theory it could help editors get a sense of how their final product was received compared to other sets. Again, I'm not sure if this type of data even exists in order to make these comparisons, but it seems like a cool idea to mathematically express "xxx set felt like people were not getting science until late in questions compared to other questions, this year's science was too hard." Maybe this idea isn't that useful since average buzzpoints already exist. I don't know. Just popped into my head.
I don't know how the Gini coefficient is computed, but the area under the cumulative distribution for the "average" TU at a tournament gives you a good way to compare the relative difficulty of sets [modulo the fact that your fields are different]. Given fields of roughly equal strength, a harder tournament will have a smaller area under the curve as more buzzes occur later in the TUs. An easier tournament will have a higher area under the curve as more TUs are answered earlier.
Daniel Hothem
TJHSST '11 | UVA '15 | Oregon '??
"You are the stuff of legends" - Chris Manners
https://sites.google.com/site/academicc ... ubuva/home
Jack
Lulu
Posts: 91
Joined: Thu Sep 28, 2017 5:07 pm

Re: What is the purpose of a first line?

Post by Jack »

You could compute it with area under the curve too. The Gini coefficient is the area above divided by the total. I was just thinking it would work better this way because then a "harder" tournament would have a higher number.
Jack
Bermudian Springs HS
Princeton University '21
User avatar
Skepticism and Animal Feed
Auron
Posts: 3238
Joined: Sat Oct 30, 2004 11:47 pm
Location: Arlington, VA

Re: What is the purpose of a first line?

Post by Skepticism and Animal Feed »

First of all, let me say how jealous I am that writers and editors today have this wonderful data, which in my day was the province of pie-in-the-sky futurists. Truly quizbowl keeps getting better and better and it's fun to watch this happening from the retirement home.

Second of all, I want to talk about what lead-in clues meant to me when I was writing quizbowl questions. Yes, obviously it provided context, dropped a pronoun, stated the type of thing that was being asked, etc., but more personally it was the part of the tossup I found most rewarding and enjoyable to write. One of the reasons I wrote questions was the sheer joy of sharing entertaining or new information with people who I knew would appreciate it, and the lead-in did this more than any other part of the tossup. Sometimes I would discover an amazing, obscure fact about something famous, and then rush to my computer to write a tossup solely so I could share this fact with the quizbowl community. I would get giddy thinking of how my new clue would be received by the quizbowl community, especially by its more physically animated members. Eventually, as I wrote more and more questions, I came to view the process of writing a lead-in clue, and then seeing smiles on the faces of people who heard my lead-in clue, as my compensation for the otherwise pretty mundane task of writing a tossup. I would have probably written fewer things if I could not have the first sentence as my playground (though I'd write much faster, because the lead-in often took the most time to research because I wanted a fresh, interesting clue). And when people actually buzzed on one of those lead-in clues, I was very happy and they were often much more overtly happy. Everyone remembers that time they first-lined a tossup on something they love, and the rarity of it probably helps make it memorable. There's also a sense of pride that comes from introducing a particular clue to quizbowl.

There's decades of quizbowl theory, best practices about how to write tossups, treatises about what makes a question fair or knowledge real, etc., but at the core of quizbowl there are bunch of people with a primal love for knowledge who enjoy learning new things and are willing to get up on a Saturday morning and go to Providence, RI just because they might learn a new thing from doing so. In 2010, when SQBS was all we had, I would have argued that even a lead-in clue that nobody buzzed on could still be considered a success if it taught people things and thereby brought joy to their hearts.

But despite all of that, when I look at these first-sentence conversion stats, my gut reaction is "oh god, we've been doing it wrong". We've been wasting time hunting down these obscure clues, we have been wasting time forcing people to listen to them. I don't know what my ideal distribution of buzzes is, I just know that this ain't it, and it certainly seems that a large chunk of the field is not engaging with the early clues much at all. If I were still writing tossups today, I would go back and make a conscious effort to make lead-ins a bit less obscure. I don't know how I'd do that without also leading to buzzer races on the first clue, but it might involve something like being more selective about which sources I use for a clue. I.e., try to find the most obscure thing in a famous source, rather than pulling a clue out of a dusty old book that most likely very few people have read. I've never had Andrew Hart's supernatural ability to will a certain get distribution into existence and easily hit numerical conversion targets so it'd probably involve a lot of trial and error...and hey look the advanced stats are now available to allow writers to do exactly that kind of trial and error.
Bruce
Harvard '10 / UChicago '07 / Roycemore School '04
ACF Member emeritus
My guide to using Wikipedia as a question source
User avatar
Smuttynose Island
Forums Staff: Moderator
Posts: 614
Joined: Wed Oct 21, 2009 9:07 pm

Re: What is the purpose of a first line?

Post by Smuttynose Island »

setht wrote: Wed Mar 06, 2019 6:20 pm It might be interesting to generate a similar CDF that ignores buzz point data whenever there's a neg
Ophir's raw data conveniently lists if a TU was gotten as a "bounceback." I went ahead and computed the cdf for the "average" ACF Regionals 2019 TU that was not a bounceback. This cdf only includes gets that were not gotten after a neg. Below is a plot of this cdf against Will Alston's theoretical y = x^2 as well as the less desirable y = x^3.

Image

I will add one word of caution - There's always a non-negligible chance that I messed up somewhere. If anyone wants to double check my work, they can find my code on github. My username is dhoth.


EDIT: Uploaded the wrong image.
Daniel Hothem
TJHSST '11 | UVA '15 | Oregon '??
"You are the stuff of legends" - Chris Manners
https://sites.google.com/site/academicc ... ubuva/home
User avatar
setht
Auron
Posts: 1205
Joined: Mon Oct 18, 2004 2:41 pm
Location: Columbus, Ohio

Re: What is the purpose of a first line?

Post by setht »

Smuttynose Island wrote: Thu Mar 07, 2019 3:29 am Ophir's raw data conveniently lists if a TU was gotten as a "bounceback." I went ahead and computed the cdf for the "average" ACF Regionals 2019 TU that was not a bounceback. This cdf only includes gets that were not gotten after a neg. Below is a plot of this cdf against Will Alston's theoretical y = x^2 as well as the less desirable y = x^3.

Image

I will add one word of caution - There's always a non-negligible chance that I messed up somewhere. If anyone wants to double check my work, they can find my code on github. My username is dhoth.


EDIT: Uploaded the wrong image.
This is great, thanks Daniel.

Comparing with Ryan's "gets + negs" CDF, I actually don't see much difference—Daniel's no-negs CDF gets a bit closer to the y = x^2 curve in the late middle section of tossups, but to my eyes the main difference is right at the end, where Ryan's CDF jumps up to a bit more than 1. Ryan, are you counting all buzzes (gets and negs) but normalizing by gets alone, or something?

It's nice to see that whether we frame things in terms of "do the clues in the first X percent of tossups get people to buzz" or "do the clues in the first X percent of tossups get people to buzz correctly," the relevant data look very similar. (At the scale of the whole set; presumably smaller sets of data such as individual tossups, categories, packets, etc. might show differences between the distributions of all buzzes and correct buzzes.)
Seth Teitler
Formerly UC Berkeley and U. Chicago
President of NAQT
Emeritus member of ACF
User avatar
setht
Auron
Posts: 1205
Joined: Mon Oct 18, 2004 2:41 pm
Location: Columbus, Ohio

Re: What is the purpose of a first line?

Post by setht »

justinfrench1728 wrote: Wed Mar 06, 2019 9:10 pm A short anecdote: because of this discussion, I went back to look at some first-lines and I noticed in particular this lead-in for the "Volsunga saga":
A poem about the “Lovers” of a character from this work appears in the third part of ​The Earthly Paradise​ by William Morris
After a bit of research I found that this lead-in is inaccurate. Morris's "Lovers of Gudrun" is actually based on a character from the Laxdaela saga that shares her name with the character from the Volsunga saga. Apparently, no one noticed this. (If someone did notice this and I didn't notice the noticing just ignore this post.) Obviously, one example does not mean anything, but the fact that out of the first five lead-ins I looked at one happened to be false and no one noticed suggests that there is a reasonably significant proportion of lead-ins that no one is engaging with, whether by buzzing or by getting context.

More questions would have to be analyzed to make a conclusion, but based on the buzz-stats in this thread and conversations with other players I'll hypothesize that many lead-ins could be removed without significant effects.
I'd be pretty leery of anecdata like this—it could be that someone did notice this issue, but didn't post here, for instance. And getting 1 hit in a sample of 5 is not a good basis for drawing conclusions.

Thankfully, we have much better options for testing hypotheses like "many lead-ins could be removed without significant effects." For instance, I think it might be instructive to look at all the tossups with 0 gets in the first sentence, and see how many also have 0 gets in the second sentence. (Or 1 get, or 2 gets, up to however many gets we consider a reasonable minimum number for the end of the second sentence. Again, thanks to Fred's processed data, we can already say that the first packet had 5 tossups with 0 gets by the end of the second sentence, and another 2 with 1 get by the end of the second sentence.) Or you could make that "how many tossups have 0 gets within 15 words after the first sentence," on the assumption that 15 words generously covers possible delays in processing useful first-sentence clues.
Seth Teitler
Formerly UC Berkeley and U. Chicago
President of NAQT
Emeritus member of ACF
User avatar
ryanrosenberg
Auron
Posts: 1890
Joined: Thu May 05, 2011 5:48 pm
Location: Palo Alto, California

Re: What is the purpose of a first line?

Post by ryanrosenberg »

Smuttynose Island wrote: Thu Mar 07, 2019 3:29 amI will add one word of caution - There's always a non-negligible chance that I messed up somewhere. If anyone wants to double check my work, they can find my code on github. My username is dhoth.
I'm getting different results -- it looks like you're setting the maximum possible gets on a tossup to the number of gets that actually happened, rather than the number of rooms it was heard in (which would be 100% conversion).
setht wrote: Thu Mar 07, 2019 9:59 am Ryan, are you counting all buzzes (gets and negs) but normalizing by gets alone, or something?
Yes, that's correct. I had been doing that to conform to Fred's initial idea that incorrect buzzes still reflect players engaging with the clue. However, I think it's instructive to look at both -- if I get some time this afternoon I'll look at making some graphs comparing the different ways of analyzing a buzz distribution proffered in this thread, as well perhaps for some tournaments with different distributions (EFT, CMST).
Ryan Rosenberg
North Carolina '16
ACF
User avatar
Smuttynose Island
Forums Staff: Moderator
Posts: 614
Joined: Wed Oct 21, 2009 9:07 pm

Re: What is the purpose of a first line?

Post by Smuttynose Island »

ryanrosenberg wrote: Thu Mar 07, 2019 1:29 pm
Smuttynose Island wrote: Thu Mar 07, 2019 3:29 amI will add one word of caution - There's always a non-negligible chance that I messed up somewhere. If anyone wants to double check my work, they can find my code on github. My username is dhoth.
I'm getting different results -- it looks like you're setting the maximum possible gets on a tossup to the number of gets that actually happened, rather than the number of rooms it was heard in (which would be 100% conversion).
That's correct. I'm going to switch to talking about pdf's for a moment, since I think it is easier to frame questions about them. When constructing the pdf, I tried to answer the following question - "If a player correctly answers a TU without another player negging, how likely were they to buzz in at the n%?" Because of this, my sample consists of all TUs that players correctly answered without another player negging. The estimate for "If a player correctly answered a TU, how likely were they to buzz in at the n%" is then "number of questions gotten at the n% / total sample size." Using the total number of rooms should give you a pdf whose area under the curve is less than 1 as you are counting TUs that went dead or were gotten after a neg? Is this wrong?

If you wanted to incorporate dead TUs, I believe that the best way to do it is to record them separately. Perhaps by marking their buzz point as 1.001 or something.
Daniel Hothem
TJHSST '11 | UVA '15 | Oregon '??
"You are the stuff of legends" - Chris Manners
https://sites.google.com/site/academicc ... ubuva/home
User avatar
ryanrosenberg
Auron
Posts: 1890
Joined: Thu May 05, 2011 5:48 pm
Location: Palo Alto, California

Re: What is the purpose of a first line?

Post by ryanrosenberg »

Smuttynose Island wrote: Thu Mar 07, 2019 2:31 pm Using the total number of rooms should give you a pdf whose area under the curve is less than 1 as you are counting TUs that went dead or were gotten after a neg? Is this wrong?

If you wanted to incorporate dead TUs, I believe that the best way to do it is to record them separately. Perhaps by marking their buzz point as 1.001 or something.
Yes, the area under the curve would be the conversion percentage, minus the neg percentage if you're excluding bouncebacks). That seems fine to me as a way to measure convertability of a tournament as well as distribution. One could imagine a tournament with a reasonable buzz distribution but low conversion, e.g. ACF Nationals 2015, which had relatively normal scores in the top brackets but record-setting low scores in the bottom bracket.
Ryan Rosenberg
North Carolina '16
ACF
User avatar
theMoMA
Forums Staff: Administrator
Posts: 5993
Joined: Mon Oct 23, 2006 2:00 am

Re: What is the purpose of a first line?

Post by theMoMA »

Contra Bruce, I don't think these buzz distributions are off base; they look basically how I'd expect and want them to look, as both an editor and player. To me, quizbowl is a game that mixes fast-paced and extremely competitive gameplay with brief moments of respite and strategic contemplation. While I occasionally enjoy playing a tournament such as ACRONYM in which every question threatens to be converted on the first clue, I don't think I would enjoy a higher-level academic event in which I had to be 100% focused on achieving a millisecond buzzer advantage on every sentence of every tossup. I appreciate how editors are frequently able to find interesting, deep clues for familiar material that set the table for the rest of the question (and occasionally reward me other someone else for knowing those clues).

I do agree with Bruce and others in this thread who have pointed out that some lead-ins are obscure to the point of uselessness. I suspect that some of these are a result of an answer that is simply too hard; if you write on a Regionals answerline at Fall, not only are fewer players going to know it at the end, but fewer players are going to know all of the clues going up the line; for sufficiently difficult answer lines, no one in the field might know more than a clue or two about the subject, turning most of the tossup's material into dead air. But that's not really a problem with the lead-in so much as with answer selection. (Another answer selection issue that can result in a useless lead-in is picking an answer line before investigating whether there are enough buzzable clues to support it, pressing on with writing even when it turns out there are not, and being forced into using some obscure lead-in with 41 Google results as a consequence. Again, this isn't really a lead-in issue.) It's also possible that some lead-ins are obfuscated into unbuzzability by writers who use de-specifying language like "a certain character" out of fear that the clue will otherwise be too easy; that's fairly straightforward to fix by just picking a clue that is difficult enough so you can describe it in all its particulars and proper nouns without fear.

I'm now going to perch on my highest armchair and postulate about a method of tossup construction that can result in useless lead-ins of the kind that Bruce seems to lament. I'll contrast this method with what I consider a better one and walk through how I wrote a particular tossup that I think offers a good contrast. (I'm flattered at Bruce's description of my difficulty-assessing abilities, but there's nothing supernatural about it; after a lot of trial and error, I've come up with some good methods for writing questions that are mostly clue-dense, interesting, and gettable, and I will share some of those below.)

In the method of tossup construction that will serve as my bogeyman in this discussion, the writer views the tossup as a hodgepodge of clues that all happen to point to the same answer line. The function of the lead-in in such questions is not to set the table for the clues that follow, or to introduce the little story that the question is telling, but simply to take up the space after the tossup number and be less famous or buzzable than the sentence that follows. Writers occasionally try to spice up questions of this construction by writing a complex answer line or finding an amusing anecdote as a lead-in, but this can itself be a problem. (I'll set aside issues related to answer lines for this discussion and focus on the practice of finding an amusing lead-in to add panache to an otherwise-uninspired question.)

The issue with this writing method is that there is no teleological framework by which the writer can judge the obscurity and relative difficulty of any given clue. For instance, there are almost infinitely many ways to write tossups on basic answers like, say, "rivers." An interesting way would use that answer line as a way to ask about, say, a body of study within earth science, cultural studies, or religion. The most boring possible way to do so is haphazardly to pick three or four rivers from around the world, guess at how famous they are relative to each other and order them accordingly, find a random clue for each, and call it a geography tossup. This would be boring even if the first clue was something amusing like "Richard Nixon once threw 18 Big Mac wrappers into one of these geographic features while raving about the communist threat." (Although, if you managed to find five straight clues as amusing as that entirely made up anecdote, and they were gettable and arranged properly, I would concede that you'd written a good tossup; clue quality can sometimes overwhelm a lack of structure.)

The issue is that, if you found a funny clue along those lines, you might be tempted to use it as a flavorful lead-in for an otherwise bland question regardless of how many people actually know it. Relatedly, such a clue does not fit within any external body of knowledge about "rivers," and thus the writer has no guidance on how relatively famous any given clue is to another, or any signposts for figuring out if the clue is well known to anyone else who is interested in a particular topic. (This is because a tossup on four haphazardly chosen "rivers" isn't even really on a topic at all!) In short, the hodgepodge-style tossup is a receptive shell for amusing lead-ins but its lack of structure does not help the writer think about how famous those lead-ins are relative to the other clues or whether they're even known at all.

The Gallant to this story's Goofus is what I would call a "coherent" tossup. By coherent, I mean only that the writer considers the tossup to be more than a collection of clues that are isolated from one another but all point to the same answer; instead, the writer comes up with a set of clues that all relate to both the answer line and to each other to test on some coherent body of knowledge related to the subject of the question.

I'll offer an example by working through how I wrote a particular tossup for EFT.

Whenever I set out to write a question, I first have to think of something to write about. I don't keep a notebook for this, but I read quite a bit and watch movies and listen to podcasts. When I sit down to write, I think about the category for a bit. If I can't remember a good idea from something I've learned recently, I think of a random topic, and then I go to Wikipedia or Google and start drilling into that topic until I think of a good idea for a question. Importantly, when I say "good idea," I don't mean "good answer line"; I mainly think about the conceit of the question, or the basic body of knowledge that it will test, and then I think of the various answer lines that I could write to test for that same idea and choose the one that I think will be the easiest for players to comprehend and convert. (I also often rewrite questions if I figure out a better way to ask about the same body of knowledge, or if a better idea comes up in my research for a particular question.)

In the case of this particular EFT question, I'd just been reading through some of Richard Hofstadter, and I remembered seeing (but not reading) his essay about John C. Calhoun. It struck me that a tossup on Calhoun's political writings would be a good way to ask about a fairly basic American history topic (Calhoun and the politics of antebellum America) in a coherent, interesting, and gettable way. In this case, it seemed like the best answer line for the topic was Calhoun himself, but I was open to changing that, or refocusing the question entirely, if interesting new avenues opened up during the research and writing.

(Contrast this with the hodgepodge method of writing on Calhoun, which would take three or four pieces of information related to the man and his life and arrange them in perceived order of difficulty. Such a tossup could very loosely cohere around Calhoun's biography, but although its clues would all point to Calhoun, they would not necessarily have any relation to each other, and there would be less of an external structure for difficulty comparisons.)

Once I have a particular conceit or theme in mind, I start looking for the basic topics that should be in the question. Usually, I have a good idea about what two of those clues are going to be (the giveaway and the piece of information that inspired me to write the question in the first place). Then I'll often head over to Wikipedia or Google (often Google Scholar) to look around and make sure that the topic I'm thinking of is actually a coherent "thing" and not some random association in my head. (At higher difficulty levels, literature reviews and PhD theses are especially good for this, and can often provide the skeleton for an entire tossup or bonus.) I do occasionally write tossups themed around random associations in my head, but in those cases, I take special care to make sure that my clues are all highly interesting and worth the detour into vanity.

In this case, I was certain that Calhoun's political writings were a coherent topic, and I thought that I'd use the Hofstadter essay and the South Carolina Exposition and Protest as an early clue and late clue, respectively. Although I dimly recalled that Calhoun wrote a longer work of political theory, I wanted to make sure that there was something notable to put in the middle. And yes, I remembered correctly, and the book was called A Disquisition on Government.

After (or sometimes during the process of) researching possible clues, I set them down in rough question form. I find it helpful to write the giveaway first, and build the question backward from there, because it prevents the late-middle clues from becoming too cramped and the lead-in from becoming too expansive. And then I tinker a lot with the wording to ensure that I'm hitting my length target, phrasing everything clearly and unambiguously, and arranging each sentence pyramidally (sentences have to internally phrased in pyramidal fashion, and not just organized that way relative to each other!).

Here, I wanted the giveaway to say that Calhoun advocated nullification and was from antebellum South Carolina. I knew that the clue about his tariff opposition would flow into that giveaway, and I figured that adding something about the nullification crisis would also, despite being slightly off the "written works" theme, be a good immediately-pre-FTP clue, and would in any event be a good way to show how he put his philosophy into practice. I thus phrased the last two sentences: "This man and another politician from the same state, (*) Robert Hayne, offered toasts praising state liberty during an 1832 crisis that this man helped precipitate while serving as vice president. An 'exposition and protest' against the “Tariff of Abominations” was written by, for 10 points, what antebellum politician, an advocate of nullification from South Carolina?"

Atop those late-middle and giveaway clues, I placed two middle clues: the earlier one about Calhoun's Disquisition, and the later one describing the Exposition and Protest. In the various overview sources I saw, including Wikipedia, the Disquisition seemed to be the kind of thing that could be name dropped in power, and its most important concept seemed to be that of the "concurrent majority" to resolve questions of federalism. And I thought the Exposition and Protest was important enough to warrant a descriptive clue that was substantively evocative and also provided some context about authorship and date. So I wrote the middle two sentences as "This man proposed resolving federalism questions via the concept of 'concurrent majority' in his posthumously published treatise A Disquisition on Government. This man wrote that the 'right of judging' is an 'essential attribute of sovereignty' in a document that he authored anonymously in 1828."

Finally, I was pretty convinced that Hofstadter's take on Calhoun was known to at least a few players (it is an influential essay, as far as I know), and that it would serve as an interesting contextual clue (Hofstadter is an American historian, the "master class" is a slaveocracy term, Calhoun was notable for both his politics and his political theory) for players who hadn't heard of it. So I wrote the first clue as "Richard Hofstadter claimed that this man was 'probably the last American statesman to do any primary political thinking' in an essay stating that this man foreshadowed and inverted communist ideas and calling him the 'Marx of the Master Class.'" The set editors rightly thought that this was too long and cut the words "stating that this man foreshadowed and inverted communist ideas," but left the rest of the text in this clue and the rest of the question unchanged.

The question was converted in all rooms (no surprise, Calhoun is a middle-school American history staple), powered in 17% of rooms, and negged in 10% of rooms. (The averages for the tournament were 90% conversion, 19% power, 22% neg.)

The buzz distribution looked like this

About 3 buzzes near the end of the first sentence, a trickle of powers and negs up to the power mark, and then fairly steady conversion to the end. (I suspect that the plateau near the end was on the "exposition and protest" clue right before FTP.)

I wouldn't claim that this is the perfect tossup, but I think it shows the strength of picking an easy answer line and using it to clue a coherent body of knowledge. The buzz distribution is fairly steady, all of the clues generated at least a couple buzzes, and everyone got it at the end.

Because I knew that the literature surrounding questions of state and federal power, including Calhoun's writings specifically, were a very relevant topic in American history, it was very easy for me to arrange the clues in the order of their notability within that body of knowledge. And because knowing the structure of that body of knowledge allowed me to assess the relative importance of Richard Hofstadter's essay within it, I was able to pick a lead-in clue that was both interesting and useful to players. With the aid of experience, all this took me about 20 minutes, most of which I spent reading the Hofstadter essay.

Had I simply selected a few haphazard clues from Calhoun's life and the scholarship surrounding it, I would not have had those advantages, and I may have picked an amusing but unbuzzable clue to sit atop the question. (I googled around for something funny but obscure about Calhoun that I might have used as a lead-in in a past life, and although he seems to have been a rather dour and boring person, I did find an anonymously reported "anecdote of the Civil War" in which Calhoun had a dream that George Washington told him to look at his hand, and when he did, the black spot of Benedict Arnold appeared on it. It has fewer than 100 Google results.)
Andrew Hart
Minnesota alum
User avatar
ryanrosenberg
Auron
Posts: 1890
Joined: Thu May 05, 2011 5:48 pm
Location: Palo Alto, California

Re: What is the purpose of a first line?

Post by ryanrosenberg »

ryanrosenberg wrote: Thu Mar 07, 2019 1:29 pm Yes, that's correct. I had been doing that to conform to Fred's initial idea that incorrect buzzes still reflect players engaging with the clue. However, I think it's instructive to look at both -- if I get some time this afternoon I'll look at making some graphs comparing the different ways of analyzing a buzz distribution proffered in this thread, as well perhaps for some tournaments with different distributions (EFT, CMST).
I took a look at the distributions of correct buzzes (so excluding negs, but not bouncebacks) for the other tournaments for which I had data: EFT, CMST, and Regionals the past two years. I know that there are other tournaments that have these detailed stats but don't have them handy; if someone could point me to them I'd be happy to include them.

Image

Note that this should not be interpreted to mean "2018 Regionals was harder than CMST". The CMST field was significantly stronger than the Regionals field (and not just at the top, the low-finishing teams at CMST would likely finish in the middle of a Regionals field). However, the teams playing CMST got the questions earlier, on average, than did the teams playing Regionals.
Ryan Rosenberg
North Carolina '16
ACF
User avatar
ThisIsMyUsername
Auron
Posts: 1005
Joined: Wed Jul 15, 2009 11:36 am
Location: New York, NY

Re: What is the purpose of a first line?

Post by ThisIsMyUsername »

I will surprise no one by saying that I disagree with Andrew's claims about what he calls "coherent" and "hodgepodge" tossups. (I find these terms already unfairly loaded. I will call these "themed" and "unthemed.") Like him, I want to leave aside the aesthetic questions, and focus on the practical matters of clue choice and playability. I have a longer document on this subject that I've been meaning to post for a while now, but I would need plentiful time to revise it, and probably won't have that time until after Nats. For now, I'll voice my basic objections.

Here is the claim that I find particularly baffling (and much of Andrew's post seems to spring from this assumption):
The issue with this writing method is that there is no teleological framework by which the writer can judge the obscurity and relative difficulty of any given clue.
(1) Why "teleological framework"? The unthemed approach is much more teleological than the themed one, because it cares only about the telos of how many people buzz. Caring about "coherence" is an extra (procedural) concern--one that doesn't need to conflict with the centra telos of a structurally sound tossup, but (as I will argue) can.

(2) Why on earth would theming a tossup make it any easier or harder to determine the difficulty of a clue? The difficulty of a clue is synonymous with how many people can buzz on it. No matter whether the tossup is themed or not, you have to know that piece of information in order to place it in the correct location in a tossup. If you don't know that already, "coherence" can't help you. If you do know that, you know where to place the clue even before you decide on the other clues in the tossup.

Let us take the case of Charles Dickens. A Christmas Carol, Great Expectations, and A Tale of Two Cities can all be tossed up at regular difficulty. There is nothing more or less "coherent" about interlacing any of these works with each other, in any particular combination--such a tossup is unthemed unless you focus on a particular type of scene (which may not be viable). If I am able to choose useful lead-ins for a tossup on any of these individuals, then I can also choose a useful lead-in for an unthemed Dickens tossup: in fact, it could be the same exact clue. If A Christmas Carol and Great Expectations are both taken to be of approximately the same difficulty, then an unthemed Dickens tossup could be written like so: the first sentence from what would be a tossup on one of those two works, followed by the second sentence from what would be a tossup on one of those works, etc.

Now, a logical objection to this would be that people don't buzz on clues in a vacuum. Context matters. And by abstracting (e.g.) my second sentence from the hypothetical Great Expectations tossup and placing it in the same spot in a Dickens tossup, I remove some of the continuity that would make it more buzzable. This is true, but the "coherent" writer faces the opposite problem: as the themed tossup goes on, it becomes more fraudable due to the continuity. So, the two approaches have their own problems: unthemed tossups can become hard to follow, and themed tossups can become too easy to fraud.

(3) Making claims about the advantages of tossup-coherence is particularly bizarre in a conversation that is primarily about first-line clues. You can't know from hearing the first sentence whether the rest of the tossup is going to be themed. And, as I've already shown (and this perhaps the biggest takeaway here): any clue that could be picked by someone writing a themed tossup could also be picked by someone writing an unthemed tossup, but the converse is not true.

(4) A lot of valuable clues are purged if one consistently takes a "coherent" approach. What happens if an artist most known for their landscapes paints a couple of portraits that could be good lead-in material? What happens if a philosopher most known for their work on ethics writes a couple of essays on metaphysics? Should we not use these as lead-ins, because they muddle the supposed "coherence"? For many creators (of literature, art, philosophy, thought), you can't explore their body of work properly unless you're willing to write some unthemed tossups. Most creators do not produce internally coherent bodies of work.

(5) Contra Andrew Hart, I think themed tossups are more prey to structural problems than unthemed tossups are. When you are writing an unthemed tossup, you are controlling for one factor only: buzzability. When you are writing a themed tossup, you are controlling for two: buzzability and thematic cohesiveness. The latter is an additional constraint, and controlling for it on top of for controlling buzzability is harder not easier. In order for a themed tossup to work, you have to either (a) know going into the research phase that there is going to be enough difficulty-appropriate information for the theme to slot into each place in the tossup; (b) be very willing to abandon the theme if you discover that certain structural roles cannot be satisfied, while clinging to the theme.

My conclusion here is not that unthemed tossups are superior to themed ones, or vice versa. But the grounds for preferring one type to the other a priori--rather than evaluating examples of each type individually--are purely aesthetic. If you know what you're doing, you should be able to write/edit both types well. Both unthemed and themed tossups have a necessary place in quizbowl if we're going to pursue interesting facts worth hearing about, and keep things creative: the former because not all of the knowledge worth knowing and rewarding can be forced into a "coherent" narrative; the latter because when such narratives are viable, they can reveal interesting connections between facts that an unthemed tossup would not.
John Lawrence
Yale University '12
King's College London '13
University of Chicago '20

“I am not absentminded. It is the presence of mind that makes me unaware of everything else.” - G.K. Chesterton
User avatar
vinteuil
Auron
Posts: 1454
Joined: Sun Oct 23, 2011 12:31 pm

Re: What is the purpose of a first line?

Post by vinteuil »

As a very vocal advocate of themed tossups, I would like to 100% cosign with John's post. Put in all the good clues that fit the theme and then fill the rest of the tossup with other good clues instead of trying to force it. (This is as somebody who has rightly been called out for doing the latter before.)

Andrew himself is supporting such an approach when he describes using late clues only thinly related to Calhoun's writings to ensure good second-half playability.

On the other hand, such an approach isn't abandoning the theme; it's simply moving the tossup a little bit away from the "themed" pole of a themed-unthemed continuum. I personally prefer questions that are "as themed as reasonable," since I think it usually improves playability (when transparency isn't a problem): a tossup on Hardy from his poetry draws from a headspace that is (I think, for most people) more well-defined and "contiguous" than one that, say, alternates clues about poetry and novels.

However, I don't think this playability bonus is particularly big, so this really is more of a matter of aesthetic preference. I will note (credit to Alston for bringing this up) that the classic defense that themed tossups "make new/interesting connections" isn't what I have in mind with this preference; while I admire the creativity of tossups like "'nothing' in Shakespeare," my (and I think Andrew's) default when thinking about "themes" or "coherence" is more along the lines of "tossing up a country with clues [mostly] from one time period," "tossing up an artist [mostly] from works in one genre," "tossing up a basic quantity like "charge" with clues [mostly] from a single scientific subfield," etc. etc.
Jacob R., ex-Chicago
User avatar
John Ketzkorn
Wakka
Posts: 197
Joined: Tue Oct 08, 2013 2:54 pm

Re: What is the purpose of a first line?

Post by John Ketzkorn »

ThisIsMyUsername wrote: Thu Mar 07, 2019 7:27 pm

(5) Contra Andrew Hart, I think themed tossups are more prey to structural problems than unthemed tossups are. When you are writing an unthemed tossup, you are controlling for one factor only: buzzability. When you are writing a themed tossup, you are controlling for two: buzzability and thematic cohesiveness. The latter is an additional constraint, and controlling for it on top of for controlling buzzability is harder not easier. In order for a themed tossup to work, you have to either (a) know going into the research phase that there is going to be enough difficulty-appropriate information for the theme to slot into each place in the tossup; (b) be very willing to abandon the theme if you discover that certain structural roles cannot be satisfied, while clinging to the theme.
To add further evidence to John's point here -- I find that many writers (especially novices) who've been told that they have the option to theme a tossup (or believe themed tossups are better than unthemed tossups) will let their buzz distribution suffer for the sake of keeping the tossup themed. Now, more experienced writers may know to pick a theme that's easy to write a buzz distrubition around, but this certainly plagues amateur writers. Here's my own correspondence with John over the Karl Moor tossup I submitted to CO 2016 that used the "theme" of quotes from the play (which was an especially terrible idea given the work isn't written in English).
My philosophy on literature questions is basically this: you want to produce a good distribution of buzzes across the question. In the beginning of the tossup, you should be distinguishing between kinds of "real knowledge." One can define "real knowledge" very restrictively to mean solely "having read the books in question," or one can define it more broadly to mean any sort of understanding of a book's contents that goes beyond mere buzzpoint character names / titles. At some point in most tossups, though, in order to produce a good distribution of buzzes one cannot continue to test only real knowledge and must begin distinguishing between levels of non-"real" knowledge: packet knowledge, basic facts.
...
The Karl Moor tossup makes the latter error, I think, but was easy to fix on those grounds: it spends too much time on real knowledge. Why are you still using quotations in the last couple of sentences in this tossup?

Obviously, not everyone makes this mistake, but unthemed tossups are a perfectly reasonable approach to writing tossups with a good buzz distribution and should certainly be the default. An unthemed tossup's first line will still (hopefully) make sense with respect to the answerline. In the hypothetical Charles Dickens tossup, if the first line is a quote, scene or critique that's very Dickensian -- when someone buzzes later in the tossup, they'll hopefully feel more confident in their buzz based on that first line.

This brings me to a point I was tentatively considering bringing up here which is that first lines don't *have* to be buzzable. A first line can simply provide appropriate context and as a means for players to narrow down answer lines. They can also be memorable / funny stories that are worthy of passing on -- not everything in quiz bowl has to be so serious that we can't find out about great stories (like the time George III shook hands with a tree that he thought was the King of Prussia -- which makes *complete* sense with respect to an answerline on George III).

My final point being said -- there is good merit to capping the length of tossups so we don't end up with egregiously long tossups that write a tournament's length into the night. Editors / writers should hopefully use the discussion here to consider cutting first lines of the "same difficulty" of their second line -- especially if the first line provides little in the way of context or as a means of narrowing down the number of answers.
Michael Etzkorn
Illinois Mathematics and Science Academy '16
UIUC '21
User avatar
Cheynem
Sin
Posts: 7219
Joined: Tue May 11, 2004 11:19 am
Location: Grand Rapids, Michigan

Re: What is the purpose of a first line?

Post by Cheynem »

I think the idea of "themed" and "unthemed" is primarily an aesthetic one that actually might be better for a different thread. I will say, in terms of talking about first lines, I would suspect sometimes, as Andrew says, that "theming" a tossup might make the first line appear more relevant and provide more context (while also being easier to judge difficulty), but sometimes I suspect the converse might also happen.

To use the Calhoun example, that's a very good tossup--I applaud the aesthetic conventions in constructing the question. And indeed, Andrew is correct--the theme of the tossup made the clue ordering pretty simple. In particular, for higher difficulties, that could be pretty useful. But I am uncertain if all aspects of Calhoun's life nicely lend themselves to such themed ordering of clues--would a tossup focusing on Calhoun the VP have such an inherently important lead-in or would the search for a "themed" lead-in result in what we're trying to avoid--something too hard or banal? (To be clear, there is presumably a good lead-in for Calhoun the VP, and I assume Andrew or me or whoever would not choose to theme a question that would be inherently difficult to construct in that way.) I also suspect that some themed tossups are in fact too easy to construct the clue order--the lead-ins are so obviously harder and less famous that they become unbuzzable or rarely buzzed on (this is a common analysis of tossups on "events," for example, some of the most themed tossups around).

I also think John is correct, though--if you have a strong knowledge of the clues involved and the topic at hand, theme or no theme, you should have a pretty good understanding of how to order the clues. Furthermore, I would gently note that while a themed tossup is wonderful at times from a "that's clever!"/aesthetic viewpoint or even sometimes from an academic viewpoint (such as Andrew's Calhoun tossup), theming a tossup doesn't mean you're going to reward how people learn the material. A legal focused scholar would presumably look at Calhoun in that manner, but me, a more generalist historian, learn lots of different facets of Calhoun's life. Indeed, when I taught survey courses, pointing out how Calhoun keeps reappearing like some evil Forrest Gump (as a War Hawk, as VP, the tariff issue, etc.) amuses some students (or perhaps the fact that his hair gets worse). Sometimes, unthemed tossups, if done right, do a good job of showing the multifacets of a topic and how they interact, which could also presumably result in good, solid lead-ins (or bad, of course, if done wrongly).
Mike Cheyne
Formerly U of Minnesota

"You killed HSAPQ"--Matt Bollinger
User avatar
theMoMA
Forums Staff: Administrator
Posts: 5993
Joined: Mon Oct 23, 2006 2:00 am

Re: What is the purpose of a first line?

Post by theMoMA »

I wasn't clear enough at the outset of my rather long post that I wasn't advocating the one true method to write a tossup.

I didn't intend to suggest that rigorous themes were inherently better than questions that are more mixed in construction, which is why I tried to avoid the naming convention of "themed" and "unthemed," and I didn't intend to suggest that this was the only way to write a tossup. Rather, I wanted to walk through a method of tossup construction that I use occasionally (certainly not all the time) that provides a nice way to assess the relative notability of the lead-in compared to the other clues. I used the term "coherent" because coherence is the specific purpose that having an interrelated set of clues serves in the context of this discussion.

When discussing "hodgepodge" tossups, I was using that somewhat insulting term, not for every tossup that fails to adhere slavishly to some abstract notion of "coherence," but to describe an especially haphazard amalgamation of clues that have little relation to one another and are chosen seemingly at random (which, in my experience, was especially prevalent during the late 2000s/early 2010s, although it lives on today). I specifically didn't mean a good tossup; I meant one that falls short because the method of construction seems to be that the writer basically picked four rivers at random and then found a random clue about each. The point was not to denigrate a wide swath of good and interesting tossups that happen to be clued in a way that is more or less arbitrary but still hits the target--I write plenty of tossups like that myself--but to identify a particular kind of malformed tossup that is a receptive shell for the "funny or interesting but totally unbuzzable" lead-in of the kind that Bruce discussed in his post.

The rest of my post walks through the way I wrote a tossup whose lead-in I was fairly certain would be interesting and helpful to players for a very specific and repeatable reason. I shared this because I don't think I've shown my work for almost any of the questions I've written, and specifically, I think it could be helpful as a way of thinking about whether lead-ins are sufficiently grounded in what players know to be buzzable. One way to ensure you know that a lead-in is notable is to pick an appropriate answer for the tossup and build it coherently from the ground up so that you know pretty confidently where the lead-in sits relative to the other clues, both in the question and in the broader body of knowledge you're testing.

But that's just one way of doing it. You can, of course, rely on your intuition and understanding of what other players know and hit upon an appropriate difficulty structure based on feel and a little research. I write this way all the time, probably a majority of the time. I'm also old and have played and edited a ton of tournaments, and it wouldn't be helpful for me to say to someone who hasn't that they should just be able to figure out if a lead-in is helpful or not in the abstract. At the end of the day, I agree with John and Mike that there are many methods for writing a good question, and more specifically a good lead-in, and that each has its potential pitfalls as well, and I apologize if I implied otherwise above. I just wanted to talk about a way of thinking about lead-ins as grounded in the framework of a coherent tossup that may not have occurred to others and that may be helpful, especially to writers who are developing their feel and want to have a more objective way to think about the ordering of their clues.
Andrew Hart
Minnesota alum
User avatar
AKKOLADE
Sin
Posts: 15773
Joined: Thu Apr 24, 2003 8:08 am

Re: What is the purpose of a first line?

Post by AKKOLADE »

Can we get this thread split into the original purpose and this new subject of how tightly to theme a question?
Fred Morlan
University of Kentucky CoP, 2017
International Quiz Bowl Tournaments, CEO, co-owner
former PACE member, president, etc.
former hsqbrank manager, former NAQT writer & subject editor, former hsqb Administrator/Chief Administrator
User avatar
Auroni
Auron
Posts: 3145
Joined: Thu Nov 15, 2007 6:23 pm

Re: What is the purpose of a first line?

Post by Auroni »

To seriously address the original topic of the thread: Nobody is intentionally trying to create an unbuzzable leadin, they're making a best guess as to what might be knowable by an expert in the subject or by someone who has read widely, but that hasn't been touched upon by previous questions. If we eliminate that, and restrict quizbowl questions solely to the province of what's come up before, then we make this a worse game for all of the reasons outlined in this thread.
Auroni Gupta (she/her)
User avatar
Smuttynose Island
Forums Staff: Moderator
Posts: 614
Joined: Wed Oct 21, 2009 9:07 pm

Re: What is the purpose of a first line?

Post by Smuttynose Island »

setht wrote: Wed Mar 06, 2019 6:20 pm It seems reasonable (or at least plausible) to me that the distribution of buzzes on early clues is pretty much "y = x^2, but with a ~9-10 word delay while players process hard clues." But I would interpret the middle and late parts of the distribution a bit differently—my impression has been that most buzzes on middle/late clues are not of the "I recognized something 9-10 words ago but it took me a while to pull it" variety; they're much closer to "when I hear clue X, I immediately mash my buzzer and say Y." So I would argue that this graphic indicates that the middle and late clues really were harder than a "y = x^2" distribution. (There's probably still some lag on average between hearing a crucial middle/late clue and buzzing in correctly, but I would imagine that lag is less than the lag for processing early clues.)
I think that this is an interesting and worthwhile problem to try to quantify, especially if we care about the pragmatic value of first line clues. It's worthwhile because if we take y = x^2 as the ideal distribution for the number of buzzes that a clue generates, we need to do a little bit of work to make our real world data useful. Of course, this is because our real world data only tells us where someone buzzed, not what clue they buzzed on. If we can approximate how often a clue causes a delayed buzz, then we can modify Will's distribution appropriately to come up with an ideal distribution for the actual distribution of buzzes.

Here's my attempt at solving this problem. Let's have P(x) denote the proportion of instantaneous buzzes generated by the x "word" of a tossup. Reality dictates that P(0) = 0 and P(1) = 1. Starting with Seth's assumption that early clues produce fewer instantaneous buzzes than middle clues we want P(x) to be a strictly increasing continuous function that connects these two points. P(x) = x is a good candidate, but I don't think it really captures Seth's point that clues in the last two sentences almost always lead to instantaneous buzzes. Other monomials are going to have similar problems.

Instead what we probably want is some sort of manipulated version of the logistic function (or wacky polynomial). Here are two possible candidates: Image

To compute the red line, I assumed that 50% of buzzes at the end of the first line (x = .15) were instantaneous and that 95% of buzzes in the middle of the second to last line (x = .8) were instantaneous. To compute the green line, I assumed that 50% of buzzes at the end of the first line (x = .15) were instantaneous and that 90% of buzzes in the middle of the third line (x = .4) were instantaneous.
Daniel Hothem
TJHSST '11 | UVA '15 | Oregon '??
"You are the stuff of legends" - Chris Manners
https://sites.google.com/site/academicc ... ubuva/home
User avatar
Stained Diviner
Auron
Posts: 5085
Joined: Sun Jun 13, 2004 6:08 am
Location: Chicagoland
Contact:

Re: What is the purpose of a first line?

Post by Stained Diviner »

What is the purpose of a first line?

There are several purposes, many of which have been noted above.

That being said, the #1 purpose is to give the person in the match with the most knowledge on the topic a chance to buzz in. If there is data showing that a significant number of first sentences are not fulfilling that purpose in any matches, then that is important and useful data.
David Reinstein
Head Writer and Editor for Scobol Solo, Masonics, and IESA; TD for Scobol Solo and Reinstein Varsity; IHSSBCA Board Member; IHSSBCA Chair (2004-2014); PACE President (2016-2018)
Locked