2017 EFT: Thanks and General Discussion

Old college threads.
Locked
User avatar
naan/steak-holding toll
Auron
Posts: 2517
Joined: Mon Feb 28, 2011 11:53 pm
Location: New York, NY

2017 EFT: Thanks and General Discussion

Post by naan/steak-holding toll »

Hey everyone,

Thanks for playing EFT - I hope you all enjoyed it! Thanks to Richard for stepping aboard to edit literature again and taking on biology as well, to James for handling the rest of the science. Thanks to all of our writers as well - Alex Fregeau, Sameen Belal, Vasa Clarke, Eric Xu, Jack Mehr, and Lawrence Simon (the latter four all from the University of Virginia).

Please offer general comments in this thread.

I'll start with some personal thoughts - I think this year's set didn't initially come together as well for the first mirrors at Yale and Florida, but our end product ended up being of similar quality than last year - with a few improvements to boot (easier philosophy, fewer early title-drops in literature, more diversity in history writing styles across categories). I could attribute this to many different reasons, not least of which was the substantial real-life commitments of the head editors, but the one I think can really be learned from is the following: Have a variety of skill levels at your playtest mirrors, and ask some teams to break up if you need to achieve this. This year we didn't have a good ability to test for things that were too hard because every team in the field was scoring above 20 PPB.
Will Alston
Dartmouth College '16
Columbia Business School '21
User avatar
naan/steak-holding toll
Auron
Posts: 2517
Joined: Mon Feb 28, 2011 11:53 pm
Location: New York, NY

Re: Thanks and General Discussion

Post by naan/steak-holding toll »

Also - I'm going to go ahead and put out a call for editors for next year's set. I think this tournament plays a valuable role each fall and would hate to see it go away. That said, given my experience with time crunches this year, I don't think it makes sense for me to serve as head editor going forward - or at least, not until I plan to be back in school again. I would like to still contribute in the future, ideally as a subject editor and advisor.

If you'd be interested in head editing the tournament and you think you'll be able to put in the work needed to control difficulty, write fresh questions, and keep high standards - please contact me privately. Thanks!
Last edited by naan/steak-holding toll on Mon Oct 09, 2017 4:59 pm, edited 1 time in total.
Will Alston
Dartmouth College '16
Columbia Business School '21
User avatar
What do you do with a dead chemist?
Lulu
Posts: 26
Joined: Sun Nov 22, 2015 3:27 pm
Location: UK

Re: Thanks and General Discussion

Post by What do you do with a dead chemist? »

Generally this tournament seemed to play well at the UK site, albeit skewing harder than I'd anticipated based on the announced difficulty, particularly in the sciences, which seemed to have very low power counts compared to other categories and appreciably worse bonus conversion (although that might just have been my room). I only have two dodgy questions which I'll bring up in the specific question discussion thread shortly in addition to a couple of other minor quibbles.

Overall, the sets focus on easy answerlines with deep cluing seemed to be well executed, although I recall a couple of incidents of people buzzing and saying that they thought it could be one of two things, but they suspected one was too hard to come up as an answerline in this tournament.

I was moderating rather than playing, so it was slightly annoying to find a number of questions had grammatical mistakes which lead to me having to pause my reading to try and interpret what the question was trying to say (although this was only a small number of questions and I don't currently have access to the questions to get some examples).
Christopher Stern
Oxford 2014-18
Real life 2018-?
User avatar
naan/steak-holding toll
Auron
Posts: 2517
Joined: Mon Feb 28, 2011 11:53 pm
Location: New York, NY

Re: Thanks and General Discussion

Post by naan/steak-holding toll »

I'm interested to hear folks' thoughts on the difficulty. Generally, it seems to me that bonus conversions and power rates are mostly what we'd anticipated - we made revisions to a number of middle and easy parts and softened some of the poetry and drama tossups after the Yale site. However, power rates do seem lower for some of the more experienced quizbowlers (though not all, as shown by Itamar's excellent performance). If anything, this strikes me as positive, because it offers a challenge on the early clues for new players - but some less experienced players were also cracking double digits in the power count.

I think science has lower power rates and slightly lower bonus conversions than other areas across all sites. This strikes me as fairly normal - the sciences are not as easy to get points from remembering previous packets without putting some effort into studying, and it's generally recognized that it's easier to improve at most of the humanities. Some powers could possibly be more generous, but from the numbers it doesn't look like much is too out of whack.
Will Alston
Dartmouth College '16
Columbia Business School '21
User avatar
CPiGuy
Auron
Posts: 1072
Joined: Wed Nov 16, 2016 8:19 pm
Location: Ames, Iowa

Re: Thanks and General Discussion

Post by CPiGuy »

This tournament was a lot of fun to play. There was a lot of fun and interesting material presented without compromising accessibility, and there were almost no questions that I actively disliked. Thanks to all the writers and editors for putting this set together.
Periplus of the Erythraean Sea wrote:I'm interested to hear folks' thoughts on the difficulty. Generally, it seems to me that bonus conversions and power rates are mostly what we'd anticipated - we made revisions to a number of middle and easy parts and softened some of the poetry and drama tossups after the Yale site. However, power rates do seem lower for some of the more experienced quizbowlers (though not all, as shown by Itamar's excellent performance). If anything, this strikes me as positive, because it offers a challenge on the early clues for new players - but some less experienced players were also cracking double digits in the power count.
I think the bonuses were pretty consistent in difficulty - my team got 10 or 20 on most of them, and I don't really think there were very many where there didn't seem to be an easy part, or there seemed to be two hard parts. I do think that the power count might have been deflated somewhat -- I think Chicago A got five powers playing against us, and most of our rounds saw zero or one power per round. The tournament did seem to be pretty consistent with where power was handed out, though, so I don't think questions being somewhat more challenging to power is inherently bad.
Conor Thompson (he/it)
Bangor High School '16
University of Michigan '20
Iowa State University '25
Tournament Format Database
User avatar
Ciorwrong
Tidus
Posts: 696
Joined: Fri Dec 20, 2013 8:24 pm

Re: Thanks and General Discussion

Post by Ciorwrong »

CPiGuy wrote:I think the bonuses were pretty consistent in difficulty - my team got 10 or 20 on most of them, and I don't really think there were very many where there didn't seem to be an easy part, or there seemed to be two hard parts. I do think that the power count might have been deflated somewhat -- I think Chicago A got five powers playing against us, and most of our rounds saw zero or one power per round. The tournament did seem to be pretty consistent with where power was handed out, though, so I don't think questions being somewhat more challenging to power is inherently bad.
Saying "seemed pretty consistent with where power was handed out" is a pretty empirically false claim as you can see by the stats. Some tossups were almost never powered and some were powered a ton. It's possible that power was in the same spot in the question (eg. 4 lines or 60 words in), but this doesn't mean much when it's a lot easier to find good clues for some answerlines than others. I don't think power marking is ever really a big deal in a set and I think bitching about it is a waste of time, but when you have new or roundabout answerlines, typically you will get depressed power numbers. I think I also saw that the power numbers for the most important work in 18th century philosophy The Critique of Pure Reason were low. That's fine and I thought the clues in that tossup were really good even though I chickened out and didn't buzz as early as I had wanted to.

Saying bonuses were "pretty consistent in difficulty" may feel correct, but again, we can only see how accurate this claim is by comparing variability in the stats to other tournaments. I thought this tournament was fine, but I just wanted to illustrate that the stats are super helpful with proving or disproving claims like these.
Harris Bunker
Grosse Pointe North High School '15
Michigan State University '19
UC San Diego Economics 2019 -

at least semi-retired
User avatar
naan/steak-holding toll
Auron
Posts: 2517
Joined: Mon Feb 28, 2011 11:53 pm
Location: New York, NY

Re: Thanks and General Discussion

Post by naan/steak-holding toll »

Progcon wrote:
CPiGuy wrote:I think the bonuses were pretty consistent in difficulty - my team got 10 or 20 on most of them, and I don't really think there were very many where there didn't seem to be an easy part, or there seemed to be two hard parts. I do think that the power count might have been deflated somewhat -- I think Chicago A got five powers playing against us, and most of our rounds saw zero or one power per round. The tournament did seem to be pretty consistent with where power was handed out, though, so I don't think questions being somewhat more challenging to power is inherently bad.
Saying "seemed pretty consistent with where power was handed out" is a pretty empirically false claim as you can see by the stats. Some tossups were almost never powered and some were powered a ton. It's possible that power was in the same spot in the question (eg. 4 lines or 60 words in), but this doesn't mean much when it's a lot easier to find good clues for some answerlines than others. I don't think power marking is ever really a big deal in a set and I think bitching about it is a waste of time, but when you have new or roundabout answerlines, typically you will get depressed power numbers. I think I also saw that the power numbers for the most important work in 18th century philosophy The Critique of Pure Reason were low. That's fine and I thought the clues in that tossup were really good even though I chickened out and didn't buzz as early as I had wanted to.

Saying bonuses were "pretty consistent in difficulty" may feel correct, but again, we can only see how accurate this claim is by comparing variability in the stats to other tournaments. I thought this tournament was fine, but I just wanted to illustrate that the stats are super helpful with proving or disproving claims like these.
I think that, if powers per tossup follows an approximately normal distribution and this is consistent across category areas, one can generally conclude that the editors had a fairly consistent power-marking philosophy. Essentially, a power mark is a demarcation that combines two things - 1) where the editor thinks people will start to buzz a lot more frequently, and 2) where the editor thinks people "deserve" a power. If this guessing is being done with a consistent "philosophy" in mind, then the variance should be mainly attributable to other sources, such as:

1) Some inconsistency in cluing, in some cases attributable due to some clues coming up more times than others
2) Sometimes, the field just knows more about one topic than the other. To take one example, the tossup on Crime and Punishment hasn't been altered much, but it had a lot more powers at the Yale site than at some other sites.

Assuming that editors are philosophically consistent with how they make their demarcations, and not making radical mis-assessments of the knowledge pool in any consistent manner, then I stipulate that variance in power rates is probably going to roughly follow a normal distribution, and this should be further true when you break down to the large category level (subcategories have a smaller sample size).

Across most of our sites, there doesn't seem to be much radical variance in power rates and bonus conversions across categories and writers - aside from James, who wrote mainly science. When there is, it usually seems some relatively easy-to-guess qualitative reason (the Canada and Stanford EFT sites both had a lot of strong fine arts players in attendance, the UK teams aren't great at US history, Myers/Chris Ray/Jason Zhou were all beating up on history at Michigan) is behind things. This suggests to me that we produced a tournament that was relatively even over the distribution.
Last edited by naan/steak-holding toll on Mon Oct 23, 2017 6:00 pm, edited 2 times in total.
Will Alston
Dartmouth College '16
Columbia Business School '21
User avatar
Ciorwrong
Tidus
Posts: 696
Joined: Fri Dec 20, 2013 8:24 pm

Re: Thanks and General Discussion

Post by Ciorwrong »

Periplus of the Erythraean Sea wrote:
Progcon wrote:
CPiGuy wrote:I think the bonuses were pretty consistent in difficulty - my team got 10 or 20 on most of them, and I don't really think there were very many where there didn't seem to be an easy part, or there seemed to be two hard parts. I do think that the power count might have been deflated somewhat -- I think Chicago A got five powers playing against us, and most of our rounds saw zero or one power per round. The tournament did seem to be pretty consistent with where power was handed out, though, so I don't think questions being somewhat more challenging to power is inherently bad.
Saying "seemed pretty consistent with where power was handed out" is a pretty empirically false claim as you can see by the stats. Some tossups were almost never powered and some were powered a ton. It's possible that power was in the same spot in the question (eg. 4 lines or 60 words in), but this doesn't mean much when it's a lot easier to find good clues for some answerlines than others. I don't think power marking is ever really a big deal in a set and I think bitching about it is a waste of time, but when you have new or roundabout answerlines, typically you will get depressed power numbers. I think I also saw that the power numbers for the most important work in 18th century philosophy The Critique of Pure Reason were low. That's fine and I thought the clues in that tossup were really good even though I chickened out and didn't buzz as early as I had wanted to.

Saying bonuses were "pretty consistent in difficulty" may feel correct, but again, we can only see how accurate this claim is by comparing variability in the stats to other tournaments. I thought this tournament was fine, but I just wanted to illustrate that the stats are super helpful with proving or disproving claims like these.
I think that, if powers per tossup follows an approximately normal distribution and this is consistent across category areas, one can generally conclude that the editors had a fairly consistent power-marking philosophy. Essentially, a power mark is a demarcation that combines two things - 1) where the editor thinks people will start to buzz a lot more frequently, and 2) where the editor thinks people "deserve" a power. If this guessing is being done with a consistent "philosophy" in mind, then the variance should be mainly attributable to other sources, such as:

1) Some inconsistency in cluing, in some cases attributable due to some clues coming up more times than others
2) Sometimes, the field just knows more about one topic than the other. To take one example, the tossup on Crime and Punishment hasn't been altered much, but it had a lot more powers at the Yale site than at some other sites.

Assuming that editors are philosophically consistent with how they make their demarcations, and not making radical mis-assessments of the knowledge pool in any consistent manner, then I stipulate that variance in power rates is probably going to roughly follow a normal distribution, and this should be further true when you break down to the large category level (subcategories have a smaller sample size).

Across most of our sites, there doesn't seem to be much radical variance in power rates and bonus conversions across categories and writers - aside from James, who wrote mainly science. When there is, it usually seems some explicable reason (the Canada and Stanford EFT sites both had a lot of strong fine arts players in attendance, the UK teams aren't great at US history). This suggests to me that we produced a tournament that was relatively even over the distribution.
I don't disagree with anything you said, but I just thought the response I was replying to was not very rigorous. In the era of stats and power numbers, I think we can stop saying a set "feels" a certain way when we can just go and see from the stats if that's the case. You are probably right on the normality assumption, but the variance grows for the smaller categories as you mention.
Harris Bunker
Grosse Pointe North High School '15
Michigan State University '19
UC San Diego Economics 2019 -

at least semi-retired
User avatar
naan/steak-holding toll
Auron
Posts: 2517
Joined: Mon Feb 28, 2011 11:53 pm
Location: New York, NY

Re: Thanks and General Discussion

Post by naan/steak-holding toll »

I generally agree that motivated reasoning, evidence-less complaints, etc. are usually defeated by hard numbers. However, even looking at the hard numbers, you can only go so far because - again - some fields just know more about some topics / areas than others.

An example of this: We had an in-team debate we had about changing an easy part on Maxwell's equations. One person brought up the fact that one site had around a 50% conversion for that bonus part, which goes a bit beyond variance, as clearly more than 10% of teams did not know that, etc. In the end, we decided that we couldn't go on that alone - Maxwell's equations there serves mainly as a test of "did you ever cover basic E&M in physics class." We concluded that, despite the numbers, the bonus part should stay because

1) Maxwell's equations, in the estimation of every team member who had taken physics in high school, was a fine easy part
2) If we were to logically apply this numbers-based reasoning to everything, we'd more likely end up with "find your ass" easy parts every time there was an anomale than actually fixing systemic problems with how the bonuses evaluate knowledge; some subjective standards ultimately have to be applied. Indeed, this is part of what makes writing good quizbowl tournaments so difficult, and why the fact that most tournaments these days are quite good is a real testament to the accumulation of human capital within the active writing population.

As with many things, a careful balance must be struck between quantitative and qualitative analysis.
Will Alston
Dartmouth College '16
Columbia Business School '21
Locked