Concern for HS Regs+ Sets this Season

Krik? Krik?! KRIIIIK!!! · Tue Feb 28, 2023 10:31 pm

EmilianoZapata wrote: ↑Tue Feb 28, 2023 3:52 pm Hello everyone!
I am excited to announce the release of SCURVY (Scholastic Competition of Utterly Ridiculous Vapid Yahoos), a Regs+ set ready for mirroring in April 2023.

Hi Aiden - congrats to you and the Hoover team for putting together this set! I love to see new writers come into their own and produce something new for the community to enjoy. I wanted to make this thread to address some specific concerns regarding your set and the other Regs+ sets planned for this spring.

As of today, I see five tournaments including yours at Regs+ difficulty or otherwise marketed as "Nationals preparation."

- SCURVY
- Maryland Spring Classic
- DMA
- Harvard "Fall" Tournament
- Prison Bowl

Based on the announcement threads, I see one tournament using any of these sets thus far - DMA at Charter Cup at Yale (2/25). I imagine these sets will have housesites and at least a few other sites pop up though. I am concerned that four sets of similar difficulty all competing with each other over the span of about two and a half months with things like Spring Break, AP Testing, finals, and other end-of-the-year chaos let alone Nationals preparation will mean that these sets - which have taken a lot of hard work to put together - will not be mirrored much, and if so, will have a small amount of dedicated teams playing them. That equals a pretty poor reward on investment in my opinion.

I think its also problematic that disregarding NAQT, there will be seven Regs+ housewrites (Scottie, DART, SCURVY, DMA, Maryland Spring Class, Prison Bowl, and Harvard "Fall") compared to four advertised Regs (IQBT #1 and #2, KICKOFF, and SHOW-ME) and four novice sets (SCOP Novice, KICKOFF Novice, SANDS, and MAKEMAKE). The vast majority of high school quizbowlers fall in the range of teams that can do OK at Regs difficulty, but then struggle to get over the last lining TU's and 10'ing bonuses gap at Regs+. I get that writers don't want to write another Mark Twain TU for Regs and want some more freedom to explore more difficult topics...but I think part of the fun part of Regs sets is exploring those things with early clues or bonus ideas - that was my philosophy with SHOW-ME anyways. Frankly, I don't think quizbowl at large needs 6 Regs+ sets in a year.

And again, if clubs want to housewrite Regs+, more power to you. But the issues end up being:

- Lack of good, Regs tournaments for the majority of teams across the country
- Smaller fields and lesser sites which in turn leads to very poor pay
- A bigger disparity between the "good" teams that can take on Regs+ and then everyone else.

Coming back to SCURVY itself, what other projects have members of your team worked on? Regs+ is a difficult place to start with in terms of writing. How much is Eric Mukherjee involved? Have you or Midhun head-edited sets before? I ask because a Regs+ set announced only a month before its start date with an almost completely new team of writers would make me nervous about the quality as a tournament director.

A few years ago, I had to take the leap-of-faith when it came to head-editing, and I learned a lot. But I really recommend starting at Regs difficulty. It's very standard difficulty, topics, and clues that has less vulnerability to overshooting difficulty.

My advice would be to delay this set so it gets played through next year and to turn down its difficulty to Regs. That shouldn't be that hard though by tweaking answerlines or bringing down middle and hard parts in bonuses in difficulty. I would be happy to help answer any questions you had about set logistics, and encourage others to share what they think in this thread.

Note: Edited to Add Prison Bowl, SANDS, and MAKEMAKE as well as housesites.

Subotai the Valiant, Final Dog of War · Tue Feb 28, 2023 11:43 pm

On my end, I plan on mirroring DMA for the upcoming fall in regions that have yet to play it, as I would of course like as many teams as possible to actually play the set considering the amount of effort I put into it. I'm not sure if there are any further tournaments of this difficulty being planned (I heard there was a potential DMA mirror in Illinois, but that may not occur).

My decision to write Regs+ was one partially forced by the existing questions I had (questions I had written for a potential BHSAT, which is always Regs+) and by the fact that ultimately, if I was going to do one final, major high school quiz bowl writing project as an individual in my last year before graduating, I wanted to fully explore the breadth and depth of topics that I wanted to ask.

Semi-related note: If anyone reading this is concerned about the quality or completeness of DMA, I'm around 90% complete with the packets for the Yale mirror on 3/11 and have received high praise about the creativity and thoughtfulness of the questions I've playtested.

(Also, Hunter is writing and running Prison Bowl in addition to the sets mentioned above)

EmilianoZapata · Post by **EmilianoZapata** » Wed Mar 01, 2023 12:44 am

Thank you so much for voicing your concerns surrounding our set and community at large! We genuinely appreciate the thought and rationale behind your comments, so we’ll try to respond to them as best as possible below:

Coming back to SCURVY itself, what other projects have members of your team worked on?

While we did have limited previous experience writing tossups such as writing for QWIZ, this is definitely a new experience for us. So, we’ve sought to approach it with an open mind and have learned (and will continue to learn) along the way.

Have you or Midhun head-edited sets before?

No.

How much is Eric Mukherjee involved?

Eric has agreed to look over the questions, but does not have an editing role

Frankly, I don't think quizbowl at large needs 6 Regs+ sets in a year.

And again, if clubs want to housewrite Regs+, more power to you. But the issues end up being:

- Lack of good, Regs tournaments for the majority of teams across the country
- Smaller fields and lesser sites which in turn leads to very poor pay
- A bigger disparity between the "good" teams that can take on Regs+ and then everyone else.

Your comments surrounding the saturation of Regs+ sets in our community are valid, but as we consider the unique value of regular content, it feels misguided to “disregard NAQT” in our consideration of the prevalence of various set difficulties at tournaments. Given how ubiquitous NAQT’s questions are to the tournaments attended by the vast majority of teams, we feel that providing unique, more expansive content serves to add variety and balance to the existing question base.

Our original decision to write Regs+ was rooted in a concern that teams which are only ever exposed to regular set content are restricted in their scope, awareness, and/or appreciation of quizbowl. Thus, we do plan to stick to the current setup of our set, though we will certainly consider the unique value of regular sets as we look to the future.

I ask because a Regs+ set announced only a month before its start date with an almost completely new team of writers would make me nervous about the quality as a tournament director.

For the sake of clarity, our team has been steadily working on the set since early November. The set is almost complete, with much of the remaining work involving editing and content selection. In addition, we're hoping to playtest SCURVY before release to determine relative difficulty and performance.

--

Ultimately, we do not plan on producing a sub-par product, and we’re working hard and with intention to ensure that we avoid many of the common pitfalls you’ve described. With that being said, we understand that overcoming such challenges involves being open to feedback and criticism, especially from experienced writers such as yourself, so we’d like to thank you again for your post and hopefully continue to communicate as to how we can produce the most valuable and accessible content for our community at large!

Krik? Krik?! KRIIIIK!!! · Wed Mar 01, 2023 11:23 am

That's great to hear how far along the set is and that you all have a plan for playtesting and editing. I wanted to bring up the experience factor because of how much I struggled my first time head-editing and how really important it is to get a lot of experienced eyes on the set. I would definitely push that public playtesting portion as much as possible.

EmilianoZapata wrote: ↑Wed Mar 01, 2023 12:44 am Your comments surrounding the saturation of Regs+ sets in our community are valid, but as we consider the unique value of regular content, it feels misguided to “disregard NAQT” in our consideration of the prevalence of various set difficulties at tournaments.

I think its important to recognize that there are fundamental differences between NAQT sets and traditional housewrites, including distribution and length of questions - not to mention content: the writers at NAQT write differently that your specific team or any other team would. NAQT really is the backbone for most circuits around the country because of how many accessible novice and Regs sets they provide.

My point is more so that I don't think that NAQT should be the only producer of HS Regs questions whereas the majority of HS housewrites (now that I added Prison Bowl) are Regs+. I discuss this more in the last paragraph of this post.

EmilianoZapata wrote: ↑Wed Mar 01, 2023 12:44 am Given how ubiquitous NAQT’s questions are to the tournaments attended by the vast majority of teams, we feel that providing unique, more expansive content serves to add variety and balance to the existing question base.

As I said in my post, I think you can very easily explore new content or introduce new things at a Regs difficulty. Ways to do this include:
- Theming tossups
- Early clues in tossups
- Bonus lead-ins
- Bonus hard parts
- Bonus themes
You can still be creative and find a way to make fun stuff work without going to a higher difficulty.

EmilianoZapata wrote: ↑Wed Mar 01, 2023 12:44 am Our original decision to write Regs+ was rooted in a concern that teams which are only ever exposed to regular set content are restricted in their scope, awareness, and/or appreciation of quizbowl. Thus, we do plan to stick to the current setup of our set, though we will certainly consider the unique value of regular sets as we look to the future.

I agree that perhaps the best aspect of housewrites is that they have more freedom to go into new or unexplored topics or insert their own flair on the subject. You do not need to be Regs+ to do this.

Let's say you have two players, Mike and Molly, at a Regs tournament. Mike is a senior and has been playing since freshman year. Its Molly's first tournament on the other hand. There's a TU on Leonardo da Vinci. Mike firstlines it because he remembers the clue from a bonus last year. Molly buzzes at the giveaway in her room because she knows about the Mona Lisa. Clearly two different skill levels. This should not be interpreted as a need to write more Regs+ sets so that Mike has more of a challenge on Leonardo da Vinci questions. Sure, it may reward players who scale higher, but it punishes those who haven't heard of current early and middle clues, let alone bumping the difficulty up. The tenth time someone has heard a tossup on Leonardo da Vinci is the first time that someone else has ever heard a question on that topic. Good players deserve to buzz early on content, and I don't think the difficulty needs to be made harder to try and get out of Regs.

I'm in Missouri, whose hundreds of active teams vary from bottom bracket teams to those that will clobber some people at Nationals. We have really struggled with sets this year because the vast majority of teams will struggle with Regs+. If you had 48 teams at a tournament with #1 being the clear best and #48 being the clear worst, the top 8 or 12 teams would scale up to varying degrees on Regs+...but the rest of the teams are going to struggle.

We can have good competition on Regs sets. We can have new and interesting content in Regs sets. I think more teams are going to want to come back and will get an appreciation from quizbowl on a Regs tournament they can go about 250 points per game on and end 3-7 rather than a Regs+ tournament that they don't break 100 points on and don't win during the day.

Krik? Krik?! KRIIIIK!!! · Sun Apr 30, 2023 12:00 am

I'd like to applaud the Maryland Spring team for postponing their set to avoid Regs+ overlap this season. I wish the best of luck to the SITH team as well.

I'm again frustrated that for every Regs set that gets announced, there is another Regs+ set announced. With no exaggeration, this is killing the high school quizbowl community. The difficulty creep is crushing middle-of-the-ground teams and widening the gap between the average and the "good." The high number of Regs+ sets are all targeting a market ten times smaller than that of a Regs set. The amount of time, work, and energy that goes into these sets is going to be consistently undercut by each other.

As a state organizer of tournaments, I am having trouble finding sets that are appropriate for our events and won't exceed the difficulty that regional teams can handle, would enjoy, and would thrive on.

For that reason, I'm putting together an HS Regs set for next year. No upconversions, no frills, none of it. If you're in a developing region for quizbowl or are tough on cash, I'll give you it for free or a discount to play. If you're interested in writing or editing, reach out to me. Expect a post this week.

Cheynem · Post by **Cheynem** » Sun Apr 30, 2023 12:32 am

I would like to applaud Ganon's post here. I think the creation of HS Regs+ sets is based entirely out of good intentions, but I suspect a lot of their creators really do not understand just how difficult and inaccessible such sets can be for a ton of regions, circuits, and teams.

quizbowllee · Post by **quizbowllee** » Thu May 04, 2023 10:30 am

I'm happy to see this conversation taking place. Also, thank you, Ganon, for producing a Reg-level set for next year. I'm hoping that there will be several more available. I know that the head of our local league has struggled over the past several years to find accessible questions for our league matches. West Point is the only team in the county that plays regularly outside of local events. Reg-level sets already give a lot of the teams a hard time. Regs+ is out of the question. I - like many coaches around the country - am desperately trying to get more teams to increase their involvement. But, when local matches end in scores like 80-50, the kids aren't having fun and are turned off to competition. The circuit needs a consistent flow of Regs and Novice level sets in order to survive.

Subotai the Valiant, Final Dog of War · Thu May 04, 2023 11:45 pm

As someone who's head edited both a highly successful regs set and multiple regs+ sets, I'll step in and say two things, one of which is considerably more controversial than the other:

1. There definitely should be more regs sets than regs+ sets, and there are probably too many regs+ sets for the upcoming season. But
2. As of 2023, for whatever reason, housewrite difficulty has ON THE WHOLE gone lower than IS set dififculty. In fact, the standard regs+ difficulty is not appreciably harder than IS sets anymore, and the IS to housewrite regs+ difficulty jump is the lowest it has ever been. This is supported by the excellent adjustments made by Groger Ranks.

I'm not sure of the reason for point 2 having occurred; it may be that IS sets have had a slow difficulty creep, or that NAQT being less "canonical" affects how consistently middle to top teams are able to maximize their points. But the fact remains. IS sets were about 1 PPB or more harder than regs housewrites the past year. DART, the primary regs+ set of the year (and indeed, the only one for much of the year), was only 0.23 PPB harder than the hardest IS set. That means a team on average is getting only one fewer bonus part every 14 full bonuses heard on DART than IS-217. From a quick glance, this seems supported in bonus stats from lower brackets as well, and the tossup answerlines in DART don't strike me as substantially harder than what would be found in an IS set; there may be a higher concentration of harder answerlines, but very little would be completely out of place.

In fact, I'll make the controversial point that while there are too many regs+ sets for the upcoming season, the ones that are being written are probably doing the worst of both worlds: writing at a stated difficulty that will turn off most teams, while not actually being the best difficulty for the top teams they're trying to reach and challenge. DMA was the first regs+ set to be unambiguously much harder than IS in the past two seasons. In my opinion, ideally, there should be one or two sets of "hard regs+" difficulty each year (instead of zero), one or two regs+ sets as currently defined (up to four sets total above regs), and everything else regs.

To me, there are two key questions to consider, which maybe other people can answer:

Empirically, and especially from people are IS sets getting harder than ideal for their intended purpose? I'm surprised that the gap between IS and regs+ has shrunk so much over the years since I played in high school, and I doubt it's entirely from regs+ writers decreasing their difficulty.
Will housewrites continue to keep their current vision of what "regs" and "regs+" difficulty is, and should we believe that they shall continue to be so well difficulty-controlled? Do teams want to keep playing this difficulty of regs+, or would teams prefer one or two high school sets to exist that are closer simulations of nationals?

[Personal addendum: I'd be happy to help out and write/edit for one of the regs sets next season once I see that I have the time to do so. Feel free to reach out to me about that in the fall.]

Stained Diviner · Post by **Stained Diviner** » Fri May 05, 2023 7:34 am

Daniel makes some good points. One thing that might help is if Groger posted a link to their adjustments whenever they posted a link to their rankings so that those adjustments get seen by more TDs. It probably would also help if they had an explanation of their adjustments on the same page as the adjustments. Groger is doing great work, and it would be good if more TDs were looking at the information Groger generates when TDs are deciding whether to host a tournament and which set to use.

dpeelen · Post by **dpeelen** » Mon May 08, 2023 5:10 pm

Krik? Krik?! KRIIIIK!!! wrote: ↑Sun Apr 30, 2023 12:00 am I wish the best of luck to the SITH team as well.

Hello all, I figured with the mention of SITH I could mention a few things. First off, thank you Ganon for bringing up this point. I don't check or post on the forums much and all of what you have been saying has been a concern of mine for a while now.

A large part of the reason I want to make this post is because I have agreed to edit literature and fine arts for SITH. With SITH being the one of two announced regular-difficulty housewrites announced so far (I believe, not counting KICKOFF), there are few things I want to be transparent about while also collecting feedback from more experienced members of the community.

Again I do not post often on here so for people who do not know, I wrote for NEWT two years ago, as well as SHOW-ME and DART III this past year primarily for literature and fine arts. This is, however, my first formal editing experience and I want to make sure things are done right. I know there were concerns with SCURVY's editors which hurt its credibility (for the record, I thought SCURVY was well-written beyond a few simple, fixable mistakes), so I don't want my position as an editor to have the same impact for SITH especially if it is one of the few regs sets this year.

With that in mind, I want to make sure that quality of the set, at least in my categories, hits a proper difficulty. Right now I am basing my idea of a regs difficulty off of LONE STAR or CALISTO 2 from years prior. Either way, I figure since regular-difficulty sets may be scarce for the 2023-2024 season I should ensure people know what is happening with SITH and are ok with it. Additionally, if people want to be a second reference on questions for SITH, feel free to reach out to me. I am sure Ganon's concerns are shared by several members of the community and as a result I do not want SITH to fail to fill the hole of regs sets for next year.

You can contact me at dapeelen [at] gmail [dot] com, or on Discord at fugard#4534 with any questions or concerns, thank you!

Krik? Krik?! KRIIIIK!!! · Mon May 08, 2023 6:11 pm

dpeelen wrote: ↑Mon May 08, 2023 5:10 pm Again I do not post often on here so for people who do not know, I wrote for NEWT two years ago, as well as SHOW-ME and DART III this past year primarily for literature and fine arts. This is, however, my first formal editing experience and I want to make sure things are done right. I know there were concerns with SCURVY's editors which hurt its credibility (for the record, I thought SCURVY was well-written beyond a few simple, fixable mistakes), so I don't want my position as an editor to have the same impact for SITH especially if it is one of the few regs sets this year.

A thought:

A Very Good Moivie wrote: In many ways, the work of a critic is easy. We risk very little, yet enjoy a position over those who offer up their work and their selves to our judgment. We thrive on negative criticism, which is fun to write and to read. But the bitter truth we critics must face, is that in the grand scheme of things, the average piece of junk is probably more meaningful than our criticism designating it so.

I think in general, quizbowl could afford to be a lot more grateful to the people make it happen. In this case, writing is a very thankless task. I write an HS Regs question in about 20-30 minutes average and then it has to be edited, playtested, edited again, proofread, and then edited again if there's any overlap elsewhere in the set! Throughout this process, writers are expected to choose good clues, make things clear to players, balance difficulty, and then proof for grammar on top of all things. It's a ton of work!

I applaud you Danny for stepping up and editing for the first time. If I wasn't clear enough if my first post, I'm very impressed with the SCURVY team to put a set together from their club - I know I definitely couldn't do that when I was in HS. My only concern there was making sure their efforts would be rewarded by having their set widely played. I also think that for new writers, HS Regs is a good starting point to learn the basic chops of writing.

In general, for as much criticism and praise questions will receive, at the end of the day, most quizbowl sets are lightyears ahead what they were 20 - even 10 - years ago. I do what I do for quizbowl now because I wouldn't have had the fun I did in high school without people volunteering to write, staff, and direct tournaments.

Everyone has to edit for the first time. Mine was ACF Fall 2019. I remember distinctly someone buzzing in first clue and begging because I structured the sentence poorly, and since then, I've always had a lot more care towards clue order to avoid situations like that. My first time head-editing was MCMT, and I really struggled with managing the set. Nobody starts writing or editing and is perfect: like anything, it takes a lot of time and dedication and experience to get confident and grow as an author.

My advice is to you and anyone else interested in writing is this: be open to feedback and find a team around you who can offer that feedback in a positive way. Not every question you write is going to be or start perfect, but that's what editing is for. Sometimes, you'll get loud voices criticizing your questions...but just remember that they aren't putting in a fraction of the work you are to make this activity playable for thousands of students across the country. Good luck!

Santa Claus · Post by **Santa Claus** » Mon May 08, 2023 6:48 pm

Subotai the Valiant, Final Dog of War wrote: ↑Thu May 04, 2023 11:45 pm 2. As of 2023, for whatever reason, housewrite difficulty has ON THE WHOLE gone lower than IS set dififculty. In fact, the standard regs+ difficulty is not appreciably harder than IS sets anymore, and the IS to housewrite regs+ difficulty jump is the lowest it has ever been. This is supported by the excellent adjustments made by Groger Ranks.

I'm not sure of the reason for point 2 having occurred; it may be that IS sets have had a slow difficulty creep, or that NAQT being less "canonical" affects how consistently middle to top teams are able to maximize their points. But the fact remains. IS sets were about 1 PPB or more harder than regs housewrites the past year. DART, the primary regs+ set of the year (and indeed, the only one for much of the year), was only 0.23 PPB harder than the hardest IS set. That means a team on average is getting only one fewer bonus part every 14 full bonuses heard on DART than IS-217. From a quick glance, this seems supported in bonus stats from lower brackets as well, and the tossup answerlines in DART don't strike me as substantially harder than what would be found in an IS set; there may be a higher concentration of harder answerlines, but very little would be completely out of place.

[...]

Empirically, and especially from people are IS sets getting harder than ideal for their intended purpose? I'm surprised that the gap between IS and regs+ has shrunk so much over the years since I played in high school, and I doubt it's entirely from regs+ writers decreasing their difficulty.

I found half of Daniel's post very reasonable - in particular, the idea that harder sets often are too difficult for most of the field and yet still too easy for the top teams. I think that the audience of such sets is perhaps not being adequately catered to.

I found that the other half (regarding the relative difficulty of IS sets and regs/regs+ sets) disagreed with my intuitions - I've excerpted the sections on this topic. Granted, I haven't read an IS set in some years, but even if I had I don't think I have an eye for high school difficulty anymore so I opted to put together some numbers instead.

I aggregated the PPB numbers from IS sets played this season (IS-213, IS-215, IS-217, and IS-219) using the NAQT statistics page. It felt like there were separate points being made about regs+ and regs difficulty housewrites, so I chose DART III and 2022 KICKOFF to represent each then manually scraped all the stats for every mirror that I could find - it ended up being roughly seven and thirteen sites respectively.

Data
Let's start with some graphs - forgive their quality, I made them in Google Sheets:
Here are the PPBs for teams that played the housewrites (DART and KICKOFF). I also included the power percentages because I already compiled the data, but I didn't subject them to any scrutiny beyond the surface level.

Here are the PPBs (and power percentages) for teams that played IS sets this season. Some of these may technically have also been played last season and some still might be used next season - I don't think it's particularly relevant to the discussion.

Here are the specific numbers, as well as (crude) box plots:

Code: Select all

          min   1st q   median  3rd q   max
DART III  5.33  9.27    13.08   18       26.15
KICKOFF   5     12.64   16.82   21.6025  26.84
IS-213    3.08  10.43   13.73   16.93    24.76
IS-215    0     9.09    12.4    16.025   25.41
IS-217    0     9.115   12.31   15.99    24.77
IS-219    0     9.5425  12.535  15.385   24.81

Some observations:

The drastically larger audiences of the IS sets have produced a much more bell-like curve than the housewrites, which appear bimodal.
The raw PPB data of DART III is on par with the IS sets, with the 3rd quartile, max, and min all being about a point higher.
The raw median PPB of KICKOFF is much higher than the IS sets, suggesting it is much easier.
There is about 1 PPB of variation between different IS sets that were played this season, with IS-213 clearly playing easier than the others based on the median, quartiles, and even minimum.

Point 1 agrees with my intuition on both the size and makeup of the audiences of IS sets vs. housewrites. It, along with points 2 and 3, makes me think that there is a strong selection bias in who chose to play the housewrites. Indeed, this appears to be the case:

Here I have used two criteria to divide the teams that played each set: 1) whether the team is in the top 100 in the most recent Groger ranking (as a proxy for both involvement in the community and overall skill level) and 2) whether the team was the "A" team (to distinguish Wayzata A from Wayzata C). Teams were then divided using these criteria into three categories:

A teams from schools on the top 100 list
B+ teams from schools on the top 100 list
teams from schools not on the top 100 list (everyone else)

I checked both of these using some rudimentary regex expressions - my check for the first criteria was very restrictive (meaning that there are certainly teams in the top 100 which I didn't catch in my analysis), but the second was simple enough. For instance, I definitely didn't catch any B teams in the top 100, but I'd say the overall error rate was <5% so it really doesn't make a huge difference in the analysis.

The major takeaway from this graph is two-fold:

There are many, many fewer teams playing housewrites than IS sets (about a quarter to a third on average)
At least half of all "top 100 teams" play each set (not necessarily the same ones, but that's probably irrelevant), meaning they comprise the majority (just under 70%) of the playerbase of housewrites

Experiment
At this stage I would like to propose the following hypothesis: The small gap in PPB between regs+ sets and IS sets (or housewrite sets and IS sets) can be largely explained by the significantly higher strength of the average regs+ (or housewrite) field. In other words, teams playing housewrites experience self-selection bias.

Unfortunately, I am not a statistimagician, nor do I have any knowledge of statistizardry. Instead, I have decided to participate in the time-honored tradition of "doing stuff and seeing what happens". My procedure:

I calculated the number of "top 100" teams that would have to not play to have the same fraction of "top 100" teams as a given IS set - call that n.
I randomly sampled 1-n teams from the set of "top 100" teams that did play to simulate this weaker field.
I then recalculated the median PPB.
I repeated this six times in total and looked at the median-of-medians to see what the effect was.

My idea is that this would use simulate a scenario where teams perform equivalently well but the national field has different characteristics, producing different aggregate statistics. I have literally no idea if there is any theoretical backing here to this method.

Anyways, here are the results - there were 88 teams removed from DART III and 100 removed from KICKOFF:

I also performed the converse experiment (removing 208 non-"top 100" teams from the stats for an IS set to replicate a housewrite):

Code: Select all

          base   1      2      3      4      5      6      m-o-m  
DART III  13.08  12.00  12.23  12.23  11.91  12.00  12.23  12.11
KICKOFF   16.82  15.61  15.34  15.61  15.61  16.02  15.61  15.61
IS-213    13.73  15.42  15.29  15.81  14.95  15.29  15.97  15.35
IS-215    12.4

Some observations:

Removing "top 100" teams did not have any impact on the bottom quartile of housewrite stats - likewise, removing non-"top 100" teams did not impact the top quartile of IS-213. This makes sense, since the designation of "top 100" is directly determined from performance, but it is also a reminder that there is also self-selection in the non-“top 100” teams.
Removing "top 100" teams lowered the PPB of DART III from being "solidly in the middle of the IS sets" to "a full PPB harder for the median team that played". I don't think that this is sufficient to "prove" my hypothesis, considering that this is an incomplete correction for the selection bias in playing housewrites, but I do think it's a useful reminder of the magnitude that it can have.
It appears that my hypothesis was not sufficient to explain the higher performance of teams on KICKOFF - it seems reasonable to say that it was substantially easier than many of the IS sets. There are, of course, many other factors.
The effect of removing non"top 100" teams from the field of an IS set increased the median PPB by almost two points - again, a reminder of the effect that a different audience can have on a set's perceived difficulty.

Conclusion
As stated in the beginning, I did this analysis because I didn't think that some of the broad statements that Daniel made would hold up to scrutiny - honestly I don't think this did a very good job of disproving them but our understanding of difficulty is very imperfect and I hope this toy example and the raw stats underlying it will be useful for calibrating our internal conceptions in the future.

This experiment doesn't account for a lot of things that I personally think are significant: distributional differences, precise consideration of who did and didn't play specific sets, the role that powers have in conceptions of difficulty, actual statistical methods, the effects of small sample sizes, the flaws in my analysis, and various others.

Here's a link to the stats I compiled: link. They are a mess - let me know if you can't tell what something is.

I would like to end with a plea that we stop using "Groger adjustments" as a meaningful quantifier of difficulty. While the math involved in deriving the numbers from tournament results is certainly sound and I think the actual ranking produced is very solid, the actual formula for adjustments is not based on anything other than intuition (Here's the doc: link). If someone wants to argue otherwise, be my guest.

EDIT: I misread (and also misthought) how the adjustments work; the formulas I was thinking of are used for computing the “score”. I do not think this significantly detracts from my point that there is much fixation on numbers produced by this one model without much introspection.

oriley · Post by **oriley** » Tue May 09, 2023 9:36 am

Some brief notes and opinions which may be of use.

1. As an editor for KICKOFF: a couple of local leagues which ran KICKOFF have not uploaded their stats yet. This may be skewing your data a bit, particularly when you remove top 100 teams.

2. While KICKOFF is the best approximation of what we think regs should be—we wrote it, after all—most other regs housewrites have leaned into difficulty creep and such things a bit more than we did.

3. As the coach of a high school team for most of this year: the contention that IS sets have simply become more difficult at a quicker rate than most housewrites is a point I agree with. If you go back and read IS-190 or so, the difference between now and even a couple of years ago is easy to see. The problem is this difference is easily observable in IS-A sets as well—see the PPB distribution for any novice tournament from this year that used an A set. I’m sure A sets will course-correct at some point, but to be clear, I don’t think this trend is a good thing.

4. NAQT is more willing to give out 0s and dead tossups than most housewrites. NAQT also sees fewer 30s, partly due to cross-category commonlinks. This skews attempting to visualize data a bit.

5. Housewrite directors are worse at getting people to play their mirrors. NAQT is the institution in most states, so teams will come back to X or Y tournament using IS-215 year after year. “2022 KICKOFF” is not an institution; teams have to be told what that is.

6. There is a significant difference in difficulty between something like KICKOFF or SHOW-ME, and something like Saturnalia. Regular-plus is essentially three tiny target difficulties crushed into one label: generalizing all of them is very difficult. As a tournament director, what I’m looking for is a set that will differentiate the teams in my tournament based on knowledge without too many annoying buzzer races. Regs+ does that for pretty much everyone outside the top three or four teams in the country. Similar conversations to these are veering toward “regs+ is too easy” territory, and I don’t agree with that claim. “Regs+” is an attempt to bridge the gap between regular and nationals difficulty. It’s okay if some of them are like HFT 2019 and some of them are like DMA.

When we say there’s a “widening gap” perceived between new teams and not-new teams, or elite teams and not-elite teams, there are two questions we need to answer. First, are there enough sets at lower difficulties to keep these teams playing quizbowl? Second, how does a team bridge the gap?

Nothing I said here should be that profound or anything, but hopefully this helps contextualize things.

Santa Claus · Post by **Santa Claus** » Tue May 09, 2023 12:02 pm

oriley wrote: ↑Tue May 09, 2023 9:36 am 3. As the coach of a high school team for most of this year: the contention that IS sets have simply become more difficult at a quicker rate than most housewrites is a point I agree with. If you go back and read IS-190 or so, the difference between now and even a couple of years ago is easy to see. The problem is this difference is easily observable in IS-A sets as well—see the PPB distribution for any novice tournament from this year that used an A set. I’m sure A sets will course-correct at some point, but to be clear, I don’t think this trend is a good thing.

Emphasis my own.

Here are the median PPBs for every team that played an IS-A set in the last five years.

Code: Select all

2022-23
220A - 14.51 n=210
218A - 14.12 n=446
216A - 14.19 n=523
214A - 14.38 n=448
212A - 14.55 n=582
       14.38 n=2209

2021-22
211A - 15.47 n=246
209A - 14.78 n=226
207A - 14.50 n=522
205A - 13.15 n=334
203A - 14.38 n=433
       14.5  n=1761

2020-2021
202A - 16.75 n=128
200A - 16.00 n=361
198A - 15.42 n=365
196A - 14.58 n=248
194A - 16.43 n=354
       16    n=1456

2019-2020
193A - 17.02 n=129
191A - 15.43 n=283
189A - 16.00 n=541
187A - 14.65 n=560
185A - 15.76 n=586
       15.76 n=2099

2018-2019
184A - 14.16 n=304
182A - 13.74 n=516
180A - 15.19 n=691
178A - 15.16 n=686
176A - 15.26 n=576
       15.16 n=2773

Some observations:

The median IS-A set this year played comparably to the median set last year.
Year to year, performance on the median IS-A set was steadily climbing before dropping significantly in the 2021-22 season.
The peak performance on IS-A sets coincided with the lowest number of teams playing, during a period of online-only play
The number of people playing IS-A sets took a noticable dip in the tail end of the 2019-2020 season and has slowly been recovering.
The median IS-A set from these last two years has teams performing roughly 1 PPB worse than pre-pandemic

In principle, IS-A sets have a different audience every year. This isn't exactly true (many of the top teams at any given IS-A set have played several before) but for the bulk of the field it's the case. This means you can't control for field strength when considering the effect of packet difficulty on something like PPB - how could you differentiate between a set that was 5% harder on average (which is more or less the scale of these PPB differences) and a cohort of freshmen that were 5% worse? If there were some sort of large event that shifted the population of players significantly, how could you distinguish between the effects of that change and any sort of intentional difficulty change?

Some other factors:

During the pandemic, only the most dedicated teams continued playing sets.
Top teams may play IS-A sets less frequently now (someone would have to check).
Difficulty relative to housewrites may have shifted due to changes within the "top 100" community that serves as both the main writers and main players of housewritten sets. I briefly mentioned this in my previous post, but I'll state more explicitly here that people (particularly high schoolers) significantly discount how big of an impact it has that NAQT writers write NAQT sets and high schoolers (and recent grads) write housewrites.

I don't want to act like it's impossible that IS-A sets have gotten harder. But I'd rather try to base this off of stats rather than the anecdata that so frequently is bandied as truth. So often I've heard players on top 100 teams saying things "That IS-A lead-in was impossible!" and my brain fills in an implicit "but I still knew it" or "but I wasn't playing, since I'm not a novice" - let's not take our experiences as representative.

Separately, I think we should be a bit more reasonable about the standards we’re using to evaluate sets. A good player not being able to power every tossup in a novice set (or, less hyperbolically, not being able to power specific tossups) does not mean it's become too hard, and it certainly does not mean it has become inappropriate for novices to start out on (which is what I would characterize as their primary purpose). The changes in overall difficulty are on the magnitude of single PPB - that’s one bonus part less per round for an average team. While unfortunate, this strikes me as less “the beginning of the end” and more “the beginning of the beginning of the end”.

The same argument was made for IS sets. The situation is certainly different and I haven't crunched the numbers, so I'll refrain from significant speculation. I'll just say that I think many of my arguments apply there as well.

Post by **Important Bird Area** » Tue May 09, 2023 2:47 pm

Thank you for posting these stats, Kevin- there's much here to think about, and I'll share these with NAQT's other editors. I may or may not have more to say later from NAQT's perspective (but anything substantive will wait until after MSNCT).

Subotai the Valiant, Final Dog of War · Tue May 09, 2023 3:00 pm

I would like to end with a plea that we stop using "Groger adjustments" as a meaningful quantifier of difficulty. While the math involved in deriving the numbers from tournament results is certainly sound and I think the actual ranking produced is very solid, the actual formula for adjustments is not based on anything other than intuition (Here's the doc: link). If someone wants to argue otherwise, be my guest.

EDIT: I misread (and also misthought) how the adjustments work; the formulas I was thinking of are used for computing the “score”. I do not think this significantly detracts from my point that there is much fixation on numbers produced by this one model without much introspection.

I'll read through the detailed stats when I get a chance, but I certainly agree with this point; I've questioned linear PPB adjustments for a long time because there's an upper bound on PPB. I used them because they were the easiest and most objective metric I had on hand, and the past two years of adjustments had begun to look quite different from previous years'.

I'm curious whether, if in some way we can tell that indeed NAQT and housewrite difficulty haven't actually diverged, there's any way of understanding why the adjustments have grown less reflective of true difficulty for non-elite teams over recent years. Trends in housewrite/NAQT writing styles have presumably stayed roughly the same over this span.

I also genuinely don't see the point in the dominant regs+ style being a small step above regs (some tournaments at this difficulty would be fine, of course, but it's not a good standard). It's simply not sufficiently hard to distinguish between top teams in many regions if they were to actually all play at a given site. Barrington got 26+ PPB and 12 powers per game on DART III against pretty solid competition, which are not numbers that bode well for matches between multiple teams of that caliber. Maybe the decrease in number of "premier" regional tournaments since when I played has something to do with top-100 teams not caring as much about playing a particular set over another, since no set is hard enough compared to the others to warrant becoming a de facto premier tournament.

Meanwhile, the label of "regs+" likely makes many median teams shy away from playing these sets and many hosts shy away from using them.

1992 in spaceflight · Post by **1992 in spaceflight** » Tue May 09, 2023 5:05 pm

oriley wrote: ↑Tue May 09, 2023 9:36 am 5. Housewrite directors are worse at getting people to play their mirrors. NAQT is the institution in most states, so teams will come back to X or Y tournament using IS-215 year after year. “2022 KICKOFF” is not an institution; teams have to be told what that is.

This is a very important point. Housewrite logistics directors (whether they be the head editor or someone else) need to make sure they spread their net wide, if they want as many mirrors as possible. This means putting in the work to get email addresses of other people who make things happen in their circuit. If you need to, write down things that people tell you about other circuits!

I would just like to add in some data about sets in Missouri. KICKOFF, SHOW-ME, the two IQBT sets and most NAQT IS and A-sets we had access to were mirrored multiple times across the state with decent field sizes at most of them. We want to be able to give our money to quizbowl sets; please write difficulty-appropriate sets so that way we can, to paraphrase Fry, give you our money.

The Quizbowl Resource Center

Concern for HS Regs+ Sets this Season

Concern for HS Regs+ Sets this Season

Re: Concern for HS Regs+ Sets this Season

Re: Concern for HS Regs+ Sets this Season

Re: Concern for HS Regs+ Sets this Season

Re: Concern for HS Regs+ Sets this Season

Re: Concern for HS Regs+ Sets this Season

Re: Concern for HS Regs+ Sets this Season

Re: Concern for HS Regs+ Sets this Season

Re: Concern for HS Regs+ Sets this Season

Re: Concern for HS Regs+ Sets this Season

Re: Concern for HS Regs+ Sets this Season

Re: Concern for HS Regs+ Sets this Season

Re: Concern for HS Regs+ Sets this Season

Re: Concern for HS Regs+ Sets this Season

Re: Concern for HS Regs+ Sets this Season

Re: Concern for HS Regs+ Sets this Season

Re: Concern for HS Regs+ Sets this Season