## Statistics Questions

### Statistics Questions

As someone who knows a lot more about statistics than pure math, but does not have any significant question-writing experience, I want to ask about statistics' role in higher-level quizbowl. I remember NAQT-written Scholastic Bowl questions containing a significant amount of prob/stats content at around the level of AP Statistics, but have noticed that college questions greatly favor pure math questions in their science subdistributions over statistics. Just out of curiosity, are there reasons for this?

### Re: Statistics Questions

I'd guess that if this is true it's because of the greater breadth of askable topics in college math combined with the ladder place of mathematics in college distributions, especially compared to state formats like VHSL Scholastic Bowl. Math questions in college can more easily ask about relatively advanced topics like abstract algebra, analysis, and numerical methods that collegiate math majors and others are likely to study, which may squeeze out other topics like statistics that have a greater place in lower difficulties because there are fewer askable fields there. Moreover, I think VHSL packets usually have two separate math tossups, while a college set with a lot of math will either a math tossup or a bonus in each packet, so there are fewer chances for statistics or any one field of math to show up.

### Re: Statistics Questions

The VHSL has an ancient distribution that the powers to be refuse to change. The computational math questions are designed to test your ability to quickly crunch out a basic computation. The VSHL is markedly different than most other forms of quizbowl especially with the high prevalence of computational math. With a lot of high school math questions that aren't computational, you get stuck with Abel-Ruffini getting clued for the 50th time or questions on "pi" or "the square root of 2." When you get to college you're able to ask about more advanced topics that you simply can't in high school.DavidB256 wrote: ↑Thu Apr 09, 2020 10:26 pmAs someone who knows a lot more about statistics than pure math, but does not have any significant question-writing experience, I want to ask about statistics' role in higher-level quizbowl. I remember NAQT-written Scholastic Bowl questions containing a significant amount of prob/stats content at around the level of AP Statistics, but have noticed that college questions greatly favor pure math questions in their science subdistributions over statistics. Just out of curiosity, are there reasons for this?

To clear up what Eric said, there are 7 math questions per 55 question packet. There are 4 computational questions in the directed round, a replacement computational question on the last page of the packet, and 2 questions that are supposed to be about math in the rest of the packet but due to the limited answer space of A-level packets often end up being math adjacent.

### Re: Statistics Questions

While college quizbowl has a relatively low and reasonable number of questions on statistical theory, I personally think it's pretty lacking in questions concerning applied statistics and statistics-adjacent fields such as machine learning, which many people use in both academia and the modern economy. Some recent tournaments such as 2019 Lederberg have sought to rectify this by including more of these applied questions in a "Data Science / Applied Math" distribution.

### Re: Statistics Questions

I agree. I've written a lot of stats for tournaments like Spartan Housewrite and Michigan Winter and I've written some machine learning and applied math things but they are hard for people to write without experience. I think a good trend is that computer science questions are asking about more multi-disciplinary and applied things rather than "this programming language" or "this data structure." The issue with statistics in quizbowl is two fold.naan/steak-holding toll wrote: ↑Fri Apr 10, 2020 1:22 amWhile college quizbowl has a relatively low and reasonable number of questions on statistical theory, I personally think quizbowl overall is pretty lacking in questions concerning applied statistics and statistics-adjacent fields such as machine learning, which many people use in both academia and the modern economy. Some recent tournaments such as 2019 Lederberg have sought to rectify this by including more of these applied questions in a "Data Science / Applied Math" distribution.

1. People conflate probability and statistics when they are different fields. Probability is concerned with sigma algebras and its basis is measure theoretical. Statistics has to do with trying to describe and account for trends in data. Probability is in some sense more basic than statistics and is certainly more pure math at its basis. Unfortunately, people often have very rudimentary or poor training in probability that doesn't extend farther than plug and chug methods for applying the CLT or whatever. There is an incredibly rich literature here and I always recommend Rosenthal's book as a good starting point for rigorous probability.

2. Likewise, for statistics, you have lots of subfields like time series, causal inference, Bayesian methods and semi and nonparametrics with rich literatures that could come up a lot more often. We should move beyond questions on like "this distribution" to asking about applications in the above fields. It's okay if someone answers a statistics quesiton based off knowledge learned in an epidemiology or econometrics or enviormental modeling class. Statistics is used in all of those fields just like how game theory is used in ecology and behavioral science as well as obviously microeconomics.

I hope I provided some ideas of how to expand the askable amount of applied mathematics. This is more for college difficulty but even in high school there are lots of problems and methods to ask about aside from calculating the mean or whatever. AP Stats is pretty bad but there are a lot of statistical tests that worth knowing.

One additional problem in statistics is that the names on Wikipedia might not be the names used in a particular textbook. Try to describe the test using words like "testing for validity of instruments" before trying to drop a name like "oh this is a Wald test." Many statistics are asymptotically equivalent (e.g. F and Chi square) so naming gets confusing especially if a particular professor or book did not note a particular nuance or naming convention. I can recommend books here if people are interested in consulting better sources than Wikipedia.

### Re: Statistics Questions

That would be much appreciated!

### Re: Statistics Questions

I think the main problem is that 1/1 other science is not large enough to ask all of the categories that get put in it. Statistics has to compete for space with pure math, astronomy, computer science, and the Earth sciences. As a writer, I've found 1.5/1.5 sufficient for lower levels, but that has the obvious caveat of "where does the additional .5/.5 come from?"

### Re: Statistics Questions

This may be rehashing an old argument, but I think it's worth considering whether we should try to squeeze earth science and astronomy with the rest of the natural sciences (e.g. grouping earth sci with bio and astro with physics) instead of putting them in other science. This would accomplish two things: 1) it would ensure that every packet has 1/1 non-natural sci, allowing for stats and other applied topics to get more exposure, and 2) it would lower the chance of similarly-themed topics being tossed up in the same packet. Speaking for myself only, I get pretty bummed when I can't hear a math/CS/applied tossup because its place has been taken up by ES/astro, despite those two subjects having fewer college majors than the rest of the other science, and I imagine many other players feel the same way. A lot of times, I see a physics tossup in the same packet as an astro tossup based heavily on astrophysics, and it feels like the packet is double dipping into physics majors' wheelhouses. Moving ES/astro out of other science would hurt chem/physics/bio players, but it makes sense in my mind for a tossup on mineral chemistry to replace the spot of an chemical element tossup rather than an algebra tossup, or a tossup on ocean science to replace ecology rather than programming.Carlos Be wrote: ↑Fri Apr 10, 2020 9:05 pmI think the main problem is that 1/1 other science is not large enough to ask all of the categories that get put in it. Statistics has to compete for space with pure math, astronomy, computer science, and the Earth sciences. As a writer, I've found 1.5/1.5 sufficient for lower levels, but that has the obvious caveat of "where does the additional .5/.5 come from?"

I guess my main point is that earth science and astronomy are more closely related to chem, physics, and bio than they are to math and CS, and we should rethink how we are grouping these subjects together to ensure that everything is being represented properly.

### Re: Statistics Questions

I agree that it's worth considering how the space for "other science", but this is a terrible argument. If we based a quizbowl distribution on the things people major in, business and psychology would both be more common than literature or fine arts.Vinjance wrote: ↑Tue Apr 14, 2020 4:26 pmSpeaking for myself only, I get pretty bummed when I can't hear a math/CS/applied tossup because its place has been taken up by ES/astro, despite those two subjects having fewer college majors than the rest of the other science, and I imagine many other players feel the same way.

### Re: Statistics Questions

Given that the science canon is much more closely tied to the classroom than any other part of the canon, I think it's not completely unreasonable to have this be at least something of a factor in how much space a discipline gets.Wartortullian wrote: ↑Tue Apr 14, 2020 5:59 pmI agree that it's worth considering how the space for "other science", but this is a terrible argument. If we based a quizbowl distribution on the things people major in, business and psychology would both be more common than literature or fine arts.Vinjance wrote: ↑Tue Apr 14, 2020 4:26 pmSpeaking for myself only, I get pretty bummed when I can't hear a math/CS/applied tossup because its place has been taken up by ES/astro, despite those two subjects having fewer college majors than the rest of the other science, and I imagine many other players feel the same way.

### Re: Statistics Questions

The reason I mentioned the number of people majoring is to illustrate why math/CS/applied should be a guaranteed tossup in any given packet. Right now, these subjects aren't a guaranteed tossup in every packet because earth science and astronomy, two natural sciences, are being grouped with all the non-natural sciences instead of the rest of the natural sciences like physics and chemistry. I realize that set-wide distributions stray far from what people actually major in, but I don't really understand why this is a terrible argument.Wartortullian wrote: ↑Tue Apr 14, 2020 5:59 pmI agree that it's worth considering how the space for "other science", but this is a terrible argument. If we based a quizbowl distribution on the things people major in, business and psychology would both be more common than literature or fine arts.Vinjance wrote: ↑Tue Apr 14, 2020 4:26 pmSpeaking for myself only, I get pretty bummed when I can't hear a math/CS/applied tossup because its place has been taken up by ES/astro, despite those two subjects having fewer college majors than the rest of the other science, and I imagine many other players feel the same way.

### Re: Statistics Questions

I think grouping math alongside computer science, and separate from "natural" science, is problematic. Most math has very little to do with computer science, and a lot of computer science has very little to do with math (e.g. programming). Math certainly has at least as much to do with physics as it does with computer science. In general, every field of science intersects with other fields; bio shares a lot with chem and stats, chem shares a lot with bio and physics, and the Earth sciences (plural) intersect heavily with biology, chemistry, physics, math, statistics, computing, astronomy, and probably other things. Additionally, grouping math along with CS may cause editors to make math more CS-like.Vinjance wrote: ↑Fri Apr 10, 2020 9:05 pmI guess my main point is that earth science and astronomy are more closely related to chem, physics, and bio than they are to math and CS, and we should rethink how we are grouping these subjects together to ensure that everything is being represented properly.

Although I disagree with your solution, I think your comment brings up an important point: treating "other science" as a single category is misleading and bad. Tournaments should specify their subdistribution for other science in their main announcement. Additionally, when assigning editors, each category in other science should be treated as its own subject. (Perhaps the same person edits multiple subjects, if they are qualified; the point is that tournaments shouldn't look for a single "other science" editor.)

How does the number of CS majors have any effect on whether "non-natural science" should be treated as a bloc? I don't follow.Vinjance wrote: ↑Tue Apr 14, 2020 7:02 pmThe reason I mentioned the number of people majoring is to illustrate why math/CS/applied should be a guaranteed tossup in any given packet. Right now, these subjects aren't a guaranteed tossup in every packet because earth science and astronomy, two natural sciences, are being grouped with all the non-natural sciences instead of the rest of the natural sciences like physics and chemistry. I realize that set-wide distributions stray far from what people actually major in, but I don't really understand why this is a terrible argument.

### Re: Statistics Questions

This thread makes me think about how I write questions.

When I think of topics for a math question, I usually default to picking one of the major subfields: analysis, algebra, topology, geometry, combinatorics/discrete, or something like that. For some reason, "statistical theory" (e.g., the Kolmogorov-Smirnov test, or omitted variable bias, or the Neyman-Pearson Lemma) never shows up. At best, I'll get to something about random variables (things converging to other things, expectation as a linear functional, consistent or efficient estimators, etc.) But that's looking at them the way an analyst would, not a statistician.

Maybe it's just a bias in what I know. But I'd wager it's a kind of implicit community norm in quizbowl about what fits in a math tossup, in the same way that religion vs. myth is a community norm. I think we have some notion of what "serious math" is, and it's informed by what college math departments teach (usually statistics would be an economics course, in my undergrad).

When I think of topics for a math question, I usually default to picking one of the major subfields: analysis, algebra, topology, geometry, combinatorics/discrete, or something like that. For some reason, "statistical theory" (e.g., the Kolmogorov-Smirnov test, or omitted variable bias, or the Neyman-Pearson Lemma) never shows up. At best, I'll get to something about random variables (things converging to other things, expectation as a linear functional, consistent or efficient estimators, etc.) But that's looking at them the way an analyst would, not a statistician.

Maybe it's just a bias in what I know. But I'd wager it's a kind of implicit community norm in quizbowl about what fits in a math tossup, in the same way that religion vs. myth is a community norm. I think we have some notion of what "serious math" is, and it's informed by what college math departments teach (usually statistics would be an economics course, in my undergrad).

### Re: Statistics Questions

I've clued both Kolmogorov-Smirnov and convergence in distribution in questions... In fact, the K-S test was an MWT hard part.ArnavS wrote: ↑Sat Apr 18, 2020 1:25 amThis thread makes me think about how I write questions.

When I think of topics for a math question, I usually default to picking one of the major subfields: analysis, algebra, topology, geometry, combinatorics/discrete, or something like that. For some reason, "statistical theory" (e.g., the Kolmogorov-Smirnov test, or omitted variable bias, or the Neyman-Pearson Lemma) never shows up. At best, I'll get to something about random variables (things converging to other things, expectation as a linear functional, consistent or efficient estimators, etc.) But that's looking at them the way an analyst would, not a statistician.

Maybe it's just a bias in what I know. But I'd wager it's a kind of implicit community norm in quizbowl about what fits in a math tossup, in the same way that religion vs. myth is a community norm. I think we have some notion of what "serious math" is, and it's informed by what college math departments teach (usually statistics would be an economics course, in my undergrad).

### Re: Statistics Questions

Maybe it's just me, then. But what were the answerlines for those?

