Algorithm for generating quiz bowl questions.

recfreq · Post by **recfreq** » Thu Feb 16, 2006 2:53 am

I don't know how many people'd be down for discussing this, but here it goes. Given the recent dismantling of some pretty poor questions, I've been wondering how well a computer can generate quiz bowl questions (namely, TUs). Here's my humble proposal for such a program.

1. The program'd have access to all the Encyclopedia Britannica articles. You could rely on some other source, but just for the sake of simplicity, let's just say we use the Britannica articles, say from the DVD software. It'd look up the article on the subject of interest as desired by the TD. Let's say the topic is Margaret Mead. The program'd look up the word "Mead" in the Margaret Mead article, and take that sentence to be the 1st sentence of the TU, replacing the "Mead" with a pronoun. Constraints like taking the 2nd appearance of "Mead" to avoid a easy 1st clue and making sure that there're no pronouns before "Mead" (else take the next appearance of "Mead") could also apply.

2. Next, randomly obtain two or three sentences from the article, the last being say, from the end of the 1st paragraph (in the intro). You could also do "she wrote" then randomly list the bibliography.

3. We can let the user specify that a certain phrase has to come up in the question, if it hasn't already been incorporated; for Mead this might be "Coming of Age in Samoa." If it's not there, look for it in Britannica and insert the sentence (albeit, it might be a awkward sentence), subject to constraints.

4. Finally, add in "FTP, name this" and the 1st sentence from Benet's, which in this case will be "American anthropologist," but you could imagine better sources for this.

I think this has more or less all you really need in the question, if you'll excuse the disjointed nature of the middle portions and a possibly lackluster finish (the good thing is that Britannica often doesn't let go of the pronoun once it starts using it). I'd be interested in seeing how it'd fare against CBI and NAQT in a double blind experiment involving unsuspecting moderators and players. No doubt they'll find the questions odd or perhaps bad, but just how bad, I'd like to know.

BTW here's a question I generated by the random picking algorithm above (my source is Britannica--should be available online, note that I skipped the 1st mention of "Mead" as an optimization--used the 2nd for the 1st clue); quotes are from the text (the last from Benet's, unquoted means the computer added it); works are used by finding the italicized items at the end, hence the wierdness of the Benedict biography clue:

Code: Select all

She "received an M.A. in 1924 and a Ph.D. in 1929."  "Her contributions to science received special recognition when, at the age of 72, she was elected to the presidency of the American Association for the Advancement of Science."  She wrote "Ruth Benedict, Culture and Commitment," and "Letters from the Field."  "In 1925, during the first of her many field trips to the South Seas, she gathered material for the first of her 23 books, Coming of Age in Samoa, a perennial best-seller and a characteristic example of her reliance on observation rather than statistics for data."  FTP name this "American anthropologist."

Is it really that bad? I'm pretty sure some of the M. Mead questions I've seen aren't that much better. Of course, extending this to science may have issues, and if someone wants to tackle the problem of bonuses, feel free.

recfreq · Post by **recfreq** » Thu Feb 16, 2006 3:07 am

The algorithm (awkwardly) applied to "Hastings, Battle of" (you could easily imagine more optimizations); I guess I had to implicitly supply the 1st pronoun as well (still, the pronoun rule is broken); again the last line is from Benet's:

Code: Select all

"Harold, learning of his landing on about October 2, hurried southward and by October 13, was approaching" this battle "with about 7,000 men."  "William therefore threw in his cavalry, which was so badly mauled by the English infantry, wielding two-handed battle-axes, that it fled."  FTP name this "battle occurring in 1066 near Senlac in Sussex, England, where Harold II died defending his claim to the English throne against the Norman William the Conqueror."

MLafer · Post by **MLafer** » Thu Feb 16, 2006 3:09 am

Is it really that bad?

yes

Leo Wolpert · Post by **Leo Wolpert** » Thu Feb 16, 2006 3:10 am

Both of those questions are bad enough to make Richard Reid blush.

recfreq · Post by **recfreq** » Thu Feb 16, 2006 3:36 am

Ok, it's pretty bad; here's something worse. (It might be a bit better if it can "know" when to pick, say the 6th or 7th apperance of the namesake instead of 2nd, but what the heck; also the more you put into it, the better, I guess.) Just for fun, here's a quick comparison, w/o being entirely fair to NAQT, since I picked the topic, but here goes:

NAQT question on Peter the Great:

Code: Select all

This man built his namesake summer home for the sole purpose of outshining Versailles [vur-SYE] with seven parks, four water cascades, an avenue of 64 fountains, and 37 gilded statues. Its most famous fountain, the Samson, commemorates his (*) 1709 victory over the Swedes. For 10 pointsâ€”name this ruler who built his summer home on the Gulf of Finland while tsar of Russia.

Random pick algorithm using Wikipedia and (last sentence) Benet's (note how funny this is, but that happens to be the 2nd appearance of Peter in the article!) (I had to supply "person," and required "Poltava."):

Code: Select all

This person "then ruled alone until 1724, whenceforth he ruled jointly with his wife, Catherine I."  "Sheremetyev also investigated the possibility of future joint ventures with the Knights, including action against the Turks and the possibility of a future Russian naval base."  "He was accompanied throughout his journey by his mistress, the Finnish girl Afrosina."  "In the summer of 1709, they nevertheless resumed their efforts to capture Ukraine, culminating in the Battle of Poltava on 27 June."  FTP name this "tsar of Russia."

Same algorithm and contraints using Enc Brit (all parens are removed):

Code: Select all

This person "was the son of Tsar Alexis by his second wife, Natalya Kirillovna Naryshkina."  "On the one hand, these Azov campaigns could be seen as fulfilling Russia's commitments, undertaken during Sophia's regency, to the anti-Turkish 'Holy League' of 1684."  "By the Russo-Turkish Peace of Constantinople he retained possession of Azov."  "He subsequently took part in the siege that led to the Russian capture of Narva and in the battles of Lesnaya and of Poltava."  FTP name this "tsar of Russia."

I guess if you're lucky, your question will be ok compared to NAQT. Improvements are welcomed. Of course, you could also just tell us what a stupid idea this is. Fun for all.

Leo Wolpert · Post by **Leo Wolpert** » Thu Feb 16, 2006 3:49 am

recfreq wrote:Of course, you could also just tell us what a stupid idea this is.

I think I'll choose this course of action. But not without noting how sad it is that NAQT (at its worst) is on par with computer-generated plagiarism bowl.

Edit: Quizbowl questions in general, at their worst, are on par with computer-generated plagiarism bowl. I feel like the non-Zeke-edited Terrapin packets resembled this kind of crap, though I may be misremembering them. So, it's not like this is a problem exclusive to NAQT, but the length restrictions seem to make it a bit more prevalent.

NotBhan · Post by **NotBhan** » Thu Feb 16, 2006 4:06 am

It's not a bad idea for an experiment, or for a thought-experiment for lazy folks like myself, but I don't see any way that you're going to generate a well-structured question without some kind of human interaction with the material. Given that there's no standard (or even predominant) structure for reference articles, I don't think a random clue search within an article is going to foster a good structure.

One obvious approach would be assigning some kind of difficulty ranking to each clue in (say) a Britannica article, then using some kind of generator to create a pyramidal question. Sounds like fun. Something fancier might be a purely algorithm which attempts to generate difficulty rankings based on relative frequency of two items in a collection of reference materials (i.e. "Coming of Age in Samoa" will appear more frequently with Mead than "Letters from the Field"). In some theoretical way that might make sense, but in practice it seems like it would be highly unreliable. Were it possible to modify such an algorithm to make it reliable, the modification would take far more work than just writing questions would.

Next, even if you generate a proper structure, you'd still likely need human interaction with the material to edit the language into a smooth flow, to avoid too-difficult-to-pronounce names when possible, &c, &c. And even then, the questions may well be dreadfully monotonous if every tossup has the same essential form. It's a nice idea, but, FWIW, I don't see it producing questions that meet modern quizbowl standards of structure and style without an enormous input of human effort.

--Raj Dhuwalia

P.S. Might be useful for generating practice and/or study material, though. Or maybe not.

AuguryMarch · Post by **AuguryMarch** » Thu Feb 16, 2006 4:20 am

Boy do I already regret posting this but,

Yeah yeah. I had this idea about 6 years ago. But in my version, instead of using reference sources, which everyone agrees are inconistent, it would use old tournaments. Split up every tossup into clues where difficulty is numbered and then use it to generate random tossups. Of course, people remember wording, and no questions that use narrative work, but otherwise yeah. After spending a week working on a database, I came up with a startling conclusion:

If, rather then spending my time working on such a project, I would instead read questions and try and learn the material therein, I would become a better player at actual quizbowl tournaments.

But hey, thats just me.

pblessman · Post by **pblessman** » Thu Feb 16, 2006 12:19 pm

I had the same idea as AuguryMarch:

1. Generate a list of hints divided by difficulty (Levels I-IV, or whatever).
2. For each hint, generate 3-5 phrasings/bridges.
3. Randomly generate question.
4. Have a human editor clean it up.

If you had only five hints at each of the IV difficulty levels, with three variations each depending on wording, you could generate over 50,000 unique TUs, and this number would approach infinity once you throw in the human editor element.

Properly edited, this could be a source of quality questions, but even without the human editor, this could be a practically infinite supply of practice questions.

I, personally, like this idea, because it would allow schools to be able to play (or practice) quiz bowl any time they want to, as there would be no limiting factor in terms of question availabilty.

Oh, one thing I forgot: To compile a toss-up packet, the computer would have to select questions based on a distribution requirement and a frequency list, with a human editor being able to throw in some tweaks.

The Quizbowl Resource Center

Algorithm for generating quiz bowl questions.

Algorithm for generating quiz bowl questions.

Database