QBDB 3: QBDB with a Vengeance

The scariest thing of all is Protobowl
Post Reply
User avatar
grapesmoker
Sin
Posts: 6360
Joined: Sat Oct 25, 2003 5:23 pm
Location: Pittsburgh, PA
Contact:

QBDB 3: QBDB with a Vengeance

Post by grapesmoker » Wed May 06, 2015 3:00 pm

Hello friends,

I'm very happy to announce the resurrection of my once-pioneering-then-defunct quizbowl question database, QBDB. Over the last several weeks, I undertook a wholesale rebuilding of the application, engineering it from the ground up using modern web frameworks. You can read more about it on the "About QBDB" tab or check out the Github repo. The punchline is that a whole lot of work has been done to make QBDB a) easy to use, b) easy to extend, and c) easy to update with new content.

Why another database? A few reasons. First, I'd always wanted to do QBDB "correctly" ever since the days when it was running in PHP with hand-coded SQL queries. Second, I've finally done the work to bring packet-parsing down from "a horrible experience not preferable to sticking needles in your eyeballs" to "performable with an irritation equivalent to a mosquito bite." The packet parser has its own Github repo so you can check that out if you care about how it works. Working on this project has also forced me to learn some frontend development to get Backbone working correctly and some worthwhile backend stuff to deal with search. The benefit of QBDB over other databases lies mostly in this ability to transform unstructured packet content into structured data, more or less automatically. This means that you can have much more in-depth queries (for example, querying by either tossup or bonus, querying by question text or answer, etc.). In addition, you can do things like generate statistics on the appearance rate of clues or answers and so on.

As usual with open source projects, I welcome any and all useful contributions. The best way to contribute to either the parser or QBDB is to fork it, make your changes, and issues a pull request. If you notice something wrong, feel free to leave an issue in the Github page of the project. If you want to contribute to the effort of adding packets, there's a bit in the FAQ about how to do that; contact me with any further questions.

Right now, QBDB is extremely minimalistic in what it does: namely, it will show you packets and tournaments stored in the db, and allow you to search it by either tossup or bonus (or both) and either question or answer text (or both). Those functions should work pretty well and form the core of what the application is for. Other functions may be added in the future, as I find time or inclination. I'll be adding packets every few days or so, starting with the most recent stuff on the HSQB archive and working my way back in time.

Hopefully this proves useful and/or enjoyable to people.
Jerry Vinokurov
ex-LJHS, ex-Berkeley, ex-Brown, sorta-ex-CMU
code ape, loud voice, general nuissance

Adventure Temple Trail
Auron
Posts: 2631
Joined: Tue Jul 15, 2008 9:52 pm

Re: QBDB 3: QBDB with a Vengeance

Post by Adventure Temple Trail » Wed May 06, 2015 3:31 pm

This looks like it will be a great thing.

I am curious if you, Aseem (AseemsDB), and Jacob (Quinterest) have any plans to coordinate efforts for the future / put your heads together to ensure that your separate projects are mutually beneficial, rather than reduplicating effort to accomplish the same results. (EDIT: It seems like AseemsDB is distinct enough in purpose and setup that it need not be subsumed into this thing, at least until this thing catches up in terms of packets stored. Curious what Aseem himself thinks, though.)
Matt J.
ex-Georgetown Day HS, ex-Yale
member emeritus, ACF

Sailing away on my copper boat

User avatar
grapesmoker
Sin
Posts: 6360
Joined: Sat Oct 25, 2003 5:23 pm
Location: Pittsburgh, PA
Contact:

Re: QBDB 3: QBDB with a Vengeance

Post by grapesmoker » Wed May 06, 2015 3:35 pm

Matthew J wrote:This looks like it will be a great thing.

I am curious if you, Aseem (AseemsDB), and Jacob (Quinterest) have any plans to coordinate efforts for the future / put your heads together to ensure that your separate projects are mutually beneficial, rather than reduplicating effort to accomplish the same results. (EDIT: It seems like AseemsDB is distinct enough in purpose and setup that it need not be subsumed into this thing, at least until this thing catches up in terms of packets stored. Curious what Aseem himself thinks, though.)
I don't know Aseem, but I've emailed with Jacob; I'd like to combine our efforts, and I think it should be a fairly straightforward undertaking. I checked out Aseem's db and it works on a somewhat different principle from mine, in that it indexes files directly. This makes it possible to search the entirety of HSQB without worrying about imports, but the problem is that it returns pretty much everything that matches with not much room for granularity.
Jerry Vinokurov
ex-LJHS, ex-Berkeley, ex-Brown, sorta-ex-CMU
code ape, loud voice, general nuissance

User avatar
Mewto55555
Forums Staff: Administrator
Posts: 709
Joined: Sat Mar 13, 2010 9:27 pm
Contact:

Re: QBDB 3: QBDB with a Vengeance

Post by Mewto55555 » Wed May 06, 2015 3:48 pm

This looks great! One small thing: when you search an answerline, it returns all the questions, with the name of the packet they were in, but not the tournament (e.g. it just says "Editors 1" instead of "Editors 1 -- Penn Bowl" or something). It'd be great if it could include the tournament name as well.
Max
formerly of Ladue, Chicago

User avatar
grapesmoker
Sin
Posts: 6360
Joined: Sat Oct 25, 2003 5:23 pm
Location: Pittsburgh, PA
Contact:

Re: QBDB 3: QBDB with a Vengeance

Post by grapesmoker » Wed May 06, 2015 3:57 pm

Mewto55555 wrote:This looks great! One small thing: when you search an answerline, it returns all the questions, with the name of the packet they were in, but not the tournament (e.g. it just says "Editors 1" instead of "Editors 1 -- Penn Bowl" or something). It'd be great if it could include the tournament name as well.
Yeah, I meant to add that but just forgot. It'll be fixed in the near future.

edit: fixed now. There will probably be some kind of sorting option added in the future.
Last edited by grapesmoker on Wed May 06, 2015 4:49 pm, edited 1 time in total.
Jerry Vinokurov
ex-LJHS, ex-Berkeley, ex-Brown, sorta-ex-CMU
code ape, loud voice, general nuissance

aseem.keyal
Lulu
Posts: 58
Joined: Sun Jan 13, 2013 2:01 pm

Re: QBDB 3: QBDB with a Vengeance

Post by aseem.keyal » Wed May 06, 2015 4:17 pm

grapesmoker wrote: I checked out Aseem's db and it works on a somewhat different principle from mine, in that it indexes files directly. This makes it possible to search the entirety of HSQB without worrying about imports, but the problem is that it returns pretty much everything that matches with not much room for granularity.
Yeah, this is exactly the difference. Unfortunately, until every set has been parsed and added to this database or Quinterest, there's gonna be a trade off between convenience/flexibility and the number of packets available to search. I'm also not too worried about reduplicating effort because my work on the database these days consists mostly of adding a set every now and then (which takes about 5 minutes). However, this database looks really exciting, specifically the potential difficulty/quality ratings for packets and the ability to have questions read to you.
Aseem Keyal
Westview HS 2014
UC Berkeley 2018

User avatar
grapesmoker
Sin
Posts: 6360
Joined: Sat Oct 25, 2003 5:23 pm
Location: Pittsburgh, PA
Contact:

Re: QBDB 3: QBDB with a Vengeance

Post by grapesmoker » Wed May 06, 2015 4:32 pm

aseem.keyal wrote:
grapesmoker wrote: I checked out Aseem's db and it works on a somewhat different principle from mine, in that it indexes files directly. This makes it possible to search the entirety of HSQB without worrying about imports, but the problem is that it returns pretty much everything that matches with not much room for granularity.
Yeah, this is exactly the difference. Unfortunately, until every set has been parsed and added to this database or Quinterest, there's gonna be a trade off between convenience/flexibility and the number of packets available to search. I'm also not too worried about reduplicating effort because my work on the database these days consists mostly of adding a set every now and then (which takes about 5 minutes). However, this database looks really exciting, specifically the potential difficulty/quality ratings for packets and the ability to have questions read to you.
The difficulty/quality stuff is going to require an implementation of an authentication system, which might take a bit of time.
Jerry Vinokurov
ex-LJHS, ex-Berkeley, ex-Brown, sorta-ex-CMU
code ape, loud voice, general nuissance

njsbling
Rikku
Posts: 314
Joined: Mon Jul 30, 2012 8:17 pm
Location: Carmichael, California

Re: QBDB 3: QBDB with a Vengeance

Post by njsbling » Wed May 06, 2015 9:57 pm

So what is the difference between this and Quinterest?
Nicholas Karas
Member, Northern California Quiz Bowl Alliance
Outreach Coordinator, National Academic Quiz Tournaments

User avatar
grapesmoker
Sin
Posts: 6360
Joined: Sat Oct 25, 2003 5:23 pm
Location: Pittsburgh, PA
Contact:

Re: QBDB 3: QBDB with a Vengeance

Post by grapesmoker » Wed May 06, 2015 10:49 pm

njsbling wrote:So what is the difference between this and Quinterest?
There are two major differences. The first is the import process. In Quinterest, this has to be carried out manually by cutting and pasting; I have a whole pipeline which does it about as near-automatically as it can be done. That's not a database feature per se, but it does make getting stuff into the db pretty easy; adding a new tournament takes, at most, about 10 minutes, and in good cases I've done it in about 2. It requires a minimal amount of preprocessing, mostly deleting extraneous non-question content from the packet. The other major difference is extensibility. Because of the way Quinterest is written (again, hand-coded PHP and SQL queries) any feature change is going to be very hard to implement. Since I'm leveraging established back- and front-end frameworks, adding new features in a consistent way is relatively easy. For example, I'm currently working on that "Read Me a Question" feature, and it's a pretty straightforward process that doesn't break anything else that works. It's not impossible to extend Quinterest, but it's a lot more work.

Those are the practical differences, and there are technical differences as well that most people probably don't care about. There may not be that much difference from the user side if all you care about is searching for questions. I want to make it easy for you to do that and other stuff as well, and I think this particular approach will allow me to do just that and give you a better product in the end.
Jerry Vinokurov
ex-LJHS, ex-Berkeley, ex-Brown, sorta-ex-CMU
code ape, loud voice, general nuissance

User avatar
UlyssesInvictus
Tidus
Posts: 698
Joined: Thu Feb 10, 2011 7:38 pm

Re: QBDB 3: QBDB with a Vengeance

Post by UlyssesInvictus » Wed May 06, 2015 11:03 pm

grapesmoker wrote:The first is the import process. In Quinterest, this has to be carried out manually by cutting and pasting; I have a whole pipeline which does it about as near-automatically as it can be done. That's not a database feature per se, but it does make getting stuff into the db pretty easy; adding a new tournament takes, at most, about 10 minutes, and in good cases I've done it in about 2.
What do you think the feasibility for splitting this feature off is? Like as a python package that people can easily import to just say "get questions from doc" or something like that. I'm sure many other programs would appreciate being able to use this as a first step--for example, it would make training that machine learning program in the other thread much easier.

I took a look at your git, and I haven't really had time to parse the code, but I see that you're using an argparser--does this mean you already have it so that you can run this from command line, or is it only really for use in conjunction with the django right now?
Raynor Kuang
quizdb.org
Harvard 2017, TJHSST 2013
I wrote GRAPHIC and FILM

User avatar
grapesmoker
Sin
Posts: 6360
Joined: Sat Oct 25, 2003 5:23 pm
Location: Pittsburgh, PA
Contact:

Re: QBDB 3: QBDB with a Vengeance

Post by grapesmoker » Wed May 06, 2015 11:15 pm

UlyssesInvictus wrote:
grapesmoker wrote:The first is the import process. In Quinterest, this has to be carried out manually by cutting and pasting; I have a whole pipeline which does it about as near-automatically as it can be done. That's not a database feature per se, but it does make getting stuff into the db pretty easy; adding a new tournament takes, at most, about 10 minutes, and in good cases I've done it in about 2.
What do you think the feasibility for splitting this feature off is? Like as a python package that people can easily import to just say "get questions from doc" or something like that. I'm sure many other programs would appreciate being able to use this as a first step--for example, it would make training that machine learning program in the other thread much easier.

I took a look at your git, and I haven't really had time to parse the code, but I see that you're using an argparser--does this mean you already have it so that you can run this from command line, or is it only really for use in conjunction with the django right now?
That's right, it runs from the command line. The Github is linked to above, so anyone is welcome to fork/clone it and play around with it. It's not terribly user-friendly, but it will tell you when something breaks; at the end, it produces a single JSON file that contains the entire tournament data, and that file is then imported into the db by sending it to an import endpoint in the Django app. It's not organized as a separate module right now, but that's easily solvable. I'm quite interested in using the ML stuff for categorization so I'd definitely be happy for someone to pull whatever data I have and use it for that purpose.
Jerry Vinokurov
ex-LJHS, ex-Berkeley, ex-Brown, sorta-ex-CMU
code ape, loud voice, general nuissance

User avatar
grapesmoker
Sin
Posts: 6360
Joined: Sat Oct 25, 2003 5:23 pm
Location: Pittsburgh, PA
Contact:

Re: QBDB 3: QBDB with a Vengeance

Post by grapesmoker » Mon May 11, 2015 6:15 pm

There's now an account system which allows you to register and rate tournaments, packets, and questions according to both difficulty and quality.
Jerry Vinokurov
ex-LJHS, ex-Berkeley, ex-Brown, sorta-ex-CMU
code ape, loud voice, general nuissance

User avatar
grapesmoker
Sin
Posts: 6360
Joined: Sat Oct 25, 2003 5:23 pm
Location: Pittsburgh, PA
Contact:

Re: QBDB 3: QBDB with a Vengeance

Post by grapesmoker » Wed Apr 05, 2017 4:28 pm

Hey all,

So QBDB has lain fallow for some time, mostly because I've had many other things on my plate. Recently I've had occasion to revisit old projects and try to bring them up to date. I've moved towards a more modern frontend build system with grunt and browserify, and I reworked the backend to do search better. A recent server upgrade by my host has allowed me to install elasticsearch, which does a much better job than my old backend did, which means that you can finally search based on any combination of tossup, bonus, question text, and answer text. I've also made substantial improvements to my packet parser, which has enabled me to parse tournaments much faster than I used to be able to. As a consequence, I was able to pretty quickly go through most of the tournaments that were available in Word format over the last two years and add them to the list. There are currently 26 tournaments available to be fully browsed and searched, and I plan to add more in the coming weeks. I also plan to have the reader functioning in the near future, so that you can tell the site to read you a question.

Unfortunately the change in schema has preserving the record of voting somewhat difficult. I still have the old database, so I can probably restore it from there if people care deeply; I don't think this feature was getting much use. Still, if people want me to, I'll try and bring back the old userbase. For now, registration and login are disabled but they'll come back.

Anyway, I hope this work proves useful to the community. More tournaments will be added going back through time, although I plan to focus exclusively on collegiate and open sets. If you're interested in helping with this effort, feel free to get in touch with me, there's always more work to be done.
Jerry Vinokurov
ex-LJHS, ex-Berkeley, ex-Brown, sorta-ex-CMU
code ape, loud voice, general nuissance

User avatar
grapesmoker
Sin
Posts: 6360
Joined: Sat Oct 25, 2003 5:23 pm
Location: Pittsburgh, PA
Contact:

Re: QBDB 3: QBDB with a Vengeance

Post by grapesmoker » Fri Apr 07, 2017 4:01 pm

I'm happy to announce that QBDB now sports a mostly-feature-complete reader of questions. You can click on "Read me a question," select your option, and go wild. There's an explanation of how it works in the FAQ (click "About QBDB" in the upper-left-hand part of the header). Enjoy.
Jerry Vinokurov
ex-LJHS, ex-Berkeley, ex-Brown, sorta-ex-CMU
code ape, loud voice, general nuissance

User avatar
dhumphreys17
Wakka
Posts: 145
Joined: Sat May 23, 2015 3:16 pm

Re: QBDB 3: QBDB with a Vengeance

Post by dhumphreys17 » Fri Apr 07, 2017 7:37 pm

grapesmoker wrote:I'm happy to announce that QBDB now sports a mostly-feature-complete reader of questions. You can click on "Read me a question," select your option, and go wild. There's an explanation of how it works in the FAQ (click "About QBDB" in the upper-left-hand part of the header). Enjoy.
Question on reading speed: is there any way to speed it up? The 1-5 options seem to be (in this order) "slow", "also slow", "also also slow", "quite slow", and "very slow".
Devin James John Bartholomew Humphreys
Team Captain, Sacred Heart Academy High School (MI), Class of 2017
Michigan State University, anticipated Class of 2020

User avatar
grapesmoker
Sin
Posts: 6360
Joined: Sat Oct 25, 2003 5:23 pm
Location: Pittsburgh, PA
Contact:

Re: QBDB 3: QBDB with a Vengeance

Post by grapesmoker » Sat Apr 08, 2017 1:07 am

dhumphreys17 wrote:
grapesmoker wrote:I'm happy to announce that QBDB now sports a mostly-feature-complete reader of questions. You can click on "Read me a question," select your option, and go wild. There's an explanation of how it works in the FAQ (click "About QBDB" in the upper-left-hand part of the header). Enjoy.
Question on reading speed: is there any way to speed it up? The 1-5 options seem to be (in this order) "slow", "also slow", "also also slow", "quite slow", and "very slow".
Sorry, I might have bungled the math on that one, not to mention the logic of the speed change. Try it now and see if you like it better. It should be quite brisk at top speed.
Jerry Vinokurov
ex-LJHS, ex-Berkeley, ex-Brown, sorta-ex-CMU
code ape, loud voice, general nuissance

Post Reply