D-Value Question

Old college threads.
Locked
User avatar
prodski
Lulu
Posts: 47
Joined: Sun Jan 22, 2012 10:46 am

D-Value Question

Post by prodski » Mon Feb 03, 2014 1:20 pm

After taking a look last week at the compilation of the D-value, I had a question as to why the strength of schedule (SOS) is applied to the PPB value of that team? A team could theoretically end up with a D-value of 0, no matter how strong they are. Suppose only Yale and Jefferson C end up in a DI sectional in New Haven (late snowstorm prevents other teams from attending and we are on a mandated New England trip with a grad student). Yale beats us 10 times in a row 600-0, averaging 20 PPB. Yale's D-value for the sectional would be 0. No matter how many points they put up, or even if that was the highest PPB value in the nation, since the TPPTH (opponent) would be 0, thus Yale's SOS would be 0. This doesn't seem right to me. After plugging in some numbers, if Yale "let" Jefferson answer one question right each round (and we subsequently miss every bonus), Yale's D-value would go up from 0 to about 33.8 (assuming a TPPTH average of 10 across the country). So it would be better for Yale to let us score. Actually, even if I brought 10 teams up to New Haven that couldn't answer a question in the tournament, Yale would still end up with a D-value of 0, since Yale's opp points per tossup would be 0 within the sectional. Does the weak sectional need to be factored? Certainly. But should Yale go down to 0?

I am not advocating for any drastic change to the D-value - I think it is a nice statistic and SOS should be evaluated to compare teams from across different regions. It seems to me that as long as there is a 2nd decent team in a sectional, the problem is eliminated since the 2nd decent team will put up points and increase the TPPTH (opp) for the other team within that sectional. I think a limiting value of some type would be more appropriate - in the case of Jefferson vs. Yale shouldn't Yale's D-value not be able to go below some number, say the PPB x average # of questions answered by an average team?

Before folks jump on me for Yale never playing Jefferson alone on DI questions, a statistic should work for ALL possible cases. This is obviously not a problem in established regions (like Florida or Alabama CC's, or anywhere east of the midwest for 4-year schools). But in developing regions, this certainly could happen - especially with smaller fields and one decent team in the region. It almost happened at our sectional last weekend - had a few more questions been missed by opponents of our A team they could have missed the CCCT field, with a really strong PPB - 3rd overall.

Any input is appreciated, I may be missing something obvious, and this question may have been asked/answered when the D-value was initiated. My apologies to Yale for bringing them into the discussion, it was the best team I could find with the least number of letters to type.
Peter Rodski
Professor of Mathematics
CCCT Champions 2016
Jefferson Community & Technical College
Louisville, KY. 40272

User avatar
at your pleasure
Auron
Posts: 1670
Joined: Sun Aug 03, 2008 7:56 pm

Re: D-Value Question

Post by at your pleasure » Mon Feb 03, 2014 2:46 pm

Combined Fields
If the divisions at a Sectional compete together (as a result of one or both divisions not having at least four teams), then an automatic invitation will also be given to the overall winner if there are at least four total teams at the Sectional.
So from the plain and unadorned reading of these rules, Yale would still get in because they earned an autobid by (presumably, if they play a perfect game every game, which is what you're postulating) winning the sectional. Which puts them out of the purview of something like D-value-the point of calculating D-value is to figure out which of the teams that didn't win their sectional(that is, "decent second teams" as you put it) should be given bids. In which case we are well out of the realm of plausible perfect games and opponents who are buzzer rocks.
Douglas Graebner, Walt Whitman HS 10, Uchicago 14
"... imagination acts upon man as really as does gravitation, and may kill him as certainly as a dose of prussic acid."-Sir James Frazer,The Golden Bough

http://avorticistking.wordpress.com/

User avatar
Tees-Exe Line
Tidus
Posts: 622
Joined: Mon Apr 12, 2010 5:02 pm

Re: D-Value Question

Post by Tees-Exe Line » Mon Feb 03, 2014 3:18 pm

prodski wrote:Before folks jump on me for Yale never playing Jefferson alone on DI questions, a statistic should work for ALL possible cases.
This is a puzzling statement, especially coming from a math professor. The whole point of statistics is to come up with a measure that reflects empirical reality, and it's in the nature of the project that the better is a measure at making distinctions at a fine level, the worse it will perform when reality produces radically divergent outcomes. Doug has already pointed out that your scenario is already handled appropriately by the existing system. The point of the D-Value is to differentiate teams with a plausible claim to an ICT slot who probably didn't play one another.
Marshall I. Steinbaum

Oxford University (2002-2005)
University of Chicago (2008-2014)

Get in the elevator.

User avatar
prodski
Lulu
Posts: 47
Joined: Sun Jan 22, 2012 10:46 am

Re: D-Value Question

Post by prodski » Mon Feb 03, 2014 3:21 pm

Point taken if there were 4 teams there. But wouldn't there still be a problem if there are only 3 teams with no qualifier? One only has to look at the upcoming small field in Arizona, or consider the potential bad weather for sectionals this weekend in the Midwest. I guess I just fundamentally have a problem with a team's D-value able to go to zero no matter how strong they are. And 4-year sectionals have the ability to combine fields, something CC's don't have the ability to do. If a new 4 year school wants to start in Colorado, but only can find one other team within 10 hours for their sectional, couldn't the potential problem arise? Or are they just left out of competing and told not to host? But thanks for the reply, although I wouldn't call my new players "buzzer rocks". They just weren't as quick to the buzzer as Yale, and my best player was late due to some unrelated issues in southern Conn I'd rather not discuss.
Peter Rodski
Professor of Mathematics
CCCT Champions 2016
Jefferson Community & Technical College
Louisville, KY. 40272

User avatar
prodski
Lulu
Posts: 47
Joined: Sun Jan 22, 2012 10:46 am

Re: D-Value Question

Post by prodski » Mon Feb 03, 2014 3:38 pm

Tees-Exe Line wrote:
prodski wrote:Before folks jump on me for Yale never playing Jefferson alone on DI questions, a statistic should work for ALL possible cases.
This is a puzzling statement, especially coming from a math professor. The whole point of statistics is to come up with a measure that reflects empirical reality, and it's in the nature of the project that the better is a measure at making distinctions at a fine level, the worse it will perform when reality produces radically divergent outcomes. Doug has already pointed out that your scenario is already handled appropriately by the existing system. The point of the D-Value is to differentiate teams with a plausible claim to an ICT slot who probably didn't play one another.
I know the statistic works for established fields, I think it is a great statistic, and I'm certainly not trying to overhaul it. I can however see a potential problem for smaller/weaker fields. I wouldn't advocate throwing out the statistic. My initial question is why PPB has a factor of sos attached to it? It is a pretty simple question. To me PPB is the best indicator of a team's strength, but has nothing to do with the strength of their sectional.
Peter Rodski
Professor of Mathematics
CCCT Champions 2016
Jefferson Community & Technical College
Louisville, KY. 40272

User avatar
t-bar
Tidus
Posts: 633
Joined: Sun Jan 25, 2009 4:12 pm
Location: Cambridge, MA

Re: D-Value Question

Post by t-bar » Mon Feb 03, 2014 3:51 pm

prodski wrote:My initial question is why PPB has a factor of sos attached to it?
Because, as a factor in the D-value expression, it doesn't stand on its own--it's included in order to measure the expected total number of bonus points in a game, in which capacity it's multiplied by the number of bonuses heard. When you sweep away all the adjustments and multiplicative factors, D-value is effectively just a corrected version of PPG, and most of the points that teams earn are bonus points. The SOS correction is because if you were playing stronger or weaker opponents, you would expect to hear fewer or more bonuses, respectively. Raw, uncorrected PPB can also be a useful statistic on its own to compare teams, but that's not its purpose in the D-value formula.
Stephen Eltinge
TJHSST 2011 | MIT 2015 | Yale 20??
ACF member | PACE member | NAQT writer

User avatar
prodski
Lulu
Posts: 47
Joined: Sun Jan 22, 2012 10:46 am

Re: D-Value Question

Post by prodski » Mon Feb 03, 2014 4:46 pm

Thanks Stephen, I think I understand that part of it, and certainly understand that you would get more bonus against a weaker field. I guess I would be more comfortable (I think) if that part of the formula incorporated PPB x average number of questions answered by the entire field, instead of the sectional itself - which is what you are saying that it in fact does. It looks like it tries to do that, by incorporating tppth avg, but not completely - the case of Jefferson vs. Yale exemplifying this - by calculating opponents ppth. It seems to me also that there is a point where you could exceed the maximum value - and letting the opponent score 10 is worth more than your team scoring 10. Of course, you wouldn't know this value without knowing the rest of the entire potential field. I'll go deeper, since I think you understand it - why are opponent's points (at your sectional) figured into the PPB part of the formula? Wouldn't a better indicator be to take the ppbh x tppth avg without the sos? This would in fact do what you are saying it does - compute the expected total number of bonus points a team would get. I'm still hung up on the limit of that D-value going to zero based on opponent's ppth - to me it shouldn't.
Peter Rodski
Professor of Mathematics
CCCT Champions 2016
Jefferson Community & Technical College
Louisville, KY. 40272

User avatar
jonpin
Forums Staff: Moderator
Posts: 2035
Joined: Wed Feb 04, 2004 6:45 pm
Location: BCA NJ / WUSTL MO / Hackensack NJ

Re: D-Value Question

Post by jonpin » Mon Feb 03, 2014 6:55 pm

prodski wrote:Thanks Stephen, I think I understand that part of it, and certainly understand that you would get more bonus against a weaker field. I guess I would be more comfortable (I think) if that part of the formula incorporated PPB x average number of questions answered by the entire field, instead of the sectional itself - which is what you are saying that it in fact does. It looks like it tries to do that, by incorporating tppth avg, but not completely - the case of Jefferson vs. Yale exemplifying this - by calculating opponents ppth. It seems to me also that there is a point where you could exceed the maximum value - and letting the opponent score 10 is worth more than your team scoring 10. Of course, you wouldn't know this value without knowing the rest of the entire potential field. I'll go deeper, since I think you understand it - why are opponent's points (at your sectional) figured into the PPB part of the formula? Wouldn't a better indicator be to take the ppbh x tppth avg without the sos? This would in fact do what you are saying it does - compute the expected total number of bonus points a team would get. I'm still hung up on the limit of that D-value going to zero based on opponent's ppth - to me it shouldn't.
The idea of the SOS adjustment in the bonus portion of the calculation is not to adjust how many PPB the team gets, but how many bonus opportunities they get. The idea being that if on average, you answer 80% of all tossups, but you're playing a schedule which is 70% the nationwide average, a back-of-the-envelope calculation is that against the hypothetical average team you answer 80%*.7 = 56% of all tossups and thus get 56%*20 = 11.2 bonuses.
prodski wrote:Actually, even if I brought 10 teams up to New Haven that couldn't answer a question in the tournament, Yale would still end up with a D-value of 0, since Yale's opp points per tossup would be 0 within the sectional. Does the weak sectional need to be factored? Certainly. But should Yale go down to 0?
prodski wrote:oint taken if there were 4 teams there. But wouldn't there still be a problem if there are only 3 teams with no qualifier?
Only if the other two teams perpetually tied each other 0-0.

You've made a case for the SOS calculation to maybe not be straight-proportional, but there is a sound reason why it exists in both halves of the formula.
Jon Pinyan
Coach, Bergen County Academies (NJ); former player for BCA (2000-03) and WUSTL (2003-07)
HSQB forum mod, PACE member
Stat director for: NSC '13-'15, '17; ACF '14, '17, '19; NHBB '13-'15; NASAT '11

"A [...] wizard who controls the weather" - Jerry Vinokurov

User avatar
prodski
Lulu
Posts: 47
Joined: Sun Jan 22, 2012 10:46 am

Re: D-Value Question

Post by prodski » Mon Feb 03, 2014 9:06 pm

I appreciate the response Jon, and I understand what you are saying when it comes to tossups. I also understand that the ppb value needs to be compared to an average team to get the hypothetical bonus opportunities against the entire field - I actually mentioned that a few times. What I don't understand is why your team's ppb is first hit with an sos value of the teams in your region. To me, it seems like the bonus part of the d-value should be independent of who is in your region - though not independent of all the teams in the field it is compared to. It is almost like there are two SOSs. There is the final SOS, which includes all teams, and the mid statistic which calculates ppth opponent, based on your sectional. I'm all for the local SOS factor being applied to the tossup value, since a team in a weak region would get fewer chances to answer against the entire field. But the PPB value I wouldn't think needs to be hit by the local sos, but rather by the final SOS including the entire field to adjust the value. If this were the case, Yale's D-value wouldn't go to 0 - since an average team wouldn't get 0 bonus opportunities. I guess I'm not convinced that what you are saying above is actually happening, because of the ppth-opponent value. To me it seems like your raw score is just adjusted in one fell swoop, both values (toss-up and bonus) being adjusted equally based on the strength of your region. I guess I'm not seeing why the PPTH - opponent is applied to the bonus part of the d-value - rather than comparing it to the SOS of the entire field and then adjusting it at that point. I'm probably not communicating this very well - I apologize.
Peter Rodski
Professor of Mathematics
CCCT Champions 2016
Jefferson Community & Technical College
Louisville, KY. 40272

User avatar
prodski
Lulu
Posts: 47
Joined: Sun Jan 22, 2012 10:46 am

Re: D-Value Question

Post by prodski » Mon Feb 03, 2014 11:02 pm

I think I am confusing myself, so I am probably confusing everyone as well. Let me go back to the 2 team fields. Suppose Yale beats Jefferson C 600-10 all 10 rounds - we'll get off the 0. Yale's ppb is 20. Suppose in another field Jefferson A beats the same C team 20-10 in all 10 rounds. My A team ends with a ppb of 0. I believe both Yale and Jefferson A would have d-values very close to one another because of the terrible C team, irrespective of their PPB (clearly dominated by Yale). What would fluctuate is Jeff C's D-value, even though they are not changing. It would increase in the Yale field, decrease in the Jeff field, even if the PPB for the C team and everything else were the same. Shouldn't the C team's D-value be the same in either case? Or maybe it shouldn't? I'm probably over-thinking this. I suppose I could live with "it doesn't necessarily work for 2 team fields because of the variation in one team". Would that be correct? But if you threw a D, E, and F team in there you would have the same issue, no?
Peter Rodski
Professor of Mathematics
CCCT Champions 2016
Jefferson Community & Technical College
Louisville, KY. 40272

User avatar
jonpin
Forums Staff: Moderator
Posts: 2035
Joined: Wed Feb 04, 2004 6:45 pm
Location: BCA NJ / WUSTL MO / Hackensack NJ

Re: D-Value Question

Post by jonpin » Mon Feb 03, 2014 11:44 pm

I'm afraid I no longer follow what you are trying to say. This is the kind of conversation it's much easier for me to engage in with a piece of paper rather than a keyboard, but here goes:
D value = 20 x ( Adjusted TPPTH + Adjusted BHPTH x Adjusted PPB) = 20 x ( TPPTH * SOS * DCT + [BHPTH * SOS * DCT] * [PPB * DCB] ).
If we disregard the two sets of questions (that is, we're assuming all teams are playing on correct questions), that simplifies as:
... = 20 x ( TPPTH * SOS * DCT + [BHPTH * SOS * DCT] * [PPB * DCB] ) = 20 x (TPPTH * SOS + BHPTH * SOS * PPB).
The blue represents the number of tossup points: the actual tossup points adjusted for the fact that easier opponents make it easier to score more points per tossup.
The green represents what I refer to as "tossup percentage": the percent of tossups answered which is equal to the number of bonus opportunities. This also must be adjusted for the fact that easier opponents make it easier to answer more tossups correctly, and thus get more bonus opportunities.
The red represents points per bonus. This is not adjusted for schedule strength. The term as a whole is adjusted in the green portion.

It is reasonable to say that the SOS should be calculated differently from (opponents' average TTPTH) / (national average TTPTH); and it's reasonable to say that the two appearances of SOS in the formula maybe shouldn't be the same thing. Are you saying one of those two things, or something else entirely?
Jon Pinyan
Coach, Bergen County Academies (NJ); former player for BCA (2000-03) and WUSTL (2003-07)
HSQB forum mod, PACE member
Stat director for: NSC '13-'15, '17; ACF '14, '17, '19; NHBB '13-'15; NASAT '11

"A [...] wizard who controls the weather" - Jerry Vinokurov

User avatar
Periplus of the Erythraean Sea
Auron
Posts: 2039
Joined: Mon Feb 28, 2011 11:53 pm
Location: New York, NY

Re: D-Value Question

Post by Periplus of the Erythraean Sea » Tue Feb 04, 2014 12:23 am

EDIT: Ignore my commentary.
Last edited by Periplus of the Erythraean Sea on Sun Feb 09, 2014 9:54 pm, edited 1 time in total.
Will Alston
Bethesda Chevy Chase HS '12, Dartmouth '16, Columbia Business School '21
NAQT Writer and Subject Editor

User avatar
prodski
Lulu
Posts: 47
Joined: Sun Jan 22, 2012 10:46 am

Re: D-Value Question

Post by prodski » Tue Feb 04, 2014 12:44 am

Excellent Jon - I appreciate the color - coded formula. It is the 2nd sos (in green) that I was having an issue with or was not understanding. I thought they were getting hit twice, first when calculating the sos from their sectional, then again when comparing it to the overall sos. But they are, in fact, only getting hit once on it. I still feel like something isn't right though. If two teams in 2 different sectionals have two really different ppb's (say 10 and 30), then doesn't the final d-value depend almost entirely on the points put up by opponents of the individual sectional? Couldn't the team with a 10 ppb eclipse the team with a 30 ppb based on their competition and how many points they put up? And maybe they should? That is what I am not understanding. I'm trying to simplify the equation down to 2 teams, and maybe I shouldn't be. I guess my gut is telling me that the PPB sos (in green) should be weighted differently than the toss-up sos (in blue), mainly because the bonuses are independent of the other teams in one's sectional.
Peter Rodski
Professor of Mathematics
CCCT Champions 2016
Jefferson Community & Technical College
Louisville, KY. 40272

User avatar
prodski
Lulu
Posts: 47
Joined: Sun Jan 22, 2012 10:46 am

Re: D-Value Question

Post by prodski » Thu Feb 06, 2014 1:45 pm

Ok, enough with the hypothetical cases. I painstakingly calculated numbers for our small regional by removing teams and matches. I first removed each team 1 at a time, then removed 2 teams to get our D-value based on a head-to-head field. A d-value should predict how a team would do against other competition, correct? Removing a team or two should not impact one's D-value tremendously, correct? Since if you add a weaker team you will score more, but your sos will go down. And other teams at the sectional would have similar results against the weaker team.

My A team started with a raw score of 316. After applying the statistic with the other 3 teams in the sectional we dropped to 186. Cut-off for CCCT was 137.6. As was pointed out earlier, we would have made the field regardless of our weak field since we had 4 teams. But what if a team or two dropped and there was no guaranteed bid?

Eliminating Jeff C from the field, our D-value drops to 174.74. Without Henderson we drop to 164.09. Without Jeff B we drop to 149.49. So although dangerously close, we would still hit that 20-24 slot out of 65 teams and qualify if only 1 team failed to show. In all cases our ppb is still top 5.

When I looked at head-to-head fields, it was truly disturbing, and what I thought would happen, did. If only Jeff A and Jeff B had shown up, Jeff A's D-value drops to 117.72. Against Jeff C we drop to 89.67. And if I didn't have a B or C team, and we only played Henderson all day, our D-value (computed from the 3 actual matches) would drop down to 50.69. We would have been ranked 59th out of 65 teams, yet had the 3rd highest ppb in the field at 17.027, with the highest being 17.30, and 4th place ppb being 14.53 (almost the top ppb and significantly ahead of 4th place). These are not hypothetical, but actual figures from our sectional. I am not assuming anything. We would have missed the field in all head-to-head match ups with what I consider a top 10 CC team (we've been top 5 last 2 years).

I fully understand the D-value now at least. The question I have now is why aren't opponent's ppb figured into the equation to show sos? I didn't realize that tppth (opp) stood for tossup points rather than total points. If Yale is going to hammer 19 questions out of 20 every match against my A team or C team head to head, wouldn't a better indicator of strength of field be to look at PPB of the opposition, rather than just the 10 toss-up points my A or C team get each game? The sos for Yale would be the same in either match up, although my A team is significantly better than the C team. I probably need to think about that some more.

My suggestion would be, at least for the CCCT field, for NAQT to at least have a committee of some type to consider small fields (without a qualifier if that happens) to prevent something like this from happening. As it stands now the field is filled according to the D-value without exception. I would hate to have the 3rd highest ppb and be ranked 59th out of 65 without any recourse.

And again, I like the D-value, I think it is a great statistic. It works for 99% of the fields. Actually, 100% up until now as far as I can tell. As long as there is a 2nd decent team in your sectional, it works. But it doesn't work for 2 team fields, if that ever happens, and is borderline for 3 team fields. And yes, my sectional was awful, Henderson had 2 new players who may have never practiced, and we were missing a few folks because of snow. I am trying to increase field size in our region, but it is not easy.
Peter Rodski
Professor of Mathematics
CCCT Champions 2016
Jefferson Community & Technical College
Louisville, KY. 40272

Locked