Senin, 08 Juni 2009

"Confabulatory" rankings of college football players

So I stumbled on "The College Football Performance Awards" site. Its mission statement is "to provide the most scientifically rigorous conferments in college football. Recipients are selected exclusively based upon objective scientific rankings." The basic driver seems to be that the ballot system, whereby some names are picked, some folks vote, and somebody wins, is an inherently flawed way to select football players; indeed, it is, they argue, more like a "popularity contest."

I suppose there's some merit to that proposition. The idea is that there must be some better way to evaluate a player, particularly if a player wins an award because of his strong supporting cast as opposed to what he individually brings to the field. Brad Smith, former Davidson and USC kicker who runs the site, also seems to have the laudable goal of including more mid-major program players into the final award mix. For example, in his rankings Rice quarterback Chase Clement finished higher in the overall rankings than Tim Tebow (Colt McCoy finished #1). So these are generally laudable goals but I still don't quite know what to make of all this.

First, the value of any so-called "objective metric" is in how good the algorithm is. On that score, despite journalists telling us that Mr. Smith's "methodology is all there on the website," I come to find out that it is not.

Smith tells us simply:

The goal of this research is to advance a sophisticated representation of college football; a just, refined, and elegant measurement of performance; a precise, objective, and scientifically reliable selection of deserving recipients; an inherently dispassionate, methodologically sound, and experimentally valid celebration of individual achievement.


But that's really it for explanation, just cool assurances that it is an elegant, sound, and valid "celebration." My favorite of course is his discussion of why rushing yards is inadequate, which I must paste in full:

Q: Is football performance analysis a form of scientific enquiry?

A: The question, "Who are the top performers in college football?" is an inherently empirical question. In other words, any attempt to answer this question trespasses overtly on the domain of science.

PERFORMANCE 101: ANALYZING RUSHING DATA

The college football player with the most rushing yards per game is sometimes referred to as the "rushing leader". This usage is misleading and, in some sense, even confabulatory. In reality, the rushing yards per game statistic is not very helpful in evaluating rushing performance and is a poor predictor of team success. For an example of this, consider running back A with 900 yards on 300 carries, B with 870 yards on 145 carries, C with 840 yards on 120 carries, and D with 800 yards on 80 carries. Further, assume that A, B, C, and D have all played the same number of games, and all other rushing variables are held constant. According to the rushing yards per game statistic, A is the rushing leader, B is second, C is third, and D is fourth. Yet, almost certainly, these rankings are inverted. After all, in this case, the discrepancies in rushing yards per game are fairly small, while there are significant differences in rushing yards per carry. To declare A the rushing leader merely based upon A's standing in rushing yards per game without careful review of other factors and considerations is at best -- a cursory and superficial analysis, and at worst -- a specious and obfuscatory one.


I know what is "obfuscatory," and it is not just the ballot system. (I also enjoy spelling "enquiry" with an "E"; he was a philosophy major so I guess he has to spell it the way David Hume did.)

But all this begs this question. He tell us that subjective views of a runningback, or even a "scientific" review based on total yards doesn't tell us much. This is of course all rather pedestrian, but he he doesn't tell us what the next step is. Is it average yards per carry? Some mixture? He doesn't say. There is no explanation of his methodology.

He does have an "academic review" section, but these fine folk don't really discuss his actual methods, and instead seem to comment only on the general idea that objective, statistics-based criteria for ballots is inherently better than the ad hoc poll/ballot system currently in use. All quite possibly true, but merely stating that is not enough. (He also has a section titled "models," which I clicked on thinking it would tell me about his algorithms or the models he used to rank players. I was wrong, but it is likely worth clicking on anyway.)

The reason this is significant is because, contrary to what he seems to think, he's not the first guy to try to evaluate players based on the statistics. Football Outsiders has been trying to do this for over a decade, and the Pro-Football Reference site is another notable site which has gone into great detail and has laid it out for the world to understand. These enquiries, along with many others, have been going on for some time, and are free from the ballot box problems he identifies.

But the other reason it is significant, in light of his apparent thought that he is the first to finally Rank All That Is Good in Football, is that we've learned a lot about how difficult it is to model and evaluate players because of the hard work and transparency of these other sites and books. It isn't easy. He claims to be able to extract the fact that Colt McCoy is better individually than Sam Bradford, or that Dez Bryant was better than Michael Crabtree; any differences in results were just based on teammates. Maybe so, but how can you be sure? And how do you apply that kind of analysis to teammates, or offensive line play, or even quarterbacks, whose job is to distribute the ball around while relying on other guys to protect, get open, make the right play, etc? It's not that it can't be done, it is silly to act like you're first, or to but acting like you're the first to have thought about these questions, or to convince journalists to write things like:

Smith says on his website: "Who are the top performers in college football?" is an inherently empirical question. In other words, any attempt to answer this question trespasses overtly on the domain of science.

Science.

There's college football's seven-letter word. It suggests computers, which suggests BCS, which will make some of you stop reading right here.


And Let The Light of Discovery Shine Down Upon Thee. The answer is that it's all a bit silly, and this majestic quest to give awards based on elegant and objective science is a commendable goal, but Mount Everest hasn't been climbed yet, and the way has been paved for some time.

But the other reason why this is so bizarre to me, is why is this so focused on post-season awards? The article linked to implies a suggestion: that Mr. Smith's (perfectly acceptable) goal is to sell his ideas to various decisionmakers who hand out the Doak Walker Award, or the Unitas Quarterback Award, Lou Groza, and the like. That's fine, but for all the arguments about how subjective the post-season awards are, it ignores the question of why they shouldn't be somewhat subjective?

What should be wholly objective is a coaching decision to start one player or another, or to recruit a guy or for an NFL team to hire one as a free agent (marketing aside). That is 100% about getting the best players on the field to perform. (Though that analysis ignores the correlations that might exist among different groups of players, an idea studied much more in depth in basketball than football.)

But with awards, why is it so bad if the Big Schools win? What are these awards? No one has ever sufficiently answered for me whether the Heisman trophy is a "most valuable player" award designed to go to the critical member of a great team without whom the team would fail, or whether it is simply the best individual player in the country, or alternatively (and this is not the same thing), the player who has put in the best performance.

The implicit premise of Smith's site is that it should go to the latter, but I'm not certain that others would agree. Why shouldn't Danny Wuerrfel win the Heisman when the Gators were rolling over people rather than Troy Davis, who was individually quite impressive with over 2,000 yards rushing? Would a supposedly "objective" result be any fairer, one that not only would be subject to the vagaries of the model (which we can't review), but also would discount Wuerrfel's leadership, or ability to get up to throw pass after pass after defender and defender slammed into him head first?

I'm not so convinced that all that is flatly irrelevant in the limited context of postseason awards. Is it a crime that we take all those "subjective" impressions into account? I think not, especially with little to no explanation of the supposedly grand "science" behind the endeavor.