Rabu, 12 November 2008

Football, Luck, and Noise

I received a surprising amount of pushback via email regarding my last post about Texas Tech and the Hot Hand Theory. At first I was confused, but then I realized that many readers do not share a rather fundamental assumption I hold about football: an incredible amount of the game is determined by "luck." Now, when I say luck, I do not mean fluke events, or the ol' bounce a da ball, or things like that. What I mean is that almost any and every outcome in football is not set in stone, but rather, there is some probability that the outcome will be X, another probability that the outcome will be Y, and maybe even a chance that it will be Z.


Theological questions aside, I really think this is a rule of life and not just football. But the point is that at no point in a football game, be it success of a play or even a determination of what the other side is actually doing, do you have fixed answers. Instead, you have probabilities, and even then your probabilities are merely estimates of the actual probabilities. So when I talk about "coolly flipping coins," I mean that everything is probabalistic. Just like when Michael Jordan went to the free-throw line, no matter what any sports writer tells you, he is never destined to make the shot, or destined to make the game-winner. Tiger Woods is never destined to hit the putt, and Tom Brady or Peyton Manning were neither destined to win the Super Bowl or hit any particular pass.

Instead, it was merely "highly likely" that each was going to do those things, because each is very good at what they do. But at no point is anything determinate.

Indeed, one of the criticisms of my post was that the probabilities dramatically increase regarding offensive success because you gain more information as time goes on. But that argument doesn't hold water. If Michael Jordan can only max-out his free-throw percentage to a point, then there is no way to max out offensive production in football when at all turns you have a human (or group of them) making choices on the other side in ways that shift your probabilities. That is a far too nebulous cloud to assume certitude.

And any playcaller will tell you the same thing. As Norm Chow says, you are never quite sure what coverage they are in, but instead you take pieces of the field or pieces of the defensive front and attack those, and therein lies success. Mike Leach does not even require his guys to memorize coverages in the sense of "Hey they are in Cover 4!" Instead, they group them into things they can recognize and they probe areas. But at every stage, things are probabalistic. I've even discussed the notion that a purely random approach to offensive and defensive calls might even be optimal.

When I made the point about the hot hand theory, part of it was about how you cannot always extrapolate how good an offense is versus a defense just because they scored on a drive, or even if they scored a lot in a half or game, because the standard deviation is too high. Some people argued that things would even out over the course of a game; I think that is sort-of true, but I still think the variance is higher than they account for. But that's an empirical question we can solve later.

But another (amazing) site, Advanced NFL Stats, made the point about the difficulty of extrapolating skill levels from even successful outcomes:


Consider a very simple example game. Assume both [Pittsburgh] and [Cleveland] each get 12 1st downs in a game against each other. PIT's 1st downs come as 6 separate bunches of 2 consecutive 1st downs followed by a punt. CLE's 1st downs come as 2 bunches of 6 consecutive 1st downs resulting in 2 TDs. CLE's remaining drives are all 3-and-outs followed by a solid punt. Each team performed equally well, but the random "bunching" of successful events gave CLE a 14-0 shutout.

The bunching effect doesn't have to be that extreme to make the difference in a game, but it illustrates my point. Natural and normal phenomena can conspire to overcome the difference between skill, talent, ability, strategy, and everything else that makes one team "better" than another.


And adding support for my argument about the high degree of variance, Advanced NFL Stats went on to try to nail down exactly how much in the way of outcomes can be attributed to skill versus luck in the NFL. You can read the details of the explanation there, and NFL teams obviously are closer in relative skill levels than most college teams, but the results are nevertheless striking:


...By comparing the two distributions, we can calculate that of the 160 season outcomes, only 78 of them differ from what we'd expect from a pure luck distribution. That's only 48%, which would suggest that in 52% of NFL games, luck is the deciding factor!

There might yet be more to it than these calculations, but the point is that variance is high in outcomes in football games. This is not to say that skill is unimportant, but the lesson is instead that you cannot merely look to actual statistics and actual outcomes to determine who is the best. Football games are tests of ranges of probabilities put up against one another:

Will all eleven players execute their assignments; will the quarterback make the right reads; will the coaches accurately assess the opponent's schemes; will the sun shine in the receiver's eye; will the ball become sweaty where the ballcarrier holds it; will there be an injury on the play; and if these factors randomly cut 50/50, will they work in our favor enough times in a row to get us in field goal or touchdown range.

In other words, lots of football fans, players, and even coaches suffer from a Fooled by Randomness problem when they analyze the game. Football is more quantum mechanics than it is Newtonian physics (though with a splash of game theory). Yet the belief in absolute determinism is natural: we intuitively want results to be indicative of objective truths, and it is much less complex to analyze easy to observe statistics and outcomes than it is to try to estimate the underlying probabilities. But football doesn't always give us large enough sample sizes to believe that results are as instructive as we'd like. So, if we want real answers, we have to admit that there's lots of luck around.

(And if you're a fan of the Michigan Wolverines, this gives you an (incredibly weak) excuse: "It's all the result of bad luck!")