clock menu more-arrow no yes

Filed under:

Offseason Stat Series: A Simple Model for College Football Strategy

A standard economic model of production highlights the success and failure of strategy in college football.

NCAA Football: Southern at Texas Christian Tim Heitman-USA TODAY Sports

Author’s note: This post gets very detailed. Feel free to skip the math at your leisure. If you’re just here for some takeaways, you can skip to the results section or browse the data here. If you’re just here for the Frogs, scroll down to the appendix, where I discuss TCU’s specific optimization problem and strategy in detail.

The application of analytics and advanced statistics into college football enjoys a period of relative success in our current moment, with advanced stats popping up on broadcasts and power rankings and win probabilities dominating conversations about playoff rankings and team quality. While the field has gradually adopted the analytical mindset, current research lacks a robust of theory of causal inference in college football. The bulk of research focuses on descriptive statistics, isolating observed values and aptly comparing those based on some measures of context like garbage time, field position, and opponent-adjustment. This analysis lays a working foundation of comparison and ranking of on-field quality: good teams move the ball well and score often, preventing opponents from doing the same (and bad teams the opposite). While current approaches capture some meaningful context of on-field results, the open problem of situational intent and execution remains unsolved.

Among the many limitations expected of a nascent and evolving field of study, a key restriction in the college football literature is the confounding nature of play-calling strategy and on-field execution. Because only the on field outcome of a play is realized (observed), existing analysis strategy cannot account for the intent of play-callers and the distribution of potential outcomes on any play, given intent. In straight-forward terms, according to the raw data, a tipped pass scramble for a touchdown and a sixty yard bomb register the same. That’s a problem. The first step in divining this intent/execution distinction is a meaningful vehicle for comparison of strategy. The aim of this article is to formalize steps toward a more sophisticated analysis of college football strategy by applying a simple profit maximization problem to derive each team’s optimal choice of run and pass plays.

I. Model

Warning: Gory mathematical details ahead!

Evaluating a team’s strategy requires a standard of comparison. To that end, I’ll frame success on the offensive side of the ball as a production function. That is, offensive success is a result of your choice of inputs (run and pass) and your technology (rushing and passing success rates). I’ll use the standard Cobb-Douglas production function; if you’re unfamiliar, all you need to know is that the CD function stylizes output as a function of inputs in a particular way: Y = L^a * K^b, where Y is output, L is your first input (often labor), K is your second input (capital), and a and b are the technological effectiveness of each type of input.

The inputs of the college football Cobb-Douglas are rushing and passing (denoted as R and P) and the technologies are the rushing and passing success rates (denoted as r and p). The output here is a bit abstract - it doesn’t translate directly to anything on field- yet. For now, it’s enough to know that the output measure is increasing in quality: higher output is better, plain and simple.

The advantage of the Cobb-Douglas is two-fold: first, it gives us a format to translate inputs and choices into production, and second, it limits that with diminishing returns to scale. (Diminishing returns to scale just means that you can’t overload the input you’re more efficient with forever and still increase your output, which translates to football nicely - even if you were the most efficient running team in the nation, you couldn’t run every play and expect the same level of success.

In economics, each firm solves a profit maximization problem: firms want to produce as much as possible subject to the costs of each input. Here, I’m assuming the costs of a play are equal and therefore negligible; solving the model, we get two equations based on observed parameters.

  • Optimal rush rate: R = 1/(1 + p/r)
  • Optimal pass rate: P = 1-R

Simple enough, right? Your optimal choice of run and pass mix is a function of your efficiency in each type. The rest of the paper is as follows: I’ll describe the actual, realized, on-field rushing and passing tendencies of each team, I’ll solve for the optimal production and rush/pass rates for each team, and then I’ll evaluate teams based on their actual distance from optimal production. In the final section, I’ll analyze TCU’s strategy profile in detail to determine where the Frogs had unrealized gains.

II. Results

College Football Tendencies, Strategy, and Production

Stat Rush Success Rte Rush Rte Pass Success Rte Pass Rte Production
Stat Rush Success Rte Rush Rte Pass Success Rte Pass Rte Production
Mean 43.59% 52.18% 41.90% 47.82% 20.23
Median 43.85% 51.83% 41.98% 48.17% 19.78
St. Dev 0.0494 0.0885 0.0541 0.0885 6.39
Max 58.60% 88.70% 56.04% 72.30% 47.43
Min 30.59% 27.69% 28.61% 11.29% 9.34

First, we start with the basics. Here are some summary statistics about team tendencies of rushing and passing. Note that these are filtered for garbage time.

Looking at the distribution, we find what shouldn’t be a surprising fact: most coaches are pretty good at their jobs. If you look at the green line, you’ll see a normal density. We would expect, bound at zero, about 75% of observations to be spread out according to that green line, within .01 units of 0. Instead, we see a large clustering just around zero: in fact 89% of teams are within .005 units of their optimal production. That is what you would expect, in reality, though - college football coaches are at the top of their field, and for the most part, should be expected to know how to work within their constraints.

Where this graph gets interesting is the skewness - that is, the clustering on the left and the pull to the right by a few observations. Six teams are outside of the .01 bucket: Navy, Georgia Southern, Washington State, Georgia Tech, Air Force, and Army. Notice a trend? Five of these are CFB’s option offenses, and Wazzou passes with a similar tenacity. In fact, the most egregious offenders when it comes to failing to optimize strategy are all staunch disciples of a certain offensive philosophy.

For the purposes of this analysis, I’m going to censor those six outlier teams; this model clearly fails to fit a coach with such normative commitments to style - you could argue Mike Leach solves a different equation than the rest of the world quite easily. Pushing aside those tenacious schools, let’s examine how teams optimized. The linked spreadsheet accompanies the following analysis. Let’s start with extreme values (outliers aside):

TOP FIVE OFFENSIVE PRODUCTION TEAMS:

  1. Alabama 47.42
  2. Oklahoma 45.57
  3. Ohio 37.34
  4. Georgia 36.16
  5. Wisconsin 34.16

BOTTOM FIVE OFFENSIVE PRODUCTION TEAMS:

  1. Rutgers 9.34
  2. UTSA 10.25
  3. Akron 10.31
  4. Central Michigan 10.51
  5. Florida State 10.67

Note - to this point, I’ve left the idea of opponent adjustments aside. That would level the top of the rankings a bit, I believe, but the raw numbers pass the eye test, in my opinion. In the top five, we have three of the consensus best coaches of 2018 - Saban, Riley, and Smart - accompanied by two program masters - Solich and Chryst - who have been running their system for years, perfecting their identity. At the bottom of the list, we have five rudderless teams - three G5 teams without much to their name or history, the worst P5 school, and a P5 school who just hired yet another offensive coordinator.

Measuring total production is one way to compare offenses - who did the best is simply who got the most production out of their constraints (taking rush and pass success rate as given, which is a strong assumption to impose, but necessary for the initial analysis). The next step is to solve for each team’s optimal selection of run and pass plays and re-calculate their production if they had been optimizing. Then we can take the difference between actual and optimal to determine how well a team did in playing to its strengths.

TOP FIVE LARGEST DIFFERENCES IN OPTIMAL AND ACTUAL PRODUCTION

  1. Georgia -.811
  2. App State -.661
  3. Hawaii -.537
  4. Michigan -.486
  5. Purdue -.4322

TOP FIVE SMALLEST DIFFERENCES IN OPTIMAL AND ACTUAL PRODUCTION

  1. Toledo (<.001)
  2. Miami (<.001)
  3. Colorado (<.001)
  4. Temple (<.001)
  5. Arkansas State (<.001)

The differences here need some explanation: they alone do not represent team quality, much less coach quality. Some teams aren’t on the list (OU, Alabama, Ohio State, for example), because they have a high ceiling and they reached their ceiling. The difference doesn’t tell us how objectively good a team was compared to other teams, it just informs us as to how good a team was relative to what it could be.

Georgia is an odd case - in the top five of production, but with a much higher theoretical limit than their actual production. That could be a sign of things to come, or it could actually be an admonishment of UGA’s coaching staff. More on that in a minute. The other top teams include a Hawaii and Michigan staff too committed to the run and a Purdue and App State team too committed to the pass. Aside from Hawaii, though, you have a roster of some pretty great football coaches here, albeit defensively minded.

As for the optimizing teams - Toledo is a chronic over-performer, and Jason Candle a perpetual hot name in coaching searches, so it shouldn’t surprise us to see him performing so well. The same goes for Arkansas State, who has a reputation of getting more from less than their Sun Belt peers. Geoff Collins at Temple just accepted the GT job, aiming to fix their offense and transition away from the triple option. Miami and Colorado are unexpected - they both fired coaches whose teams worked well within their constraints - perhaps the issue at both schools was the constraints?

Finally, I evaluate coaches based on an implied coaching effect. I add a multiplicative “total factor productivity” term to the Cobb-Douglas production function, and then ask if teams were in fact making optimal choices of run and pass plays, what value would the coaching effect have to take? In that way, I can determine how much a coach helped or hurt a team in theory.

TOP FIVE MOST BENEFICIAL COACHES:

1. Toledo (<.001)
2. Miami (<.001)
3. Temple (<.001)
4. Colorado (<.001)
5. Arkansas State (<.001)
...
7. TCU (.003)
8. Alabama (.004)

BOTTOM FIVE MOST DETRIMENTAL COACHES:

  1. App State (.293)
  2. ECU (.261)
  3. Georgia (.218)
  4. Purdue (.218)
  5. Hawaii (.215)

Most of these teams are the same as the list above, although a few demonstrate that some coaches were more off on understanding their constraints than others. You can peruse the full list at the linked spreadsheet.

IV. Conclusion

This analysis finds clear evidence that strategy choices play a substantial role in determining team quality and on-field performance. While that may not sound revelatory on the surface, this phenomenon has been heretofore undocumented. The results confirm two stylized facts about college football: 1) choice of run pass mix affects team quality 2) some teams are better than others at optimizing their talent.

Further research should continue in three directions. The binary distinction between run and pass plays grows more obsolete by the day in the era of hybrid offense and run pass options; a more parsimonious categorization of plays would improve understanding the tenuous balance between style and strategy. Next, the above analysis takes rushing success rate and passing success rate as given parameters. These of course depend on talent and on coaching, which might confound the production equation. Modeling endogenous technology begins a path down a deep rabbit hole, but as we move towards a fuller understanding of the on-field equation to solve, we will require more complex and detailed methods. Finally, the model can be built up by including defensive measures, opponent adjustments, and thus creating a comprehensive ranking of teams.

IV. Appendix: TCU Strategy Evaluation

Many of us in the online TCU fan crowd have strong opinions about the offense; I’ve yelled for months now about how weak TCU’s psuedo-option has looked, and even pressed for them to consider a triple-option hybrid, or at least something different. TCU rushed 52.1% of the time, which ranks 69th nationally; their success rate was 41.6% overall (43.5% rushing (69th), 39.5% passing (90th).

Their Cobb-Douglas production was 18.03 (81st), which is right in line with their S&P+ Offensive ranking. TCU’s optimal production, though, was right at 18.03 as well, 82nd overall, which indicates their ceiling wasn’t very high to begin with. That suggests, according to this model, that TCU’s problem wasn’t so much one of play-calling as it was of execution. That of course, is a limited statement - execution and success rate are obviously tied together - but, in absence of a model to account for that change (forthcoming, maybe?), we have clear insight as to more of the source of TCU’s issues, and an effectual vote of confidence in the offensive staff at TCU: given their constraints, the TCU OC and offensive coaches were optimizing better than most programs in the country.