What predicts success rate best - offensive or defensive strength?
The game of football is the literal and figurative collision between two teams, each with an offense and a defense, each striving after mutually exclusive, symmetric objectives: team one prefers their offense scores and their defense prevents scoring, while team two prefers *their* offense score while their defense prevents team one from scoring.
To what extent can a team control the outcome any given college football game? Is defense more important than offense?
In this article, I’ll examine one statistic - success rate - and attempt to answer whether it is determined on a game-by-game basis more by the offense or the defense.
To answer this question, I pulled college football play-by-play data from the last three seasons using cfbscrapR. If you’d like to get the data for yourself, please avail yourself of my Introduction to College Football Analytics with R and cfbscrapR.
I filtered the data for games involving only FBS teams and only during the regular season - the volatility of bowls adds more noise into the data, as team’s rosters, coaching staffs, and motivation are highly variable in bowl season. Additionally, I filtered out garbage time. I agonized a little bit over that decision, but ultimately I went ahead and dropped garbage time, as I ultimately believe that the garbage time stats overemphasize the magnitude of control an elite offense or an elite defense has, which would bias the estimates.
For each game in all three seasons, I calculated each team’s season success rate, holding out the result from each game.
For success rate, the data for a few games looks like this:
I used Ken Pomeroy’s very simple model to predict any statistic in a game:
stat = a*(offense) + b*(defense) + g*(site) + e
Where a is the influence of the offense on the game stat, b is the influence of the defense, g is the home field advantage, and e is some random error.
Each game shows up twice in the dataset, as we can try to predict both teams’ offensive success rate based on their season success and their opponent’s. This gives us 4392 observations across the three seasons.
The data I’ve constructed looks like this:
Now, for the results. I ran an Ordinary Least Squares (OLS regression) of game success rate on offense and defense season success and home field advantage. Some might prefer a logistic regression here, as we’re dealing with percentages, but I prefer OLS - interpretation remains clear and errors are normally distributed, so no transformation of the data is necessary. Also, the results from logit and from OLS are only different in tiny decimals places, so it really doesn’t matter.
The coefficient for season defense is larger than that for season offense (.72 compared to .69), indicating that in fact, defense has a little more control over individual game success rates, but that difference is quite small! If you wanted to guess an offense’s success in one game, the caliber of the defense is actually a little bit more informative! Home field advantage plays a statistically significant role in determining success rates, but I worry about that being biased by good teams paying bad teams to come play games - meaning that the average quality of opponent at home is lower than on the road for teams.
The R-squared for this regression is .2398, meaning that offensive and defensive season long success rates, plus home field advantage, explain almost 24% of the variation in game success rate outcomes. We’ve not accounted for anything else like season-long strength of opponent or injuries or roster or even sequence of games (when games were played/time-series components), and what we’ve done is explained a fourth of the varaition in college football success rates.
Overall, the coefficients provide a nice story about defense being slightly more important in determining single game outcomes, a finding which shouldn’t revolutionize our thinking but does in fact add some context and challenge to current notions that offense is more important than defense.
In considering college football success rates, OLS regression suggests that defenses are slightly more important than offenses. This could be for a variety of reasons, including high concentration of talent, complexity of schemes, and even higher volatility of outcomes initiated by the offense. What does this finding give us in terms of insight into how the game of college football works? Offenses are more volatile than defenses, and so defensive quality affects offensive success rates more than offensive quality.
I’ll be examining the relative influence of defense and offense more as the Offseason Stat Series continues. Until then, follow me on Twitter for stats, graphs, and college football conversation.