Author’s Note - This week, I’m using the column to clean up a couple loose ends, as it’s an especially busy one for me. Next week, I’ll be back with an original, fresh, research question, and we’ll start diving deep again.
As the summer continues, I’ll wander through some thoughts about football and analytics generally - “state of the game” kinds of things - and some more specific previews looking ahead to this fall, both in the Big 12 and around the NCAA. I’m always open to suggestions about college football data projects, so feel free to reach out in the comments.
Early July is an inevitably stupid period in the college football calendar; we’ve all wrung every last bit of intrigue from spring practice, and media days are still ahead of us. In an offseason such as this, absent of any major scandal or coaching carousel, we are left with a heaping plate of nothing, a volley of meaningless interview questions and even more meaningless pontifications in response, coaches and media members alike just pining for the real thing, for college football to actually start.
I don’t want to spend the rest of my summer talking about Lincoln Riley. We’ve all made our arguments, and the fact of the matter is, as I mentioned last week, right or wrong, intrepid or unintelligible, Lincoln Riley’s throwaway comments regarding Georgia’s defense were exactly that - throwaway comments. So, we take them for what they are worth, which is not a lot. And, to be fair, the really troubling aspect of evaluating hypotheticals is what they laymen call the butterfly effect - that history is non-ergodic: you cannot simply hold all else constant, changing one key feature, and expect everything else to continue the same.
One of the main criticisms I received was about the inability of “statistics” to actually measure opponent quality, give an accurate picture, or provide meaningful insight. The comment section from last week’s piece does feature something rare on the internet: a disagreement that ended in better understanding. To that end, I want to communicate that statistics are not the be-all-end-all. A single statistic is no more meaningful than that one drive in the 4th quarter you saw of that one team, or the highlights you caught the morning after. Statistics give us a way to account for context that we cannot physically observe simultaneously. We have to think long and hard (don’t you dare) about the construction of those statistics, but statistics aren’t some magical voodoo - they are oft abused, but that far from invalidates them.
The above is a long-winded way to say that the reader and college football fan should familiarize them with the S&P+ analytics system. It’s not going away - terms like success rate, IsoPPP, and efficiency are the future of understanding college football. It is not complex, and it is far from perfect, but even this flawed metric helps inform our conclusions, test our theories, and shape the way we measure success in college football. You can find Bill’s primer here.
I have qualms with it, but it’s the best metric we have right now. In this space in the future, I’m planning on working with some numbers of my own - let’s talk about those, and how we can improve analysis beyond just observation.
Whew. Deep breath.
The second half of this post, I want to follow up on turnovers. A couple of weeks ago, I compared turnovers in college football to batting average on balls in play, highlighting that the metric does contain a stochastic element, but that it may depend more on team control than previously anticipated. I then promised some TCU specific fumbles conversation, and TCU specific fumbles conversation I have.
Let’s start with a graph:
(Side note: I’m on my way to learning formally learning Python this summer, and am just about at the point where I can move from Stata over to Python completely. As a result, my graphs are about to get 100% shinier and cleaner. For now, though, we will stick with my 1980s looking Stata graphs.)
Let’s wildly speculate on randomness.
Years when TCU has been “good”: 2009, 2010, 2014, 2015, 2017. The difference between 2009 and 2010 TCU in terms of turnovers is more than a half a turnover a game - quite a stark difference. Some of that is due to better bounces, but I posit that most of that difference is due to returning quite a lot of production, and perhaps to a team more locked in to redeem themselves after blowing a golden opportunity to be undefeated in 2009. There, see? We have latent variables about motivation, past results, and player chemistry that all factor into turnover rate.
The same could be said for 2016 to 2017, and perhaps the converse is true for 2015 to 2016. The stochastic element comes in for years like 2014: That year’s offense had a below-average turnover performance, for a Gary Patterson team over the last ten years. But if you compare that year’s average to the rolling three year average (since TCU joined the Big 12), then that was one of their better performances, given the context. That also has to account for offensive coordinator changes and style, and of course for opponents.
This is turning into a ramble, and so I’ll cut it here to say: predicting single game turnover margins is of course a shot in the dark, and not very helpful, on the whole. Thinking about Turnovers as random fluctuations may undercut some valuable information about a team’s quality and style.
Conclusion: When thinking about the pillars of college football success, turnovers cannot be ignored. The role of a team in determining those turnovers extends beyond luck, and a talent based explanation can apply to more teams than just the outliers.
tl;dr: Context matters in college football, and even off-field context can influence on-field things like turnovers.