As I am wont to do, I am going to spend today constructing a football metric based on baseball statistics. Our victim of theft from baseball today is Weighted On Base Average,wOBA. The Fangraphs library tells us:
Weighted On-Base Average combines all the different aspects of hitting into one metric, weighting each of them in proportion to their actual run value. While batting average, on-base percentage, and slugging percentage fall short in accuracy and scope, wOBA measures and captures offensive value more accurately and comprehensively.
wOBA takes the basic idea of a slugging percentage and enhances it. Slugging percentage, for the uninitiated, is a measure of relative impact, based on total bases. So, singles are 1 point, doubles are 2, triples are 3, and homeruns are 4. The logical next question is, of course, is a home run really twice as good as a double? Or 4 times a single? The solution to those inconsistencies is to use wOBA, which attributes an expected run value to each event, and then, weights a batter’s performance on those expected run values. It provides an upgrade from On Base Percentage (how often a player makes it to at least first base, factoring in walks), and slugging, as it accounts for all actions, but it also accounts for all of the expected run values from those actions.
In football, we can do similar things. Bill Connelly, our friend and hero, has a measure of success called, aptly, success rate, which looks at how often a team achieves the required yards in certain downs - half the needed yards on first down, seventy percent on second down, and 100% on third and fourth downs. He pairs this metric with another type of calculation, isolated points per play. The IsoPPP essentially looks at how well a play does when it is a positive play. He then, more or less, adds those together to derive his S&P+ metric, with some intermittent tinkering. From Football Outsiders:
Success Rate: A common Football Outsiders tool used to measure efficiency by determining whether every play of a given game was successful or not. The terms of success in college football: 50 percent of necessary yardage on first down, 70 percent on second down, and 100 percent on third and fourth down.
IsoPPP: An explosiveness measure derived from determining the equivalent point value of every yard line (based on the expected number of points an offense could expect to score from that yard line) and, therefore, every play of a given game. IsoPPP looks at only the per-play value of a team’s successful plays (as defined by the Success Rate definition above); its goal is to separate the explosiveness component from the efficiency component altogether.
What do we know about adding fractions with different denominators, though? Bill has created, effectively, the football equivalent of OPS - on-base plus slugging: how often do you “do what you need to” plus how effective is it when you do? On the surface, this statistic rides pretty well with team quality, but of, course, gets a little fuzzy, and can create the illusion of multiplicative value when in fact, there isn’t.
In an effort to refine the concept of success in college football, I offer this week first a theory of an expected points model of down, distance, and field position from the 2017 season, and use those expected point values to create an Weighted Expected Success Rate, wxSR.
Briefly, I’ll discuss my methodology. My task list to generate this statistic is:
1. Filter out garbage time: I use Bill’s garbage time measurements, just because I don’t have a serious argument for changing them, and it helps to standardize comparison.
2. Create expected points for down, distance, and field position. I created “bins” for field position, starting with 0-10 and going in ten yard increments. I’m not 100% sure I trust these numbers, yet, but this is just a start. Here’s my final data for this attempt.
3. Calculate a change in expected points for each play. This is a little tricky, as I had to increase the down and distance and field position for each play based on the result (read: I used a ton of “if” statements). For example - in the Miami (OH) vs Akron game, Miami had a 1st and 10 from the second bin of field position (10 to 20 yards). The expected point value for that situation is 2.603211 points. Miami rushed for 1 yard, putting them at 2nd and 9 in the second bin, for an expected points situation of 2.52291, resulting in a change of expected points of .078.
3. After all of this, I now have what I need for the metric: total success times the expected point change, and total plays. The equation is:
Simple enough, right?
An alternate formulation that I am going to play with might entail calculating the expectation of success on any given play, and using that to work on the rates. There is some overlap, but for now, I believe this gets us where we need to go.
What I’ve outlined here is an advancement in how to think about team success, specifically joining the quantity of successful plays with the quality of successful plays. Next week, I’ll lay out the specifics of the data, and from there, compare to other analytical systems to see how meaningful differences can inform out thinking.
For now though, I’d love some feedback on the basic model and the expected points data.