Welcome to Regression Alert, your weekly guide to using regression to predict the future with uncanny accuracy.
For those who are new to the feature, here's the deal: every week, I dive into the topic of regression to the mean. Sometimes I'll explain what it really is, why you hear so much about it, and how you can harness its power for yourself. Sometimes I'll give some practical examples of regression at work.
In weeks where I'm giving practical examples, I will select a metric to focus on. I'll rank all players in the league according to that metric, and separate the top players into Group A and the bottom players into Group B. I will verify that the players in Group A have outscored the players in Group B to that point in the season. And then I will predict that, by the magic of regression, Group B will outscore Group A going forward.
Crucially, I don't get to pick my samples (other than choosing which metric to focus on). If the metric I'm focusing on is touchdown rate, and Christian McCaffrey is one of the high outliers in touchdown rate, then Christian McCaffrey goes into Group A, and may the fantasy gods show mercy on my predictions.
Most importantly, because predictions mean nothing without accountability, I track the results of my predictions over the course of the season and highlight when they prove correct and also when they prove incorrect. Here's a list of my predictions from 2019 and their final results, here's the list from 2018, and here's the list from 2017.
THE SCORECARD
In Week 2, I opened with a primer on what regression to the mean was, how it worked, and how we would use it to our advantage. No specific prediction was made.
In Week 3, I dove into the reasons why yards per carry is almost entirely noise, shared some research to that effect, and predicted that the sample of backs with lots of carries but a poor per-carry average would outrush the sample with fewer carries but more yards per carry.
In Week 4, I talked about how the ability to convert yards into touchdowns was most certainly a skill, but it was a skill that operated within a fairly narrow and clearly-defined range, and any values outside of that range were probably just random noise and therefore due to regress. I predicted that high-yardage, low-touchdown receivers would outscore low-yardage, high-touchdown receivers going forward.
In Week 5, I talked about how historical patterns suggested we had just reached the informational tipping point, the time when performance to this point in the season carried as much predictive power as ADP. In general, I predicted that players whose early performance differed substantially from their ADP would tend to move toward a point between their early performance and their draft position, but no specific prediction was made.
In Week 6, I talked about simple ways to tell whether a statistic was especially likely to regress or not. No specific prediction was made.
In Week 7, I speculated that kickers were people, too, and lamented the fact that I'd never discussed them in this column before. To remedy that, I identified teams that were scoring "too many" field goals relative to touchdowns and "too many" touchdowns relative to field goals and predicted that scoring mix would regress and kickers from the latter teams would outperform kickers from the former going forward.
In Week 8, I noted that more-granular measures of performance tended to be more stable than less-granular measures and predicted that teams with a great point differential would win more games going forward than teams with an identical record, but substantially worse point differential.
In Week 9, I talked about the interesting role regression to the mean plays in dynasty, where the mere fact that a player is likely to regress sends signals that that player is probably quite good and worth rostering long-term, anyway. No specific prediction was made.
Statistic for regression | Performance before prediction | Performance since prediction | Weeks remaining |
---|---|---|---|
Yards per Carry | Group A had 3% more rushing yards per game | Group B has 36% more rushing yards per game | Success! |
Yard to Touchdown Ratio | Group A averaged 2% more fantasy points per game | Group B averages 40% more fantasy points per game | Success! |
TD to FG ratio | Group A averaged 20% more points per game | Group B averages 16% more points per game | 1 |
Wins vs. Points | Both groups had an identical win% | Group B has an 8% higher win% | 2 |
Another week down and the race between our field goal kickers continues to tighten. Or at least it appears to. Last week, Group B had a 24% advantage. This week, Group B only has a 16% advantage. Surely Group B's lead is getting smaller.
But that's not the case. In fact, Group B outscored Group A once again last week; Group B's lead only shrank because it outscored Group A by a smaller margin than it had the first two weeks, 9.0 points per game to 8.5 points per game. Additionally, Group A was already in a hole and only had a few chances to dig its way out, so any week that doesn't see progress toward that goal represented a squandered opportunity.
Prior to last week, both Group A and Group B had nine remaining games (because of byes), and Group A needed to outscore Group B by 15 points over those 9 games, or about 1.6 points per game. Now, Group A has five games to Group B's four games, but Group A needs to outscore Group B by a whopping 25 points; based on their usual performance, that's probably closer to 3.4 points per game. Group B's lead as a percentage might be shrinking, but the chances of Group A closing that lead are shrinking even faster.
As for our "wins vs. points" prediction... both Group A and Group B posted identical 4-3 records last week, which caused their race to tighten as well. The highlight (or, from the perspective of our prediction, the lowlight) was the Group A Saints dismantling the Group B Buccaneers in a 35-point blowout. Since our original prediction stipulated that Group B would need a winning percentage ten points higher, this marks the first time all season that any of our predictions have trailed. Which provides a good segue into today's topic.
Regression and Large Samples
One of the key features of regression to the mean is that outlier performances are significantly more likely over small samples. If I flip a coin that's weighted to land on heads 60% of the time, that means there's still a 40% chance it lands on tails. Given those odds, landing on tails wouldn't be very surprising at all. But if I flipped the same coin a million times, the odds of seeing Tails come up more often than Heads dwindles down to nothing.
This idea that variance evens out over larger samples is one of the key insights in fantasy football. Why do top DFS players compete with so many different lineups every week? The answer is not, as is commonly believed, because it increases their expected return on investment. Indeed, every DFS player has a "best" lineup, a lineup that they think is most likely to win that week, and every other lineup that player submits actually decreases expected payout (because it's a worse lineup than the best lineup).
So why submit so many different lineups? Because outlier performances are significantly more likely over small samples. By using 20 lineups in a week, top players reduce the amount of money they'd be expected to win, but they also reduce the chances of a single injury or bad performance wiping out their entire bankroll, and that's a worthwhile trade.
(Of course, larger samples reduce variance in both directions. A DFS player who submits 20 lineups is far less likely to lose their entire bankroll, but they're far less likely to double it, too.)
Why is it that three weeks at the beginning of the year don't give us enough information to outperform ADP, but five weeks do? Because three weeks is too small of a sample for the outlier performances to have all washed out sufficiently, and five weeks is not.
This is why the preferred practice around here is to select groups of players or teams to compare in our predictions. If we selected a single player, we'd be wrong much more often (much like betting heads on a weighted coin will still lose 40% of the time). By bundling players into similar groups of five or ten, it becomes much more likely that we'll see the overall pattern emerge.
This is also why the preferred practice around here is to let predictions run for four weeks. I'd love to let them run for even longer, sometimes, but a clearly-defined endpoint is critical for accountability, to prevent me from just running the prediction until Group B pulls ahead and then immediately closing it and declaring it a success.
The fact that outliers are more common on smaller samples tends to manifest in our results over time, too. We usually see Group B take its biggest lead in the week or two after the prediction and then watch that lead shrink over the remaining weeks, for instance. At the same time, when Group B does trail, it likewise typically does so in the week or two immediately after the prediction before pulling back ahead in Weeks 3 and 4.
And this week's kicker prediction provided a great illustration of another consequence of the fact that outliers are more common over smaller samples: small leads over big samples can be more impressive than big leads over small samples.
For those of you who have been watching football for long enough, you probably remember the 2004 NFL season. The 2003 season closed with the New England Patriots beating the Indianapolis Colts 24-14 in a game that wasn't as close as the final score might suggest. The Colts complained that the Patriots' defensive backs were hitting receivers more than five yards beyond the line of scrimmage, which violated the rules as written, and the referees let it slide.
Over the offseason, the NFL's competition committee decided it would place a "point of emphasis" on ensuring officiating crews began calling contact downfield in line with the rules as written. NFL defenses adjusted by being less physical in coverage and passing offenses exploded, setting numerous records, headlined by Peyton Manning's own 49-touchdown season. After 2004, the NFL quietly dropped the point of emphasis, officiating crews went back to letting contact six or seven yards downfield slide, and offenses dropped off again.
In 2003, the league-wide average for yards per pass attempt was 6.6. In 2004, it spiked all the way to 7.1. In 2005, it fell back down to 6.8. (For context, yards per attempt so far this year is 7.4.)
That was it. Three-tenths, five-tenths of a yard per pass attempt, that was the difference between a stifling defensive environment and a wide-open offensive environment. When an offense dropped back to pass in 2004, the result was approximately 7.5% better than when it dropped back to pass in 2003.
On a player level, a 0.3-0.5 extra yards per attempt isn't a massive difference. So far this year, Kirk Cousins leads Aaron Rodgers by 0.9 yards per pass attempt. Nick Mullens leads Jimmy Garoppolo by 0.8 yards per attempt.
But that's a difference of half a yard on a couple hundred attempts. This was a difference of half a yard... over 16,354 attempts. Despite attempting 139 fewer passes, the league as a whole passed for an extra 5169 yards in 2004. It basically conjured an entire 1984 Dan Marino out of thin air. That half yards per attempt was a massive change given the sheer number of attempts in question.
What does this mean for us? It means if we want to make a sure profit in fantasy football by betting on regression to the mean, we're going to need to place a lot of bets. Trading away one player or acquiring another simply because they have a profile that suggests regression is a positive move in expectation, but the range of possible outcomes is massive. It could work out really well, it could work out terribly. Weighted coins still flip tails sometimes.
It also means that the more bets we place on regression to the mean, the more our upside becomes capped. With larger samples, the odds of hitting big on every bet decline.
But just like top DFS players, when you have a genuine edge it can often make sense to turn to safer profits over gambling on the potential to strike it big while leaving yourself fully exposed to the consequences if the flip doesn't go your way.