# Regression Alert: Week 5 - Footballguys

Revisiting preseason expectations and seeing what we've learned.

Welcome to Regression Alert, your weekly guide to using regression to predict the future with uncanny accuracy.

For those who are new to the feature, here's the deal: every week, I dive into the topic of regression to the mean. Sometimes I'll explain what it really is, why you hear so much about it, and how you can harness its power for yourself. Sometimes I'll give some practical examples of regression at work.

In weeks where I'm giving practical examples, I will select a metric to focus on. I'll rank all players in the league according to that metric, and separate the top players into Group A and the bottom players into Group B. I will verify that the players in Group A have outscored the players in Group B to that point in the season. And then I will predict that, by the magic of regression, Group B will outscore Group A going forward.

Crucially, I don't get to pick my samples, (other than choosing which metric to focus on). If the metric I'm focusing on is yards per target, and Antonio Brown is one of the high outliers in yards per target, then Antonio Brown goes into Group A and may the fantasy gods show mercy on my predictions.

Most importantly, because predictions mean nothing without accountability, I track the results of my predictions over the course of the season and highlight when they prove correct and also when they prove incorrect. Here's a list of all my predictions from last year and how they fared.

# THE SCORECARD

In Week 2, I laid out our guiding principles for Regression Alert. No specific prediction was made.

In Week 3, I discussed why yards per carry is the least useful statistic and predicted that the rushers with the lowest yard-per-carry average to that point would outrush the rushers with the highest yard-per-carry average going forward.

In Week 4, I explained why touchdowns follow yards, (but yards don't follow back), and predicted that the players with the fewest touchdowns per yard gained would outscore the players with the most touchdowns per yard gained going forward.

 Statistic For Regression Performance Before Prediction Performance Since Prediction Weeks Remaining Yards per Carry Group A had 24% more rushing yards per game Group B has 8% more rushing yards per game 2 Yards:Touchdown Ratio Group A had 28% more fantasy points per game Group B has 69% more fantasy points per game 3

"Yards per carry is completely random", exhibit #103,962,411: in week 4, our "high-YPC" cohort averaged 4.219 yards per carry. Our "low-YPC" cohort averaged 4.224 yards per carry. Yards per carry is totally not actually a thing.

Also, if you were hoping for a dramatic result, our yard-to-touchdown ratio prediction last week certainly delivered one. Consider: over the first three weeks, our high-touchdown cohort scored 36 touchdowns in 39 player-games, or 0.92 touchdowns per game. Our low-touchdown cohort scored 4 touchdowns in 36 games, or 0.11 touchdowns per game.

In Week 4, 66% of "low-touchdown" players reached the end zone, and the group collectively scored 0.75 touchdowns per player game. Meanwhile, only 23% of "high-touchdown" players reached the end zone, and the group collectively scored 0.38 touchdowns per game, about half as many as Group B scored.

There's still a long way to go on both predictions, but the early returns show why I chose to focus attention here first.

# REVISITING PRESEASON EXPECTATIONS

In October of 2013, I wondered just how many weeks it took before the early-season performance wasn't a fluke anymore. In "Revisiting Preseason Expectations", I looked back at the 2012 season and compared how well production in a player's first four games predicted production in his last 12 games. And since that number was meaningless without context, I compared how his preseason ADP predicted production in his last 12 games.

It was a fortuitous time to ask that question, as it turns out, because I discovered that after four weeks in 2012, preseason ADP still predicted performance going forward than early season production did.

This is the kind of surprising result that I love, but the thing about surprising results is that sometimes the reason they're surprising is really just because they're flukes. So in October of 2014, I revisited "Revisiting Preseason Expectations". This time I found that in the 2013 season, preseason ADP and week 1-4 performance held essentially identical predictive power for the rest of the season.

With two different results in two years, I decided to keep up my quest for a definitive answer about whether early-season results or preseason expectations were more predictive down the stretch. In October of 2015, I revisited my revisitation of "Revisiting Preseason Expectations". This time, I found that early-season performance held a slight predictive edge over preseason ADP.

With things still so inconclusive, in October of 2016, I decided to revisit my revisitation of the revisited "Revisiting Preseason Expectations". As in 2015, I found that this time early-season performance carried slightly more predictive power than early-season performance.

To no one's surprise, I couldn't leave well enough alone in October 2017, once more revisiting the revisited revisitation of the revisited "Revisiting Preseason Expectations". This time I once again found that preseason ADP and early-season performance were roughly equally predictive, with a slight edge to preseason ADP.

And now, as you've probably guessed, it's time for an autumn tradition as sacred as turning off the lights and pretending I'm not home on October 31st. It's time for "Revisiting Preseason Expectations"! (Or, I guess technically for Revisiting Revisiting Revisiting Revisiting Revisiting Revisiting Preseason Expectations.)

# METHODOLOGY

If you've read the previous pieces, you have a rough idea of how this works, but here's a quick rundown of the methodology. I have compiled a list of the top 24 quarterbacks, 36 running backs, 48 wide receivers, and 24 tight ends by 2017 preseason ADP.

From that list, I have removed any player who missed more than one of his team’s first four games or more than two of his team’s last twelve games so that any fluctuations represent performance and not injury. As always, we’re looking by team games rather than by week, so players with an early bye aren't skewing the comparisons.

I’ve used PPR scoring for this exercise because that was easier for me to look up with the databases I had on hand. For the remaining players, I tracked where they ranked at their position over the first four games and over the final twelve games. Finally, I’ve calculated the correlation between preseason ADP and stretch performance, as well as the correlation between early performance and stretch performance.

Here's the data.

## QUARTERBACK

 Player ADP Games 1-4 Games 5-16 Tom Brady 2 1 9 Drew Brees 3 6 13 Matt Ryan 4 22 15 Russell Wilson 5 3 1 Derek Carr 6 23 21 Marcus Mariota 9 13 22 Cam Newton 10 15 2 Kirk Cousins 11 10 5 Ben Roethlisberger 12 21 8 Philip Rivers 13 20 3 Dak Prescott 14 5 14 Matthew Stafford 15 14 6 Andy Dalton 16 24 18 Eli Manning 17 12 28 Tyrod Taylor 20 18 16 Jay Cutler 21 30 27 22 2 7

The correlation between ADP and late-season performance was 0.252.
The correlation between early-season performance and late-season performance was 0.431.

## RUNNING BACK

 Player ADP Games 1-4 Games 5-16 Le'Veon Bell 2 3 2 LeSean McCoy 3 15 6 Devonta Freeman 4 5 25 Melvin Gordon III 5 17 5 Jay Ajayi 7 33 35 DeMarco Murray 8 32 20 Jordan Howard 9 11 17 Todd Gurley 10 1 3 Isaiah Crowell 12 47 28 Marshawn Lynch 13 42 18 Christian McCaffrey 14 20 9 Kareem Hunt 15 2 8 Lamar Miller 17 14 21 Carlos Hyde 18 8 11 Joe Mixon 20 34 34 C.J. Anderson 21 12 30 Ameer Abdullah 23 22 52 Mark Ingram II 24 25 4 Bilal Powell 26 21 37 Derrick Henry 30 25 33 LeGarrette Blount 31 29 57 Tevin Coleman 32 19 29 Frank Gore 34 30 22

The correlation between ADP and late-season performance was 0.540.
The correlation between early-season performance and late-season performance was 0.447.

 Player ADP Games 1-4 Games 5-16 Antonio Brown 1 2 2 Julio Jones 2 27 5 Mike Evans 4 9 25 A.J. Green 5 4 18 Jordy Nelson 6 6 56 7 5 8 Brandin Cooks 8 12 15 Dez Bryant 9 24 26 Doug Baldwin 11 15 13 Keenan Allen 12 10 3 T.Y. Hilton 13 23 32 DeAndre Hopkins 15 3 1 Demaryius Thomas 16 38 16 Kelvin Benjamin 17 52 44 Alshon Jeffery 18 19 23 Martavis Bryant 19 49 47 Michael Crabtree 20 29 35 Davante Adams 21 22 11 Tyreek Hill 22 8 10 Golden Tate 23 17 12 Larry Fitzgerald 24 7 6 Stefon Diggs 26 1 39 Jamison Crowder 28 88 24 Jarvis Landry 29 13 4 Sammy Watkins 32 31 45 Eric Decker 37 81 49 DeSean Jackson 38 35 50 Randall Cobb 39 34 38 Tyrell Williams 40 33 54 Adam Thielen 42 14 9 Marvin Jones Jr 44 59 7 Rishard Matthews 47 30 41 48 51 34

The correlation between ADP and late-season performance was 0.349.
The correlation between early-season performance and late-season performance was 0.412.

## TIGHT END

 Player ADP Games 1-4 Games 5-16 Rob Gronkowski 1 1 2 Travis Kelce 2 3 1 Jimmy Graham 4 18 3 Kyle Rudolph 5 23 6 Zach Ertz 6 2 4 Delanie Walker 7 7 8 Eric Ebron 8 24 9 Jack Doyle 9 15 5 Marcedes Lewis 10 19 28 Josh Hill 11 55 65 Austin Hooper 12 13 17 Jason Witten 13 6 11 Evan Engram 14 8 7 David Njoku 15 20 24 Jesse James 16 9 29 Cameron Brate 17 4 14 Jared Cook 20 10 12 Antonio Gates 21 28 30 Dwayne Allen 22 83 63 Gerald Everett 23 32 44 Tyler Higbee 24 42 35

The correlation between ADP and late-season performance was 0.636.
The correlation between early-season performance and late-season performance was 0.857.

## Overall

Across all positions, the correlation between ADP and late-season performance was 0.456.
The correlation between early-season performance and late-season performance was 0.570.

After six years of running this article and with eight years of collected data, how do things stand? Here are the correlations at each position. (I've only run positional breakdowns for the past four years and the two-factor averages for the past two years, hence the shorter charts.)

 Quarterback Season ADP Early-Season Avg of Both 2014 0.422 -0.019 2015 0.260 0.215 2016 0.200 0.404 0.367 2017 0.252 0.431 0.442 Average 0.284 0.258 0.405 Running Back Season ADP Early-Season Avg of Both 2014 0.568 0.472 2015 0.309 0.644 2016 0.597 0.768 0.821 2017 0.540 0.447 0.610 Average 0.503 0.583 0.715 Wide Receiver Season ADP Early-Season Avg of Both 2014 0.333 0.477 2015 0.648 0.632 2016 0.551 0.447 0.576 2017 0.349 0.412 0.443 Average 0.470 0.492 0.510 Tight End Season ADP Early-Season Avg of Both 2014 -0.051 0.416 2015 0.295 0.559 2016 0.461 0.723 0.716 2017 0.634 0.857 0.891 Average 0.335 0.639 0.803 Overall Season ADP Early-Season Avg of Both 2010-2012 0.578 0.471 2013 0.649 0.655 2014 0.466 0.560 2015 0.548 0.659 2016 0.599 0.585 0.682 2017 0.456 0.570 0.608 Average 0.557 0.555 0.645

At quarterback, running back, and wide receiver, two of the last four seasons have favored preseason ADP and two of the last four seasons have favored early-season performance. Overall, I'd say the two factors are basically in perfect balance, (or as close to perfect as you'll get from something of this nature).

At tight end, each of the last four years has favored early-season performance, which is enough for me to believe this might be a trend. This isn't to say that regression doesn't happen-- players still tend to move in the direction of preseason ADP. It's just to say that the spot they finally settle in tends to be closer to early-season performance.

Overall, though, this year just reinforces my prior belief that four games worth of stats gives us no more and no less information on a player than an offseason of study. If one person drafted a new team today straight from preseason ADP, and another drafted straight from current year-to-date rankings, both teams would probably do about equally well.

But the idea that it has to be either preseason ADP or early-season production is a false dichotomy. Most of us are closet Bayesians, which means we start with an opinion and update it with new evidence. In that case, we've reached the point of the season where we should give roughly equal weight to both factors.

Indeed, a simple average of preseason ADP and ranking through four games correlates with rest-of-year outcomes better than either factor alone, last year producing a robust 0.608 following a correlation of 0.682 in 2016. And I've demonstrated in the past that an award-winning projector like Bob Henry can outperform even that average, though for everything we do we'll probably never get much higher than correlations of 0.700.

(As an aside, in the past I've seen a study similar to this that used points scored from the previous year instead of preseason ADP, and that study discovered that week three is the informational tipping point. This led to the quip that all of the hard work we put in during the offseason is basically to buy us one extra week before being wrong.)

This week's prediction is "early-season performances will tend to regress toward preseason ADP". This prediction isn't easily trackable, so I won't be adding it to the scorecard going forward. I have a hunch that I might just be revisiting this prediction sometime during October of 2019, though...

See all

See all

See all

See all