Dynasty, in Theory: Where Analytics Fail

Analytics is a very powerful way to gain new insight into the world. Sometimes it goes wrong.

Analytics is hot. Analytics is in. Check out the trend in usage of the word “analytics” in books, and note that that trend-line ends at 2008, before Nate Silver became a national celebrity with his own website sponsored by ESPN.

So analytics is big right now. It’s on quite the winning streak, and it’s serving notice to the rest of the world that they better not stand in its way. And for good reason— from Hollywood blockbusters to 24-hour news election forecasts, analytics seem to do no wrong.

Today, I want to talk about something I often see analytics do wrong.

A Defining of Terms

When I talk about analytics, I am speaking simply of the process of answering questions by gathering and analyzing data. Some analytics are going to be frightfully complicated. Others will be laughably simplistic. To me, it’s not the process itself that is important so much as the instinct to turn to statistics and data to answer empirical questions.

What Analytics Does

This article should not be considered an attack on the usefulness of analytics. That battle is over. The statisticians have won. There are books and movies about their power, and their practitioners liberally populate every single franchise in Major League Baseball.

The NFL has ever been conservative and resistant to change. It has also been using analytics for over five decades. A man named Virgil Carter, for instance, published a paper titled "Operations Research on Football" back in 1971 that calculated the expected value to an offense of facing 1st-and-10 at various positions on the field.

If that name sounds familiar, it’s because it should; in 1971, in between working towards his MBA, Virgil Carter was the starting quarterback for the Cincinnati Bengals and the man for whom the West Coast Offense was developed. His findings were gleefully embraced by his offensive coordinator, a young up-and-comer named Bill Walsh.

Even that 1971 date understates the origin of analytics in football. The Cowboys were using analytics to overhaul their scouting process in the ‘60s en route to building a dynasty. They standardized all scouting measurements so that they could be fed through a computer for analysis, as a result essentially creating the modern scouting complex, including but not limited to the annual player combine.

A footnote in Carter’s paper leads us back further still, to a paper published in 1954 titled "The Application of Operations-Research Methods to Athletic Games”. Which means saying that football analytics today is in its infancy ignores up to sixty years of history.

This early use of analytics was no secret, either. Here is a Sports Illustrated article from 1972 that openly talks about the analytics revolution Carter was pioneering. From the article:

Carter has set a literal new standard for the shopworn phrase "student of the game," for he is a computer analyst whose diligent research into football has led to findings that dispute some of the most sacrosanct coaching theories. He has taught at Xavier University and has given seminars for the Data Systems Division of the A.O. Smith Corp. of Milwaukee.

In that quest Reid shares a perspective with Carter, the analytical Mormon mathematician who started offering up football plays to a computer's peristalsis while he was working on his master's degree at Northwestern in 1970. Carter's wife Judy helped him by coding 8,373 plays from 56 games played during the first half of the 1969 season. On all 8,373 Carter kept track of 53 variables—time, down and distance, weather, playing surface, score and almost every other critical factor save which team puts its pants on two legs at a time. It added up to over 440,000 information tidbits that Carter then fed into the computer. The ensuing print-out, in so many numbers, said that a lot of football's sacred coaching bylaws were really so much bunk.

While the battles being waged over the applicability of analytics today seem new, Carter presaged many of them, too. Again, from the Sports Illustrated article:

Carter, however, is not likely to bend Paul Brown's ear with his statistical revelation in order to help in Cincinnati's struggle in the AFC Central Division. For one thing, his study is an illustration of quantitative analysis and therefore makes no allowance for specific individual talents. "If you wanted to use this." he said, "you'd have to tie in your personnel, and then you'd have to adjust it to your desires as a coach. You'd have to interpret it with respect to your own philosophy. A computer will never make coaching decisions. The idea is ridiculous. You're dealing with probability, and you can't assign a number to all probable events. You can't give a number to how players are going to react to their pregame meal. You can't program desire. That's why a coach has to adjust this stuff to his own team."

It’s true that many of the old guard have fought a delaying action against this analytics offensive, (including Carter’s own head coach, Paul Brown, who famously and single-handedly kept Bill Walsh from receiving a well-deserved head coaching position for years by telling all within earshot that he was “too professorial”). But even there the statisticians are advancing.

Consider, for instance, Gary Kubiak, the new head coach of the Denver Broncos. He announced this summer that he had hired an analytics consultant who would communicate with him on the sidelines during the games to help him optimize his decision-making. Now, hiring someone and listening to him are two different things, but the results are already showing. Against Cleveland, Gary Kubiak kept his offense on the field in the first quarter to attempt a 4th-and-4 in “no man’s land”. During his nearly 8-year tenure with the Houston Texans, the famously conservative Kubiak only twice had his offense attempt to convert a 4th down of longer than 2 yards in the first quarter.

So analytics has been around for decades, and has been thoroughly incorporated into the fabric of football scouting, coaching, and play-calling. It has achieved this success because it works. Analytics has proven over time that it helps optimize everything from scouting to play design to game management.

What Analytics Doesn’t Do

Again, it should be clear by now that I am very much pro-analytics. In fact, from time to time I engage in some very rudimentary analytics myself. When I wanted to know how DeAndre Hopkins might handle an increase in targets, I checked the data for receivers who saw an increase in targets. (Highlighting the power of analytics, the data presaged the massive uptick in Hopkins' production to coincide with his greatly expanded role). When I wanted to know how players aged, I checked the data to see how past players had aged.

Examining data is a powerful method for gaining a better understanding of the world around us. And many do a much, much better and more thorough job of it than I do. They have larger datasets, and more experience, and access to a more powerful suite of skills and tools to facilitate their probings. I’m pretty much limited to things like taking averages and occasionally executing an =Correl( or two in excel.

My sometime problem with analytics lies not with their application, which is often prudent and measured. Instead, it lies with what comes after, in the “drawing conclusions” stage.

Let’s give an example. Let’s say that I believe there exists some inherent trait that I’m going to call “X-factor”, and I hypothesize that X-factor explains which players break out and which do not. Why is Devonta Freeman suddenly unstoppable? Why did Cordarrelle Patterson fail to build upon his scorched-earth run to end 2013? The difference, I opine, is explained by their respective “X-factor”.

Now, let’s say I get a robust dataset of all players in history, containing things like age, measurables, and first-year stats. And I’m searching through that database for something that predicted which players broke out and which did not. And let’s say after untold time spent searching, I conclude that there’s nothing there. Nothing in my dataset correlated with future breakouts in any meaningful way. With this finding, one of four things can be true.

Possibility #1— X-factor does not exist.
Possibility #2— X-factor exists, but not where I was looking for it.
Possibility #3— X-factor exists where I was looking for it, but my methods were insufficient to detect it.
Possibility #4— X-factor exists, it is where I was looking for it, my methods were sufficient, but I screwed up the analysis.

The first possibility is relatively self-explanatory. If you go looking for something that does not exist, your search will obviously come up empty. So a search that comes up empty could be evidence that something does not exist.

The second possibility, however, is very important. I mentioned that my database contained age, measurables, and first-year stats. Let’s suppose for a second that it was possible to predict who would break out by looking at a player’s splits between the first half and the second half of the season. If that was true, I would not expect my test to discover it, because my database doesn’t contain such splits.

The third possibility is likewise very important. Let’s suppose that I found that players who ran a sub-4.40 forty-yard dash had a 50% chance of breaking out, which was the same rate as the population at large. I might conclude that a sub-4.40 forty did not predict breakouts. But perhaps players who ran a sub-4.40 and did 15 reps on the bench press had a 100% breakout rate, while players who ran a sub-4.40 and had fewer than 15 reps had a 0% breakout rate. In this case, by combining two variables we can get a perfect predictor of player breakouts, but I would never know unless I tested combinations of variables.

The fourth possibility is largely beyond the scope of my criticism today, but it’s important to at least note the possibility. Sometimes even really, really smart, talented people make mistakes in their analysis.

One of the most famous examples of analytics was a paper by Thomas Gilovich, Robert Vallone, and Amos Tversky that examined the play-by-play data of the 1980-1981 Philadelphia 76ers and concluded that the “hot hand”, (or idea that shooters sometimes became “hot” and more likely to hit their shots for stretches), was a myth.

This paper was hugely influential to generations of researchers. According to a new study by Joshua Miller and Adam Sanjurjo, it was also flawed. The technical details of the flaw are pretty hard to understand at first, (I have tried to explain them more simply here), but the upshot is they failed to account for a bias in their data; properly accounted for, the data would, in fact, have provided evidence of the “hot hand” in basketball.

For now, as I said, I’m setting aside this fourth possibility. Understanding and organizing data is very difficult, and occasionally even brilliant analysts will make mistakes. I’m mostly concerning myself with areas where the analysis was not mistaken and discussing the conclusions that we draw from them.

Drawing Conclusions From Inconclusive Data

I see a lot of quality analytics being performed in fantasy football right now, as smart, numerically-minded, passionate people are putting their skills to work. And they largely do great work, but occasionally I will see someone test a theory, find no evidence, and use this as proof of Possibility #1 without considering Possibility #2 or Possibility #3.

This has been bothering me for a while. I wrote last year about something similar, trying to reconcile a schism in how I thought about and experienced football. On the one hand, I knew that “momentum” did not exist; several studies that I thought were credible looked at the issue and couldn’t find any evidence of the phenomenon.

On the other hand, as a Denver Broncos fan, I distinctly remembered October 15th, 2012. On that night, the 2-3 Broncos traveled to play the division-leading San Diego Chargers. The Broncos had been struggling through the first stages of the season, and those struggles were magnified early as San Diego raced to a 24-0 halftime lead.

Everything that could go wrong did. Denver fumbled a kickoff, leading to an easy San Diego touchdown. Peyton Manning threw an interception that was returned for another touchdown. The Broncos defense, which had held up pretty well, buckled and gave up a third touchdown just before the half.

And then, in the second half, the script was flipped. Denver scored a touchdown on its first drive. On San Diego’s first drive, it crossed midfield and reached field goal range before Philip Rivers was sacked, fumbled, and the ball was returned for a second Denver touchdown. On the next drive, San Diego went three-and-out, punting it back to Denver, which drove the field for a third touchdown.

San Diego’s next drive ended in an interception, and Denver scored another touchdown to take the lead. San Diego’s next drive ended in an interception, too. And their drive after that ended in another interception, this one returned for another defensive score and a 35-24 lead for the Broncos. And San Diego’s final desperation drive was brought to an end with a sack-fumble and another takeaway for the Broncos.

In all, San Diego’s second-half drives ended as follows: fumble returned for touchdown, three-and-out, interception, interception, interception returned for touchdown, fumble. And Denver became the first team in history to trail by 24 points at halftime and still go on to win by double digits.

Now, a part of me understood that “momentum”, as a concept with any predictive weight, was bunkum. But I really had no other way to explain what had happened that night. I understood that if scoring was randomly distributed we should expect to see bizarre runs like that. But I also remember watching the Chargers play, and I remember watching the desperation as their lead dwindled and the mistakes seemed to multiply.

Last year when I addressed the topic, I tried to carve out a space for both beliefs to coexist. I identified some beliefs as “predictive”, designed to anticipate what is going to happen next. From a predictive standpoint, momentum is not a thing. On the other hand, I called some beliefs “descriptive”, telling a story about what happened in the past. Descriptively, Denver took the momentum from San Diego that night and never gave it back.

I rather liked the article— it was one of my favorite to write— but I feel like it represented me just starting to grapple with this unease that I’m still wrestling with today. And so now, a little bit more experienced and starting to wrap my arms around the concept more, I want to revisit the topic.

Studies have been done that failed to find evidence for momentum. I had done one myself, and I could find no evidence that teams that overcame a large deficit were more likely to win games than teams that overcame a small deficit. This is fine, insofar as it goes.

The problem is that the authors of some of those studies— and I very much count myself among their number— took that finding and concluded that Possibility #1 must be the explanation. Momentum must not exist.

Many of them failed to consider that momentum might exist, but not where they were looking. Or that it existed where they were looking, but they did not use the proper tools to find it.

In fact, I believe those latter two explanations should be treated as the more plausible of the three. How many players and coaches have discussed momentum as a real force? Who better exists to tell us what impacts football games than the people who are actually playing the football games? If our method failed to verify their claims, we should assume that the fault is more likely with our method than with their claims.

An Example From Baseball

This new paradigm of mine was inspired largely by an article I recently read about the sabermetrics movement in baseball. Now, the last baseball game I watched was played during the Clinton administration. The last baseball game I played came under George Bush Sr. I would not consider myself a baseball fan.

Regardless, this piece by Matt Corbett, titled “Stop Thinking Like a GM; Start Thinking Like a Player”, is without question the greatest bit of writing I have read about analytics. Especially for a piece that contains very little actual analytics! I know that I often provide an abundance of links in my articles, and many might not click through on every one; I would strongly encourage you to click through and read that article. Or, at the least, bookmark it and return when you have time.

One story from the article really stood out to me, though. It seems that, in the early 2000s, sabermetrics found no evidence of the value of catcher defense, and so it declared that catcher defense had no value, (a Possibility #1 claim). The article contains player blurbs from the Baseball Prospectus Annual from various years, largely mocking the very idea advanced by scouts that some catchers might be useful for their defense and advocating that all of those defensive catchers be replaced by players who were better hitters.

In 2011, enabled by better pitch tracking technology, a man named Mike Fast demonstrated statistically that some catchers were better able to catch balls on the edge of the strike zone in a way that encouraged the umpire to call them as strikes. This effect became known as “pitch framing”. And, lo and behold, the best “pitch framers” were the players who old-school scouts had been praising for their defense for years.

The biggest eyebrow-raiser, though, came in a 2013 player blurb in the Baseball Prospectus Annual that suggested a defensive catcher “owed” Mike Fast for discovering this heretofore unknown defensive ability. The defensive catcher had been employed for years, though, and had made more money before Fast’s research than he had after.

The entire story is one of hubris from start to finish, (obscuring some high-quality data analysis). Analytics-types couldn’t find something, so they rejected the idea it existed. When they finally did find it, they tried to claim credit for its discovery.

And really, the entire embarrassing affair could have been avoided if prior peddlers of analytics had been more measured in their conclusions. If they had realized that a failure to find something could mean it didn’t exist, or that they were looking in the wrong place, or that they needed a better microscope.

One Final Admonition

Analytics is a powerful tool, and it is exploding in popularity because it obtains results like we’ve never seen before. It is an exciting way for us to receive knowledge.

But it is not the only way for us to receive knowledge, and we must never lose sight of that fact. If players and coaches tell us that momentum is real, and analytics cannot find evidence of it, we must contemplate that the fault lies with our use of analytics.

If players and coaches tell us that sometimes running backs need carries to establish a rhythm, and we find no evidence of backs getting better the more carries they get, we need to be cautious with how we present that data and realize that we have not disproven the concept of “getting into a rhythm”.

If Basketball players and coaches tell us they sometimes have a hot hand and become more likely to make their shots, we should assume they know what they're talking about unless and until we have extremely compelling evidence to the contrary.

If Baseball scouts and general managers tell us that catchers can be as important for their defensive work as they are for their offensive work, then we should be very careful about stepping out and calling that defensive work useless or nonexistent.

If our analytics tell us coaches should be going for it more on fourth down, and coaches keep refusing to do so, we must assume that perhaps they know something that we don’t. After all, coaches have been performing exceedingly complex optimization tasks since the dawn of professional football.

This isn’t to say that those coaches are necessarily right and the analytics are necessarily wrong, (or, more accurately, insufficient). It’s simply to say that we should be much more open to the possibility.


More articles from Adam Harstad

See all

More articles on: Dynasty

See all

More articles on: Strategy

See all