Understanding Expectation and Variance

Understanding Expectation and Variance

When we project Keenan Allen to score 14.5 points in DraftKings’ scoring system, what does that mean?

It doesn't mean that we expect him to score precisely 14.5 points. That’s possible, but it’s very unlikely. Even if 14.5 is more likely than any other specific number, that exact outcome occupies an exceedingly small slice of probability space.

What it means in theory is that if you take each fantasy point total Allen could conceivably get, multiplied it by the respective probability of getting that score, and add all of those products up, you'd get a sum of 14.5. (Using the same procedure, we'd project the roll of a six-sided die to produce a value of 3.5, because 1*1/6 + 2*1/6 + … + 6*1/6 = 3.5. Even though the die lacks a side with 3.5 on it, 3.5 is a good projection in the sense that it would be the fair over/under at even odds.)

I say “in theory” because nobody actually does projections that way. If you consult this book’s chapter on projections, you won't see anyone estimating the probability that Keenan Allen will score 0.0 points, and then doing the same for 0.1 points, 0.2 points, and so on all the way up to 60+ points before doing some multiplication and addition to get a projection of 14.5 points.

Rather, 14.5 points represents a decent estimate of his points if the game goes the way we expect—if Allen catches an expected number of passes for an expected number of yards and touchdowns, based on all the factors outlined in the section of this book on Using Projections.

But we can reverse engineer that 14.5-point projection to tell us something about the implied distribution curve comprising all those other possibilities. If you know what a normal distribution is—sometimes called a “bell curve"—the distribution of probabilities implied by a player’s projection will share a number of characteristics with that. (A player’s distribution of point probabilities is not actually a normal curve. A normal curve is laterally symmetrical, but a player’s fantasy-point distribution will be a bit skewed because it extends further to the right than to the left, where it reaches a fairly hard wall at zero. If you want to nerd out, a player’s fantasy-point probability distribution is more like a gamma distribution than a normal distribution.)

For one thing, a player’s fantasy-point probability distribution will generally be unimodal, which is a fancy way of saying that it generally has just one peak. And that peak will generally be roughly equal to the projection itself.

So that means that while it is unlikely that Keenan Allen will score exactly 14.5 points, he is more likely to score 14.5 points than 15 or 16 or 17 points, or than 14 or 13 or 12 points. The further away the projection gets from 14.5, the less likely that particular point total will be to occur.

Different players, however, will have differently-shaped distributions even if they have the same projected point total.

In a given week, Keenan Allen and Kelvin Benjamin may both be projected to score 14.5 points. But Kelvin Benjamin’s distribution curve might be relatively tall and skinny while Keenan Allen’s is relatively short and fat. What that would mean is that while both players should score around 14.5 points on average, Benjamin is likely to score between 11 and 18 points, while Allen is likely to score between 8 and 21 points. While both players' projected point totals have the same expectation, Allen’s projection has a greater variance.

Just as any individual player’s projected point total will have an expectation and variance, so will the total projection for any group of players. In fact, the group’s projected total will just be the sum of the individuals' totals. As long as none of the players are playing in the same games, the same is true for variance. You find the group’s variance by summing the variance of the individuals.

Keep in mind that when multiple players are playing in the same game, the variance of the group cannot be calculated as simply. The group’s variance can be greater than or less than the sum of the individual players' variance, depending on how the performances of the individuals are correlated with each other.

For example, a quarterback’s performance and his primary receiver’s performance are positively correlated with each other—meaning that when one does well, the other will usually do well; and when one does poorly, the other will usually do poorly. In this situation, the variance of the two players as a group is greater than the sum of their individual variance.

By the same token, a quarterback’s performance is negatively correlated with that of the defense opposing him. To put it another way, when one does well, it is bad news for the other. When considering a quarterback and the defense opposing him as a group, the group’s variance will be less than the sum of the variance of the component players.

Here’s something that’s true of variance across all of life’s uncertain activities: for the underdog, variance is friendly. It’s the only thing giving the underdog a chance to win. For the favorite, variance is the enemy. It’s what gives his opponents a chance to beat him.

How can we use that bit of wisdom in our DFS exploits? Consider the difference between cash games and tournaments.

In a cash game, let’s say that we think we'll have to score 150 fantasy points in order to finish in the money, and let’s say that we construct a lineup that is expected to score 166 points. That makes us the favorite! If our expectations are estimated correctly, we'll win more than half the time no matter what. And in fact, if it weren't for variance, we'd win every time. With zero variance and a correctly calculated expectation of 166 points, we'd score 166 points with 100% certainty—never more, never less—and automatically beat our goal of 150. Zero variance is impossible in fantasy football (unless you start only players who are inactive, which we don't recommend), but as long as your expectation is above the projected cutoff to finish in the money, less variance is always better than more variance.

In tournaments, on the other hand, your expectation will nearly always be out of the money. For example, let’s say that we think we'll have to score 180 points to cash in a particular tournament, but our best lineup is expected to score only 166 points. With zero variance in this case, we'd be toast. The only reason we have a chance to finish in the money is because of variance—because of the fact that sometimes we'll score well above 166 points, and sometimes we'll score well below 166 points. It’s the “above” part that we care about here. Even if our team scores only 166 points on average, with a high enough variance, we may score more than 180 points as often as 25% of the time. That will make us money if only 20% of the field gets paid.

So we see that, in cash games, we want a high expectation with a low variance; and in tournaments, we want a high expectation with a high variance