**Information on this site is collected from outside sources and/or is opinion and is offered "as is" without warranties of accuracy of any kind. Under no circumstances, and under no cause of action or legal theory, shall the owners, creators, associates or employees of this website be liable to you or any other person or entity for any direct, indirect, special, incidental, or consequential damages of any kind whatsoever. This information is not intended to be used for purposes of gambling, illegal or otherwise.**
In Part 1 of this series I discussed the prevalence of statistical models in ratings systems for team strengths in things like Chess, Halo, etc. In Part 2, I introduced a simple “Margin-of-Victory” style model for NFL game spreads, and I discussed how we “fit” the model, and then generated power-ratings with the model, using the results from this year up to week 13. I then used these to generate predictions for the Week 14 games. In Part 3, I discussed using normal distributions to model each team’s variation in performance.
In Part 4, we will look at how to model game outcomes and estimate win probabilities, both against the spread or straight up. I know that in Part 3 it may have been unclear why I spent so much time talking about the normal distributions and bean machines, and where all of that was going. This post should (I hope) clarify all that.
NOTE: I’m providing an Excel spreadsheet to accompany this post. You don’t need it to follow anything in this post. But I’ll be demonstrating some things in Excel, and it may be helpful—especially if you are a hands-on learner—to go ahead and download it here so you can follow along (and see exactly what I did) when I refer to it later.
UPDATE: Here is an updated version of the spreadsheet. It doesn't go along with the examples used in the post. However I've updated it to correspond to a real-world scenario (tonight's game); it incorporate the home-field advantage and team-ratings (for the Saints and Falcons) from Sagarin's "Pure-Point model, the Vegas spread on the game, and more realistic variance estimates.
The title of this post (and the corresponding figures) gives a rough outline of the starting and endpoints of what we’ll be covering in the post: We will be starting with the simple Margin-of-Victory model that was introduced a couple posts back, and showing how this model can be used to estimate win probabilities for any matchup, either straight up, or against the spread.
The Story So Far
Let’s review everything we’ve covered so far as quickly as possible, just to refresh everyone’s memory.
This is the equation that I gave in Part 2, to describe our Margin-of-Victory model:
PointsHomeTeam – PointsAwayTeam = RatingHomeTeam– RatingAwayTeam + HomefieldAdvantage + Error
Essentially everything you need to understand about the model can be captured by a couple simple graphs, which are shown below.
Note that to keep things from getting too messy, I’m only showing 4 of the 32 teams here (furthermore, these aren’t the true estimates of these teams’ ratings; I chose these values for illustrative purposes).
Note that to keep things from getting too messy, I’m only showing 4 of the 32 teams here (furthermore, these aren’t the true estimates of these teams’ ratings; I chose these values for illustrative purposes).
Team Power Ratings (top graph)
Our model assigns each team a rating. This rating is a single real-numbered value. Teams with positive ratings (Rating > 0) are better than an average team; teams with a negative rating are worse than an average team.
Ratings are on the same scale as points. What this means—in terms of understanding a game outcome—is that a team with a rating of +6 is expected to beat a team with a rating of +1 by 5 points. All team ratings are constrained so that the average rating is 0. This gives the team ratings some inherent meaning outside of specific matchups; a team with a +6 rating would be expected to beat an average team by 6 points on a neutral field.
So, in the image above, the distance between any pair of teams would give an estimate of the outcome of a game between the two teams (on a neutral field). And the Packers and 49ers are above-average teams, while the Seahawks and Vikings are below-average teams.
Our model also learns a homefield advantage, which is on the same scale as ratings and points, so can simply be added to expected game outcome, expressed in terms of: PointsHome – PointsAway. So if the value of homefield equals 3 points, a team rated +6 would be expected to beat a team with a +1 rating by 8 points at home, 5 points on a neutral field, and 2 points on the road.
Team Performance as normal distributions (bottom graph)
To account for the fact that each team’s performance will vary across games, we model each team using a normal distribution. This is illustrated in the bottom plot of the figure above.
The mean (or center) of each team’s normal distribution is equivalent to its rating. For each team, I’ve shown the team rating using a dashed vertical line, and the teams’ rating distribution (which capture the variability in the team’s performance across games) using a solid line; this solid line corresponds to a normal distribution.
This is how we use normal distributions to model team ratings:
When describing a normal distribution, we only need two parameters: a mean (μ) and variance (σ2). With these parameters we can then define any normal distribution using the notation:
Normal (μ, σ2)
- The μ parameter is the mean of the normal distribution; its value is the location of the center or “peak” of the distribution. The mean for any team each team (μTEAM) is their team rating.
- The σ2 parameter is the variance of the normal distribution; it captures the how “spread out” the distribution is, or the amount of uncertainty in the distribution. The higher the variance, the more likely it is that the distribution will generate values that are further away from the mean. In terms of our model, higher variances would indicate that teams’ performances will vary more from game to game.
If we increased the variance of the distributions in the figure above, there would be more of an overlap between the team distributions, and if we decreased the variance there would be more separation between the distributions.
In fact: if we started with the plot on the bottom, and we kept decreasing the variance of the normal distributions, the width of the distributions would shrink until eventually the plot on the bottom would become equivalent to the top plot. In fact, when the variance = 0, the only value with any probability is the mean (μ) itself—and this would no longer be a “normal distribution” but instead be a Delta Function.
That is the heart of it. Now begin in the middle, and later learn the beginning; the end will take care of itself.
- Harlan Ellison
I realize that each part of this series has introduced more concepts, and if this sort of thing is new to you it may feel a bit overwhelming. So let me make a couple points before moving on:
I’ve been doing my best to (1) assume no background knowledge on the part of the reader, and (2) keep each post as self-contained as possible. But that doesn’t mean everything is going to instantly make sense in your head. In fact, I really wouldn’t expect it to (it took me a long time doing statistical modeling, before I really started to feel comfortable with this sort of thing). So I don’t expect everything to have totally clicked in your head yet.
That said, I think that the best thing to do is to push forward. Even if you only have the general gist of what we have covered so far, you can probably get a good feel for what we’ll be covering in this post: that is, how we can model game outcomes, and use this to understand win probabilities.
This is why I’ve put the Harlan Ellison quote above. We’re going to be covering stuff that draws on material from previous posts. But, you shouldn’t feel like you need a full handle on everything from the previous posts (or even that you need to have read them) to follow this post; learning doesn’t always happen in a totally linear fashion.
So let’s move forward, getting into the middle of things now. But don’t sweat the details if it isn’t all immediately clear; once you have the broader picture of how this all works, I’ll bet that earlier pieces that didn’t totally click for you will start to fall into place.
Here, we are going to make the key step in moving from modeling individual teams, to modeling game outcomes.
To make the ideas here more concrete, let’s use a single example of a game:
Example: Modeling the outcome of a single game
Suppose that we are modeling the outcome of a game between the 49ers and the Seahawks. In order to help define the game, we will say that the 49ers are the “home team”, but to keep things simple we will assume that there is no homefield advantage (i.e., we will set the value: Homefield-Advantage = 0 ).
What “modeling the outcome” of a game means is that we want to consider the relative probabilities of all of the different possible game outcomes, in terms of score differential (more explicitly: we want to model game outcomes using a probability distribution).
It takes two steps to get our probability distribution for the game outcome: (1) define the matchup, and the teams’ rating distributions, and (2) use these distributions to estimate a distribution of game-outcomes.
Step 1: Define our Game
What we start with is the normal distributions that describe each team. Normally, we would first “fit” the model (i.e., learn all the parameter values from data) and then use these values to define our game. I’ll be demonstrating how to do fit the model in Excel in our next post; for now, I’m just going to assume some parameter values that will make the example as straightforward as possible.
So, for this example we will say that the 49ers have a rating of +4, and the Seahawks have a rating of -2.
We will also assume the following:
(1) All teams’ variances are equal to 4
(2) That homefield-advantage = 0.*
* Note that ignoring homefield-advantage is completely for the purpose of keeping the example as simple as possible; it is very easy (and important) to account for a homefield advantage, but here it would just be a nuisance. For example, If we did include homefield advantage, and set its value equal to +3, the equivalent game could be defined by simply lowering the 49ers rating by 3 points. *
So, we have two teams. For each team we have a rating and a variance. These define a normal distribution for each team, which captures how their performance varies across games.
As a reminder, we can use the shorthand notation to describe these two teams’ normal distributions.
Rating49ers ~ Normal (+4 , 4)
RatingSeahawks ~ Normal (-2 , 4)
[Read as, e.g.: “The Seahawks rating is normally distributed, with a mean of -2 and a variance of 4”]
Now that we have defined our game we are ready to model the outcome of the game. Since the 49ers are our “home team”, this means that we want to estimate the following:
Points49ers – PointsSeahawks = Game Outcome
We already know how to get a single value (a point estimate) of the game outcome. But now we want to think about this in terms of a probability distribution; we want to know the probabilities of all possible game outcomes.
The reason we need the probabilities of different game outcomes, is that it allows us to estimate the probability of us, e.g., winning a specific bet (e.g., the probability that the 49ers beat the spread).
Step 2: Model the game outcomes
Let’s quickly think about what it means to model the outcome of the game. The most intuitive way of thinking about this may in terms of simulating game outcomes.
Step 2; Version 1: Simulating game-outcomes
Let’s imagine that we simulate a bunch of games, and look at the distribution of outcomes. That is: we will “sample” a value from each of the teams’ normal distributions (think of drawing a sample from a “bean machine” if this helps visualize it), and use these to get a bunch of simulated game outcomes.
Each sample from a team’s normal distribution gives us that team’s performance level (rating) for the specific game we are simulating. And to determine how these sampled ratings translate to point-differentials, we simply plug them into that original equation:
PointsHomeTeam – PointsAwayTeam = RatingHomeTeam– RatingAwayTeam
So suppose we start randomly sampling from both teams’ rating distributions, and computing the outcomes for each simulation. In the figure below, I’ve done just that:
After repeating this process for a while, we would have a bunch of samples of outcomes that our model generates. We could then use all these samples to estimate different probabilities; for example, we can estimate the probability that the 49ers win by counting the proportion of our samples that have a positive value, or we could estimate that the 49ers beat the spread by counting the proportion of samples with values greater than the spread.
Luckily, there is a much easier and more direct way to estimate the outcome probabilities: we can simply define the game outcome using a probability distribution.
Step 2; Version 2: Directly defining the probability distribution of the game outcome
Here, what we want to do is to define a probability distribution of game outcomes, just as we defined the distribution of team Ratings. In other words, we want a probability distribution that gives the relative probability of all possible game outcomes.
This happens to be particularly easy in our case—thanks to a convenient property of the normal distribution:
The difference between two samples generated a from normal distributions is normally distributed!2
In other words, since each individual team’s rating is sampled from a normal distribution, and the game-outcome is simply the difference between the teams’ ratings, the game outcome can be modeled using a normal distribution.
Furthermore, the parameters of the normal distribution that defines the game outcome is extremely easy to compute directly from the parameters of the team distributions. These are the rules for how to compute the mean and variance of the game outcome using the team distributions:
- The mean (m) of the game outcome = [ the mean of the home team ] – [ mean of away team ]
- The variance (s2) of the game outcome = [variance of home team] + [variance of away team]
Put simply: the mean of the distribution of outcomes equals difference of the team means. The variance of the game outcomes is the sum of the individual team variances (and in our case, since all team variances are equal, this is simply equal to two times the team variance).
Applying this rule to our example game:
- mOUTCOME = m49ers - mSeahawks = 4 – (-2) = 6
- s2OUTCOME = s2 49ers + s2 Seahawks = 4 + 4 = 8
That’s all there is to it. We can now define the normal distribution for game outcomes:
Game Outcome ~ Normal( 6 , 8)
Remembering that game outcome is expressed in terms of point differential:
Game Outcome = Points49ers – PointsSeahawks
Example Game: Team Distributions and Game Outcome Probabilities
Now, let’s look at plots of the probability for the teams and the game outcomes, to see how this all comes together:
So, it should hopefully be pretty easy to see how we get from the first picture to the second. The mean of the game outcomes is equal to the 49er’s average rating minus the Seahawk’s average rating. The variance of the game outcome is larger than the variance of the team distributions (2x larger, to be precise).
Since the notion of “variance” is less straightforward than the mean, here’s an example to help you understand why the game-outcome is going to have more variance than the individual teams.
Think of each modeling each individual team’s performance as a single die. If this were the case, each team could only generate the six values between 1 and 6. Now imagine that we model the game’s outcome using the sum of the value of the two dice (this also works using the difference, but let’s use the sum since everyone is familiar with it). Obviously, there are more possible outcomes for the sum of the dice than for each individual die (11 rather than 6, since 2 dice can sum to 2-12. This same principle is at work when we are working with the difference (or sum) of normal distributions.
In both cases, most of the time, you aren’t going to get an extreme value. The roll of two dice on average will sum to seven (which is simply the sum of the average value on each individual die), just as the difference between two Normals will on average be equal to the difference of the individual normal distributions’ values.
However, because we now have two randomly varying values, in rare cases, you will get extreme values from both distributions, which is what causes the increase in variance. With the sum of two dice rolls, you get extreme values (e.g., 2 or 12) when both dice are either high or low.
Now, looking at the pictures above, you can see the equivalent situation for the normal distributions that would lead to extreme values. If the 49ers perform above their average , and the Seahawks perform below their average, we would get a game outcome off on the far right of the outcome distribution (say, greater than 12). On the other hand, if the 49ers performed well below average and the Seahawks performed well above average, the outcome would be on the far left end of the distribution (say, less than 0).
Game Outcome Distributions: Point Spreads and MoneyLines
Once we have defined the probability distribution that represents the game outcome in terms of point differential, it is straightforward to compute the probability of any outcome you are interested in.
For obvious reasons, the two probabilities that people are most interested in are (1) the probability that the outcome is greater or less than a specific value (i.e. the point-spread), and (2) the probability that each team will win/lose the game (corresponding to the “moneyline”).
To do this, let’s look again at the distribution of outcomes for our example game:
It’s fairly straightforward think about the win probabilities here, since values greater than zero correspond to an outcome in which the 49ers win, and values less than zero correspond to outcomes in which the Seahawks win. In other words, the probability of, e.g., the 49ers winning equals the probability that this distribution generates a positive value.
It’s not much harder to think about how we could use this model to estimate the probability of beating the spread. For example, suppose that the spread of this game had the 49ers favored by 4. It should be clear that in that case, we would want to pick the 49ers, since the mean of the distribution is actually 6. And the probability of beating the spread would be equal to the probability that this distribution generates a value greater than or less than 4.
So, the question now is: how do we compute these probabilities from the game outcome distribution?
The relationship between the spread and the moneyline
We can use this same idea of the distribution of game outcomes to understand the deep connection between the “moneyline odds” and the probability of “beating the spread”. If we had to estimate what the moneyline odds should be, we simply consider the probability of the outcome being less than or greater than 0. The probability of beating the spread is essentially the same thing, just instead of comparing this distribution to the value 0, we compare the distribution to the value of the spread.
To make this connection clear, let’s think about things in terms of our odds of winning a bet, or our “win-odds”. Imagine that all we care about is whether or not the game outcome is greater than or equal to the dark vertical line (currently at “0”). If we are making a money-line bet on the 49ers, this means that we are betting the 49ers to win straight-up, so the “win-odds” of this bet are equivalent to probability of the distribution generating a value greater than 0.
Now, what if we are betting against the spread? To think about this, just imagine that that we slide the dark vertical line either to the left or to the right. If the 49ers were favored by 6, we would slide the dark line over to the value +6. Our “win-odds” on a bet for the 49ers against the spread would then be equal to the odds of generating a value greater than 6 from the distribution of game-outcomes. This is equal to 50%, since the mean of this distribution is 6. And, since the house takes a cut of your winnings, this is obviously a bad bet (unless you make the bet against a friend, in which case it’s a fair bet, just not a good bet per se).
However, what if the 49ers were only favored by 4? To get a picture for how to think about this, imagine that we slide that dark vertical line to +4, which I’ve done for this picture:
From the picture, it’s clear that a bet on the 49ers against a spread of 4 would have a greater than 50% chance of winning here. But, just how good are our chances of winning that bet? In other words, how do we compute the probability of the outcome being greater than or less than a specific number?
Side-Note: What's a good bet?
It’s worth pointing out here that with the standard Vegas odds on a bet against the spread you need to win 11 out of 21 bets you make to break even (or about 53% of the time). To see this, imagine we were to making a series of bets against the spread, in increments of $110:
The standard odds (in sportsbook terminology) of “-110” means that you win $100 on a bet of $110. So, if we lost 10 straight $110 bets, we’d be down a total of $1,100. To make this money back, we’d have to win 11 straight $110 bets, since we only get back $100 on each bet.
In terms of our model, this roughly means that we need to have at least a .53 probability of the game outcome being on the side of the spread we pick, in order to break even.
Computing outcome probabilities
First, let’s think about the indirect method we discussed previously: simulation. We could generate a bunch of values from the distribution of game outcomes, and use the proportion of values that were less than or greater than zero to estimate each team’s win probability (or greater than or less than the spread to estimate your odds of beating the spread).
Although this is an indirect method, it is a totally valid estimation method. Furthermore, it is fairly easy since we can generate tens of thousands random samples from normal distribution extremely fast on a computer. In the Excel file that serves as a companion to this post, I’ve implemented this exact simulation for our example game (Note that there are three “sheets” in the excel file: the sheet labeled GAME_OUTCOME_SIMULATION” is the one I’ll be discussing here).
This file simulates the game we’ve been talking about 2500 times. For each game it samples a random value for each teams’ rating from their rating-distribution. It also computes the proportion of simulated games which result in (1) the 49ers winning outright, and (2) the 49ers beating a spread of 4.
You can adjust various parameter settings on the left to see how it affects the game simulations. The parameters that you can change without things getting all screwy are the mean and variance for either of the teams, and the game spread. Essentially, everything in the big table on the right deals with simulating the games (i.e., generating samples from the team distributions), and the parameters and outcome summaries are in the 4 tables on the left.
Since simulation is not the method I’m recommending, I’m not going to discuss all the details of that spreadsheet. However, I’ve done my best to label things so that it is fairly clear what everything is doing, if you wish to look at it.
The reason that I’m not recommending this method is that there is a much easier and direct way to do this: we can directly compute the probabilities from the normal distribution of the game outcomes.
Look at the following images:
The top image corresponds to the “moneyline” betting scenario discussed above, and the bottom image corresponds to the betting “against the spread”. In each image, the area under the curve for the region of interest (shown in red for the 49ers, and blue for the Seahawks) corresponds to the win probabilities for those teams. So rather than simulating games, all we need to do is compute the area of these regions.
Ok, so I know that “calculating the area under the curve” tends to bring to mind bad memories in many people, myself included. So let’s make something clear: we will not be doing any calculus here (not directly anyway). No integrals, no derivatives.
Computing win probabilities directly from the normal distribution
NOTE: The second sheet of the Excel file, labeled “DIRECT_COMPUTATION” goes along with this section.
As I said, the way to directly compute the probabilities for any outcome of interest here is by computing the area under the curve for the region you are interested in.
So, for example, suppose that we want to compute the probability of the Seahawks beating the spread of 4. What we want to do in this case is compute the total probability of the game-outcome distribution generating a value smaller than 4 (i.e. the probability that Points49ers-PointsSeahawks will be less than four 4). The way we calculate this is by computing the area under the normal distribution of the game outcome, for the region to the left of 4 (corresponding to the blue region in the 2nd plot of the last figure).
The reason we don’t need calculus to do this is that the normal distribution is extremely common, so there are plenty of tools out there that will do this for you. And, in fact, if you’ve ever taken an intro to stats class you’ve probably looked up a “significance level” in a big table like this, which is just a giant table of values based on integrating a normal distribution. We will do something a little more sophisticated than looking at a table; we can compute the area under the curve of a region of a normal distribution in Excel, using the following command:
= NORMDIST(x, m, s, TRUE)
In English, this command essentially means: “compute the area under the curve of a normal distribution (with parameters m and s) in the region to the left of x”. If you have some familiarity—or faint memory—of calculus, this is equivalent to saying “integrate a normal (with parameters m and s) from: negative infinity to x”.
In the language of Excel, you can think about the command I’ve given above as follows:
- The parameters m and s are the parameters of the normal distribution.**
- The x parameter is used to define the region of interest: if you set it to 0, it will give you the area in blue for the first plot above (the probability that the Seahawks win the game), and If you set it to 4, it will give you the blue region for the second plot (the probability of the Seahawks beating the spread).
- The “TRUE” command is used to indicate that we wish to integrate the normal distribution. The only other valid value to use here is “FALSE”, which will give you the height of the normal distribution at that location (which really isn’t useful here except for plotting the distribution).
** Note that Excel uses s and not s2 to define the normal distribution. The parameter s is the “standard deviation” of the normal, which is just the square-root of the variance s2. **
I’ve used this Excel command to directly compute all of the win-probabilities we have discussed for our example game on the second sheet of our Excel file. So in that file, you can see the exact commands I’ve used to compute any of the regions from the plots above.
As a single example of how to use this excel command, enter the following command in any cell of a spreadsheet, to compute the probability that the Seahawks beat the spread for our game example:
= NORMDIST(4, 6, 2.83, TRUE)
where x=4 because we are interested in the region to the left of the spread of 4; 6 is the mean of the game-outcome distribution, and 2.83 is the standard deviation of this distribution (i.e. the square root of the variance, 8).
To compute the region of interest for the 49ers, you just need to subtract the Seahawks win probability from 1. This is because the total area under the normal distribution (or any probability distribution) is equal to 1. Alternatively, we know that one of the two teams has to win, so that:
Win-Probability49ers + Win-ProbabiliitySeahawks = 1
which can be re-written as:
Win-Probability49ers = 1 – Win-ProbabiliitySeahawks
In other words, subtracting the area of the blue region from “1” is equivalent to computing the area of the red region.
To see all of the outcome probabilities (and the command used to compute each of them), just look at the table labeled “outcome probabilities” in the spreadsheet._______________________________________________________________________________________
A brief summary, and look at what's ahead
So, to briefly sum things up:
- Starting with the rating distributions for two teams, we looked at how one can compute a probability distribution over possible game outcomes (both through simulation, and by directly computing the normal distribution over outcomes).
- We then looked at how you can use the probability distribution over game outcomes to compute each team’s win-odds, both against the spread or straight up (i.e., on a moneyline wager).
In the next post: I will give a tutorial on fitting this model using excel. And I will show how we can then apply the concepts from this post to get actual estimates of win probabilities, for the following week's games.
A Final Word Of Caution About Variance
For the sake of illustration, I’ve assumed that all team variances were equal to 4 throughout this entire post. But—and this is extremely important to understand—the actual variance of teams (or of game outcomes) is much larger: Based on the model I fit in Part 2 of this series, the estimated variance of the distribution for game-outcomes was about to s2 = 140 (corresponding to a variance of 70 for each team if we assume all teams have equal variance).
And, keeping in mind that variance is a measure of uncertainty (or error), the upshot of all this is that there is much more uncertainty about the outcomes of actual NFL games than in the example I’ve used in this post.
To give a sense of how this increase in variance changes our estimates of outcome probabilities: in our example game the estimated 49ers win-probability drops from .983 to .694 on a money-line bet, and from .760 to .567 against the spread. You can see this for yourself by setting the values of the team variances to 70 on the Excel spreadsheet.
Or, to put this into pictures, here are more realistic versions of some of the figures we’ve looked at in this post (i.e. when we set team variances to 70).
1 There is a simple way to estimate the team variance from our model. But this is irrelevant to the concepts I cover in this post. In the next post (where we will be fitting the model in Excel) we'll also look at how to estimate the team variance in this model.
2 This is a somewhat rough description of this property. First, I phrased it to be specific to the fact that we are thinking about the difference, rather than the sum of the scores (it applies to both). Secondly, for this property to hold the two distributions need to be independent. What this basically means is that the two distributions do not interact. For example, in our model this means that the 49ers rating-distribution is equivalent no matter what team they are playing. Since this is precisely one of the assumptions in the model, I didn’t want to go into this issue in depth. Whether this assumption is a reasonable one is a wholly different issue (and something we can examine later), but is not something we need to worry about for now. Note that, if you are unfamiliar with modeling, this assumption may feel very wrong, and in fact that you probably have the right instinct. But, for now, just trust me when I say: although this assumption of independence is almost certainly incorrect, there are lots of good reasons to use it (one being that it is a way to deal with the fact that there is very little data for our model due to the NFL season being so short).