Hello again!  It’s been a while since my last blog post.  This is going to be a shorter one than my last two, a mini-blog entry if you will.

Have you ever wondered how to calculate the probability of a team winning a game?  An easy way to estimate that is by looking at their winning percentage.  But the opponent also has a winning percentage.  Surely, the New England Patriots who for example, have a 0.750 winning percentage in a given year have a lower probability of winning against another team with a 0.750 winning percentage than say, the Browns.  So how do you incorporate BOTH teams’ winning percentages into an estimate of the probability of a team winning?  To get this estimate, the following formula is used: Win Probability FormulaThe probability of the opposing team winning would be identical except the numerator (top portion of the fraction) would be the winning percentage of the opposing team multiplied by one minus the winning percentage of your team.  You would just flip the subscripts of A and B in the numerator. As far as I can tell, the only problem with this formula is that when both teams have a 0% winning percentage or a 100% winning percentage the result is 0/0, which is an indeterminate number.  For the following analysis, I came across this problem once with two teams playing each other, where the home team had a 100% winning percentage the previous year at home and the road team had a 100% winning percentage the previous year on the road.  To mitigate this issue, I manually entered both teams with a 50% probability of winning. For more information on this formula please refer to Win Probability Formula.

With the method of calculating the winning probability, I went about creating two types of analysis:

Predicted vs. Actual WinsActual Wins vs. Diff. Sched. Wins

For the methodology for the top graph, I used home and away winning percentage data from 2011 to 2015 and the formula outlined above to predict the record in a given season, by using the previous season’s winning percentages and current season’s schedule.  For example, I used the winning percentages from 2015 at home and away for each team’s schedule in 2016 and used the formula to get predicted wins for 2016 based solely on the winning percentages from the previous year.  Then I compared those predicted wins with the actual wins in 2016.  To do that I plotted predicted win values (X-Axis) vs. actual win values (Y-Axis).  Then I proceeded with R Squared analysis.  What I hoped to measure was how much of the wins in a given season is predicted by the wins of the previous season.  The resulting number wasn’t insignificant but wasn’t large either.  The R Squared is about 10%.  Therefore, 10% of the variation of wins for a given team can be explained by the variation of wins in the previous season.

The analysis in the bottom graph was the comparison of a given season’s actual wins and those predicted using the following season’s schedule.  For example, I used the actual wins per team from 2015 and compared them to those projected using the 2016 schedule.  The logic behind this analysis is to isolate the importance of the schedule.  Once again, I did an R Squared analysis.  As one would expect the R Squared is very high at 93.5%.  Therefore, 93.5% of the variation in the 2016 projections can be explained by the 2015 home/away win percentages.  Meaning that as much as 6.5% of the variation can be explained by the differences between schedules.  Although there are potentially other factors such as randomness in play.

And that’s it for the mini-blog post.  I hope you enjoyed it!

My sources are:

Win Probability Formula

Team Schedules

Total Record

Home/Away Record

Leave a comment