Hello again, this time the break wasn’t as long!

The second part in this series looked at factors that affected average fantasy points.  The variables were continuous.  Continuous variables mean that they have an infinite amount of values.  This time I decided to perform a bit of a more complicated regression and test categorical variables (an example of a categorical variable is gender).  I personally think that the best categorical variables to test are dichotomous.  In other words, there are two variables that are distinct from each other.  The categorical variables that I chose were type of stadium (dome/open) and type of weather (inclement/clear).  As a reminder, I classified fog, rain, sleet, and snow as inclement weather.  I will come back to this a bit later on.  First, I will explain the regression I ran.

Categorical variables are represented as dummy variables.  For gender either male or female is chosen as a dummy variable.  One variable should always be omitted to make the math work.  If male is chosen as a dummy variable then it’s coded as a 0 (for female) or 1 (for male).   A good example of dummy variables can be found on this linked site.

The below is an example from the site:

The example from Interpreting Regression Coefficients was a model of the height of a shrub (Height) based on the amount of bacteria in the soil (Bacteria) and whether the shrub is located in partial or full sun (Sun). Height is measured in cm, Bacteria is measured in thousand per ml of soil, and Sun = 0 if the plant is in partial sun, and Sun = 1 if the plant is in full sun. The regression equation was estimated as follows:

Height = 42 + 2.3*Bacteria + 11*Sun

The explanation for the bacteria variable is that controlling for “sun”, a plant with 1,000 more bacteria per ml of soil will be 2.3 centimeters taller than a plant with fewer bacteria.  In addition, a plant that’s in full sun grows 11 more centimeters than one in partial sun.  Obviously, both of these analyses are contingent on the significance of the coefficients (p value<0.05).

An additional wrinkle to these regressions is interaction terms.  Below is an example from the same website:

Height = B0 + B1*Bacteria + B2*Sun + B3*Bacteria*Sun

Adding an interaction term to a model drastically changes the interpretation of all the coefficients. If there were no interaction term, B1 would be interpreted as the unique effect of Bacteria on Height. But the interaction means that the effect of Bacteria on Height is different for different values of Sun.  So the unique effect of Bacteria on Height is not limited to B1 but also depends on the values of B3 and Sun. The unique effect of Bacteria is represented by everything that is multiplied by Bacteria in the model: B1 + B3*Sun. B1 is now interpreted as the unique effect of Bacteria on Height only when Sun = 0.

An explanation below using real coefficients:

In our example, once we add the interaction term, our model looks like:

Height = 35 + 4.2*Bacteria + 9*Sun + 3.2*Bacteria*Sun

Adding the interaction term changed the values of B1 and B2. The effect of Bacteria on Height is now 4.2 + 3.2*Sun. For plants in partial sun, Sun = 0, so the effect of Bacteria is 4.2 + 3.2*0 = 4.2. So for two plants in partial sun, a plant with 1000 more bacteria/ml in the soil would be expected to be 4.2 cm taller than a plant with less bacteria.

For plants in full sun, however, the effect of Bacteria is 4.2 + 3.2*1 = 7.4. So for two plants in full sun, a plant with 1000 more bacteria/ml in the soil would be expected to be 7.4 cm taller than a plant with less bacteria.

The sun variable means that controlling for the interaction effects of bacteria a plant in full sun will be 9 centimeters taller than one in partial sun.  Of course, all variables are once again contingent on significance.

In addition, one of the best explanations I’ve seen on this subject is this half-hour video. It’s definitely worth the watch.

Now let’s come back to fantasy football.  I used two continuous variables, “Def Pass Yards” or pass yards given up per game by the defense and “Def Passer Rtg” or passer rating given up per game by the defense.  I also used two categorical variables as previously mentioned: type of stadium (dome/open) and type of weather (inclement/clear).  I ran eight regressions in total, four with just dummy variables and four more with both dummy variables and interactions.

Note:  Just another reminder to pay attention to the p-value.  If it’s less than 0.05 the coefficient is significant.

Regression 1:

Continuous variable:  Def Pass Yards

Categorical dummy variable:  Type of stadium

Dependent variable:  Fantasy points

A reminder that I took out all the games where a quarterback had 10 or fewer attempts.  Also, unlike in part 2, I didn’t look at averages for fantasy points, due to the categorical variables complicating things I included every relevant QB game that fit the criteria.  (QB games are each unique row of data).  I followed this method for all 8 regressions.

Dummy and interaction regression 5

We see that the defense in terms of pass yards given up per game that a QB faces is significant controlling for type of stadium.  Each yard results in ~0.09 more fantasy points on average.  Teams playing in a dome vs. open stadium are also significant.  Controlling for yards per game allowed by a defense, a dome stadium results in 1.42 more fantasy points than an open stadium.  This information suggests playing a QB that’s playing a poor defense and/or in a dome. (We will see later that it’s not quite as simple as this).

Please note in order to not get repetitive I won’t continue to write “controlling for…”, it should be assumed.

Regression 2:

Continuous variable:  Def Pass Rtg

Categorical dummy variable:  Type of stadium

Dependent variable:  Fantasy points

Dummy and interaction regression 6

Once again we see that playing in a dome vs. open stadium is significant and the worse the defense a QB faces in terms of passer rating the more fantasy points will result.  Each one-point increase in terms of average passer rating per game allowed by the defense will result in ~0.26 more fantasy points.  Playing in a dome will result in ~1.17 more fantasy points than in an open stadium.  The conclusion is similar to the first regression.

Regression 3:

Continuous variable:  Def Pass Yards

Categorical dummy variable:  Type of weather

Dependent variable:  Fantasy points

Dummy and interaction regression 7

Surprisingly inclement weather is insignificant in terms of affecting a QB’s fantasy points.  Or maybe it’s not that surprising.  Eyeballing the pivot from Part 1, it does in fact not appear that there’s a strong correlation between inclement weather and QBs’ fantasy points dropping.   Therefore maybe it’s best to just not worry about the forecast unless it predicts high wind.  We saw in Part 2 that high winds are in fact significant.

The number of passing yards the defense allowed per game is once again significant.

Regression 4:

Continuous variable:  Def Pass Rtg

Categorical dummy variable:  Type of weather

Dependent variable:  Fantasy points

Dummy and interaction regression 8

The same pattern is followed here and the same conclusions can be drawn.  The defensive stat is significant and the weather doesn’t really matter.

Regression 5:

Continuous variable:  Def Pass Yards

Categorical dummy variable:  Type of stadium

Interaction term: Def Pass Yards X Type of stadium

Dependent variable:  Fantasy points

Dummy and interaction regression 5

We see that “Def Pass Yards” is significant with a coefficient of ~0.08.  In the next four regressions, this variable takes interaction into account.  It means that for every 1 extra yard a defense gives up, a QB playing a game in an open stadium will have on average ~0.08 more fantasy points.  In addition, the interaction term of “Def Pass Yards” and “Dome” is almost significant at a 0.05 p-value level.  It’s worth considering this variable since the p-value is so low.  The interaction variable means that for every 1 extra yard a defense gives up, a QB playing a game in a dome stadium will accumulate ~0.04 more fantasy points than one playing in an open stadium.  This will result in a total of 0.12 extra fantasy points for every yard a pass defense gives up per game in a dome.  Therefore, it does appear that a dome stadium exacerbates a bad defense and leads to more fantasy points.   If you have two quarterbacks playing similar pass defenses (in terms of passing yards allowed) and one will play in a dome while the other one will play in an open stadium it may be worthwhile to start the one playing in the dome.

One interesting phenomenon to consider is that the non-interaction “Dome” variable now becomes not significant, controlling for the interaction effects of the defense.  This may mean that its previous significance was based on its interaction with and affect on the pass defense.

Regression 6:

Continuous variable:  Def Passer Rtg

Categorical dummy variable:  Type of stadium

Interaction term: Def Pass Rtg X Type of stadium

Dependent variable:  Fantasy points

Dummy and interaction regression 6

Defensive passer rating per game in an open stadium is significant.  In other words, an increase of one in passer rating given up per game will lead to ~0.25 more fantasy points in an open stadium.  However, surprisingly the other two variables are not significant at all.  Once again the “Dome” variable becomes not significant.  Most interesting is that the interaction term of “Def Passer Rtg X Dome” is also not significant.  In other words, going from an open to a dome stadium doesn’t change the effect of defensive passer rating on fantasy points.  This is surprising because this doesn’t follow the narrative from the previous example.  “Dome” didn’t become insignificant because its significance was based on its interaction with passer rating defense, like in the previous example.  Therefore, I’m frankly at a loss of how to explain this finding but it appears that a QB playing in a dome is of limited use in situations where you know the opposing team’s passer rating defense.

Also, comparing the numbers above for pass yards being more or less significant brings up the question of why is passer rating so different in significance.  Passer rating as a reminder includes the following:  completion percentage, yards per attempt, touchdowns per attempt, and interceptions per attempt.  It’s possible that the type of stadium is irrelevant for factors such as interceptions.

Regression 7:

Continuous variable:  Def Pass Yards

Categorical dummy variable:  Type of weather

Interaction term: Def Pass Yards X Type of weather

Dependent variable:  Fantasy points

Dummy and interaction regression 3

We see that pass yards allowed by the defense the QB is facing in clear weather is significant.  Every pass yard allowed by a defense that a QB faces playing in clear weather results in ~0.07 more fantasy yards.  Inclement weather doesn’t result in more fantasy points for each passing yard allowed.  Nor is inclement weather on its own significant.  Once again we see that knowing that it’ll rain in a game, for example, is more or less useless.  Also, bad defensive teams are bad defensive teams and weather doesn’t necessarily exacerbate those effects.

Regression 8:

Continuous variable:  Def Passer Rtg

Categorical dummy variable:  Type of weather

Interaction term: Def Pass Rtg X Type of weather

Dependent variable:  Fantasy points

Dummy and interaction regression 4

Once again defensive passer rating for clear weather is significant (every increase of 1 in rating results in 0.24 more fantasy points) and there’s no added effect on fantasy points if these games are played in poor weather.

In conclusion, as we saw in part two, the defense a quarterback is playing whether measured by yards allowed per game or passer rating allowed per game is extremely valuable in predicting how well a quarterback will do.  The type of stadium has limited predictability.  It has predictability when you know the pass defense you’re playing against (as measured in pass yards/gm), as a dome can exacerbate the effects of a poor defense.  It has no predictability if you know the passer rating average.  Finally, inclement weather has no predictability.  The only type of weather that has predictability is wind.  Temperature and precipitation don’t matter as seen in part 2.

Once again my data comes from:

QB Stats

Defensive Stats

Weather Information

One thought on “Fantasy Football Rankings & Statistical Analysis Part 3

Leave a comment