NHL Coaches Analysis

I’m back! It’s been an almost year-long hiatus. This latest project is a ranker of NHL coaches based on the performance of their players under other NHL coaches.

The two major stats I used were Corsi For % (CF%) and PDO. These are popular advanced stats in the NHL analytics community. CF% is a possession stat that looks at shots attempted by a team on net (shots on goal, blocked shots, and missed shots) while 5 on 5 divided by these attempts total for both teams. This stat is supposed to gauge how much a team had the puck and correlates well with winning.

Hereafter, Corsi For (shot attempts for) will be classified as CF and Corsi Against (shot attempts against) will be classified as CA.

PDO is a stat that looks at shooting percentage 5 on 5 plus save percentage 5 on 5. This stat is supposed to regress to 100% and anything above is considered good luck and below is considered bad luck.

I regressed both of these variables in each season from 07-08 through 18-19 (07-08 was the first season this data was available) against team points because looking at both paints a more complete picture. Furthermore, if I were to compare players on different teams I can’t just look at CF% if a certain team had a much higher PDO they would be more successful.

For 2012-13 I projected the point totals over an 82 game season since it was a lockout-shortened season where only 48 games were played. The projection was simply made by multiplying each team’s point total by 82/48. I also made a similar adjustment for the 2007-08 New York Islanders. They played 81 games under Head Coach Ted Nolan and 1 game under Al Arbour. In order to keep the data from that season, I adjusted all of the stats for that one game including point totals. There I took out the points that were won in the one game Nolan didn’t coach and multiplied the resulting point total by 82/81, it was for all intents and purposes treated as an overtime loss as the projected points from that game were around 1. I did the same with the CF%, deleting the Corsi for and Corsi against from that one game (68 for and 40 against).

This was the output of the regression I ran:

CF%-PDO Regression

As you can see the p-value for both variables and the intercept are highly significant (<0.05) and the R Squared is predictive at 84%.

Next, I went about collecting data (CF, CA, CF%) for individual players for the same time period, as well as the coaches that coached the team at the time.

I originally thought that CF% would depend on experience. Perhaps rookies performed worse, as did older players. However, I plotted the CF% by experience and saw no pattern, nor much of a difference between worst performing and best performing. Therefore I decided not to adjust the data for this variable. Below is the chart:

Corsi by Experience V2

For individual player data, I used players that played 60 games or more in each of the relevant seasons and 35 games or more in 2012-13 the lockout-shortened season. I chose 60 games because it’s a larger sample size, more likely to have legitimate NHLers rather than farm club call-ups and because it made the data more manageable. 35 is about the same amount of games proportionately for a 48 game season as 60 is for an 82 game season.

I also deleted players that played for more than one team in a given season. The data source I’m using doesn’t identify those teams and therefore I couldn’t link those players with any coaches.

For the coaches’ data, I identified who the coaches were per team per season. Most teams only had one, some had two, and in the relevant years, only the 2011-12 LA Kings had three. They won the cup that year, so perhaps that was their secret.

As previously mentioned the 2007-08 Islanders were listed as having two coaches but I treated them as a team coached by only Ted Nolan. I made all of the appropriate points and stat adjustments, for the players and the team.

I excluded the teams that had seasons with multiple coaches because there wasn’t a way I could assign Corsi to the appropriate coach in a reasonable way. I did, however, use these teams when comparing players’ performance under different coaches. For example, if Jaromir Jagr played for a team coached by Tom Renney of the Rangers and we want to look at his CF% for other coaches and he played in another season for a team that had two coaches not named Tom Renney, I would include that in the comparison since it doesn’t matter which of those coaches had what Corsi, all that matters is that season’s Corsi. Furthermore, if Jagr played in 08-09 under Renney and Tortorella (this is hypothetical since he didn’t), that season wouldn’t make it into Jagr’s average of stats that he compiled under different coaches.

To compare players’ performance under different coaches I used a slight variation of the CF% statistic, called relative CF%. Relative CF% takes into account the fact that better teams would inflate individual player CF% and worse teams would deflate it. So this statistic measures how the player does in terms of CF% while he’s on the ice, compared to what the CF% is when he’s off the ice.

The statistic is already tracked for each player in a given season. However, I had to estimate for the average the players had under different coaches. In order to do that I projected Corsi events (CF + CA) for the games a player played. I did so by taking Corsi events per season and adjusting it to the number of games played by each player. This was done by multiplying Corsi events by games played by a given player divided by games in a season. I then took these projected events and subtracted the Corsi events for each player to get the Corsi events of when they were off the ice. From there I used relative Corsi% to estimate the CF and CA while the players were off the ice. I subtracted relative Corsi from actual Corsi to get CF% while the players were off the ice and multiplied that by the events to get CF while off the ice, subtracting that from the projected off-ice events, yielded CA while off the ice. Finally, looking strictly at the data of the players under other coaches, I summed up the CFs for each season divided by events and subtracted CF divided by events while off the ice. The result was the relative Corsi for each player.

Next, I took this estimated relative Corsi and added it to the off-ice CF% of each player in a given season. As a reminder, the off-ice CF% is CF% minus relative CF%. The reason I used relative Corsi was to mitigate the effects of team strength. If Jagr played on a stronger team than a given Rangers team, we don’t want to use his CF% which would likely be lower on the Rangers.

After recording these CF%s that players had under different coaches I adjusted them. I looked at each team’s actual CF% and divided it by the CF% if you just average out the players in the dataset (I used total CF divided by total events to weigh the players according to the events). Then I applied that same multiple to the CF% under different coaches to project what the CF% would be on this team hypothetically if all of the players averaged the CF% they had under other coaches. This adjustment was necessary because there’s no way to get a team Corsi number by averaging CF and CA since they overlap between players, in addition, all of the players aren’t even in the dataset.

Finally, I took this projected CF% and applied it to the regression model, multiplying it by the CF% coefficient, adding that to the actual PDO of the team multiplied by the PDO coefficient, and finally adding the intercept. The final result is the projected points that this hypothetical team would have if the players performed the way under the given coach as they did under other coaches. I then took the delta between these projected points and actual points. This number was how much the coach over/underperformed compared to what could be expected given these players and their historical CF%.

The output of the analysis is the top 10 and bottom 10 coaches in the NHL during this time period based on this point delta.

Top 10 Coaches by Pts

Bottom 10 Coaches by Pts

The tables above displays the coaches, their average delta of points above/below expected, their most/least successful team, those teams’ delta from expected, points, projected points, CF%, projected CF%, and CF% delta.

It’s worth noting that the 2008-09 Avalanche, the 2011-12 and 2012-13 Oilers, and the 2011-12 Bruins all had very similar actual CF% to their projected and therefore over/under-performed what the model would have predicted if we were to input their actual CF% and PDO into it. In fact, both Oilers and the Bruins team had actual CF% that was in the opposite direction of what’s to be expected given the wins delta. My hypothesis was that these teams did very well in special teams and therefore metrics that don’t measure special teams under/overstated their performance. My hypothesis was wrong, however:

Power Play Effectiveness

The Oilers and Bruins didn’t really stand out in terms of Power play goals given up or allowed and the Avalanche actually overperformed the model while their special teams were actually very poor. We’re not expecting a 69 point team to have amazing special teams so it’s possible that -14 is actually better than expected. It’s also possible that the Avalanche overperformed the model because they relative to what was expected won a lot of close games and lost a lot of blowouts and the Oilers and Bruins were the opposite.

I also looked at the top 10 coaches and bottom 10 coaches in terms of CF% vs. projected.

Top 10 Coaches by CF%

Bottom 10 Coaches by CF%.PNG

Here we look at the top and bottom 10 coaches in terms of CF% vs. projected, the average delta of these coaches, the most/least successful team, the delta for these teams, the CF% and projected CF% for these teams, the player who had the greatest individual growth and drop of CF% vs. projected on those teams, as well as how much that growth/drop was by.

Jack Capuano as a coach that won significantly more than expected was most surprising due to his poor reputation. Adam Oates performing poorly in both metrics was least surprising since he’s universally considered a poor coach.

CF% is a more direct way of ranking coaches, however, this is a “wins and points” business and it’s important to gauge who over/underperformed when it comes to points since that’s ultimately what coaches are judged by.

SUMMARY:

Here are the most important data in a nutshell, as well as the coaches that take the top (or bottom) spot in each list. Each picture is a link to the coach’s Wikipedia page if you’re interested in learning more.