Delving Deeper into Hockey's Advanced Stats
Many are at least vaguely familiar with the typical advanced stats; your Corsi, Fenwick, PDO, etc. But what about Total Hockey Ratings? Or Expected Shooting Percentage? Or DeltaCorsi? Zone Start Adjusted Corsi, anyone?
As the analytics movement in hockey grows, there will continue to be new stats and measurements added that try to not only simplify the process, but also give those who use them advantages in analyzing players. Some new stats that I've come across will be gone over in this post, and I'll probably factor some of them into analysis that I'll perform once the 2014-15 season gets underway (quick glossary of terms here).
Total Hockey Rating (THoR)
Did somebody say THOR?
YES. WITH THIS ALL MIGHTY STAT, WE MEASURE THE ABILITY OF EACH PLAYER TO PUT ON AN ASGARDIAN PERFORMANCE, WORTHY OF RECEIVING THE HIGHEST HONOR; TO BE THOUGHT OF ON THE SAME LEVEL AS THOR, PRINCE OF ASGARD.
Kidding aside, THoR is a way of encompassing all that a player does on the ice into one complete stat. Statistical Sports Consulting owners (and creators of THoR) Michael Shuckers and Jim Curro describe it thusly:
"A two-way player rating that accounts for the all of the on-ice action events when a players is on the ice as well as their linemates, their opponents, the current score of the game, and where their shift starts. Each event is assessed a value according to the chance that it leads to a goal. THoR uses a statistical model to determine the value of each player’s contribution to the overall outcomes that occur while they are on the ice."
Basically, things such as shot attempts, shots on goal, hits, faceoffs, zone starts, teammates, etc. are factored into a statistical component that totals how likely each event caused by a player is 1) likely to result in a goal for the team and 2) a measure of that player's skill level and not a result of his linemates. THoR is measured in Wins Above Replacement (WAR).
Like all metrics, it isn't perfect and has a number of shortcomings, as well as one major shortcoming; it doesn't account for shooting percentage. A shot taken by Tyler Kennedy is valued the same as one taken by Steven Stamkos, though those two obviously should have different values. Regardless, THoR does hold some advantages over RelCorsi. Its autocorrelation (year to year repeatability) is about .1 (point - one) higher than that of RelCorsi, though it has a tad lower correlation to winning. It can be more easily predicted from year to year, and still is able to be a valid measurement over 50% of the time. (For the full research paper presentation of THoR, that goes over every metric included in the equation, see here.)
DeltaCorsi (dCorsi for short) is the first true measurement of it's kind. Essentially, it takes RelCorsi, and then factors in team effects, zone starts, quality of teammates, quality of competition, skater age, and etc., for a total of about 8-11 variables per player to develop an Expected Corsi. The Expected Corsi is then measured against Observed Corsi, with the difference being dCorsi. Matt Pfeffer, statistical analyst for the Ottawa 67s, sums it up perfectly for us"
"Big picture, I see this as the replacement for Relative Corsi as the best stat fancy stats has to offer... Corsi Rel has a ton of value add and dCorsi is exposing its weaknesses."
Just some notes, though I'm not going to delve into the stats: dCorsi is a repeatable stat, and it's normally distributed. The best current use for it is to compare players who have similar Expected Corsi values, to see which ones are able to handle the workload the best. Ideally, a team will have a dCorsi of zero, as each player on the team is given assignments that they are capable of handling. (For Steve Burtch's full research paper on dCorsi, look here. For Steve's shortened version in article format, look here.)
Zone Start Adjusted Corsi
This stat is by no means perfect, but is another way to look at the adjustments made for Corsi. I have an amplified version of the one found on stats.hockeyanalysis.com, and can give us a better indication of which players over perform their zone starts. Combined with dCorsi, we should be able to fully tell which players are performing over their usage in terms of controlling the puck for their team.
(The stats come from the Boys on the Bus blog. For the original post, check here.)
Expected Shooting Percentages
Here, we have work done by two different groups. Group 1 is Rob Vollman, Christophe Perreault, and Arik Parnass. Group 2 is Matt Pfeffer.
Vollman, Perreault, and Parnass
The work done here focuses on calculating expected shooting percentages based on previous shooting percentages of players. In its base form:
The methodology was time-consuming, but really quite simple. We took the past nine seasons, all the way back to the 2005 lockout, and worked out each team's expected shooting percentage based on each of its player's individual shots and previous career shooting percentages.
For example, for Montreal you would multiply Max Pacioretty's 270 shots by his previous career shooting percentage of 9.83% to get 27 expected goals. Do this for all the Canadiens and you get an expected number of goals, their actual number of shots, and therefore an expected shooting percentage for the entire team.
There are more bells and whistles to it that can be found in the whole article. Also of note is the fact that these expected shooting percentages are calculated for teams, and not for players. They probably won't be used in player evaluations, especially if they don't lend any value to the analysis, but could be useful for predicting how the Panthers will do next season.
Matt Pfeffer goes a different route, instead calculating an expected shooting percentage based off of a number of variables such as shot distance, shot type, and timing (face-off, rebound, etc.) to develop an expected shooting percentage, which is then compared to an observed shooting percentage. I see this as a partial replacement to the missing piece in THoR. Players who are consistently over their expected shooting ranges can be considered "snipers" (Stamkos) where as players who are below their expected (Kennedy) should be considered "not as dangerous." It's important to note that the Pfeffer model doesn't account for past shooting percentages, and only accounts for the factors that he goes over in the full article. Even more important to note is that though the theory is sound, the percentages Pfeffer uses in the article are woefully inaccurate. Sorry, but Tomas Fleischmann did not have a 12% shooting percentage last season. I wish he did, but he did not. If the data is fixed, this stat should be pretty useful.
Passing Stats (SAGE)
Passing data, as a whole, doesn't necessarily lend much information to us, as the passes can't really be correlated to winning hockey games. SAGE (Shot Attempt Generation Efficiency) however, has a very high correlation to winning. Ryan Stimson, of In Lou We Trust, has single-handedly tracked passing data for the New Jersey Devils from last season, and from that data, he's been able to calculate that the team who is more efficient at getting shots on goal from passes goes on to win the game 81.7% of the time. The closest other percentage to that is even strength Fenwick, which only is accurate 54.9% of the time. Hopefully, another full season of data will help further validate SAGE as a stat, and allow us to even better assess player value.
(The theory behind the SAGE is that shots taken after a passing attempt are more likely to go in, because the goaltender has to reassess the situation (location, shooter, type of shot) in a limited time frame. For more on the passing stats, check out the original post at ILWT.)
*I'll be tracking passing stats for the Panthers next season.
These statistics add more pieces to an already complex puzzle. Hopefully, through using them and analyzing players, we'll be able to better understand both how the stats can be used, and which players are better at helping their team win. In the upcoming weeks, I'll be going over how certain players did, to see if we can glean useful information headed into next season.
There are other stats referred to as Adjusted Plus Minus (APM) and Expected Goals. Both come from work done by Brian Macdonald who has been hired as Florida's new Director of Hockey Analytics.
There's not yet much to be found on his work beside research papers, so I'm not entirely sure how effective it is, but hope he's open-minded enough to take in the vast amount of information available on the web, and factor it into his calculations. It's good news to hear that the Panthers have created such a position in any case, even more so by filling it with someone who has done legitimate, quality work in this quickly-expanding field. I don't entirely understand why Florida hasn't promoted the hire (and position) a bit more publicly in what is a clear step in the right direction for the club.