Publisher’s Note: I am delighted that a good friend of the Clay Collins blog is making his TLN debut today in a guest post. I keep my fingers crossed that we can make it a regular thing. Clay developed a player point projection model, and admittedly I’m a sucker for it as I’m a sucker for anything that gives me a range to work with rather than an absolute number. I encourage you all Check out Clay’s work and follow him on Twitter. Hopefully this will lead to more of his work appearing on The Leafs Nation.
Maple Leafs 2021-2022 player point projections
by Clay Collins
DISCLAIMER – This model does NOT use best practices in creating and implementing a model. And that was intended, so to speak. I wanted to put together a very quick and dirty model this season and see my own personal growth through my models as I pursue my education and career as a data scientist. This season is a baseline for me and as such it has some very big problems. There are many much smarter and more experienced modelers and projectors throughout the hockey world that you should definitely explore!
The components of my model:
I made this model using a very simple multiple linear regression. I promise it sounds more complex than it is. It’s basically the same:
y = mx + b
Formula, but add more variables to create them:
y = m1x1 + m2x2 +… + B
I’ve used the last five seasons (2016-17 to 2020-21) as my dataset. I also used four pretty simple stats and one additional statistic to help. These four raw statistics are: Shots on Goal (SOG), Shot Attempts (Satt), Corsi-For (CF) and Takeaways (TK). The “cheat” statistics I introduced were Goals / 60 (Gper60) and Assists / 60 (Aper60) in the goal and assists models. I’ll get into why I chose it when I address why there is Areas.
So this is where these ugly math formulas come in, the actual models themselves:
Goals = 0.102 *SO G – 0.0008 *CF + 0.065 *TK – 0.0169 *Fed up + 9.21 *Gper60 – 3.655
Assists = -0.00011 *SO G + 0.0009 *CF + 0.088 *TK + 0.028 *Fed up + 11.79 *Aper60 – 8,567
There’s a lot more to immerse yourself in, but just that real Statistics nerds would love to talk about it and if you can real Statistics nerd, please don’t weight my model because it’s not good.
Why are there ranges?
The models listed above were calculated using historical data. However, we need to decide what to input for the SOG, CF, TK, Satt, and Gper60 stats so that every gamer can spit out a real number. So what do we add there? I used a “per game” rate for the stats (excluding goals and assists per 60) and multiplied it by the number of games the players expected to play. These are the numbers I will adjust as the season progresses as players are absent due to injuries. An example of this is that I had Matthews at 82 games when I originally ran these; But since he wasn’t ready, I knocked him down on 75 games.
The bandwidths of my models are not a spectrum of all possibilities between low and high. Instead, I used two different inputs: average over the past five seasons and trends over the past five seasons. This is also why some players have a small range (like Tavares predicts between 28-30 goals) and others have a longer range (like Matthews 37-45). A player like Tavares has very similar 5 year averages to his 5 year trends while Matthews (believe it or not) is still trending up from his averages.
What do I mean by the trends? Here is the other one Bad Modeling behavior comes into play. I’ll use my model for the coyotes that I made first to explain this (and you’ll see why).
Alright #yotes Friends:
Here are my goals, assists, points predictions / ranges for the Coyotes list this year.
* This does NOT use data best practices, but I will go over my models and inputs in a following thread pic.twitter.com/eClgItQYYM
– Clay Collins (@ Clay_C10) October 12, 2021
Let’s look at Chychrun’s five year trend for shot attempts per game.
To do this, I’ll use some opinions to help decide what type of trend to use. For this reason I think: “Chychrun has seen an upward trend in many ways. So is it reasonable to assume that his 6.34 shot attempts per game last season could be pumped up to the forecast 6.8 shot attempts per game? Oh well. He is becoming more and more self-confident and the Coyotes have swapped OEL away. This paves the way for even more time and growth for Chychrun. “
Another example in the other direction is Jay Beagles Goals at 60.
Well … even if it makes sense in its own strange way, Beagle can’t have negative goals per 60. Even saying zero can be insincere.
For him, I went with an exponential model (think about the half-lives of radioactive):
This seemed to me to be more suitable for my purposes and it fits very well.
I do this for each player with their different stats and adjust the trend line to be more reasonable. This is a lot of prejudice and opinion going into the trend to be used to project the input, which is good for more reasonable player-to-player models. However, it is bad trying to project across the entire NHL.
It also means that I am much more certain of my Coyotes stats than the Leafs numbers. I’ve invested a lot more time in how the team should play with the current squad for the next season and can make more informed decisions when deciding on a model.
For the “My ____ selection” columns, I selected which of the two issues I thought was more likely based on the player. That is, for some I chose the averages, for others I chose the trends.
How accurate will this model be?
As accurate as using averages and trends can be. For some players, I feel pretty good about my final picks. With others, I feel like my areas are likely to be accurate. For players with little data or absurd trends like bunting, I really have no idea.
While I’d tend to say that using trending data is better overall than using averages, the past two seasons have been far from average for all of us, including NHL players. Because of this, I decided to use both and set it as the range between the lower and higher values between the models.
Either way, this is just my first step in my personal journey into predictive modeling and I will be watching them throughout the season and hope to come back next season with a much better model and a lot more confidence in my choice.
After signing up for a free account we will give you a number of boxes of player names and you will choose a name from each box until you have put together a super crew that you think can fight for NationDrafts -Championship. Seems easy doesn’t it? It’s easy, and it’s not just because you’re bad. Register here for FREE right now.