1. So you know, since I've continuing to work on this, the fundamental problem was actually that oZS was being drastically over-scaled. That whole point was to remove the correlation between CF% and oZS, which you will be glad to know changes many scores.
2. I have since stoppped using oiS and oiSV and started using SCF% I didn't do this because its relation to the goalie, as both stats were used rel to the team. I changed it because of the low sample size.
3. It was a small project, got an 87% but don't plan to stop working on it.
4. Maybe I was too harsh on hockeyviz, some of their graphs are useful, but I was really wanting to see the actual numbers, because I am interested in creating various metrics, which obviously I can't do by judging the thickness of the bars. There is no me trying to find a specific result, after all, its the same data, Natural stat trick is more of the site I'm looking for, they have both graphics and the actual data to those graphics. Natural stat trick has by game all the data one could ask for. There is some useful visuals on hockeyviz, notably shooting maps, but I don't believe they paint the full picture.
5. I think we are not really connecting on individual stat, my understanding was that you were not referring to on ice stats (CF, FF, etc.), which are usually categorized differently than individual stats (iCF, P, etc).
6. I started this project after getting frustrated with looking at
https://frozenpool.dobbersports.com/frozenpool_playerusage.php, which seems to be a very helpful tool, but I couldn't stand that there was no scaling on Corsi, which its common sense that if you start in the O zone more you are going to have a higher CF% than if you started more in the D zone. You could also see that for many players their GF% was drastically different than their CF%, even if the goalies are very different. But obviously GF% doesn't really have a very good sample size, especially for players who only played a few games, one of the reasons that I switched away from oiS and oiSV, so I went with SCF% because typically there is around 9x more data.
newScore2 here is very simply CF%*SCF%/50. Obviously there is a small correlation between them, enough to change many scores.
This was my solution, if you can tell me if there is a problem with that, I would actually appreciate it.
the next step would be to scale it by the quality of competition, newScore + sigma[playerScore*TOI]/(Total TOI) - 50.
since this is what this whole deep dive down this stupid ass rabbit hole started from, the point is basically to solve what is wrong with this graph, for example, Thornton has a pretty good 53% CF, but he starts more in the oZone and faces easier competition than the rest of the team, so obviously his score should be worse than that, while Vlasic for example starts with a 49% CF but plays in the Defensive zone more and faces harder competition, and then there is just the fundamental problem of that Corsi (and Fenwick) don't consider the quality of the shots, hence the inclusion of SCF% and previously the rel difference of oiS and oiSV.
still no clue where you got the idea that I'm purposefully weighing things differently to fit a narrative... but you do you.
Unfortunately, I have not seen many things like this, especially with zone start scaling, and surprisingly quality of competition stats (QoC) seems to be hard to come by as well. I'm not re-inventing the wheel, im using it... its not rocket science.