Similarity Scores and Projections
July 2, 2006 4 Comments
As many visitors to this site probably know, the latest craze in projection system design is the use of similarity scores. The idea behind this, essentially, is that the performance of players similar to the one whom we are trying to project is somehow more predictive than simply his past statistics. First of all, I doubt this is really very true for batters. They are an overall much more predictable group, and they do not exhibit as many jumps all over the place. Moreover, I am unconvinced (though Chris from firstinning.com tells me that this is unquestionably true) that overall similarity scores will tell us anything more than category-based similarity scores, for hitters. In other words, I have a lot of doubts about how the fact that two batters have similar walk totals and batting averages will mean that their power should deveop along similar lines as well. While I am willing to believe that a 24 year-old, for example, with 10, 20, and 30 home runs in his first three seasons may not follow a Marcel projection (22 home runs) because of his young age and great progression, I think this fact (if it exists) would be reflected in a category-based similarity score, that is, by finding the average performance of 24 year-olds who also hit 10, 20, and 30 home runs from ages 21-23.
For pitchers, I am more inclined to believe that similarity scores might be a factor, and that in fact various components are more lkely to interact (making overall similarity scores more predictive than category-based similarity scores) though there too I would assume that in fact similarity scores are simply a substitution for what we really want: scouting information. Here’s my opinion on why similarity scores have shown to be predictive (especially for pitchers): They are a decent substitute for scouting information. Baseball Prospectus’ PECOTA system, for example, includes phenotypical attributes (height, weight) and things like strikeouts and walks give us a good idea of the quality (and “hardness”) of a pitcher’s stuff. They’re not always going to be spot-on, but generally, the more Ks and the more BBs, the harder a pitcher’s fastball, and if you get a guy with a lot of Ks and few BBs, you probably have a guy with great stuff. In reality, I think, a statistics-based similarity score system is just a substitute for a scouting-based similarity score system which would be just that much better. Which player is going to age better: The guy with who throws mostly fastballs or the curveball pitcher? Probably the first, but PECOTA (or any other similarity score system today) doesn’t know which pitcher is which. It can only guess based on their component stats. At some point, we’ll need to build up an extensive database of scouting information, and use that instead.
One last thing, on error ranges. One of the cooler features of PECOTA is that it gives us percentage chances for things like whether or not the player will breakout or collapse, and includes error ranges in its forecasts. My personal opinion is that those error ranges are done incorrectly (based on similarity scores); I believe they should be based on a player’s experience, with a normal distribution built off that. I doubt similarity scores give us better ranges than a simple normal distribution would. Finally, and this is a bit of a personal rant that should not be viewed as an attack on PECOTA, because I really, really admire what Nate has done with his system. I personally do not like how BP sells its error ranges. They say it acknowledges the inaccuracy of any projection (which is obviously good). However, what their error ranges really do is add a false sense of accuracy, like we know that player x has a 90% chance of being better than their 10th percentile projection or a 10% chance of being better than their 90th percentile projection. Until someone actually studies the accuracy of those percentiles, overall and among different age groups, we really have no idea of how important those percentiles are, and IMO should approach them with much, much caution.
EDIT: You can read more discussion of this here.