Some statistics just don’t make sense
August 3, 2005 10 Comments
Currently, I’m loving Baseball Prospectus and greedily taking advantage of their week-long free preview. But I have to say: it’s not all good. It’s not just that the folks at BP charge for almost all their content, or that their research is not open source, or that they won’t adopt measures not invented by BP. It’s that they’re willing to come out with a bold-faced lie just to justify themselves. They have plenty of stupid ideas, but instead of scrapping them and admitting that they’re not at all useful (or even doing it quietly), BP continues to flaunt ideas that just don’t make sense.
Take for exampleEQA, a statistic that is basically supposed measure offensive production on a scale similar to batting average. All right, the idea is a nice one, and something I’ll address a little later. But here is the basic problem: EQA does not work on a linear scale, meaning that the difference between a player with a .260 EQA and a player with a .270 EQA is not the same as a player with a .290 EQA and a player with a .300 EQA. That makes no sense, at least not from the perspective of doing quick analysis. But that’s supposed to be the whole point of a statistic like EQA: by putting offensive contribution on a batting-average-like scale, it’s supposed help quickly analyze a player. But it doesn’t do that!
But that’s not it. EQA is a statistic that even BP has acknoweldged came about accidently, through much random tinkering to help its accuracy. It doesn’t make any more sense than Runs Created, Equivalent Runs, or a million other run estimators whose accuracy is based solely on the fact that they were fitted to the data. There are in my opinion, two “true” run estimators: linear weights and BaseRuns. Without going into the details and the whys, these are the only two that make sense. And since the accuracy of all run estimators is pretty much the same, well then it makes no sense to use anything but those two. Yet in the article I linked to, BP touts EQA as a tool whose “ability to estimate runs scored from team and league data is unsurpassed.” Great, except that is a lie. First of all, they likely messed up the way they measured BaseRuns. Secondly, who the hell cares that EQA better evaluates offenses from the 1870s? And thirdly, wesl thirdly, there was a great discussion of this on the Fanhome board when this article was written, so I’ll stop.
But let me get to back to my point: EQA’s main goal, the reason it is considered a useful statistic, is that it gives us an easy way to evaluate and compare batters. But it doesn’t, really. And that’s why I would much rather use GPA which is uses the same idea (forcing a scale equivalent to batting average) but makes more sense, is easier to calculate ((1.8*OBP+SLG)/4), and doesn’t aspire to be anything more than it is: a useful, and generally accurate, tool.