# Some statistics just don’t make sense

August 3, 2005 10 Comments

Currently, I’m loving Baseball Prospectus and greedily taking advantage of their week-long free preview. But I have to say: it’s not all good. It’s not just that the folks at BP charge for almost all their content, or that their research is not open source, or that they won’t adopt measures not invented by BP. It’s that they’re willing to come out with a bold-faced lie just to justify themselves. They have plenty of stupid ideas, but instead of scrapping them and admitting that they’re not at all useful (or even doing it quietly), BP continues to flaunt ideas that just don’t make sense.

Take for exampleEQA, a statistic that is basically supposed measure offensive production on a scale similar to batting average. All right, the idea is a nice one, and something I’ll address a little later. But here is the basic problem: EQA does not work on a linear scale, meaning that the difference between a player with a .260 EQA and a player with a .270 EQA is not the same as a player with a .290 EQA and a player with a .300 EQA. That makes no sense, at least not from the perspective of doing quick analysis. But that’s supposed to be the whole point of a statistic like EQA: by putting offensive contribution on a batting-average-like scale, it’s supposed help quickly analyze a player. But it doesn’t do that!

But that’s not it. EQA is a statistic that even BP has acknoweldged came about accidently, through much random tinkering to help its accuracy. It doesn’t make any more sense than Runs Created, Equivalent Runs, or a million other run estimators whose accuracy is based solely on the fact that they were fitted to the data. There are in my opinion, two “true” run estimators: linear weights and BaseRuns. Without going into the details and the whys, these are the only two that make sense. And since the accuracy of all run estimators is pretty much the same, well then it makes no sense to use anything but those two. Yet in the article I linked to, BP touts EQA as a tool whose “ability to estimate runs scored from team and league data is unsurpassed.” Great, except that is a *lie*. First of all, they likely messed up the way they measured BaseRuns. Secondly, who the hell cares that EQA better evaluates offenses from the 1870s? And thirdly, wesl thirdly, there was a great discussion of this on the Fanhome board when this article was written, so I’ll stop.

But let me get to back to my point: EQA’s main goal, the reason it is considered a useful statistic, is that it gives us an easy way to evaluate and compare batters. But it doesn’t, really. And that’s why I would much rather use GPA which is uses the same idea (forcing a scale equivalent to batting average) but makes more sense, is easier to calculate ((1.8*OBP+SLG)/4), and doesn’t aspire to be anything more than it is: a useful, and generally accurate, tool.

[…] “>

In response to a comment about my EQA post, I posted a link to a Baseball Primer discussion thread on BaseRuns. Obviously, I […]

Hey Im a huge BP fan, and its nice to hear ur opinion of their stuff. However, could u link me that fanhome discussion i must have missed that one and Id love to read it. Email it to me if u cant do it here. Thanks

TJ, I can’t seem to find it. It may have been deleted. Unfortunately, it seems that google does not have it in its cache either. There was also a good BsR thread on Primer:

http://www.baseballthinkfactory.org/files/primer/studies_discussion/29382/

Of course I could have named Putsy Caballero, Jimmy Bloodworth, and Steve Jeltz. The point is best made using the example of overrated, power-deficient corner outfielders.

I am skeptical of any and all formulas which include SLG, for the obvious reason that SLG includes SINGLES, which are a component of CONSISTENCY, not power. Isolated power, or power average as I prefer to call it, credits one power base for a double, two power basis for a triple, and three power bases for a home run. While we know linear and regression values for extra base hits deviate from the above assigned values, that is not a case for continuing to employ the archaic SLG statistic, which does little more than pollute research results. Singles hitters like Carew, Gwynn and Suzuki should never be made to look at least proficient in the SLG category because they are/were adept at slapping a lot of singles around the ballpark. To find the unadulterated Power Average, simply subtract batting average from slugging average. NOW the overrated high-average banjo hitters are exposed for what they are – one dimensional imposters!

Rob, you just named three of the best hitters in the history of baseball, which is true whether you use runs created, linear weights, base runs, etc. Singles contribute plenty to scoring and I see no reason to discount them. If anything, SLG underrates singles, because a double is not worth twice as much, nor is a triple worth three times as much, nor is a home run worth four times as much as a single.

Can players be overrated based just on batting average? Yes, and some are. But if you use an advanced run meaure like linear weights, you’ll be fine.

I neglected to trash Gwynn and Suzuki for their abysmally low walk-rate, as well. So here you have two RF’s who do not hit for power and who do not work the count to take the walk and help the team. Bill James, years ago in an “Abstract,” made reference to two players who fit the above mold at the time (Steve Garvey and Garry Templeton) as “Notoriously selfish ballplayers.” it’s long past time, in this computerized, information age, to take another look at idolized singles hitters who play corner positions who neither walk nor hit for power. All the P.R. coming down has blinded the general public, but surely we sabermetricians can do better than to fall for the hype.

Rob, Gwynn had 398 career win shares, more than Wade Boggs, Mark McGwire, and I believe more than Rafael Palmeiro. Ichiro has had some great (30+ WS) seasons as well. I see no reason why non-power hitters can’t be valuable and the numbers back that up.

[…] Okay, it’s actually simply an attempt to correct an old one. Last week, I wrote that GPA is a better statistic than EQA and easier to use for comparison purposes. But there is a […]