What’s Missing From Zone Ratings
September 17, 2008 Leave a comment
Last week I looked at which pieces of data are needed, and which of those we do not always have available, when constructing defensive metrics. This week I take a look at Zone Rating, pointing out one major missing piece of data.
Bill James’ Defensive Efficiency Rating (DER) is the simplest of calculations, dividing successes (plays made) into opportunities (batted balls). The batter is either safe or out, with no judgment needed on the part of the scorer. To go from team DER to individual DER, the only info that needs to be added is which fielder was responsible for each batted ball, which I developed in my college summer league in 1982. John Dewan, then of STATS Inc. developed Zone Rating (ZR) in the late 1980’s. There are many reasons I like ZR, but there are also several shortcomings. Dave Studeman gave his own review of ZR
ZR is quite similar in that it does explicitly answer the “Who?” question for every ball in play, and uses the same successes per opportunity percentage. Where ZR differs is in judging the balls in play by level of difficulty, deciding whether it is in the fielder’s “zone” (better than a 50% chance) or out of the zone (less than a 50% chance). STATS divides the playing field into zones, derived by angle and distance from home plate. Historical data is used to determine which of those zones have a greater than 50% success rate for the fielder, and balls that fall in these are considered “in zone” for the fielder. Some plays are made out of zone, but a much higher percentage is made in zone. The simpler incarnation of ZR reports only the in zone figures, and ignores any plays, made or not, out of zone. Chris Dial wrote a detailed description of how the balls are coded
What do we do to account for plays made out of zone? STATS current ZR puts plays made out of zone into both the numerator and denominator. This makes sure the percentage never exceeds 1.00, but does nothing to show us what percent of opportunities out of zone were successful. Revised ZR at hardballtimes.com lists separately the total plays made out of zone. This avoids any math problems that come with STATS approach, but still doesn’t tell us much. This would be like saying Jeff Reed has made 92% of his field goals inside the 45 yard line, and 10 outside. OK, but how many attempts did he have?
As STATS scores the games and produces the data, I will lay it at their feet that no incarnation of ZR gives us one simple piece of data – how many out of zone opportunities were there? Given that, we would have, for each fielder, the percentage of plays made in zone, and the percentage made out of zone. Each play would be classified into one of the two categories, each with its own expected and observed values.
Even with only two categories for level of difficulty, ZR data could then be represented in a probabilistic model, using a plus/minus or percentage system to show plays made compared to plays expected. In zone and out of zone percentages could also be combined into a single value in a much more mathematically valid way than is being done now.
Next week I’ll take a look at probabilistic methods.