How To Construct a Defensive Metric

If we are given sufficient detail in the play by play, defensive stats would be a compilation of not only putouts, assists and errors, but also of hits, doubles, triples, homeruns, extra bases by runners, etc., which are charged against each fielder. Then, to provide proper context, these observed values are compared to the expected values of the collection of batted balls that were hit to each fielder.
Whether they are soft grounders to short or line drives in the gap, each play is described as to whether it’s a hit our out, where it is hit, how hard, whether it’s a fly or grounder, etc. Plays with the same description are grouped, and then the probability of each grouping being an out, error, single, double, triple or homer is calculated. By counting the number of each type of play each fielder is presented with, and then multiplying those sums by the probability distribution of each type, the expected number of outs, hits, etc for each fielder are derived. Typically, the difference between the observed and expected values is expressed by subtraction as a plus or minus number of plays, or as a ratio.
This is one of the places where I a favorite tool, which I call the “Inverse James Function”. The ever brilliant Dan Fox gave a rundown of James’ original log5 method at The Hardball Times
ExAvg = ((BAVG * PAVG) / LgAVG) / ((BAVG * PAVG) / LgAVG + ((1-BAVG)*(1-PAVG)/(1-LgAvg))
Bill James introduced this in the 1981 Baseball Abstract to answer the question “Given a certain batter and a certain pitcher, in the context of the league mean, what should the result be?”
My inquiring mind twisted this around to ask “What if ExAvg is instead what I observe. I know LgAVG and can calculate PAVG as the expected value. Then if I solve for BAVG I have the true value of B in the context of P and L.” Changing ExAVG to ObsAVG and solving for BAVG, the formula becomes
I have used this formula as the core for my Park Factors, comparing the observed home values to the expected road, and also for MLEs, comparing the observed minor league value to the expected major league.
Basically, the formula expresses the ratio of the observed to the expected, multiplied by the mean, but it’s constructed so that the result will never be less than 0 or more than 1. If Obs = Exp, then R = Lg. If Obs > Exp, then R > Lg, and if Obs < Exp, then R < Lg.
Count every batted ball?
Everything gained by the offense is allowed by the pitching and defense. Philosophically, we can take a top-down approach that also states that the team stats are the sum of the individuals on the team. DER, as a team statistic, uses all batted balls, but individual defensive metrics like ZR and UZR exclude popups. PMR, SFR and OPA take everything into account, but PMR doesnít breakout the results, such as separate ratings for groundballs and popups.
Account for every base?
DER and it’s derivative ZR are measure the percentage of batted balls that are made into outs, and so are analogous to batting average. They do not consider extra base hits, such as in slugging average. It is a defensive skill, a combination of positioning, range and arm, to keep a batter from stretching a single into a double, or a double into a triple. When evaluating the ability to keep baserunners from advancing, we can use a weighted mean of every groundball and flyball by base and out situation.
I think Dan Fox is also clairvoyant, as he seems to have stolen all my best ideas. His Simple Fielding Runs (SFR), as well as Pizza Cutterís OPA! are two metrics which are capable of all these things. As both were designed to evaluate Retrosheet play by play data, they have the flexibility to handle most any kind of data input, calculating expected values on the weighted means of whatever play descriptions are available. I suggested to Dan that he could us GameDay data with SFR, and he was able to produce minor league ratings for 2007. This flexibility can also be a weakness, as the generated ratings are only as good as the preciseness of the play by play data being input. Therefore, not all SFR or OPA! ratings, even with the same sample size, have the same level of certainty, as this is dependant on the source of data, although this fades with large enough sample sizes. Given the same input data as UZR or PMR, SFR and OPA! should give equivalent results, but SFR and OPA! can give results when less than optimal play by play is all that is available.
One other neat thing Pizza Cutter has done with OPA! is the ability to rate infielders on different skills used to make outs, such as range, hands and arm. Each is expressed in terms of runs, so that they can be added to a grand total as well as listed individually. The components can also be scaled, such as 1-10 or A-E. Then you can say that Derek Jeter has a range of E, but an arm of A, which still adds up to a poor shortstop.
With detailed play by play now available for all levels of professional baseball, the ability to measure defensive performance is light years ahead of just a few years ago. There are still a few tweaks to the scorekeeping that can eliminate most or all of the need for estimations, and there are also the more major upgrades like fielder locations and batted ball trajectories that may take more time to realize. Iíve commented on several of the metrics currently available. They all have a lot to like, but still have some limitations. Letís not stand still, thereís still a lot of development to be done.


2 Responses to How To Construct a Defensive Metric

  1. dan says:

    Is there a reason that WOWY isn’t used more? It receives almost no attention (that I’ve seen) but seems like a pretty sound way to evaluate fielders without using PBP data.

  2. Brian Cartwright says:

    WOWY is pretty new, and I like it, although I admit to having it slipping my mind when writing this article, as it didnít quite fit into the methodologies.
    I think that WOWY could be used with or without play by play data. Conceptually, it uses matched pairs of players, so it is probably the best or at least easiest way to split combined performances, such as the feed and pivot men in the double play, or the pitcher and catcher in stolen bases.
    Coding a WOWY tool in SQL needs to go on my to do list. Tango has shown that it can be used to measure range, but right now I think Iíd use it conjunction with OPA!. as a tool to measure some of those specific skills.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: