The PanDIBS theory: Pitching and Defense Independent Batting Statistics
September 28, 2008 5 Comments
DIPS changed everything. (Thanks Voros!) It was the first sustained theory that evaluated players not so much by what a player had done over the last year, but at which part of the player’s (in this case, the pitcher’s) performance was something within his control and what was out of his control. The theory has been refined here and there, but the basic idea remains: there are some things that a pitcher has more control over than others. It’s a little disconcerting to think that so much of baseball rides on luck, but it’s important to know.
What’s odd is that this line of theories seemed to stop there. To my knowledge, no one’s really looked at whether there’s any analogous coherent theory out there for batting statistics. Are there some batting stats that seem to be more statistically reliable (i.e., skill based) and some that are more un-reliable (i.e., luck based). I’d contend that the answer is yes, and the pattern works in a specific way. In previous columns, I’ve taken some time to meditate on the statistical reliability of many stats, some more esoteric than others. (Eric Seidman once called me the master of statistical reliability.) When I looked through some of the work from a more wholistic standpoint, the pattern became pretty clear.
Take a look back at this article on when different statistics stabilize enough to the point where they can be considered reliable. The stats that stabilize the quickest are the ones over which the batter might be expected to have the most control (whether or not he swings, how often he makes contact), but then are followed in controllability by things over which there is some interaction between the batter and the pitch (type of batted ball), and then by some of the actual results that come from that batted ball (single, home run, out). Roughly.
Let’s model the outcome of an at-bat in a flowchart. A plate appearance can basically end in one of four ways. The batter can walk, strikeout, be hit by a pitch, or do something that involves hitting the ball (or he’ll reach on fielder’s interference… once every five years). The first three events end the plate appearance right there. If he hits the ball, it will either be a flyball, grounder, liner, or popup. If it’s a flyball, it might be a HR, or it might be an XBH or a single or be caught (or dropped) by a fielder. I could do the same basic breakdown for all the other types of batted balls.
As you get further and further down the flowchart, with more steps involved in the process, the underlying rates become more unreliable statistically. Part of it is the fact that as you split off further and further, a player only has say 150 ground balls, but may have 600 plate appearances. Anything where you get 600 measurements on anything, it will be more reliable than 150 measurements. But, in general, when you constrain the data set so that you’re comparing the reliability at 150 PA vs. 150 GB, the stats closer to the base of the flowchart still show up as more reliable.
DIPS proposed two categories for statistical reliability. Category one was a pitcher’s K rate, BB rate, HBP rate, and (I believe erroneously) HR rate. Category two was the now famous BABIP. BABIP was considered to be the product of luck, while K and BB were the product of skill. Here I propose PanDIBS, with three (perhaps four) tiers of batting statistics to consider. The most reliable of all stats are the swing diagnostics (and we know that they’re important), although no one ever really wants to project what J.D. Drew’s contact percentage will be. Let’s call swing diagnostics the zero-th level. The first level, in terms of reliability of the stats are the DIPS stats: K rate, BB rate, HBP rate. The second level is the player’s batted ball profile (GB%, LD%, FB%). The third level is what we really care about, things like HR and doubles. Sadly, those are the ones most likely to be influenced by luck.
I’d also propose that it’s important to look at each type of batted ball seperately. A little while ago, I looked at Kelly Johnson’s season and found that there was very little consistency from year to year when in came to outhitting one’s expected BABIP. The answer was “not much consistency.” (I found a four-year intra-class correlation of around .27). What I didn’t know then was that there are different effects for different types of batted balls. I looked at how well players did in “outhitting” their expected BABIP, chopped up by each different type of batted ball. For example, 24% of grounders go for hits, while 73% of line drives do. So, we would expect players to have a .730 BABIP on liners and a .240 BABIP on grounders. Of course, things vary, but do they vary consistently? If a player is above average in year one, he should be above average in year two. That’s the mark of a skill-based stat.
The answer depends on the type of batted ball. Players were more consistent in ”outhitting” their expected BABIP on flyballs (ICC’s over 4 years were in the mid-.30s, depending on what PA inclusion criteria was used) than grounders (ICC in the mid-.20′s), and line drives (about .10). When I split up flyballs into infield flies (which have an expected BABIP of about .025) vs. non-infield flies. In fact, getting more hits on popups was almost entirely luck (.02 ICC over four years).
It makes sense that there would be more of a “skill” in out-hitting expectations on flyballs. Some players are rather skilled at hitting them off the wall, and some are not. The skill in out-hitting expectations on ground balls is called “speed.” Line drives on the other hand are just a matter of luck as to whether someone catches them or not. A high line drive will likely go off the wall, but that’s about it. If a popup goes for a hit, either someone missed it, or the batter simply lucked into a Texas Leaguer. So, when looking at whether a player will continue with getting all those hits, it’s important to know what type of hits he’s getting and what the base rate expectations are for that type of ball So, if you see someone who hits a lot of line drives have a dip in his performance (or a breakout year), expect a lot of regression to the mean. If he’s the kind of guy who hits a lot of flyballs, he’s not going to have to give as much of that back in regression to the mean.
So, yes Virginia, it is possible to sort out which stats are the result of luck and which are the result of skill for batters too in a fairly coherent way. There is variation in how reliable each stat is, but in general, the farther away the ball gets from the bat, the more luck creeps in to influence the outcome.