# So… maybe clutch hitting exists…

April 8, 2009 1 Comment

A few weeks ago, I looked at whether BABIP eventually stablized enough that we could say that after X number of balls in play, the BABIP that resulted for a pitcher could be considered an accurate reflection of his actual ability over those X number of BIP. Turns out that the number was something on the order of 3800 balls in play or so, which is roughly 7-8 years worth of 180 innings or so per year. So, it’s not really useful in figuring out who’s good or bad at preventing hits on balls in play, but it suggests that such a skill, however overwhelmed by the noise it is, is out there.

So what about clutch hitting? It’s been found that year to year, there is little in the way of consistency in clutch hitting performance, no matter how you measure it. One year doesn’t tell you much of anything about a player’s clutch abilities. But, that’s not the same thing as no skill being there. It might just be overwhelmed by noise. Maybe we just need a longer view.

I took the PBP data files from 1979-2008 dumped them into one file. I then calculated win percentages based on the usual inning/out/baserunner/score framework, and then calculated leverage for each situation based on this handy dandy tutorial from Tom Tango. (Tom’s going to yell at me for using that… he’s since moved to using a Markov model which he says — correctly — is more accurate. But the values that this method produces are good enough for government work, and I don’t know much about how to do Markov modeling.)

Once I had win probability added values for each batting event and leverage values for each situation, I could use the WPA – WPA/LI definition for the amount of clutch that a given PA has in it that has become the standard operational definition for the term. The rest is just a matter of split-half reliability.

As per usual, I went through each player’s plate appearances over the course of those 30 years and numbered them sequentially. I then took matching samples (even numbered vs. odd numbered) of X number of plate appearances. So, if I was looking for 1000 PA, what I really found were all players who had 2000 PA total, and took 1000 in each column. I calculated how much clutch was present for each player in each of those matching samples and ran a correlation between the two. If clutch is a repeatable skill, the correlation between the two should creep up. In theory, when we get to a trillion PA’s, the numbers should match perfectly and the correlation will be 1.0. Of course, no one will ever accumulate that many PA.

number of PA | N | split-half |

1000 | 869 | .174 |

2000 | 429 | .304 |

3000 | 186 | .431 |

4000 | 74 | .489 |

5000 | 20 | .656 |

After 5000 PA, I couldn’t go any higher, and that sample of 20 players who have 10,000 PA to split into 2 halves with 5000 PA each (for the curious: R. Alomar, Baines, Biggio, Boggs, Bonds*, Steve Finley, Luis Gonzalez, Griffey, Gwynn, Rickey, McGriff, Molitor, Murray, Palmiero, Raines, Ripken, Sheffield, Oz. Smith, F. Thomas, Vizquel) is a very selective and very small sample. (Oddly enough, most of these guys are career negative clutch.) But, with that said, the split-half number is approaching .70, which I use as my own personal cutoff for “stable enough.” (Reason: at .707, you’ve got an R-square of .50, which means that half of the variance has been accounted for within the player himself.) A decent guess is that somewhere around 6000 PA, we have enough of a read on a player’s clutch-iness that it actually means something. That’s about 8-10 years as a full-time starter.

So maybe we can start talking about clutch careers in Retrospect. These aren’t the kind of data that I would want to bet on from a technical statistical perspective (methodological sample size, and selectivity issues abound!). From a front office perspective, this is useless. Who wants a stat that just barely makes it over the “stable” mark after 8-10 years? But at the sports bar… well now that’s a different story.

This would totally be an acceptable sports bar conversation.