More on plate discipline
June 10, 2007 Leave a comment
A few extra developments following from my “Plate Discipline” article and post. Several people have suggested a few possible developments and I’ve taken a little bit of a look at some things. For those of you interested in all of the raw numbers from 2006, you can find them here.
Justin at On Baseball and the Reds commented on the piece suggesting that I might be able to identify groups of players (low sensitivity and low response bias). That’s do-able. I reached back into my bag of statistical tricks and pulled out a method that does just that. It’s called cluster analysis. It takes a bunch of data points and figures out which two are the closest together. They get put into a group. Keep repeating the same process. At some point, I’ll do an explanation piece on how it works, but for now, just understand that it creates groups of data points. The most logical solution was a five cluster solution. The clusters shook out as three groups with high sensitivity with one being with really high response bias, one with really low response bias, and one with high sensitivity and a response bias around 1.0. The final two groups were medium sensitivity with a response bias around 1.0 and low sensitivity with a response bias around 1.0. I found that odd. The group that had the widest variations in response bias were the high sensitivity players. I looked at a graph plotting sensitivity numbers against response bias numbers and found the following, which looks like a megaphone.
It looks like the better a hitter knows the strike zone, the more he can get away with either being very patient or being very aggressive. Interesting.
Peter Jensen, commenting on TangoTiger’s blog, pointed out that a batter might take a strike if the pitch was one that he couldn’t hit. He recommended doing a study only on full counts. This way, the pitcher has no incentive to throw anything out of the strike zone (a throw-away pitch) and the batter has no incentive to take the pitch if he thinks it’s in the strike zone. I ran the data, using overall data from 2004-2006 the old fashioned way, and then only on 3-2 pitches (and only used player who had seen 100 such pitches over those three years) and found that response biases jumped. And how! The correlation between overall response bias and 3-2 response bias was .209. The correlation between sensitivity in both situations, however, was .605. Not perfect, but still pretty good. It looks like players change their approach in different counts, but are still pretty consistent in differentiating between balls and strikes.
Dan Fox from Baseball Prospectus (reproduced on his blog) has his own take on the subject, which I found to be rather enlightening. He pulled out the dataset from MLB Enhanced GameDay’s pitch location data and took a look and published a few metrics of his own. He points out that the problem with the dataset is that this information is only collected at certain ballparks. But, it’s the beginning of a good framework. I’d like to see how many pitches in the strike zone that a batter swings at, but misses. (Update: ask and ye shall receive!) Maybe there will be more from that.
In any case, thank you all for your comments.