The measure of a man, part I
February 3, 2009 8 Comments
This is a rather special post for me to write. About 2.5 years ago, I was at a professional conference at the University of Kansas (Rock Chalk, Jayhawk), and truth be told, I was bored. At that time, I had been hanging out on a few Sabermetric websites, and had briefly toyed with the idea of getting into Sabermetric research myself. So, instead of listening to the presentations on child clinical psychology, I made it look like I was taking notes and instead sketched out a research plan for some Sabermetric work. This is the piece that I envisioned writing. Apparently, if you want to get Sabermetrically inspired, you go to Kansas.
I wanted to look at how I might take hitters and reduce them down to a few basic stats that would describe their abilities, rather than their performances. I suppose that there are thousands of numbers that I might generate, but I wanted to break it down to a manageable number, perhaps ten to twelve numbers total.
So, I started off by breaking things down piece by piece within a plate appearance from the batter’s perspective. No matter what else happens, the pitcher will throw a pitch. What happens next will depend on a few things.
- The batter will have to look at that pitch and figure out what it’s going to do. Is it a strike? Is it hittable? Is it juuuuuust a bit outside?
- With that information, the batter must decide whether or not to swing. Some people (*cough*Vlad*cough*) will swing at anything. Some prefer to keep their bat behind their ear until they’re absolutely sure.
- If he swings, he will either make contact or not. Some folks are good at this. Some… are not.
- If he makes contact, the ball will either go fair or foul. If it’s a foul ball, the plate appearance continues (yeah, I know someone could catch it.)
- If the ball goes fair, it will either be a groundball, line drive, or fly ball
- And it will either go far far away from home plate or stay close by
- Either way, the batter will have to run to first (and beyond?) once the ball is hit… unless, of course, he hits it out of the park. Or right at the shortstop.
Now, in order to capture abilities (rather than performance), what we’d ideally see are statistics that hold up over time (hence, my obsession with reliability). Some of the stats that would measure some of the abilities above already existed. (GB%, FB%, LD%, contact %). Some easy reliability analyses will show that they stabilize rather quickly, so I’m comfortable with these things being considered repeatable. It’s pretty easy to see, even from a small sampling of plate appearances, whether a player is a ground ball or a fly ball guy.
Then, there were some stats that needed creating from the ground up to describe each of these steps. So, to measure parts 1 and 2, I created my twin plate discipline scores, sensitivity and response bias. These two scores ended up correlating nicely with strikeout rate (sensitivity) and walk rate (response bias) quite nicely.
I studied foul balls, and while it’s easy enough to get a foul ball rate, I found that not all foul balls are created equally. Two strike foul balls were good foul balls indicating a batter who made better contact. Foul balls at zero or one strike indicated a plate appearance more likely to end in a strikeout… or a home run. They suggested a player who took riskier swings. Plus, rates of the two types of foul balls were largely uncorrelated suggesting that they are two separate skills. So, I broke up two strike fouls vs. 0-and-1 strike fouls.
For number six, I created a power score. Why? Because I’m cool like that. For number seven, Bill James had already created a speed score formula, which I simply took and made slightly better, although infinitely more complicated to calculate. I don’t expect anyone to calculate my scores by hand, but I wrote syntax that will calculate them automatically. And since mine are slightly better (and because it’s my party), I’ll use mine.
I hit my goal of ten numbers: sensitivity, response bias, contact%, 0&1 strike foul rate, 2 strike foul rate (per 2 strike PA’s), GB%, LD%, FB%, power score, and speed score (mine). I’ve subjected all of them to reliability analyses and they all pass with flying colors, even at sample sizes as low as 100 PA.
The thing is that some of these numbers might just overlap. After all, what’s a good way to spoil a lot of 2 strike pitches? Be a good contact hitter! So, I needed to see whether these ten factors stood on their own, or whether they might be reduced further. Warning: it’s going to get nerdy.
I calculated each of the above for each player in 2008 who had at least 100 PA, and subjected the numbers to an exploratory factor analysis to see which ones stuck together. In theory, if these were ten completely independent skills, none of them would. EFA is one of those things that if you’re reading this blog, you could probably grasp with a little bit of reading if you don’t know it already. The two sentence version is this. Suppose you had a bunch of questions or measures or something. Which of them inter-correlate with one another?
For those in the know, gory details: I used a Varimax rotation, and asked the computer to save factors with an Eigenvalue over 1.0. (And if you’re an “elbow rule” devotee, the last factor had an EV of 1.05, but it was a really well-defined factor…)
The factor loading plots looked like this (loadings below .30 suppressed):
|variable||factor 1||factor 2||factor 3||factor 4|
Not surprisingly, some of these skills clump together and in ways that make sense. On factor 1, ground ball percentage and flyball percentage load beautifully (1.0 is the maximum for a factor loading), and in opposite directions. So, you’re either a flyball hitter or a groundball hitter. But what’s more interesting is that power and speed both load on this factor. Power hitters hit fly balls and fast guys hit grounders. Those loadings aren’t great but they do suggest a distinction between the little slap and run hitter and the big fly (lead footed) power hitter. Let’s call it the Ichiro-Ryan Howard continuum.
Factor two shows that players who are good at making contact in general and in spoiling 2 strike pitches (which common sense tells us is a function of making contact as well) are the ones who are skilled at avoiding strikes. Let’s call factor two “contact.”
Factor three is an interesting factor. The higher the response bias (likelihood of swinging), the more early count foul balls that a player, and to some extent, even though he swings more, he makes contact less. So, he’s coming up empty a lot of the time. The thing is that early count foul balls are associated with home runs, so a guy who swings a lot is a guy who likes to take his chances. Factor three is “risk taking”
Factor four is something that surprised me. It’s pretty much line drive percentage, but power score loads pretty heavily on there. LD% is in the formula for power score, so maybe that’s what’s driving the correlation, and power score, as I’ve defined it has much more to do with making good contact. Still, it has a pretty good correlation with HR/FB, so one of the marks of a good power hitter is apparently hitting a lot of LD’s. Let’s call this factor “solid contact”
So we have a four-factor structure: What sort of approach does the player take (slap and run vs. swing hard and hope you hit it), how likely is he to take risks, how good is he at making contact, and how solid is that contact. Makes sense. To be on the safe side, I re-did the same factor analysis with the same measures, this time using data from 1993. The point there is that it’s mostly a different group of guys (and the few carry-overs to 2008 were all 15 years younger then…) I got virtually the same factor loading plot. Looks like this model holds across time.
Why is this important? Because now we have scales which are orthogonal, based on statistics with good metric properties, and follow a reasonable flowchart of what a hitter is actually expected to do. This should come in rather handy.