September 8, 2008 5 Comments
Consider the act that you are doing now. No, not wasting time at work. You’re reading. Ever stop for a moment and think about how complicated a process reading really is? After all, you must be able to see the squiggly lines on the screen, recognize them as symbols of a language system (aka letters), be able to sound them out, put the combinations of letters together as words, and understand what they mean both as words and as part of a sentence. That’s a lot of work, but of course, you’ve learned to do it almost automatically after years of practice. Babies can’t read. Why not? They can see the squiggles, but they don’t (yet) recognize them as meaningful symbols.
One of the things that drives me nuts when people start diagnosing the problems (or successes!) of baseball teams or players is that most of the explanations focus on the need for the team/player to increase or decrease their performance in just one area. (“He just needs to walk more. Then he’ll be a Hall of Famer.”) In addition to a general distaste for any theory which relies on “one magic bullet”, it shows a general misunderstanding of how people, including baseball players, develop. A few weeks ago, I talked about a few insights that I have on baseball, specifically the development of baseball players, that come out of my work studying the development of kids. Here’s another.
Now, this is one where I think the casual fan and the Sabermetrician (including myself) are both guilty of using “one magic bullet” thinking. Sabermetricians just have prettier bullets. But saying that “He just needs to walk more” is kinda like looking at a child who is having trouble reading and saying that “he just needs glasses.” The logic is fairly sound. If he can’t see the words, he won’t be able to read them, so certainly giving him a pair of glasses won’t hurt. And, if he’s got all of the other skills necessary for reading (good phonemic awareness and processing, symbolic decoding skills, and a grasp of vocabulary, grammar, and usage), glasses will probably do him a lot of good. Yes, the better your visual acuity (to a point), the better your reading ability will be, if you have all the other skills necessary for reading. However, having perfect visual acuity is useless if you don’t have the ability to understand that those squiggly lines on the screen are letters!
An example: when I wake up in the morning in Cleveland, my glasses make the difference in my being able to read the road signs I pass on my way to work. When I went to Moscow with my wife two years ago, glasses or no, I couldn’t read the signs on the subway. In Cleveland, I know what the squiggly lines on the signs mean. In Moscow, they’re just meaningless squiggles because they’re in Russian. ?? ?? ?????? ????????? ???. ???? ?? ??????, ?? ?? ??????????? ???????.
When you change the level of one skill (comprehension of the language), the relationship between two other skills changes. In the Cleveland case, improving my visual acuity by giving me glasses helped because it happened in the presence of another skills (comprehension of the English language). In the Moscow case, improving my visual acuity made no difference and was not correlated with my ability to read Russian because I have no clue when it comes to the Russian language. In one case, visual acuity and reading are correlated. In another, the same two variables are uncorrelated. That’s the essence of a moderator.
Sabermetrics (and a lot of other fields) need a more nuanced approach. We like bivariate correlations and regressions because they show one variable’s effects on another. They’re easy to understand, and it may indeed produce some interesting (which is not necessarily the same thing as useful) conclusions. But here, I am arguing for a proper understanding of something called moderator effects and their application to baseball. For illustration, I specifically chose to look at the effects that moderators have on batter’s home run rates. I could do this type of analysis for any stat, really, but home runs are fun and people obsess over them.
Reader, if you’re interested in the mathematical guts of the method, they are hiding behind the cut. The short version: I calculated a bunch of stats, both rate stats (1B rate, K rate, BB rate, etc.) and some swing diagnostics (swing %, contact %, pitches per PA) and batted ball stats (LD rate, GB rate) for all player-seasons from 2003-2007 (min 250 PA) and looked for interactions between these stats in predicting HR rates. (I also looked at HR/FB, but the results were pretty much the same.) A bunch of interactions popped up (I ran everything as a moderator of everything else), which tells me that (publicly available) Sabermetrics has a lot of work to do on this one. I picked the ones that were a) the strongest and of those b) the most interesting to look at to report on, but there’s plenty more where this came from.
Three variables came out most strongly as moderators and in some rather interesting ways: contact percentage, swing percentage, and pitches per PA. Swing diagnostics make a difference.
For example, there was an interaction between extra base hit (XBH) rate (doubles and triples) and contact rate. Generally, we figure more XBH means more HR, but not always. It depends on what’s going on with another skill, contact rate. Contact rate changes the relationship between XBH’s and HR’s.
Players who had low contact rates in general hit more homers than those with high contact rates. Makes sense since a power swing is generally one which sacrifices the chances for making contact in exchange for a chance at hitting the ball farther if you do make contact. But, let’s say that you see a hitter’s XBH rate creeping up. Should you expect more or fewer HR from him? If he’s a high contact hitter, you should expect more. If he’s a low contact hitter, you should expect fewer.
Players who are already aiming for the fences will probably succeed, but some of the balls are bound to go over the fence and others just to hit the fence. Some guys get unlucky in the sense that they have to make do with more doubles and fewer HR. Players who are contact hitters and who usually hit singles are probably changing their approach a little and instead of prioritizing contact, they are aiming a bit more for the fences. Moderators make a difference.
There’s another interaction I found between swing percentage and contact percentage. Again, low contact hitters hit more HR, but what happens when a hitter swings more? If he’s a high contact hitter, swinging more won’t really do much to his HR rate. But, if he doesn’t make contact a lot, swinging more will actually depress his HR rate. I’m guessing that if he doesn’t make contact and he swings a lot, instead of hitting HR, he’s striking out.
One other interesting find. Pitches per plate appearance is a rather interesting moderator of a well-known property of HR hitters. HR hitters strike out a lot. But, the ones who see more pitches per PA, as their strikeout rates go up, their HR go up much more than if they don’t see a lot of pitches per PA. So, the ones who are better at extending the count are the ones who get more bang for the buck in terms of HR gained for each strikeout. But there’s another effect of PPA that should prove rather interesting. High PPA hitters, when they hit more flyballs have a sharper increase in HR rate than those hitters with a low PPA number. So a high PPA player, when he hits more flyballs, hits better quality flyballs.
A rather important note: everything here is cross-sectional. The reader may be thinking that since everything here is measured within the same year, there’s no way to prove causation. For example, is it that hitters who like to extend at bats out (and have high PPA numbers) then develop the patience and selectivity to pick out good HR pitches or is it that players who hit a lot of HR choose instead to wait on more pitches? If you’re thinking that, you’re ready for the next step: multi-latent developmental models.
For those who are hungry for more methodology, some major numerical nerdiness follows. If you’re happy just knowing a few more things about home runs and perhaps a little bit more about a few new wrinkles that should be in the greater Sabermetric methodology, you are now excused.
Read more of this post