## Batter Walks

August 29, 2009 3 Comments

Last entry, I looked at a quick regression model for batter strikeouts based on contact percentage and swing percentage. This week, we’ll look at batter walk percentage based on swing percentage, zone percentage, and contact percentage.

Same rules as last time. All 2008 qualified batters, linear regression.

The results were pretty good, though not as accurate as last time. The r-squared of the equation is .7349, which, while good, isn’t quite as accurate as estimating batter strikeouts. I’m sure part of this is intentional walks, which wouldn’t likely have anything to do with swing%, zone%, or contact rate, but that is a topic for another day. Let’s look at some of the results.

More than Expected

Swing% | Zone% | Contact% | ActualBB% | Pred BB% | Difference | ||

Albert Pujols | 0.415 | 0.471 | 0.901 | 0.166 | 0.125 | 0.041 | |

Pat Burrell | 0.42 | 0.497 | 0.813 | 0.16 | 0.121 | 0.039 | |

BJ Upton | 0.404 | 0.511 | 0.805 | 0.154 | 0.118 | 0.036 |

I did not say “lucky” on this one for a couple reasons. First, since intentional walks are undoubtedly going to be a part of the error margin, I don’t feel that “luck” is the appropriate term, as we use it so much in the statistics community. Second, while an r-squared of over .7 is certainly a good one, there are a number of other factors to be analyzed as well.

Worse Than Expected

Swing% | Zone% | Contact% | ActualBB% | Pred BB% | Difference | ||

Garrett Anderson | 0.487 | 0.484 | 0.828 | 0.049 | 0.092 | -0.043 | |

Aubrey Huff | 0.433 | 0.483 | 0.848 | 0.081 | 0.116 | -0.035 | |

Jeremy Hermida | 0.435 | 0.492 | 0.778 | 0.087 | 0.121 | -0.034 |

Close to Expected

Swing% | Zone% | Contact% | ActualBB% | Pred BB% | Difference | ||

Adam Jones | 0.535 | 0.527 | 0.769 | 0.046 | 0.0464 | -0.0004 | |

Brian McCann | 0.464 | 0.481 | 0.855 | 0.101 | 0.10127 | -0.00027 | |

Josh Hamilton | 0.555 | 0.453 | 0.741 | 0.093 | 0.093057 | -5.7E-05 |

While there is still some work to be done, most importantly, that with intentional walks, the model is fairly accurate for a basic linear model. Next time, we will see how accurate such a regression formula is with pitchers.

*Thanks to Fangraphs.com for their contributions to this article*.

Regression calculations performed by:

**Wessa, P. (2009), Free Statistics Software, Office for Research Development and Education,version 1.1.23-r4, URL http://www.wessa.net/**

Mike Silver recently completed his requirements for the Sport Management Major at THE University of Massachusetts-Amherst, where he is a brother of Theta Chapter of Theta Chi Fraternity, the best house in the country. He is a huge Red Sox and Bruins fan, and longs for the days of the REAL Boston Garden, Cam Neely, and the ultimate Dirt Dog Trot Nixon. If you have any questions, you can reach him at mjasilver@gmail.com. Have a good night readers, and know that Mike hopes to hear from you soon. If you quote Mike in an article, please let him know. He’d love to hear it.