Batter Walks

Last entry, I looked at a quick regression model for batter strikeouts based on contact percentage and swing percentage. This week, we’ll look at batter walk percentage based on swing percentage, zone percentage, and contact percentage.

Same rules as last time. All 2008 qualified batters, linear regression.

The results were pretty good, though not as accurate as last time. The r-squared of the equation is .7349, which, while good, isn’t quite as accurate as estimating batter strikeouts. I’m sure part of this is intentional walks, which wouldn’t likely have anything to do with swing%, zone%, or contact rate, but that is a topic for another day. Let’s look at some of the results.

More than Expected

Swing% Zone% Contact% ActualBB% Pred BB% Difference
Albert Pujols 0.415 0.471 0.901 0.166 0.125 0.041
Pat Burrell 0.42 0.497 0.813 0.16 0.121 0.039
BJ Upton 0.404 0.511 0.805 0.154 0.118 0.036


I did not say “lucky” on this one for a couple reasons. First, since intentional walks are undoubtedly going to be a part of the error margin, I don’t feel that “luck” is the appropriate term, as we use it so much in the statistics community. Second, while an r-squared of over .7 is certainly a good one, there are a number of other factors to be analyzed as well.


Worse Than Expected

Swing% Zone% Contact% ActualBB% Pred BB% Difference
Garrett Anderson 0.487 0.484 0.828 0.049 0.092 -0.043
Aubrey Huff 0.433 0.483 0.848 0.081 0.116 -0.035
Jeremy Hermida 0.435 0.492 0.778 0.087 0.121 -0.034


Close to Expected

Swing% Zone% Contact% ActualBB% Pred BB% Difference
Adam Jones 0.535 0.527 0.769 0.046 0.0464 -0.0004
Brian McCann 0.464 0.481 0.855 0.101 0.10127 -0.00027
Josh Hamilton 0.555 0.453 0.741 0.093 0.093057 -5.7E-05


While there is still some work to be done, most importantly, that with intentional walks, the model is fairly accurate for a basic linear model. Next time, we will see how accurate such a regression formula is with pitchers.


Thanks to for their contributions to this article.

Regression calculations performed by:

Wessa, P. (2009), Free Statistics Software, Office for Research Development and Education,
version 1.1.23-r4, URL

Mike Silver recently completed his requirements for the Sport Management Major at THE University of Massachusetts-Amherst, where he is a brother of Theta Chapter of Theta Chi Fraternity, the best house in the country. He is a huge Red Sox and Bruins fan, and longs for the days of the REAL Boston Garden, Cam Neely, and the ultimate Dirt Dog Trot Nixon. If you have any questions, you can reach him at Have a good night readers, and know that Mike hopes to hear from you soon. If you quote Mike in an article, please let him know. He’d love to hear it.


3 Responses to Batter Walks

  1. Mike Rogers says:

    Why not just take IBB’s out of the BB totals, then?

  2. Mike Silver says:

    Nice. I was wondering who would be the first to ask that question. I’m glad its a BtB writer who caught it (nice posts by the way).
    A couple reasons. The first is that I didn’t realize the variable until I was in the process of writing the article.
    The second is that I’m too impatient not to have printed it anyway, with the IBB in the calculations. The next article will probably remove the IBBs from the equation.

  3. Mike Rogers says:

    Thanks, Mike. I enjoy Stat Speak, so getting a compliment from a writer here is awesome.
    And, gotcha. I was thinking there was some angle of having them in there is important that I was missing. I look forward to reading that.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: