The Current Criteria For Defining Batted Balls
October 11, 2009 Leave a comment
With all the emphasis placed on BABIP in the statistical forums, we really could use a better method of classifying batted balls than line drives, groundballs, fly balls, and pop-ups. I guess that’s why Hit F/X is about to take the stat world by storm. For now, we have to deal with what we have.
In order to get a good sense of what we are dealing with, we should see how well these batted ball descriptions correlate with BABIP. Therefore, I took a sample of all qualified 2008 starting pitchers and made a regression equation to compare batted balls to BABIP. The results were not particularly encouraging.
Here’s the equation:
Pitcher BABIP = 1.90 – 1.11 LD% – 1.67 FB% – 1.75 GB% – 0.144 IFFB%
The R-Squared of this equation was 0.352. Unfortunately, this is a moderate to weak correlation. In other correlations, such as trying to find the relationship between break and curve ball success or count versus BABIP, we may be happy with this result. However, with the importance placed on batted ball data, especially when analyzing pitchers, this shows that the current classifications are inadequate.
Another important factor to remember is defense. Every defense influences the pitchers that throw in front of it. Therefore, we should test this equation while accounting for defense, to see if we can bring the correlation anywhere closer to a linear trend.
Here is the regression equation:
Pitcher BABIP = 1.06 + 0.616 Team BABIP – 0.51 LD% – 1.01 FB% – 1.09 GB%
– 0.102 IFFB%
Again, there is only a moderate correlation, as even factoring defense into the equation raised the linear trend only marginally.
As we are on the eve of the availability of Hit F/X data, hopefully these points will become moot. Until then, be sure to take batted ball tendencies of pitchers with a grain of salt when making inferences on BABIP.