September 29, 2009 3 Comments
Sorry about the last entry, everyone. I made a mistake in my query, which skewed the results.
Anyway, here are the actual results for fastball velocity, particularly swing and miss percentage and foul balls.
The sample includes every fastball that was swung at during the 2008 season, broken down into velocities of 85 mph and up. This yielded a sample of 38108 events. There were a number of interesting trends, particularly the correlation coefficients of fastball velocity relating to swing&miss percentage, and foul ball percentage.
To me, the foul ball percentage was the most interesting, but I’ll let you decide.
Below is a description of the data, with velocity in the first column, followed by the percentage of all swings at each velocity according to: Swing&Miss Percentage, the Foul Percentage, then Foul Tip Percentage, then In Play Percentage, followed lastly by the number of total events at each velocity.
The correlation coefficient for this data set was 0.89. Therefore, there is a strong linear relationship between the velocity thrown and the percentage of swings and misses at the pitch. One thing to notice, however, is that this graph is not completely linear. At the velocity gets above 95, especially with the point at 98 (which, granted, has a small sample size), the graph becomes non-linear, with what looks like an exponential relationship. Therefore, it gets exceedingly harder to make contact with a pitch that is going that additional mile per hour.
As a result, this also causes a lower value of the correlation coefficient, even though the graph has a clear upward trend. Remember, a correlation coefficient is a measure of linear relation. Therefore, when the graph is exponential, the linear relation will be less.
Still, no surprises, as this was expected.
Part II: Foul Ball Rates
For the rate of foul balls per swing, there is a clear upward, linear trend until 95 mph, where the graph falls at a pretty steep rate.
Foul balls are one of the last unexplored realms of baseball statistical analysis. Hopefully Hit F/X will be able to give us some useful data, but until then, I’ll be waiting. Also, why do we only measure foul balls when they are caught by a fielder? Otherwise, they wouldn’t even be counted as a ball in play. There’s a lot we can learn about the batter-pitcher interaction by foul balls, but there is very little information out there. It would be a great leap forward if there were some good studies on foul ball data.
But, back to the graph. There isn’t a strong linear trend on the graph because of its parabolic shape. However, the correlation coefficient between 85 and 95 mph is .97949, which is an incredibly strong correlation.
This is a very important point when analyzing the success of soft-tossing pitchers. For pitchers who throw at low velocities, it is important to note that by getting fewer fouls, they are essentially giving away free strikes. These batted balls become balls in play, while for pitchers at higher velocities, the batter now has one additional strike on them, with a great chance for a strikeout. Besides the low swing and miss totals, these low-velocity pitchers have fewer strikes in their favor.
As to why there is a sharp downward trend in the data after 95 mph, I’m not totally sure as to why, though I do have a hypothesis. One, is to think of the graph not in terms of foul or non-foul, but in terms of being late on a pitch. While some of these fouls are going to be pulled, the fact that it is dictated by velocity means that the ones affected by velocity are those that the batter is late on. Therefore, as the velocity goes up, the batter will be late on the pitch to a greater degree. As a result, when the batter gets beyond 95 mph, they are no longer late and fouling off the pitch, but they are late for a swing and miss. This probably has something to do with the exponential increase in swing and misses for high velocities.
This may not change the end result of the at-bat too much, as a strike is still a strike whether its a whiff or a foul; though, higher velocities will have more 2 strike swing and misses (for a K), while lower velocities have longer 2-strike at-bats, due to the at-bat staying alive. The lower velocities will probably have more foul-outs as a result, however.
Part 3: Ball In Play Rate
This last graph shows the rate of balls in play per swing at each velocity. Again, the data is about where we’d expect it, as its harder to put a ball in play at a higher velocity. This speaks volumes as to why low-velocity pitchers struggle in the majors: if the batters can put your stuff in play more often, there are more chances for hits, and fewer for free outs (strikeouts). This one follows common logic: the faster the velocity, the fewer balls in play per swing.
The graph follows a very consistent linear trend from 85 to 97 mph, then drops suddenly at 98+ mph. It is difficult to say why there is a sudden drop, as it could be due to small sample size or due to the fact that they’re just so hard to make contact with at those speeds. It may very well be a mix of both, though the fact that even 97 mph is within the linear trend makes me believe there is a significant sample size component to this issue.
From 85 mph to 97 mph, the correlation coefficient is -0.977, which is another very, very strong correlation. The fact that there is a correlation is not surprising, though the strength of the correlation is quite shocking. I didn’t expect there to be such a substantial correlation.
This study brings about some very interesting trends, as the strength of these correlations are very strong. In particular, the relationship between velocity and foul balls (which is probably causal, velocity causing foul ball percentage for the reason explained) is particularly interesting, especially because the issue is rarely discussed. I think this could give us some insight as to the relationship between velocity and pop-up rate, as pop-ups are generally thought to be the result of being late on a pitch, particularly on inside pitches, where its hard to get the bat head to the ball on time.
In the end, the data seem to back up the reasons why it is so hard to succeed in the MLB without fastball velocity: low-velocity means fewer Ks, more balls in play. I’ll do more research on this, and I hope to post more next time.
Thanks to TheHardballTimes.com for their contributions to this article.
Mike Silver recently completed his requirements for the Sport Management Major at THE University of Massachusetts-Amherst, where he is a brother of Theta Chapter of Theta Chi Fraternity, the best house in the country. He is a huge Red Sox and Bruins fan, and longs for the days of the REAL Boston Garden, Cam Neely, and the ultimate Dirt Dog Trot Nixon. Aside from StatSpeak, you can find Mike at TheHardballTimes.com and FireBrandAL.com. If you have any questions, you can reach him at firstname.lastname@example.org. Have a good night readers, and know that Mike hopes to hear from you soon. If you quote Mike in an article, please let him know. He’d love to hear it.