# Breaking Down a Pitcher's Stuff: Fastball Velocity

September 29, 2009 3 Comments

Sorry about the last entry, everyone. I made a mistake in my query, which skewed the results.

Anyway, here are the actual results for fastball velocity, particularly swing and miss percentage and foul balls.

The sample includes every fastball that was swung at during the 2008 season, broken down into velocities of 85 mph and up. This yielded a sample of 38108 events. There were a number of interesting trends, particularly the correlation coefficients of fastball velocity relating to swing&miss percentage, and foul ball percentage.

To me, the foul ball percentage was the most interesting, but I’ll let you decide.

Below is a description of the data, with velocity in the first column, followed by the percentage of all swings at each velocity according to: Swing&Miss Percentage, the Foul Percentage, then Foul Tip Percentage, then In Play Percentage, followed lastly by the number of total events at each velocity.

Part 1: Velocity versus Swing and Miss Percentage

This one was no big surprise. In essence, the higher your velocity is, the more swings and misses you get.

The correlation coefficient for this data set was 0.89. Therefore, there is a strong linear relationship between the velocity thrown and the percentage of swings and misses at the pitch. One thing to notice, however, is that this graph is not completely linear. At the velocity gets above 95, especially with the point at 98 (which, granted, has a small sample size), the graph becomes non-linear, with what looks like an exponential relationship. Therefore, it gets exceedingly harder to make contact with a pitch that is going that additional mile per hour.

As a result, this also causes a lower value of the correlation coefficient, even though the graph has a clear upward trend. Remember, a correlation coefficient is a measure of *linear *relation. Therefore, when the graph is exponential, the linear relation will be less.

Still, no surprises, as this was expected.

Part II: Foul Ball Rates

This graph was particularly surprising. Maybe because I never have really given it much thought, but I didn’t think I would find such an interesting trend. Here’s the graph:

For the rate of foul balls per swing, there is a clear upward, linear trend until 95 mph, where the graph falls at a pretty steep rate.

Foul balls are one of the last unexplored realms of baseball statistical analysis. Hopefully Hit F/X will be able to give us some useful data, but until then, I’ll be waiting. Also, why do we only measure foul balls when they are caught by a fielder? Otherwise, they wouldn’t even be counted as a ball in play. There’s a lot we can learn about the batter-pitcher interaction by foul balls, but there is very little information out there. It would be a great leap forward if there were some good studies on foul ball data.

But, back to the graph. There isn’t a strong linear trend on the graph because of its parabolic shape. However, the correlation coefficient between 85 and 95 mph is .97949, which is an incredibly strong correlation.

This is a very important point when analyzing the success of soft-tossing pitchers. For pitchers who throw at low velocities, it is important to note that by getting fewer fouls, they are essentially giving away free strikes. These batted balls become balls in play, while for pitchers at higher velocities, the batter now has one additional strike on them, with a great chance for a strikeout. Besides the low swing and miss totals, these low-velocity pitchers have fewer strikes in their favor.

As to why there is a sharp downward trend in the data after 95 mph, I’m not totally sure as to why, though I do have a hypothesis. One, is to think of the graph not in terms of foul or non-foul, but in terms of being late on a pitch. While some of these fouls are going to be pulled, the fact that it is dictated by velocity means that the ones affected by velocity are those that the batter is late on. Therefore, as the velocity goes up, the batter will be late on the pitch to a greater degree. As a result, when the batter gets beyond 95 mph, they are no longer late and fouling off the pitch, but they are late for a swing and miss. This probably has something to do with the exponential increase in swing and misses for high velocities.

This may not change the end result of the at-bat too much, as a strike is still a strike whether its a whiff or a foul; though, higher velocities will have more 2 strike swing and misses (for a K), while lower velocities have longer 2-strike at-bats, due to the at-bat staying alive. The lower velocities will probably have more foul-outs as a result, however.

Part 3: Ball In Play Rate

This last graph shows the rate of balls in play per swing at each velocity. Again, the data is about where we’d expect it, as its harder to put a ball in play at a higher velocity. This speaks volumes as to why low-velocity pitchers struggle in the majors: if the batters can put your stuff in play more often, there are more chances for hits, and fewer for free outs (strikeouts). This one follows common logic: the faster the velocity, the fewer balls in play per swing.

The graph follows a very consistent linear trend from 85 to 97 mph, then drops suddenly at 98+ mph. It is difficult to say why there is a sudden drop, as it could be due to small sample size or due to the fact that they’re just so hard to make contact with at those speeds. It may very well be a mix of both, though the fact that even 97 mph is within the linear trend makes me believe there is a significant sample size component to this issue.

From 85 mph to 97 mph, the correlation coefficient is -0.977, which is another very, very strong correlation. The fact that there is a correlation is not surprising, though the strength of the correlation is quite shocking. I didn’t expect there to be such a substantial correlation.

This study brings about some very interesting trends, as the strength of these correlations are very strong. In particular, the relationship between velocity and foul balls (which is probably causal, velocity causing foul ball percentage for the reason explained) is particularly interesting, especially because the issue is rarely discussed. I think this could give us some insight as to the relationship between velocity and pop-up rate, as pop-ups are generally thought to be the result of being late on a pitch, particularly on inside pitches, where its hard to get the bat head to the ball on time.

In the end, the data seem to back up the reasons why it is so hard to succeed in the MLB without fastball velocity: low-velocity means fewer Ks, more balls in play. I’ll do more research on this, and I hope to post more next time.

*Thanks to TheHardballTimes.com for their contributions to this article.*

Mike Silver recently completed his requirements for the Sport Management Major at THE University of Massachusetts-Amherst, where he is a brother of Theta Chapter of Theta Chi Fraternity, the best house in the country. He is a huge Red Sox and Bruins fan, and longs for the days of the REAL Boston Garden, Cam Neely, and the ultimate Dirt Dog Trot Nixon. Aside from StatSpeak, you can find Mike at TheHardballTimes.com and FireBrandAL.com. If you have any questions, you can reach him at mjasilver@gmail.com. Have a good night readers, and know that Mike hopes to hear from you soon. If you quote Mike in an article, please let him know. He’d love to hear it.

You make a great point about foul ball data.

Why aren’t they kept track of better? And I don’t mean advanced data like landing location, angle of deflection, speed off the bat, etc. I’m just talking straight counting. We’ve got to start somewhere.

Does retrosheet or baseball-reference.com have this data hidden somewhere? I know we have pitches / pitcher; pitches / game; pitches / at bat; but the data needs to be a bit more raw so it can be recombined for a different purpose — tracking foul ball stars.

I’ve always wondered which hitters (and pitchers) produce the most foul balls. In particular, it would be interesting to find out if there’s actual skill involved. Intuitively, I believe some hitters “hang tough” better than other, especially with 2 strikes. For example, David Eckstein would have to be near the league leaders in ABs that have more than 3 foul balls after 2 strikes, just based on personal observation.

Some have suggested pitches per at bat, but that’s really not granular enough and includes balls. I’d also like to see more than just pure foul ball data, as that includes 0 and 1 strike situations as well as bunts and foul tips.

There could even be a foul ball stat for AB’s similar to a “Dominating” start used by some analysts (such as Ron Shandler and his team). Again, I would think Eckstein would be a leader in that category each season.

Nice article and thanks for providing a forum for me to get this off my chest.

The difference in points in the Velocity/Whiff rate graph at 95 and 98 looks to be the same distance between 95 and 98 on the foul vs. velocity graph. So I’d guesstimate that your conclusion about batters going from fouling them off to just missing them altogether is most likely correct.

Stuff on foul balls:

http://statspeak.net/2008/04/the-foul-ball-part-one-what-does-it-tell-us-about-a-batter.html

http://statspeak.net/2008/04/the-foul-ball-part-two-what-does-it-tell-us-about-a-pitcher.html

http://statspeak.net/2008/04/the-foul-ball-part-three-what-does-it-tell-us-about-an-at-bat.html

Foul balls are included in Retrosheet data from about 1993-present.