Back on December 20th, John Walsh wrote a very interesting article at The Hardball Times, taking everything recorded by the Pitch F/X system in 2007 and, amongst others, calculating the average velocity, horizontal movement, and vertical movement for the four major pitches: fastball, curveball, slider, and changeup. The results showed that the average fastball clocked in at 91 mph with -6.2 inches of horizontal movement and 8.9 inches of vertical movement. The author acknowledged that he did not differentiate between four-seamers, two-seamers, and cutters, but rather lumped them all together in determining the averages; two-seamers and cutters differ in velocity and movement components from four-seamers.
While I plan on calculating the averages for all different sub-groupings of pitches at some point, what recently piqued my interest was finding the averages for different velocity groupings. As in, what is the average horizontal movement for all 94 mph fastballs? Or, the BABIP for 98 mph fastballs?
With that knowledge we could effectively compare certain pitchers to the means of their velocity grouping rather than overall averages of every grouping. Instead of comparing, say, Edwin Jackson’s 94 mph fastball to a group including those who throw slower, we can compare him to his “peers.”
I started at 92 mph and queried my database for groupings (92-92.99, 93-93.99, etc) all the way up until 98+ mph. I figured 92 mph would be a solid starting point since the sample size would be extraordinarily large–large enough for four-seamers to overcome the two-seamers and cutters that may inevitably sneak in. Anything 98 mph or higher was grouped together to ensure a large enough sample since, as you will see below, the higher the velocity, the smaller the sample:
Velocity
|
Sample
|
%
|
92 mph
|
41,157
|
31.4
|
93 mph
|
33,368
|
25.5
|
94 mph
|
24,315
|
18.6
|
95 mph
|
16,586
|
12.7
|
96 mph
|
9,245
|
7.1
|
97 mph
|
4,236
|
3.2
|
>98 mph
|
2,018
|
1.5
|
All of the sample sizes here were large enough for analysis. Even though the 98+ group appears to be 1/20th the size of the 92 mph group, that speaks more for the latter than against the former.
Next, how do the movement components look for each group?
Velocity
|
Horiz.
|
Vert.
|
92 mph
|
-6.34
|
9.24
|
93 mph
|
-6.28
|
9.51
|
94 mph
|
-6.16
|
9.80
|
95 mph
|
-5.98
|
10.07
|
96 mph
|
-5.84
|
10.23
|
97 mph
|
-5.89
|
10.41
|
>98 mph
|
-6.03
|
10.38
|
It should be fairly apparent that the tendency is for horizontal movement to decrease and vertical movement to increase as the velocity increases, at least through 96 mph. At 97 mph, both movement components increase. At 98+ mph, the vertical movement stays stagnant while the horizontal movement jumps quite a bit.
The next area to discuss includes B%, K%, HR%, and BABIP:
Velocity
|
B%
|
K%
|
HR%
|
BABIP
|
92 mph
|
35.9
|
44.6
|
0.65
|
.302
|
93 mph
|
36.3
|
45.1
|
0.55
|
.303
|
94 mph
|
35.5
|
45.9
|
0.55
|
.292
|
95 mph
|
35.8
|
46.4
|
0.76
|
.303
|
96 mph
|
35.2
|
47.0
|
0.54
|
.291
|
97 mph
|
36.1
|
46.8
|
0.41
|
.273
|
>98 mph
|
33.9
|
49.3
|
0.69
|
.293
|
The percentage of balls doesn’t move too much until its dip of over two percentage points at 98+ mph. The amount of strikes, however, seems to increase. There is no real discernible pattern in the home run percentages; the most came on 95 mph heaters while the least came on those registering 97 mph.
Speaking of the 97 mph group, notice anything odd? Perhaps that their BABIP is .273, a full eighteen points below any other group? Prior to getting the results I expected each group to fall somewhere in the .290-.310 range; that all of them did except the .273 struck me as very peculiar.
I spoke to several other analysts, all of whom initially mentioned small sample size syndrome, only to redact the assessment after learning the sample sizes in question. The dropoff in home run percentage was tossed around, as well, since less home runs means more balls in play to be counted in the BABIP formula. This is a “could be,” though, rather than a “definitely why.” As was mentioned in these discussions, too, it could be nothing; perhaps there were more warning track flyballs that just missed leaving the yard as opposed to weaker hit balls.
Now, while the 4,236 pitches at 97 mph constitutes a large enough sample to analyze, the balls in play were not large enough yet to break into individual counts or locations. When they do get big enough this could serve as a means of explanation; perhaps something in either or both does not jive with the other velocity groups. Of those with significance, however, there was a .263 BABIP on 0-0 counts, and a .286 BABIP on pitches in the middle of the strike zone.
Pizza Cutter, or “The Master of Statistical Reliability” as I like to call him (yeah, a nickname for a nickname), suggested that BABIP is one of those stats that is super-unreliable, even with my large sample of pitches. I did a split-half reliability test, randomly splitting the sample in half, and calculating the BABIP of each half. For those unfamiliar, this serves to test the reliability of the sample; if it truly is large enough then no matter how we cut the sample in half we will have fairly convergent results. If the results were wildly divergent then we are dealing with an unreliable sample. The BABIPs of the two groups were .271 and .275, which essentially threw that idea out of the window.
Something interesting to consider was how, in each of these tables, all patterns seemed to stop when they reached 97 mph or higher. The horizontal movement increased instead of its decreasing trend; vertical movement decreased after its increase at 97; the percentage of strikes ceased increasing; and home runs reached their low. Could be something, could be nothing, but interesting nonetheless.
For now I am going to chalk this BABIP drop as an extreme random statistical variation and hope that you loyal readers out there might chime in with some more ideas to investigate. Otherwise, though, when gauging the movement components, percentage of balls/strikes/home runs, or even BABIP, we can compare individual pitchers to their “like-minded” averages by velocity grouping. If I get enough feedback involving different aspects to measure regarding these fastballs we will look at that soon, in the next day or two. Otherwise, next week I have something similar to this, looking at BABIP by movement.