Position Players on the Hill

Since the Laws of Voros McCracken were published way back in the early 21st century, baseball statistical analysts have sought to firmly establish whether or not pitchers have any control over their batting average on balls in play. However, one confounding variable has always been selection bias, whether the pitchers who make it to the major leagues have some special control over their BABIPs and that the players who have control are not in the majors because they can’t control this variable. As a result, the assertations of statistical analyses don’t have a set “control group” through which to analyze their players. If we had a reliable control group of players to test this BABIP theory against, we would have a clearer picture of whether or not these players can control their BABIP.

Therefore, I thought it would be interesting to see the results of position players who take the mound. Certainly, they fit the criteria that we want in a control group. For one, they must certainly be worse (though, it’s possible that they are better than minor league pitchers) than minor league pitchers. Second, they pass the “scout selection bias” that goes along with pitchers who make the major leagues, that they were not selected by scouts or player analysis experts to play in the majors. Though, it should be noted that many of these pitchers do have some sort of pitching experience, and should have enough athletic ability to post good velocities. In addition, they are selected by their managers as competent pitchers. Either way, it is a reasonable assumption that these pitchers are far worse than their major league counterparts and that they do not fall under the “attrition” bias, that their poor performance will shut them out of the league, as happens with many players with poor debuts.

Alas, let’s get on to the results. The sample was taken from all player seasons in the last 15 years, where pitchers threw fewer than 10 innings and played as a position player for more  than 50 games. The table is compiled at the end of the page and was derived from statistics at the Baseball Databank. The total sample comprised 54 innings.

I’ll leave the results here then let you guys talk it over.

First, the BABIP. I still think that I may have totaled the number of balls in play wrong, so I’d love for someone else to check it for me. However, the total BABIP for the sample was .269. This was especially intriguing given that it was actually lower than the standard .300.  I was hoping to see a number in the upper .300s, which would mean that there would be a spectrum of BABIPs that could include the results of lesser pitchers. It’s still possible that there is such a spectrum. However, this study did not lend evidence to this effect.

Second, was the relative skill of the pitchers. Don’t fear, just because the BABIP didn’t pan out as expected doesn’t mean that the rest of the numbers didn’t as well. First, the pitchers compiled a total 7.33 ERA, with a 7.66 BB/9 rate and 4.0 K/9 rate. These results were a little surprising, as I expected the ERA to be much higher than 7.33, at some place in the teens. In addition, I thought that the K rate would be much lower, as I didn’t think that MLB hitters struck out against position players at such a frequency. Maybe it isn’t so embarrassing to be retired via the K by a non-pitcher, or maybe players should just be embarrassed every time they are K’d by Carlos Silva.

Without fly ball data, I was unable to assess the HR/FB rates. However, they were not all that high, as 9 home runs were registered in 178 balls in play. However, without fly ball data, it is difficult to say the effect. However, if we guess and say that 37.07 percent of BIP were fly balls (for a total of 66 fly balls), this means that 9/ 66+9 balls left the yard, or  12 percent of fly balls – just 1-2 percent worse than the league average for MLB pitchers. Strange.

With such a small sample size, it is yard to pull any concrete results from the data. However, it does seem to lend evidence against the notion that there is a BABIP and HR/FB selection bias against major league pitchers.

Beyond that, I’ll let you readers discuss.

Here are the sums of the data:

BIP: 178        H on BIP: 48      BABIP .26966

HBP: 6           H: 57                     IPouts: 162

BFP: 263      HR: 9                     BB: 46

SO: 24           IBB: 0                  ER: 44

IP: 54          K/9: 4.0               ERA: 7.3333

BB/9: 7.666

And, one last note, I removed Rick Ankiel from the results, as he was formerly an accomplished pitcher, but still crept into the query.

playerID playerID G HBP H IPouts BFP HR BB SO IBB ER G_batting AB yearID
alexama02 alexama02 1 0 1 2 7 1 4 0 0 5 54 149 1997
bellde01 bellde01 1 0 3 3 10 0 3 0 0 4 158 627 1996
benjami01 benjami01 1 0 0 3 3 0 0 0 0 0 35 103 1996
bogarti01 bogarti01 2 0 2 6 9 1 1 1 0 1 97 241 1997
boggswa01 boggswa01 1 0 0 3 4 0 1 1 0 0 132 501 1996
bonilbo01 bonilbo01 1 0 3 3 6 1 1 0 0 2 159 595 1996
burkeja02 burkeja02 1 0 1 3 4 0 0 0 0 1 57 120 2004
burrose01 burrose01 1 0 4 3 7 1 0 0 0 3 63 192 2002
cangejo01 cangejo01 1 0 1 6 7 0 0 0 0 0 108 262 1996
cansejo01 cansejo01 1 0 2 3 8 0 3 0 0 3 96 360 1996
cirilje01 cirilje01 1 0 0 3 5 0 2 1 0 0 158 566 1996
davisch01 davisch01 1 1 0 6 7 0 0 0 0 0 145 530 1996
durritr01 durritr01 1 0 0 1 1 0 0 0 0 0 43 122 1999
espinal01 espinal01 1 0 0 2 2 0 0 0 0 0 59 112 1996
finlest01 finlest01 1 1 0 3 4 0 1 0 0 0 161 655 1996
francma01 francma01 2 0 3 4 10 1 3 2 0 2 112 163 1997
gaettga01 gaettga01 1 1 1 1 3 0 0 0 0 0 141 522 1996
giovaed01 giovaed01 1 0 1 4 7 0 2 0 0 0 92 139 1998
gonzawi01 gonzawi01 1 0 0 3 4 0 1 0 0 0 95 284 2000
gracema01 gracema01 1 0 1 3 4 1 0 0 0 1 142 547 1996
haltesh01 haltesh01 1 0 1 3 3 0 0 0 0 0 74 123 1997
harrile01 harrile01 1 0 0 3 3 0 0 1 0 0 125 302 1996
howarda02 howarda02 1 0 2 6 12 0 5 0 0 1 143 420 1996
jacksda03 jacksda03 1 0 3 6 10 0 2 0 0 2 49 130 1997
jimenda01 jimenda01 1 0 0 4 4 0 0 0 0 0 86 308 2001
lakerti01 lakerti01 1 0 1 3 5 0 1 1 0 0 52 162 2003
loretma01 loretma01 1 0 1 3 5 0 1 2 0 0 73 154 1996
mabryjo01 mabryjo01 1 0 3 2 6 0 1 0 0 2 151 543 1996
martida01 martida01 1 0 2 1 5 0 2 0 0 2 146 440 1996
maynebr01 maynebr01 1 0 1 3 5 0 1 0 0 0 85 256 1997
mccarda01 mccarda01 3 0 2 11 14 0 1 4 0 1 91 175 1996
menecfr01 menecfr01 1 0 6 3 8 1 0 0 0 4 66 145 2000
milesaa01 milesaa01 2 1 3 6 9 1 0 0 0 2 134 522 2004
nunezab01 nunezab01 1 0 0 1 1 0 0 0 0 0 90 259 1999
ojedaau01 ojedaau01 1 0 0 3 3 0 0 0 0 0 78 144 2001
oneilpa01 oneilpa01 1 0 2 6 11 1 4 2 0 3 150 546 1996
osikke01 osikke01 1 1 2 3 8 0 2 1 0 4 48 140 1996
penato02 penato02 1 0 0 3 3 0 0 1 0 0 152 509 2007
perezto03 perezto03 1 0 0 1 2 0 0 0 0 0 91 295 1996
relafde01 relafde01 1 0 0 3 3 0 0 1 0 0 142 494 1998
seitzke01 seitzke01 1 0 0 1 1 0 0 1 0 0 132 490 1996
sheldsc01 sheldsc01 1 0 0 1 1 0 0 1 0 0 58 124 2000
spiezsc01 spiezsc01 1 0 0 3 4 0 1 0 0 0 147 538 1997
venturo01 venturo01 1 0 1 3 4 0 0 0 0 0 158 586 1996
wallati01 wallati01 1 0 1 3 3 0 0 0 0 0 57 190 1996
whitema01 whitema01 1 1 1 3 7 0 2 3 0 1 40 140 1996
wilsojo03 wilsojo03 1 0 1 3 5 0 1 0 0 0 90 263 2007
woodja02 woodja02 1 0 0 3 3 0 0 0 0 0 98 117 2007
zeileto01 zeileto01 1 0 1 3 3 0 0 1 0 0 29 117 1996

The Current Criteria For Defining Batted Balls

With all the emphasis placed on BABIP in the statistical forums, we really could use a better method of classifying batted balls than line drives, groundballs, fly balls, and pop-ups. I guess that’s why Hit F/X is about to take the stat world by storm. For now, we have to deal with what we have.

In order to get a good sense of what we are dealing with, we should see how well these batted ball descriptions correlate with BABIP. Therefore, I took a sample of all qualified 2008 starting pitchers and made a regression equation to compare batted balls to BABIP. The results were not particularly encouraging.

Here’s the equation:

Pitcher BABIP = 1.90 – 1.11 LD% – 1.67 FB% – 1.75 GB% – 0.144 IFFB%

The R-Squared of this equation was 0.352. Unfortunately, this is a moderate to weak correlation. In other correlations, such as trying to find the relationship between break and curve ball success or count versus BABIP, we may be happy with this result. However, with the importance placed on batted ball data, especially when analyzing pitchers, this shows that the current classifications are inadequate.

Another important factor to remember is defense. Every defense influences the pitchers that throw in front of it. Therefore, we should test this equation while accounting for defense, to see if we can bring the correlation anywhere closer to a linear trend.

 Here is the regression equation:

Pitcher BABIP = 1.06 + 0.616 Team BABIP – 0.51 LD% – 1.01 FB% – 1.09 GB%
                – 0.102 IFFB%

R-Squared: .418

Again, there is only a moderate correlation, as even factoring defense into the equation raised the linear trend only marginally.

As we are on the eve of the availability of Hit F/X data, hopefully these points will become moot. Until then, be sure to take batted ball tendencies of pitchers with a grain of salt when making inferences on BABIP.

Follow

Get every new post delivered to your Inbox.