# The “toughest out” study, redux

September 25, 2007 7 Comments

I never expected the “toughest out” study to be much of anything. I was bored one night, fooling around a little bit with some Retrosheet data files, and thought I might test out the old “What’s the toughest out?” question. Then, Rob Neyer from ESPN linked to me, and it became my all-time most read piece. All this for a study that I did in about 15 minutes. Of course, the stuff that takes me hours of detail-oriented work to do gets read by five people.

When writing the piece, I knew that I wasn’t really doing the study much justice. I didn’t control for batter or pitcher quality and my sampling methods were based on how quick I could get the study done. A few good commenters pointed out a few possible improvements, and then (insert spooky sound effects here), Bill James himself visited me in a dream last night and told me the spirits of Sabermetrics were angry at me for shirking my duty. (OK, not really.) So, here is the “toughest out” study, done properly.

First, the ever reliable Tango Tiger suggested that I look at the overall league OBP for the plate appearances when there had already been 1 out recorded, then 2, and so on. Fair enough. (A small confession: what I calculated wasn’t exactly OBP. Because of my database set up, I had to make do with whether an out had been recorded in each plate appearance. The great majority of those outs were made by the batter, but occasionally a batter singles, but his idiot teammate gets thrown out at third. There are also times when a batter strikes out, but reaches first on a passed ball.) My data set is almost everything that happened in 2006, throwing away my original stipulation that the only interesting things to look at were the games in which all 27 outs had been recorded. (This threw away all ninth inning comebacks by the home team, as well as all home wins in which the bottom of the ninth was superfluous to requirements.) I *didn’t* look at outs that were recorded on caught stealings or pickoffs.

The out with the highest OBP? The 17th out (which came in 2nd place in the original study) with an OBP of .3614. It was followed closely by the 9th out at .3608, then the 10th out, 2nd out, and 1st out. The easiest out to get was the 25th out (1st out of the ninth inning) with an OBP of .3071. So the difference between the highest and the lowest is .054, which is one non-out in twenty plate appearances. Not a huge difference, but definitely a difference. The 27th out was actually the 6th easiest to come by.

Still, in my original study, the 1st out was the most difficult to come by. Several folks properly pointed out that this had something to do with the fact that the first person up in a game is the leadoff hitter and, Juan Pierre not withstanding, the leadoff guy is usually a high OBP guy. So, it’s important to control for the batter’s ability to avoid outs and the pitcher’s ability to induce them.

I calculated OBP for all pitchers and hitters over the course of the season and converted them into odds ratios. For those who aren’t familiar, an odds ratio takes the probability (p) from a yes/no question (Did the batter make an out or not?) and turns it into something that is much more easy to work with mathematically. The formula is p / (1-p).

Now, suppose that Larry is pitching and he has an OBP against of .333. He is facing Neifi (a name I just pulled out of nowhere) who has an OBP of .200. Neifi’s odds ratio is .200 / (1 – .200), which is (.200 / .800), or 0.25. Larry is at .333 / (1 – .333) or 0.5. What is the expectation that this confrontation will end up without making an out? We can find it with the following formula:

(batter OR / league OR) * (pitcher OR / league OR) = (expected OR / league OR)

I had to calculate the OBP for all at-bats in the 2006 season (.3409 for the curious, which may not match up to other sources, but remember I’m using a slightly definition for the purposes of this study), but the rest is just plugging numbers into the formula and solving. Once we’ve got the expected OR, it’s easy enough to convert it back into a probability. p = OR / (OR + 1).

Given all that, we can figure out what the expected OBP would be for any given plate appearance and by summing a few things up, what the overall expected OBP would be for all PA’s at a specific level of outs. Then, we can compare what the actual OBP was for that number of outs versus what could be expected given batter and pitcher quality.

The toughest out to get using this formula? Still the 17th out. It had an expected OBP of .344 given who was batting and pitching at that time, but had an actual OBP of .361 for a difference of .017. Following close behind it were the 9th, 12th, and 14th outs. The easiest out to get was still the 25th out, followed by the third, and the fifth out. I couldn’t discern any kind of pattern running through the numbers. Maybe I’ll take a look at a few other years to see whether certain outs are tough to come by from year to year.

The first out of the game actually drops down into 16th place. Interestingly enough, the expected OBP for the batter/pitcher matchups that tried to produce the first out was .3556, while the actual OBP was .3551. The first out of the game is almost exactly as hard to come by as one would expect given the people who generally bat (and pitch) there. The 27th out actually had a similar pattern, with an expected OBP of .3279 and an actual OBP of .3282. The last out of the game isn’t any harder (or easier) to get than one might expect given the batter/pitcher matchups that happen there.

Cool stuff. Can you list the results in a table: out/actual/expected/diff/SD

Assuming about 7000 PA per out-slot, the random diff should be .0057. (SD above is diff/.0057). For the 17th out, you are reporting a diff of .017, meaning 3.0 SD from the mean. It’s possible this is evidence of a tiring pitcher (17th out would mean around the 26th batter, which is right around when pitchers are pulled).

When I looked at it by PA (not out), there was a definite tiring pattern (the OBP goes up, the more batters a pitcher faces). It’s in The Book if you want to reference it.

Anyway, the 17 point difference is more like a 9 point difference after reflecting the tiring aspect, turning the 3 SD into 1.5 SD.

You’ll probably find that after you apply a “tiring/starter” effect, that the differences are random, as you’ve suspected.

I should have added that if you take the SD of the (adjusted) SD, that you’ll probably get something very close to 1.00 (i.e., random).

Hopefully this works formatting-wise…

Out Actual Expected Diff

17.00 .3614 .3441 .0173

9.00 .3608 .3484 .0125

12.00 .3439 .3327 .0111

14.00 .3450 .3363 .0087

21.00 .3506 .3425 .0081

10.00 .3555 .3493 .0063

18.00 .3461 .3412 .0049

8.00 .3434 .3391 .0044

11.00 .3441 .3403 .0038

13.00 .3357 .3343 .0013

16.00 .3453 .3440 .0013

20.00 .3408 .3400 .0008

27.00 .3282 .3279 .0003

4.00 .3449 .3449 .0001

23.00 .3416 .3417 -.0001

1.00 .3551 .3556 -.0004

6.00 .3192 .3207 -.0015

15.00 .3364 .3380 -.0016

22.00 .3402 .3429 -.0026

24.00 .3363 .3408 -.0046

2.00 .3554 .3630 -.0076

26.00 .3214 .3304 -.0090

19.00 .3331 .3426 -.0094

7.00 .3166 .3262 -.0097

5.00 .3174 .3291 -.0117

3.00 .3518 .3639 -.0120

25.00 .3071 .3325 -.0254

interesting…let’s sort that again

Out Actual Expected Diff

1.00 .3551 .3556 -.0004

2.00 .3554 .3630 -.0076

3.00 .3518 .3639 -.0120

4.00 .3449 .3449 .0001

5.00 .3174 .3291 -.0117

6.00 .3192 .3207 -.0015

7.00 .3166 .3262 -.0097

8.00 .3434 .3391 .0044

9.00 .3608 .3484 .0125

10.00 .3555 .3493 .0063

11.00 .3441 .3403 .0038

12.00 .3439 .3327 .0111

13.00 .3357 .3343 .0013

14.00 .3450 .3363 .0087

15.00 .3364 .3380 -.0016

16.00 .3453 .3440 .0013

17.00 .3614 .3441 .0173

18.00 .3461 .3412 .0049

19.00 .3331 .3426 -.0094

20.00 .3408 .3400 .0008

21.00 .3506 .3425 .0081

22.00 .3402 .3429 -.0026

23.00 .3416 .3417 -.0001

24.00 .3363 .3408 -.0046

25.00 .3071 .3325 -.0254

26.00 .3214 .3304 -.0090

27.00 .3282 .3279 .0003

If I take the standard deviation of the SD, I get 1.59 which is very significant. If I only do it to the first 18 outs (6 innings, basically the starter), I get an SD of 1.45.

If I stick with the first 18 outs, and adjust the first 7 outs diff upward by 6 OBP points, and the other 11 outs down by 6 OBP points (as a way to handle the tiring pitcher and/or advantage of batter in facing same pitcher multiple times), the SD is 0.95. That is, random.

Trying to do the same with the final 9 outs, and it becomes readily apparent that the easiest out, more than can be explained by chance, is the 25th out, as pizza has pointed out. The reason here is likely that it’s the first out of the 9th inning, and you have a “fresh” pitcher.

Otherwise, all other outs are within the realm of chance.

but, but, but, hang on here. if you’ve eliminated all the home team come from behind wins, does that affect the probabilities on outs 25, 26 & 27 in a statistically significant way?? Aren’t there a bunch of ABs for those outs that just got thrown out of the study? Or is it not enough to matter? Seems like I’m always seeing this on SportsCenter, but then again, it’s not much fun seeing the home team go down 1-2-3 9th (unless it’s your team as visitors, of course).

A small mis-understanding. I did an earlier version of the study in which I had eliminated all come-from-behind wins. This post addressed that problem (and others) from the initial study.