# The "toughest out" study, redux

September 25, 2007 7 Comments

I never expected the “toughest out” study to be much of anything.

Advertisements

An archive of StatSpeak from its days on MVN

September 25, 2007 7 Comments

I never expected the “toughest out” study to be much of anything.

Advertisements

%d bloggers like this:

Cool stuff. Can you list the results in a table: out/actual/expected/diff/SD

Assuming about 7000 PA per out-slot, the random diff should be .0057. (SD above is diff/.0057). For the 17th out, you are reporting a diff of .017, meaning 3.0 SD from the mean. It’s possible this is evidence of a tiring pitcher (17th out would mean around the 26th batter, which is right around when pitchers are pulled).

When I looked at it by PA (not out), there was a definite tiring pattern (the OBP goes up, the more batters a pitcher faces). It’s in The Book if you want to reference it.

Anyway, the 17 point difference is more like a 9 point difference after reflecting the tiring aspect, turning the 3 SD into 1.5 SD.

You’ll probably find that after you apply a “tiring/starter” effect, that the differences are random, as you’ve suspected.

I should have added that if you take the SD of the (adjusted) SD, that you’ll probably get something very close to 1.00 (i.e., random).

Hopefully this works formatting-wise…

Out Actual Expected Diff

17.00 .3614 .3441 .0173

9.00 .3608 .3484 .0125

12.00 .3439 .3327 .0111

14.00 .3450 .3363 .0087

21.00 .3506 .3425 .0081

10.00 .3555 .3493 .0063

18.00 .3461 .3412 .0049

8.00 .3434 .3391 .0044

11.00 .3441 .3403 .0038

13.00 .3357 .3343 .0013

16.00 .3453 .3440 .0013

20.00 .3408 .3400 .0008

27.00 .3282 .3279 .0003

4.00 .3449 .3449 .0001

23.00 .3416 .3417 -.0001

1.00 .3551 .3556 -.0004

6.00 .3192 .3207 -.0015

15.00 .3364 .3380 -.0016

22.00 .3402 .3429 -.0026

24.00 .3363 .3408 -.0046

2.00 .3554 .3630 -.0076

26.00 .3214 .3304 -.0090

19.00 .3331 .3426 -.0094

7.00 .3166 .3262 -.0097

5.00 .3174 .3291 -.0117

3.00 .3518 .3639 -.0120

25.00 .3071 .3325 -.0254

interesting…let’s sort that again

Out Actual Expected Diff

1.00 .3551 .3556 -.0004

2.00 .3554 .3630 -.0076

3.00 .3518 .3639 -.0120

4.00 .3449 .3449 .0001

5.00 .3174 .3291 -.0117

6.00 .3192 .3207 -.0015

7.00 .3166 .3262 -.0097

8.00 .3434 .3391 .0044

9.00 .3608 .3484 .0125

10.00 .3555 .3493 .0063

11.00 .3441 .3403 .0038

12.00 .3439 .3327 .0111

13.00 .3357 .3343 .0013

14.00 .3450 .3363 .0087

15.00 .3364 .3380 -.0016

16.00 .3453 .3440 .0013

17.00 .3614 .3441 .0173

18.00 .3461 .3412 .0049

19.00 .3331 .3426 -.0094

20.00 .3408 .3400 .0008

21.00 .3506 .3425 .0081

22.00 .3402 .3429 -.0026

23.00 .3416 .3417 -.0001

24.00 .3363 .3408 -.0046

25.00 .3071 .3325 -.0254

26.00 .3214 .3304 -.0090

27.00 .3282 .3279 .0003

If I take the standard deviation of the SD, I get 1.59 which is very significant. If I only do it to the first 18 outs (6 innings, basically the starter), I get an SD of 1.45.

If I stick with the first 18 outs, and adjust the first 7 outs diff upward by 6 OBP points, and the other 11 outs down by 6 OBP points (as a way to handle the tiring pitcher and/or advantage of batter in facing same pitcher multiple times), the SD is 0.95. That is, random.

Trying to do the same with the final 9 outs, and it becomes readily apparent that the easiest out, more than can be explained by chance, is the 25th out, as pizza has pointed out. The reason here is likely that it’s the first out of the 9th inning, and you have a “fresh” pitcher.

Otherwise, all other outs are within the realm of chance.

but, but, but, hang on here. if you’ve eliminated all the home team come from behind wins, does that affect the probabilities on outs 25, 26 & 27 in a statistically significant way?? Aren’t there a bunch of ABs for those outs that just got thrown out of the study? Or is it not enough to matter? Seems like I’m always seeing this on SportsCenter, but then again, it’s not much fun seeing the home team go down 1-2-3 9th (unless it’s your team as visitors, of course).

A small mis-understanding. I did an earlier version of the study in which I had eliminated all come-from-behind wins. This post addressed that problem (and others) from the initial study.