Debunking a Statistical Myth

Over the last few weeks, I’ve read a lot about the supposed Moneyball backlash the White Sox World Series victory supposedly foreshadows. When you couple that with the Dodgers’ inane knee-jerk firing of GM Paul DePodesta, baseball traditionalists are crying triumph.
Bob Cook’s recent piece over at Flak Magazine is one of those pieces. While Williams’ piece is fraught with jabs at the sabermetrics society, he concludes with a keen observation:

The concept at the core of “Moneyball” is not statistical acronyms such as VORP and OPS, but finding players whom the marketplace undervalues. Williams did that with the White Sox by signing players who weren’t over the hill, but had worn out their welcomes elsewhere and therefore would be available for bargain prices.

There is, I would say, kernels of a very accurate representation of the White Sox World Series victory. The lessons of Moneyball focus more on finding hidden value in players through statistical analysis when working under a constrained budget. I don’t think Williams did that by signing players who were not welcome elsewhere. The only one of the White Sox unwelcome anywhere else was Carl Everett. Rather, Williams took a chance on a bunch of players and got great pitching. Since baseball is so dollars-oriented, there’s bound to be a little bit of Moneyball in every General Manager not named Brian Cashman.
But my main complaint with Cook’s article is this gem:

Williams’ White Sox won the World Series this year with a strategy that included use of the sacrifice bunt and the stolen base, two plays that statheads will tell you, with convincing evidence, result in fewer runs than if a team just let its players swing away and wait for the next batter to move them over.

This is a charge I see repeated over and over again by people who are afraid of the statistically-minded community, and it’s simply not true.
Statistics do not say that the use of the sacrifice bunt results in fewer runs than swinging away every time. Rather, it’s necessary to explore when it’s a good idea to use the sacrifice bunt and when it is not. There is indeed a place in baseball for the sacrifice bunt if it’s used properly. Let’s take a look at some numbers culled from the 2005 Expected Run Matrix.
This season, a team that had a runner on first and no out could be expected to score 0.8968 runs. In a runner-on-first, no-out situation, what happens to your expected run output if you sacrifice? Well, it declines. A team with a runner on second and one out could be expected to score 0.6911 runs. That’s a 22 percent drop in scoring. In that case, it doesn’t make sense to sacrifice (unless your pitching is up) because it doesn’t help your chances of scoring a run.
But what happens if you have a runner on second and no on out? The whole picture changes. A runner on second and no one out results in 1.1385 runs on average. A runner on third and one out results in .9795 runs for a drop of 14 percent. While bunting here doesn’t necessarily help your chances of scoring a run, the effect is not that important. Teams will score a run with a runner on third and one out 98 times out of 100 (unless it’s your favorite team and than they never succeed there). Teams succeed in this situation nearly all the time as you can see that it’s much more likely that a run will score than that it will not. (EDIT: My original statement wasn’t a correct interpretation of the run expectancy matrix. 0.9795 runs score with a runner on third and one out. That’s not the same as saying that the run scores 98 times out of 100.)
With runners on second and first and no out, bunting is a statistically irrelevant move. In that situation, a team goes from scoring 1.4693 runs to 1.4144 runs. The percentage drop is negligible.
So stats-minded analysts will tell you that, while bunting never really adds to your run-scoring chances, there are times when it’s acceptable to bunt. You never want to see your team surrendering outs in a meaningless fashion. So if a sacrifice bunt is in order, it better not be to move a player over from first to second.
What about stolen bases? Again, stolen bases have to be used judiciously. Take, for example, a runner on first and no out. That situation, if you recall, results in 0.8968 runs. If he’s caught stealing, the run expectancy value drops to .2796, a 70 percent decrease. If he’s successful, the run expectancy value goes up to 1.1385, a 27 percent increase. In pure numerical terms, the loss is about 2.5 times the gain.
Much math, discusses previously at Baseball Prospectus and ESPN.com during its Golden Era, shows that you need 3 successful steals for every caught stealing attempt.
Again, stats-minded folks don’t believe the stolen base should be discarded. Rather, it should be used intelligently. This season, Scott Podsednik, credited with leading the White Sox revamped offense, was caught stealing 23 times while successfully swiping 59 bases. He actually cost the White Sox runs by running so frequently.
So all of this analysis shows a few things, in my opinion. First, people who look down upon statistical analysis tend to do so because of the seeming complexities of the numbers. Math is hard, and it can be tedious to wade through all of these numbers. Second, statistically analysis is born out of the game and not the other way around. By studying the game, stats analysts can better discern what strategies work and what don’t work. Without the numbers from games already played, though, stats can’t predict anything. The numbers can help us understand what to do in the future, but we can only do that by learning from the past.

Advertisements

11 Responses to Debunking a Statistical Myth

  1. John says:

    I think what some people fail to recognize is that the chances of scoring a run after bunting is lower and the benefit-cost ratio of stealing bases is .33 for average batters (or a succession of batters in a lineup).
    Clearly bunting a runner over when the batter has BA/OBP/SLG that is declining from the league averages is will increase the chances of scoring runs. Same goes for baserunning – certainly a successful steal increases scoring in certain situations but a caught stealing is even more costly in some situations.
    Anyone have any thoughts on what those thresholds and situations would be? Or is it like Bill James and his reversal on clutch hitting – we don’t have enough situations to analyze for a big enough sample?

  2. Voros McCracken says:

    Nice article, Bejnamin. However the statement:
    “Teams will score a run with a runner on third and one out 98 times out of 100 (unless its your favorite team and than they never succeed there).”
    is incorrect. That .9795 represents the expected number of runs scored in the remainder of the inning given the base out condition listed (man on third one out). This includes runs that might be scored in that inning that are by people other than the guy on third base. To get a more accurate idea of how often that particular guy on third scores (though not a perfect one), subtract whatever the expected runs are for a no one on and one out inning from that .9795 above. Most of the time that number hovers at a little under 70%.

  3. mezzie says:

    “A runner on third and one out results in .9795 runs for a drop of 14 percent. While bunting here doesnt necessarily help your chances of scoring a run, the effect is not that important. Teams will score a run with a runner on third and one out 98 times out of 100”
    This is wrong.
    The .9795 figure is not a %; it’s the # of expected runs in the inning. Using the same logic as the author, one should conclude that a team will score a run with a runner on second and no outs 113.85 times out of 100…
    The question of helping the team’s chances of scoring a single run is vastly different from the question of how many runs the team will be expected to score on average in the rest of the inning. The former question is not addressed in the article; the answer to the latter question is -14% after a SUCCESSFUL bunt.
    It is clear that sacrificing a runner from 2nd to 3rd is a disastrous play from a run-expectancy point of view. If the team needs a single run to win the game, then the bunt is justified when the odds of scoring THAT ONE RUN (not 98 out of 100, mind you!) after a successful bunt rise based on the skills of the bunter and the next couple of batters.

  4. Benjamin Kabak says:

    Voros and Mezzie: Sorry. You’re right. I gota little confused there and I was writing this entry up fairly quickly this afternoon while at work. Lame excuse, I know.
    However,Mezzie, I don’t understand how you can say it’s disastrous to sacrifice from 2nd to 3rd from a run-expectancy play. It’s not disastrous to do that. In fact, that’s one of the times it’s not a horrible play.

  5. mezzie says:

    I think it’s reasonable to say that any play which leads to a 14% drop in scoring over the long haul is disastrous 🙂
    If team’s scoring drops from 800 runs one season to 688, all else being equal, wouldn’t that be disastrous to their W/L record? (a 14% drop) Such seemingly small numbers can make an enormous difference given a large enough sample.
    Let’s say a team has 80 such opportunities over the course of a season (# taken out of thin air). If they make a successful sacrifice in every case, they will lose .16 of a run each time, for a net loss of 80 x 0.16 = 12.8 runs, or a little more than an extra team loss.
    Of course no team would ever employ that strategy all the time. The point is that team’s which do it regularly in the 1st inning or early in the game with good hitters up are costing themselves runs, and hence wins. Teams which use the sacrifice in the late innings of a close game may be increasing their win expectancy even though they decrease their run expectancy, since it’s possible that the chances of scoring the first run increase. That depends on all the other factors at play in the given situation. Managers who blindly bunt without considering such factors are costing their team runs and wins.
    All else being equal, based on straight run-expectancy, it is disastrous in my book. Feel free to disagree!

  6. Benjamin Kabak says:

    I see your point there about it being disastrous from an overall standpoint. I guess this is where you have to considering the circumstances. If you need one run to tie and its late in the game, a manager may feel more comfortable bunting a runner over from second to third with no one out depending upon who’s up. I This is the kind of example where sabermetrics can only do so much and game-time situations are seemingly more important. If you’re 8 hitter gets on, bunt with the 9 hitter. But if your 2 hitter gets on, hit away with the 3-4-5 guys.
    Still, it’s more disastrous to attempt to bunt a runner from 1st to 2nd because you lose EVEN MORE runs that way!

  7. mezzie says:

    Agreed 🙂
    The overall effect of the “other” factors far outweigh any simplistic analysis based on run expectancy tables. A manager who doesn’t factor in any of the surrounding situation and simply bunts because it’s the “right thing to do” is applying a strategy which, if the outcome is as desired, loses runs.

  8. Rob Bonter says:

    “Odds of” (accomplishing something) is a misnomer. The correct syntax is) “Chances of.” Odds should be stated in the context “against” something transpiring, such as “The odds against Bellamy Road winning the Kentucky Derby are 3-1,” (the equvalent of a 25% chance.) Or you could say the chances of Bellamy Road winning the Derby are 25%. But NOT “the odds of Bellamy Road winning are 3-1.” Chances and odds: 99.8 % of the scribes in this country do not know the difference between the two and almost always screw it up in print.

  9. Benjamin Kabak says:

    Yeah, Rob, you’re right. Odds of my messing that up are about 3-2. I do it a bit. It’s almost become an accepted part of the lexicon, not that it should be. Thanks for pointing that out!

  10. BosoxBob says:

    Teams which use the sacrifice in the late innings of a close game may be increasing their win expectancy even though they decrease their run expectancy, since its possible that the chances of scoring the first run increase.
    This is an excellent point, and Tangotiger has tables of data collected from the 99-02 seasons that show this effect. According to his data, the probability of scoring at least one run with a runner at first and nobody out is 43.7%, and with a runner at second and one out, that drops to 40.6%. However, the probability of scoring exactly one run goes up from 17.6% to 23.0%. Sacrificing a runner from second to third increases the single run probability from 34.8% to 47.8%.
    Another interesting thing about those tables relates to intentional walks after a successful sacrifice. With a runner on second and one out, an intentional walk increases the overall chance of scoring from 40.6% to 42.6%, but it decreases the single run probability from 23% to 16.1%. Even more interesting is the runner on third/one out case. Here, an intentional walk decreases both the overall scoring probability (66.2% to 65.5%) and the single run probability (47.8% to 37.0%).

  11. The Zoner says:

    Just found your blog–great stuff. I’ve written about it too–the small ball thing is a poorly put together argument. The Sox won in spite of their offensive strategy, not because of it. They still hit a bunch of home runs. That and the pitching and D are the reasons–not giving away outs by bunting and getting caught stealing.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: