A quick look at baserunning

You’re probably looking around going, “Where’s my roundtable?” And you will have a roundtable. Probably Friday. In the meantime, I’m laying out some finger sandwiches and lemonade – a light afternoon snack, if you’d like. Partake if you wish.

So I have a baserunning evaluation metric, measured in runs above/below average. Nothing fancy or special, really. Dan Fox has covered this ground a lot better than I have. (And that’s just the tip of the iceberg.) So here’s how I dos it:

  1. Start with Retrosheet play-by-play data.
  2. Calculate run expectancy separately for each base, like this, for each season.
  3. Looking only at the lead baserunner, calculate the average destination run expectancy for each event. Everything was broken down by the following categories:
    • Number of outs remaining,
    • Event code (single, double, out, wild pitch, etc.),
    • Batted ball type,
    • Whether the batter was bunting,
    • Whether the ball was hit to the battery (pitcher/catcher), an infielder or an outfielder,
    • Whether the ball was hit to the left or right side of the field.
  4. Compare what a player did to the average.

Let’s say you have a runner on first, no outs. Most of the time a runner ends up on second, some of the time on third, when a ground-ball single is hit into left field. If a runner ends up on second, he gets a (very slight) debit. If he ends up on third, he gets a credit. All of these changes are tracked and totaled up.

Simple and easy, right? Here’s the top ten baserunning +/- seasons, 1953-2007:

YEAR_ID
PLAYER_ID
Name
TEAM_ID
PLUS_MINUS
1965
flooc101
Curt Flood
SLN
12
1976
patef101
Freddie Patek
KCA
12
2004
erstd001
Darin Erstad
ANA
11
1991
molip001
Paul Molitor
MIL
10
1978
puhlt001
Terry Puhl
HOU
10
2000
goodt001
Tom Goodwin
COL
10
1987
browj001
Jerry Browne
TEX
10
1974
bochb001
Bruce Bochte
CAL
10
1957
blasd101
Don Blasingame
SLN
10
1976
leflr101
Ron LeFlore
DET
10

You’ll note that the best baserunning season of the Retroera was only worth 12 runs above average. Obviously you’d prefer a good baserunner to a bad baserunner, all else being equal, but it definitely takes a backseat to hitting and defense.

Ten worst seasons?

YEAR_ID
PLAYER_ID
Name
TEAM_ID
PLUS_MINUS
2007
lodup001
Paul Lo Duca
NYN
-9
1959
thomf103
Frank Thomas
CIN
-9
1980
cruzj001
Jose Cruz
HOU
-9
1965
johnd103
Deron Johnson
CIN
-9
1962
brutb101
Bill Bruton
DET
-9
1976
sizet101
Ted Sizemore
LAN
-10
1974
darwb101
Bobby Darwin
MIN
-10
1999
stanm002
Mike Stanley
BOS
-10
1965
fairr101
Ron Fairly
LAN
-10
1964
bertd101
Dick Bertell
CHN
-13

UPDATE: This is too large for an EditGrid, so here’s a full spreadsheet, including career totals. Requires something that can read Excel files. Best I can do for y’all right now.

Advertisements

5 Responses to A quick look at baserunning

  1. Brian Cartwright says:

    Try grouping by run differential as well

  2. Pizza Cutter says:

    But he must be a good and valuable player… he’s fast!

  3. Colin Wyers says:

    I’m reticent to add any more adjustments the way I currently do it, Brian, because then you start to really shave down the sample sizes on the state-to-state transitions. Especially since I’m doing it season-by-season, it starts to get dangerous if you drill down anymore. I’m sure there’s a way to handle it better, but a lot more work would have to go into it.
    And sportwriters say stuff like that, PC, but when it comes down to brass tacks, like the MVP award, they vote a Big Damn Slugger with no other positives second. It’s all lip service.

  4. Dan Novick says:

    “they vote for a Big Damn Slugger”
    Like Dustin Pedroia?
    I agree with you for the most part btw, I’m just giving you a hard time.

  5. Brian Cartwright says:

    This is an idea that I’ve actually had for close to 25 years, since I kept statistics and had all the play by play for a college summer league, but it’s still on my to-do list.
    The way I have it conceptualized, if there’s a rare grouping of events, the expected value will not be as accurate because it’s based on a much smaller sample size – but, if the player’s samples are weighted by how often that player is in each situation, then the effect in the final rating of a larger variance in the expected value of any subgroup will be minimized by the weighting.
    So, of any groupings that you have, calculate their expected rate (the league mean over x number of seasons). Find the number of times that a player was in each situation, and calculate the player’s weighted overall expected rate, then compare to the player’s observed rate, and convert to runs.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: