The Middle 80%

In a recent interview with Kevin Orris at Major League Report, veteran lefty Jamie Moyer said something particularly interesting with regards to how he likes to evaluate himself and other pitchers.  According to Moyer, “I have a couple of outings a year where I am just not good.  But, if you look at any starting pitcher during the course of the season when you’re getting 30 or more starts, we’re all in the same boat.  It’s just how bad are you?  The way I look at it is if you take those 3-5 bad starts away and remove the 3-5 good starts, the bulk of your season is in the remainder of 25 or so starts.  If you pitch well in those games you have a chance to make big contributions to your ball club.”
While it’s probably not the ideal way to evaluate pitchers it piqued my interest nonetheless.  Since the number of starts a pitcher makes in a season can constitute a small sample size–even in the early 1900s–the average Game Score of a pitcher may be inflated or deflated due to 3-4 tremendous or terrible outings.  Now, this isn’t to say that these starts should be removed when evaluating said pitchers, but what happens if they are removed?  Would there be significant differences in the Game Score averages?  If Moyer is right–many could argue for and against his idea–that every pitcher making that many starts will have a couple clunkers and a couple standout performances, then looking at the remaining bulk would offer up more of a general range of consistency.
With that in mind I probed the Baseball Reference Play Index for the highest average Game Scores in a single season from 2000-2007, of those making 30+ starts.  I then removed the top and bottom three Game Scores from each pitcher and re-calculated their averages.  Essentially, since the pitchers all fell between 30-35 starts, the six removed starts accounted for 20% of their total outings, leaving the middle 80% to look at… hey, that sounds like a catchy title.
The Play Index query brought back a plethora of names but the top twenty-five all happened to have averages of 60 or higher, so it seemed like a logical cutoff point.  Now, this analysis isn’t done to necessarily suggest we evaluate pitchers this way, by any means, but sometimes it’s just fun to look at interesting ideas and toy around with the numbers.  Perhaps we’ll find that the great or terrible starts really did have a true effect on the season even with their relatively small percentage of the whole.  For starters, here are the twenty-five pitchers, their seasons, and their overall average Game Scores:

  1. Randy Johnson, 2002: 67
  2. Randy Johnson, 2001: 67
  3. Johan Santana, 2004: 65
  4. Randy Johnson, 2004: 65
  5. Pedro Martinez, 2002: 65
  6. Curt Schilling, 2002: 64
  7. Randy Johnson, 2000: 64
  8. Roger Clemens, 2005: 63
  9. Johan Santana, 2005: 63
  10. Pedro Martinez, 2005: 63
  11. Ben Sheets, 2004: 63
  12. Mark Prior, 2003: 63
  13. Curt Schilling, 2001: 63
  14. Kevin Brown, 2000: 63
  15. Jake Peavy, 2007: 62
  16. Johan Santana, 2006: 62
  17. Jason Schmidt, 2004: 62
  18. Chris Carpenter, 2005: 61
  19. Jake Peavy, 2005: 61
  20. Oliver Perez, 2004: 61
  21. Kerry Wood, 2003: 61
  22. Andy Pettitte, 2005: 60
  23. Kevin Brown, 2003: 60
  24. Derek Lowe, 2002: 60
  25. Odalis Perez, 2002: 60

After removing the top and bottom three starts from each, here is a table showing the before and after photos, so to speak, ranked by differential.  So, someone with a 60 who shot up to a 62.5 after those starts were removed would have a 2.5; the opposite would result in a -2.5 since the pitcher’s average lessened after removing these starts. In theory, those who benefited the most from three tremendous starts will see their averages decrease while those who suffered from three bad starts will see their averages increase.

Player

Before

After

A-B

Johnson, 2000

64

66.24

2.24

Clemens, 2005

63

65.08

2.08

Carpenter, 2005

61

62.93

1.93

Pettitte, 2005

60

61.78

1.78

Wood, 2003

61

62.77

1.77

Peavy, 2007

62

63.64

1.64

Santana, 2004

65

66.14

1.14

Peavy, 2005

61

62.08

1.08

Schmidt, 2004

62

62.92

0.92

Schilling 2002

64

64.89

0.89

Martinez, 2002

65

65.88

0.88

Johnson, 2001

67

67.86

0.86

Johnson, 2002

67

67.79

0.79

Perez, 2004

61

61.79

0.79

Santana, 2005

63

63.67

0.67

Brown, 2003

60

60.50

0.50

Prior, 2003

63

63.42

0.42

Santana, 2006

62

62.32

0.32

Schilling, 2001

63

63.31

0.31

Brown, 2000

63

63.22

0.22

Lowe, 2002

60

60.08

0.08

Martinez, 2005

63

63.04

0.04

Sheets, 2004

63

62.57

-0.43

Johnson, 2004

65

64.55

-0.45

Perez, 2002

60

56.16

-3.84

What initially stands out is that so few of these twenty-five players actually saw their average game score decrease. A closer look shows that not many increased either. I mean, in a relatively speaking type of sense, all but the final three “increased” but said increase was so minimal that I would say Johnson’s 2000 and Clemens’ 2005 season were the only two to experience somewhat significant increases while Odalis Perez’s 2002 season took a big hit. Everyone else may have increased or decreased a bit, but for an initial look at something like this it does not seem that removing these starts really has that big of an effect on the overall averages. So, Jamie, for now, it seems that it isn’t really hurting you to look at pitchers this way but there really isn’t any need to get rid of the starts.

About these ads

6 Responses to The Middle 806

  1. dan says:

    Eric, I know you created your own version of Cy Points, but have you looked into doing something similar to improve Game Score? It seems to work pretty well in most cases, but it doesn’t account for park, quality of opponent, defense, and probably some other factors that I’m not thinking of at the moment.

  2. I have thought about it but apparently there was a presentation at SABR this weekend that showed Game Score holds up fairly well. I’m trying to get my hands on it or find a detailed recap from someone who saw it, though. The tough thing is that some stats differ in frequency through the decades so something like game score would, in theory, favor the past due to pitchers going more innings as we backtrack through time.
    Then again, perhaps this presentation proved that theory wrong.
    Something we could potentially do without even revising the stat is just breaking it down into different groups. We could look at the 35 Game Scores and break them into different splits: pitchers park vs hitters park, good opponent vs bad opponent, etc.
    Definitely something to consider.

  3. Ok great, Jeff Angus sent me his SABR presentation on Game Scores so I’m going to see what he finds and then report back.

  4. Okay, so the gist of his presentation, having read the first half of it (it’s long and thorough), is that the Game Score stat still holds up from its inception in 1987. Now, I haven’t finished his presentation so I’m not sure if it delves into the frequencies of the past, say, the pre-1987 years but if I were to do anything with it, it would involve finding ways to better adjust for what was more frequent and relevant in the past.
    Unfortunately, the game by game logs don’t exist prior to 1956, or at least enough so yet to do something very thorough on the subject.
    Ideally, though, what I would do is not count innings as highly in the way back game scores because it was far more common for pitchers to go complete than say the 5.1 they go today.

  5. dan says:

    Yea I read about his presentation briefly, that’s what made me wonder if you’d done anything with it.

  6. Devon Shurick says:

    Eric, isn’t there something fishy about only taking the top 25 pitchers to run through this algorithm? My intuition says that good pitchers stay consistent, and occasionally become riddled with a few bad games, thus accounting for so many positive differentials. This metric might be more useful when ball clubs are trying to select lesser, but more consistent pitchers.
    I am only just getting into the field of Sabermetrics and I bet (and hope) that there is a better measure of consistency, but I am just interested in seeing players from multiple tiers, or a random sample, etc.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: