StatSpeak World Famous Roundtable: April 28

What better cure for the Mondays can there be than a StatSpeak roundtable.  Today, StatSpeak is proud to welcome Geoff Young of Ducksnorts (where Geoff writes about the San Diego Padres) as well as Baseball Digest Daily, and The Hardball TimesGeoff joins Eric and Pizza in a discussion of what you should be seeing on the bottom of your screen, Trevor Hoffman, and which player who’s had a crazy good start to the season has the best chance of keeping things up.
Question #1: Trevor Hoffman — small sample size victim or toast.
Geoff Young: I’m going to cheat and say a little of each. On the one hand, I don’t feel comfortable making a firm judgment based on eight games to start the season. On the other, Hoffman is 40 years old and he hasn’t been dominant since the ’90s. Skills erode.  From a visual standpoint, what concerns me most is his decreased ability to locate the fastball. He used to be deadly accurate with that pitch, but not so much thus far in ’08. Whenever Hoffman has had this sort of problem in the past, he’s been able to correct it quickly. We’re still waiting for that to happen this time, and at his age, there’s no guarantee that it will.
From a statistical standpoint, the declining strikeouts are a huge yellow flag. His K/9 over the past five years in which he was healthy paints a troubling picture:
2002: 10.47
2004: 8.73
2005: 8.43
2006: 7.14
2007: 6.91
There isn’t a way to cast those numbers in a positive light. At the same time, largely because of smarts and great control, Hoffman has managed to remain effective despite decreased dominance. Here’s his ERA+ over that same period:
2002: 137
2004: 168
2005: 130
2006: 189
2007: 135
There’s no identifiable pattern, but from looking at the K/9, it’s clear that he’s more mirrors than smoke at this point. The agonizing part of the equation, from the standpoint of the Padres and their fans, is that by the time enough of a sample is gathered, it’s probably too late to make the right decision. This, of course, is why hindsight is 20-20. I’d say give him another month or so to right the ship. If Hoffman still hasn’t figured it out by the end of May, then slap together a new plan.
Eric Seidman: I would personally be inclined to think it’s a sample size issue right now but there seems to be luck-based factors at work, as well.  Hoffman’s K/BB has plummeted since 2004, dropping from 6.63 to 2.93 last year–currently at 2.00–however his current K/9 is actually higher than in 2006 and 2007.  As of this moment his percentage of line drives has decreased from 17% to 7% while his grounder frequency has jumped from 30% to 39%.  His BABIP is currently the second-highest it has ever been in his career, likely as a result of the grounder increase.  Labeling is a big factor in situations like this, as I’m sure Pizza can attest to, because once we say something about a person, every subsequent action is viewed in this light.  When Brad Lidge gave up the home run to Pujols in the playoffs, he was labeled a mentally bruised and battered pitcher.  Playing directly into a convenience factor, Lidge posted a 5.28 ERA the following year and lost his spot.  Most in the media wrote him off as having a fragile psyche because he followed a devastating home run surrendered with a seemingly subpar season.  His FIP in 2006 was 3.84.  His FIP last year, when he had a solid on-the-surface statistical season?  3.84.  He was unlucky in 2006 and a tad lucky last year but because he was given the label of being toast we viewed every blown save as more evidence of his demise.  With regards to Hoffman, it seems that he is currently a bit unlucky, as his FIP is 1.5 points lower than his ERA as well as the aforementioned factors.  I’d love to revisit this question in June or July to see where he stands.  He won’t be good forever but I think this is a bit overblown.  (Ed. note: When did Eric get a degree in psychology? – P.C.)
Pizza Cutter: Mmmmm, I like toast… Well, it takes about 150 PA’s to get even some basic pitching stats to stabilize enough that they can really be counted reliable.  As I write this, Trevor has faced 40.  His BABIP is high, as is his HR/FB, so luck has not been his friend.  On that evidence, I’d say he’s just the victim of a small sample size.  But, there are a few concerning signs.  Trevor hasn’t lost much velocity off his pitches (assuming that FanGraphs has good data on Hoffman’s pitch selection), but he’s been throwing more sliders than normal, at the expense of his changeup.  Why would a pitcher who’s had success in the past mess around that drastically with his pitch selection?  (An injury?  A lack of confidence in the changeup?  Maybe he has new confidence in the slider now.)  Plus, both last year and this year, he’s seen a decent sized jump in his fly ball rate.  Last year, he gave up a ridiculously low HR/FB so he got away with it.  This year, he might not be so fortunate.  Closers are usually brought into high leverage situations where a home run is catastrophic for his team.  Sure, Hoffman pitches at Petco, so it might not be as big a concern half the time, but still, it’s not like it’s a good idea to be giving up so many fly balls.  Then, there’s the issue of his strikeout rate creeping downward (although his current rate is creeping back up to 8 per 9 innings), and his walk rate creeping upward.  Since his velocity isn’t down, perhaps it’s his control that is fading?  He is 40 years old.  I have a hard time reading too much into stats this early in the season, but I do see some signs for concern, even dating back to last season.
Read more of this post


A Closer Look at Closers – Part One

Over the course of the next few weeks I will be primarily working with Closers – trying to determine the most effective ways to evaluate talent and quality at an inconsistent position that sure receives some hefty and consistent dollars.
This first part will introduce my opening step to a weighted formula to determine the value of a Closer, as well as discussing what a Closer is, and how we currently evaluate them.
Though this first part will focus solely on 2007, my study also consists of data from 2005 and 2006.
When compiling my data and examining game log after game log, I decided that my study and research should focus on some consistency, which can be hard to find for a Closer. 
I looked at the National League in 2005, 2006, and 2007, and wanted to limit my group to include only those who reached a certain criteria.  Initially I thought that anyone with 25+ saves in all three seasons should qualify.
Then, I actually saw the numbers and realized that would limit my study to include onlyJason Isringhausen, Trevor Hoffman, Billy Wagner, and Chad Cordero.
Suffice it to say, I wanted to have some more people in there.  With that in mind, I altered my criteria to simply those who actually were closers during those three seasons.  I also took into account the fact that some were demoted, promoted, or injured, and so my criteria called for 15+ saves in 2005, 2006, and 2007.
With those numbers, the nine Closers who find themselves under my statistical microscope are – Isringhausen, Hoffman, Wagner, Chad Cordero, Francisco Cordero, Brad Lidge, Jose Valverde, Brian Fuentes, and Ryan Dempster.
Yes, Francisco Cordero was in the AL for 2005 and some of 2006, however he has recorded 103 saves in the last three seasons and 60 of them were in the NL.  Plus, the whole idea of working with Closers stemmed from the idea that an inconsistent one-inning pitcher could receive a 4 yr/40 mil deal.
Simply stated, a Closer is a pitcher called on in the 8th or 9th innings, whose job is to seal the win for his team.  If he does his job he records a “Save.”  If the other team comes back to tie the game, he records a “Blown Save.” 
If you asked anyone about those stats before 1969, though, they would assume you were discussing hockey or soccer since saves are a relatively new statistic.
There are three ways a pitcher can record a save. I know this is a recap for many readers but it is important in the grand scheme of my study. The first way, which is how most people generally describe saves, involves the pitcher entering in either the 8th or 9th inning, with a lead of three or less, and preventing the other team from coming back to tie.
The second way is contingent upon when you enter the game and in what situation.  If you enter the game with the tying run on base, no matter the lead (usually extends it to a 4-run lead), and prevent the team from tying, you get a save.
The third way, which is how most middle relievers will rack up their 1-3 random saves per season, involves a pitcher going for the final three innings of the game – regardless of the score.  If the Phillies lead the Braves 9-1 and Ryan Madson pitches the 7th, 8th, and 9th, he gets a save.
If there are different types of save categories, doesn’t that mean there are different save types for each category?
Yes.  Plenty.  Think of it this way.  If you enter the 9th inning with only one out to go, and a 5-3 lead and bases empty, and you end the game, you get a save.  If you enter the 9th inning with only one out to go and the bases are loaded, and you end the game, you get a save.  One is clearly harder to do than the other and has a higher risk of resulting in a blown save, yet each ultimately results in the same statistic – a save.
With that in mind, I looked at the 9th inning and thought of all the possible situations that someone could receive a save.  In the 9th inning, there are 72 different ways to record a save, excluding what the pitcher does in the inning. 
If we count what the pitcher does, either giving up a run with a two-run lead or two runs with a three-run lead, and so forth, in the 9th inning there are 144 total ways to record a save.  I will get more into these different ways in Part Two, however the basic idea is that there are eight situations of baserunners (empty, 1st, 2nd, 3rd, 1st and 2nd, 1st and 3rd, 2nd and 3rd, bases full) and 18 different variations of these eight situations.  These variations include entering with 1 out, with 2 outs, with no outs, entering with 1-run, 2-run, or 3-run leads, and more of the same.
144 different ways can a pitcher record a save in the 9th inning, depending on how many outs he records, what the baserunning situation is, and how many runs he gives up.  Yes, this can be said for many other statistics, but Saves generally only span 2-innings MAX, and so the huge number of different types means a bit more here.
I am not dealing with “clutch” in my study.  To read some fascinating insights into relief pitching and relief clutch, read Pizza Cutter’s articles on the subject.
Instead, I am looking at what actually happens and how it happens, not the potential of why it happens.
Many people will look at two, and only two, stats when determining the quality of a closer – total saves, and percentage of successful saves (saves/save opportunities).  It has been pounded into our heads as a barometer and these statistics are supposed to inform us that the “best” closers are the ones with either the most saves or least blown saves.
What I am contending is that if there are 144 different types of 9th inning saves, and the barometer is the sum of all converted opportunities, regardless of the type of save, the needs of your team, and the situation at hand, it is impossible to equate total saves to quality.
Think of it this way – Closer A and Closer B both have 30 saves.  Closer A has 6 blown saves while B has 4 blown saves.  With those numbers, which are usually the only ones readily available, we assume that Closer B was better.  After all, he blew less saves.  What if the 4 saves he blew were all 3-run leads with bases empty and only 1 out to go in the 9th inning, though, which is the least dangerous save situation of the whole 144.  And what if the 6 blown saves of Closer A were all games he came in with runners on third base and no outs, or games where he entered in the 8th inning.
It becomes very difficult to gauge the “better” factor with just those numbers.
Regardless, even looking at hypotheticals like that, which take into account different types of saves, we cannot determine true quality because it does not take into account the needs of the teams these closers are on – which is ultimately the point of the closer.
When we discuss who the best closer is, what are we asking?  Are we wondering who was best with the most pressure?  Who posted the best numbers?  And if we are talking about numbers, what numbers are the best numbers?
These questions, and more, can cause a headache.  My point here is that we cannot compare closers to each other or determine true quality and effectiveness without analyzing what each closer did for his team.
In order to do this we need to find the number of games that each team won in a save situation (meaning no walk-off wins or 3-inning saves) and add it to the number of Blown Save-Losses because that tells us the true number of save opportunities each team had.  I call that a TSO – Team Save Opportunity.
Jose Valverde had 47 saves this year, leading the NL, however the DBacks had 64 Team Save Opportunities, whereas Ryan Dempster’s Cubs only had 48 of those games – sixteen less than the DBacks.
Dempster had 28 saves, much less than Valverde, but his conversion rate (28/31) was higher.  Valverde had more saves, but he also had more opportunities because his team played a different way and, as a team, played more close games that needed saving.  And even though Dempster’s percentage was higher, he also had less opportunities to blow saves.  If he had the 54 appearances of Valverde, he may have also blown more saves and had a worse conversion rate.
What we need to do here is level the field of play between those on teams with many save opportunities and teams with fewer.  After all, it is not Dempster’s fault that the Cubs had a better offense and blew teams out more than the DBacks.  He was not needed as often as Valverde and so his raw save and blown save totals do nothing but compare one number of Dempster’s to the overall need of Valverde and the needs of the Diamondbacks.
To really do this, the effectiveness of one pitcher to his team needs to be compared to the effectiveness of another pitcher to another team.
The DBacks had 64 TSO’s and Valverde had 54 opportunities.  This means that Valverde appeared in 54 of the 64 total save opportunities for his team, or 84.4 %.
That 84.4 % tells us he was durable since the team had so many potential save opportunities and his appearances were so high.
The Cubs only had 48 team save opportunities and Dempster only had 31 attempts.  His appearance rate would be 31 of 48, or 64.6 %.
Yes, Dempster was hurt, but this does make sense because you cannot be more effective (positive or negative) for your team if you are not involved as often as possible.  The fact that other pitchers were involved in over 1/3 of the Cubs save opportunities says that Dempster was not truly effective in making appearances.
To see the order of the nine closers in terms of Appearance Rate, look at the table below. The table shows the saves and save opportunities of the individual, as well as the total real save opportunities of the team, and then the Appearance Rate.

F. Cordero 44 51 58 87.9
Valverde 47 54 64 84.4
Hoffman 42 49 60 81.7
Wagner 34 39 48 81.3
C. Cordero 37 46 59 78.0
Isringhausen 32 34 46 73.9
Dempster 28 31 48 64.6
Lidge 19 27 55 49.1
Fuentes 20 27 59 45.8

Despite this stat being useful to tell us how durable or useful a closer can be in making appearances based on team need, it does not tell us how successful they were in actually converting these saves. Just because Valverde appeared in 54 of 64 team save opportunities for the DBacks does not mean he converted 54 saves – just that he made 54 appearances.
After careful thought, I came up with “Save Rate”, which takes the total number of saves by a closer and divides it by the total number of team opportunities.  This statistic takes the Appearance Rate to the next level.  Since closers can have a high Appearance Rate but low number of saves or low save percentage, Save Rate balances that out.
Save Rate lets us know how successful a Closer was in recording saves relative to the percentage of his team’s save opportunities.  It tells us how successful one was based on how effective he was in fulfilling his team’s need.
Essentially, it rewards those with more saves in less team opportunities, and takes away from those with less saves in more opportunities.
Valverde had 47 saves out of 54 chances, and his team had 64 real save opportunities.  His Save% would be 47/54 and his Appearance Rate would be 54/64.
His Save Rate would be 47 (# of saves)/64 (# of total team chances for a save), which comes out to 73.4 %, meaning that Valverde successfully saved 73.4 % of the DBacks team save opportunities.
Francisco Cordero of the Brewers was 2nd in the NL with 44 total saves.  He also blew seven saves giving him 51 opportunities.  His Save% was 44/51, very similar to Valverde, but his Appearance Rate was higher because the Brewers had six less team save opportunities and he only had three less appearances than Valverde.
His Appearance Rate was 51/58, or 87.9 %.  He appeared in more games proportionate to his team’s need.
His Save Rate would be 44/58, or 75.9 %, higher than Valverde’s.
To see the nine closers in order of Save Rate, look at the table below.  Again, it lists the total saves and opportunities of the individual, as well as the total team opportunities, and then the actual Save Rate.

F. Cordero 44 51 58 75.9 %
Valverde 47 54 64 73.4 %
Wagner 34 39 48 70.8 %
Hoffman 42 49 60 70.0 %
Isringhausen 32 34 46 69.6 %
C. Cordero 37 46 59 62.7 %
Dempster 28 31 48 58.3 %
Lidge 19 27 55 34.5 %
Fuentes 20 27 59 33.9 %

It makes sense that Cordero would be higher because even though his save totals and appearance totals were slightly less, he was involved in a higher percentage of his team’s chances and he converted successful saves at almost an identical number and percent.  Basically, he had less opportunities and still did the same exact thing – not the same ratio, but the same thing.
This does not necessarily mean Cordero had a better season.  This is merely one part of a two or three part article series and Save Rate is only the first part to a weighted system that should be able to determine who the best Closers are based on statistics that essentially define a good Closer.
Next week I will get into the different types of saves featured in the data sheets and discuss their importance in determining quality and effectiveness. WPA and Win Predictors will be discussed as well.
I will also look at raw numbers to help come up with the Seidman Closer Model to properly evaluate Closers.
In closing (pun very intended), I just want to add that the Closer position has become such a fickle one over the years that these evaluations need to be done on a year to year basis.  Jose Valverde was arguably one of the best NL Closers in 2007, and somewhat of a replacement, or makeshift closer, in 2005.  Brian Fuentes was dynamite in 2005 and still pretty good in 2006, yet so bad in 2007 that he lost his job.
It is remarkable how inconsistent Closers are, and that is one of the primary reasons (along with playoff success) why Mariano Rivera will go down as the greatest ever.
Lastly, of the nine closers used in this ongoing study:

  • Lidge and Fuentes were demoted in 2007
  • Lidge and Valverde were traded to new teams
  • Dempster is likely going back to the starting rotation
  • Francisco signed a huge four-year deal with a new team
  • Billy Wagner changed teams from 2005 to 2006

The only NL Closers that have actually kept their job for the same team between 2005 and 2007 are – Trevor Hoffman, Jason Isringhausen, and Chad Cordero.

2007 NL Starting Pitching Analysis

When it comes to analyzing and comparing pitchers, those conducting the comparisons will often find themselves in a tricky situation.  Sure, certain pitchers are better than others, but what are they specifically better at? 

How can we conduct an honest analysis when there are so many variables to consider?  And how can we truly determine which pitchers were better than others when some are on terrible teams with no run support and others are on tremendous teams with tons of run support?
The first step is to determine what we are measuring.  If we want to know who the best strikeout pitcher is, we should look at the raw total for strikeouts and also an average of K/IP, since some guys will make less starts than others.  To figure out who walks the least, we measure the number of walks each pitcher gives up and a walk-IP ratio.
These measurements are contingent on one category, though, and cannot tell us who is better or more effective than the rest.  All of the research and ideas presented in this article are designed to measure the “effectiveness” of a pitcher. 
In order to determine this effectiveness, a whole heck of a lot of numbers need to be measured and properly weighted/scaled so that everybody has a fair shot – whether or not they are on a great team.
I took the 1-3 best pitchers from each National League team and entered their statistics into a database, measuring everything from their raw Innings Pitched totals to their Adjusted Quality Start % (you’ll read more on that below).  After entering all of the statistics, and crunching numbers until my brain turned to mush, I came up with my weighted points system.  I assigned the corresponding point totals and added everything up to determine what I feel is a very accurate measurement of pitching effectiveness amongst the NL’s best. 
This was not applied to every single NL Pitcher in 2007 (I will do that another time) but rather amongst these 30 selected #1, #2, or #3 starters.  For instance, a guy like Jeff Suppan may have been more effective than Jason Bergmann but I wanted to have at least one person from each team.
The system is not 100% perfect and does not take into account every single statistic (do you know how many statistics there are??), but it definitely levels the playing field between those on good or bad teams, those injured/called up or just plain bad, and those who got lucky or unlucky with run support.  The points are assigned based on the areas I, as an intense student of the game, feel are most important to determine true effectiveness. 
The basic idea of this system is to measure the true quality of a pitcher over his season – IE, what would happen if a pitcher was rewarded every time he pitched well and discredited every time he pitched poorly – something that happens perfectly just about 0% of the time. 
We will begin by going over the statistics involved, what their points scale was, and why they are used.  The idea behind these corresponding point totals is to properly weight the areas in which most people intuitively attribute to success and quality.
The points given to each statistical subset are designed to separate the aces from the workhorses and the workhorses from the seemingly replacement level pitchers.  They may seem arbitrary and could be replaced with different numbers, or fractions/decimals, however the difference between the points in subsets was based on the amount of pitchers who fall into certain categories.
In order to be as effective as possible, a pitcher needs to make as many starts as he can.  How can we say that a pitcher with 14 starts is more effective than one with 34-35, even if his numbers in those 14 starts are tremendous and the numbers of the one with 34-35 are a bit worse?  His numbers may be better than the pitcher with 35 starts, however the latter pitcher was involved in 21 more games and proved to be durable enough to pitch an entire season, and solid enough to maintain his SP status for 162 games. 
This does not mean that a pitcher with 35 starts is necessarily “better” than one with 14-16, but rather he is more effective because he is involved in more of his team’s season. 
If the pitcher with 14-16 starts posted the same numbers in 32 starts, it would not be a contest.  But, he didn’t – it was only 14-16.  You cannot have as much of an effect on your team (actual play, not motivational or anything) unless you are out there as often as possible.
***What the end result of this effectiveness points system showed is that those with average numbers, over 30+ starts, were equally as effective, or slightly better/worse, than those with good numbers over 16-20 starts.***
If somebody makes only 14 starts in a season, it could be because he was injured for half of the season or was called up from the minors during the season, so he should not be penalized with negative points for that – he just should not be rewarded as highly as someone with 30+ starts.

  • if over 30 starts, +5
  • if 25-29 starts, +3
  • if 20-24 starts, +2
  • if under 20 starts, 0

Just like Games Started, IP can only get you positive numbers, because the low raw number of IP can be attributed to injury or a midseason call-up.  Those with more IP get higher point totals, though.  The reason for 0 points for under 100 innings is because you were not necessarily a bad pitcher, but the lack of innings (whether due to injury or a call-up) limits the effectiveness.

  • if 230+, +8
  • if 220-229, +7
  • if 200-219, +5
  • if 150-199, +3
  • if 100-149, +2
  • if under 100, +1

This is where negative numbers can begin.  If you were hurt, or called up from the minors, you are not penalized with negatives for the raw number of innings pitched or games started, but if you posted a high number of starts and low number of innings, this statistic will bite you in the rear.  IP/Game separates the hurt or called up from the downright below average or bad.  It also helps reward those with a couple less starts than others but with more raw innings pitched.  These types of pitchers were in the same GS range but some went deeper into games than others.  Nobody averaged over 7 IP/gm, so we start lower.

  • if 6.5-7 IP/gm, +7
  • if 6.0-6.49 IP/gm, +5
  • if 5.5-6 IP/gm, +3
  • if 5.0-5.5 IP/gm, 0
  • if below 5.0 IP/gm, -5

If you cannot average over 5 innings per game, or exactly 5 innings per game, you should not be a starting pitcher.  Even Adam Eaton averaged over 5 IP/gm in 2007.
Quality Starts can be an inaccurate statistic because it takes into account games in which a pitcher goes 6+ innings and gives up no more than 3 earned runs… and nothing else.
If a pitcher goes 8.1 innings and gives up 4 runs, it is arguably the same ratio and an equal game in terms of quality, but does not get counted as a quality start.
With that in mind, I came up with the stat of Adjusted Quality Starts, which takes into account all regular quality starts as well as games in which someone goes 7.2-9 innings and gives up no more than 4 runs.  This measures the true number of games in which a pitcher had a good-great performance.
***If you wonder why it is 7.2 IP, instead of 8, the number was derived from the amount of times a pitcher was lifted after 7.2 IP for a specialist, or other sort of reliever, and from the sheer low average of innings pitched/game by a starter this year.  Reaching the 7th inning is now a great feat, let alone coming within one out of finishing the 8th.  Though the previous ratio for a QS was 2:1, due to the data mentioned above, going an extra 1.2 IP to get to 7.2 IP merits being able to give up one more run.***
I used the percentage of AQS to the total number of Games Started to measure effectiveness in this area.  Someone over 75% almost always pitches a good-great game, whereas someone under 50% only pitches a good game less than half of the time – not very effective.

  • if AQS % is above 75%, +5
  • if AQS % is 67-74%, +3
  • if AQS % is 50-66%, 0
  • if AQS % is below 50%, -3

If you’re keeping score at home, AQS= 6+IP with ER =< 3, AND, 7.2+IP with ER =< 4, where =< is the blog version of greater than/less than or equal to. 
In addition to AQS, something that needs to be taken into account is how often a pitcher went for a complete game, since they are so rare.  We also need to take into account a shutout, since they occur even less. 

  • For every CG, +2
  • For every SHO, additional +1

***NOTE: Aaron Harang had two games in 2007, one where he went 9 IP, and one where he went 10 IP, when he did not get a decision.  Even so, I am counting these 2 as a combined 1 CG, since he went 9+ innings.***
W-L Records are the most deceiving statistics because they do not take into account the true quality of the games pitched.  Just because a pitcher goes 14-7 does not mean he was necessarily a great pitcher.  He could have pitched terribly and had great run support in 10 of 14 wins, but brilliantly with terrible run support in the 7 losses.
The whole point of the adjusted W-L records is to get an AQS, since that means you pitched well and should be rewarded, even if your team (offense or bullpen) does not help you. 
After all, Ian Snell cannot control the Pirates’ offense.  It is not his fault that 4 of his 12 losses were “Tough Losses” and all 11 of his No-Decisions were games in which he pitched brilliantly and had an AQS, yet he received little to no offense to help garner him a ‘W’.
With that in mind, I changed W-L to the following 5 stats:

  • Cheap Wins: wins in which one does not get an AQS (-1)
  • Tough Losses: losses in which one does get an AQS (+2)
  • Legit Wins: wins in which one does get an AQS (+2)
  • Legit Losses: losses in which one does not get an AQS (-2)
  • ND-AQS: no-decisions in which one gets an AQS (+1)

I received some questions for how these numbers came to be, and to keep it simple, the statistics that actually have an effect on the W-L record are valued higher (negatively and positively) than the statistics like ND-AQS, which prevent a pitcher from winning but do not hurt him with a loss.
ND-non AQS is not used here for the same reason that Cheap Wins is only negative one, which is that not every Cheap Win or ND-non AQS was a terrible start.  A large bulk of them were games in which a pitcher had a good outing but only went 5 or 5.1 innings.   Cheap Wins loses you a point (not two, only one) because you do not get an AQS but it does effect your win-loss record.  ND-non AQS means you do not get an AQS but it does not effect your win-loss record, which is why I decided to just leave it out.
Though I am not too fond of this statistic and originally tinkered around with separately evaluating H/IP and BB/IP, using WHIP just seemed to make things easier.  Though it does not tell us which pitchers walk less and give up more hits, or vice versa, or tell us how many “empty innings” a pitcher had (innings where no baserunners got on), it does provide a valid average of baserunners to expect in a given game since it does not equate to a per-9 inning scale.

  • if WHIP 1.00-1.15, +3
  • if WHIP 1.16-1.25, +2
  • if WHIP 1.26-1.30, +1
  • if WHIP 1.31-1.40, 0
  • if WHIP above 1.40, -2

Instead of using K’s, I wanted to use the ratio of strikeouts to walks, since not every pitcher is a strikeout pitcher.  Even so, you do not have to be a strikeout pitcher to be an accurate one, and because of this I rewarded those with high K:BB ratios.  Greg Maddux only struck out 104 in 34 starts, but only walked 25 – a K:BB of 4.16.  This meant that Maddux kept more runners off-base by striking them out and not walking them.

  • if K:BB above 4, +7
  • if K:BB above 3, +5
  • if K:BB above 2, +3
  • if K:BB above 1, 0
  • if K:BB 1 or below, -3

Now that we have the points, let’s test it out and put it to use.  We will use Ian Snell and Carlos Zambrano.
The table below shows Ian Snell’s 2007 numbers and points he receives for each in my points system.

Starts 32 +5
Innings 208.0 +5
Cheap W 0 0
Tough L 4 +8
Legit W 9 +18
Legit L 8 -16
ND-AQS 11 +11
AQS % 75% +5
IP/Game 6.52 +7
WHIP 1.33 0
K:BB 2.60 +3
CG 1 +2
SHO 0 0

When we add up all eleven of these numbers, we get Snell’s Effectiveness #, which comes to: +48.
Now, let’s look at Carlos Zambrano’s season numbers in the table below and add his point totals up.

Starts 34 +5
Innings 216.1 +5
Cheap W 0 0
Tough L 2 +4
Legit W 18 +36
Legit L 11 -22
ND-AQS 0 0
AQS % 53% 0
IP/Game 6.36 +5
WHIP 1.34 0
K:BB 1.75 0
CG 1 +2
SHO 0 0

We look at his numbers and add up the totals to get his Effectiveness #: +35.
Zambrano had more legit wins but also more legit losses, and of Zambrano’s 3 no-decisions, none were ND-AQS, whereas of Snell’s 11 no-decisions, all were ND-AQS. 
That tells us that if each player got a win for every game he pitched well, and a loss for every game he did not pitch well (did not get an AQS), and the only no-decisions they received came from no-decisions that they pitched poorly in or did not go a full 6 IP, their records would look like this –

  • Carlos Zambrano (18-13) would actually be 20-11
  • Ian Snell (9-12) would actually be 24-8

Snell went further into his games, had a better K:BB ratio, and had that higher AQS %.  It also tells us that of Snell’s 32 starts, 24 of them were of great quality, whereas Zambrano had 18 good-great starts and 16 average-bad starts.
This essentially tells us that while Zambrano’s good-great starts may have been better than Snell’s good-great starts, when Zambrano had his bad starts, Snell was still having good-great ones.
As mentioned before, I used this points system to evaluate 30 National League pitchers.  I compiled a group of spreadsheets, ranking the pitchers in order in different categories to show that certain stats we rely on do a bad job of proving effectiveness.
To view all of my results, click on the links below.  You can use this data in other areas, but please credit my work.

  • To see the list of pitchers and their statistics used to assign points, click here.
  • To see the list of pitchers in order of effectiveness points, click here.

I do not want to post a ridiculously long table on this article, so you will need to look at the linked files to see the results, but I will list the top 15 pitchers and their effectiveness points.

  1. Jake Peavy, +74
  2. Aaron Harang, +69
  3. John Smoltz, +69
  4. Brandon Webb, +67
  5. Cole Hamels, +65
  6. Brad Penny, +64
  7. Tim Hudson, +63
  8. Ted Lilly, +60
  9. Matt Cain, +52
  10. Roy Oswalt, +50
  11. Ian Snell, +48
  12. Bronson Arroyo, +47
  13. Derek Lowe, +47
  14. Greg Maddux, +45
  15. Adam Wainwright, +45
  16. Jeff Francis, +45

And, again, these points were assigned to statistics based on how important they corrolate to effectiveness.  The points system essentially covers the statistics and averages from all angles.
The most shocking part of this was how low Chris Young of the Padres came out.  Young went 9-8, with a 3.12 ERA, in 30 starts.  He should have been more effective, I thought, based on those numbers.  After looking at his game logs, though, I changed my mind and realized it made sense.
Of his 30 starts, he was essentially two different people.  In the 19 starts in which he went for 6+ innings, he was 9-1 with a 1.64 ERA, averaging 6.6 IP/gm, with a 0.85 WHIP and 129 K’s in 126.1 innings.
In the other 11 starts, he was 0-7, with a 7.14 ERA, only going 4.2 IP/gm, with a 1.76 WHIP, and 38 K to his 36 BB, in 46.2 innings.
After analyzing his situation and the points system I realized that my effectiveness model favors consistency and lower standard deviations (the average of how far someone strays from his average).  To me, that truly defines effectiveness.
I would much rather have a guy who I knew would amass an AQS 67% or more of the time than a guy who might strikeout 20 batters and pitch a two-hitter in one game, but give up 5 runs in 6 innings for the next three, before again pitching a brilliant game.
As long as the consistency is of a good nature, consistency in this model proves effectiveness.
I know, we’re finally at the end of the article, right?  I apologize for the length but it took this long to get everything across. 
Looking at Jake Peavy, the most effective NL pitcher at +74, we see that the only counted statistic in which he led was AQS.  Peavy had the most good-great starts of any NL pitcher.  While he may not have led in IP, IP/gm, K:BB ratio, or least losses (Brad Penny only had 1 legit loss), he led in consistency and being consistently good-great.
These results also show that Cole Hamels, with 6 more starts that he missed due to injury, would likely challenge Peavy for #1 in effectiveness – however, as my model dictates, the fact that he missed those 6 starts and Peavy did not shows that Peavy was more effective.
Yes, there were more stats we could add to this, and more variables to account for, but I feel this accurately levels the field of play between pitchers in distinctly different playing situations, and levels the difference between 2007 reputation and 2007 actual performance.
I must remind you before I come to a close, though, that this is only a measure of effectiveness, not the end-all solution to determining who the “best” pitchers are.
However, for this Sabermetrician, effectiveness directly corrolates with quality and value.

2007 Sabermetric Year in Review: San Diego Padres

I was looking forward to reviewing a playoff team.  According to Pythagoras, the Padres were the second-best team in the National League last year.  Unfortunately, Pythagoras has been dead for a few millennia.  For what it’s worth, Friar fans, I consider the Padres to be a playoff team for 2007.
Record: 89-74, 3rd in NL West. 
Pythagorean Projection (Patriot formula): 89.55 wins (741 runs scored, 666 runs allowed, but it was the last three that hurt the most). 
Team Statistical Pages:
Baseball Reference
Baseball Prospectus
MVN Blog:
San Diego Spotlight
Other Padres Resources:
Latest News
Contract Status
Trade Rumors
Overview: It’s really not fair to reduce the Padres season down to the playoff against the Rockies.  One game isn’t a big enough sample size to tell you much of anything.  I’ll leave the analysis of that game to the poets and philosophers. 
A lot of other things happened in 2007 for the Padres.  Trevor Hoffman recorded his 500th career save.  Chris Young got into a fight with the Cubs’ Derrick Lee, and both were suspended.  However, Young was able to pitch in the All-Star game, while on suspension!  (His first baseman while he was pitching?  Lee.)  Now that’s setting a good example for the kids!
What went right: Jake Peavy won the Cy Young Award, unanimously.  It wasn’t even that close.  Peavy finished the year with a VORP of 77.0 (best in baseball among pitchers), a WPA of 4.79 (best in baseball), and a strikeout rate in excess of more than 1 per inning (third only to Erik Bedard and Scott Kazmir among starters).  Peavy was something of a worry for Padres fans coming into the season.  After showing signs of brilliance in 2005 (and being the ace of the staff of the 2006 World Baseball Classic U.S. team), Peavy dipped to an 11-14 performance in 2006, with a 4.09 ERA, a jump of 1.2 runs.  What’s odd is that his walk and strikeout rates were about the same from 2006 to 2007.  But, in 2006, his BABIP was 30 points higher.  In 2006, Peavy got a little un-lucky.  The other thing that Peavy did better in 2007 was to induce more ground balls, which cut down on his HR allowed totals (that and pitching in Petco Park).  Maybe Greg Maddux showed him a few tricks.  Probably one of them will be how to win multiple Cy Young Awards.
In addition to the strong pitching of Peavy, the Padres also had the benefit of having the best relief pitcher in the National League on their team.  And I’m not talking about Trevor Hoffman.  Heath Bell pitched in 81 games, threw 93.2 innings, and was obtained from the Mets for two players who appeared in a combined total of 10 games.  I’d say the Padres did OK in that deal.  Methinks that someone in the Padres front office saw this chart in the off-season last year.  Bell, since he came up, has always been a near-strikeout per inning guy.  His walk rate was decent enough.  But, his BABIPs in 2005 and 2006 were .374 and .394, respectively.  Bell was clearly getting burned by Lady Luck, and burned bad.  But, even from 2005 to 2006, he started turning some of his flyballs into ground balls.  This is how a smart team finds themselves a player and steals him from a team that just doesn’t get it.  In 2007, his BABIP fell to .260, which is below the league average (read: Bell is due for a downward correction, but not a horrible one), and he induced even more ground balls, this time turning a lot of line drives into ground balls.  He’s a two-pitch pitcher (fastball and curve), but if there’s one area for improvement, it’s that Bell’s release points for the two pitches are rather distinct, and I suppose that a good hitter might be able to pick up on that.
And at least for this year, the Barfield-Kouzmanoff (my Russian-speaking wife threatened never to speak with me again after hearing me attempt to pronounce that name) turned out OK.  The trade was a good old youngun-for-youngun bet that makes baseball fun.  The failures of Marcus Giles at second base, failures such as being a below-replacement-level hitter, obscured the fact that Barfield was also well below replacement level this year in Cleveland (where he was eventually benched) and that Kouzmanoff had a decent year.  Bill James projects him to have a 100 point jump in OPS next year, which seems a little overly optimistic to me.  He will be a mere 26 years old when he takes the field on Opening Day next April, so there’s some room for growth.
What went wrong:  Here’s a list of everyone who played left field for the Padres in 2007. 

  • Terrmel Sledge: recently signed with the Nippon Ham Fighters. Functioned below replacement level.
  • Jose Cruz, Jr.: Below replacement level.  Maybe it’s time to let go, Jose.
  • Milton Bradley: Injured Mike Cameron, then injured himself while arguing with an umpire in the same game.  Then again, I once sprained my ankle while conducting psychological research.  Don’t ask how.
  • Scott Hairston: Yawn.  A VORP of 9.2. 
  • Russell Branyan: the poster boy for the “swing real hard in case you hit it” school of thought.  He’s had a ten year MLB career!  Barely above replacement level.
  • Geoff Blum: What the heck was he doing in left field?  Barely above replacement level, and that’s second baseman replacement level.
  • Rob Mackowiak: Below replacement level.  And I can never figure out how to pronounce his name.
  • Paul McAnulty, Brady Clark, Hiram Boccachica: below replacement, barely above replacement, below replacement.

Yeah, that about sums it up: ROCKIES 13TH: T.HOFFMAN REPLACED C.HEADLEY (PITCHING); K.Matsui doubled to center field; T.Tulowitzki doubled to center field [K.Matsui scored]; M.Holliday tripled to right field [T.Tulowitzki scored]; T.Helton was walked intentionally; J.Carroll out on a sacrifice fly to B.Giles [M.Holliday scored… sorta]  (from here, with one minor edit… I’ll let you figure out where.)
I don’t think his hand hit the plate either.  Then again, Holliday’s triple was really a home run.  Sadly for Padres fans, that’s a game that will be shown on ESPN Classic for years and years to come.   But, since we’re playing the “then again” game, had the Padres just held on against the Brewers in the second-to-last game of the regular season, none of that would have happened and Jake Peavy would have started Game 1 of the Divisional Series against Cole Hamels.  And it would have been the Padres who got swept by the Red Sox.
Revisiting Scott Linebrink: I’m sure there was a bit of head-scratching that went on when the Padres dealt Scott Linebrink to the Brewers.  Linebrink had an outstanding 2005, regressed a bit to the mean in 2006 (but was still rather good), and wasn’t having a bad year in 2007.  Why trade a valuable piece of a bullpen?  Linebrink’s walks were up slightly and his strikeouts were down a bit, although the rest of his component numbers were pretty similar from 2005-2007.  I have to wonder whether the Padres were tipped off to a mechanical issue or an injury and decided to sell high (if they were, they got it wrong as his numbers in Milwaukee were better than his San Diego numbers), or maybe it was the fact that Linebrink was on the edge of free agency.  Maybe it was the best reason of all to make any trade: they had Hoffman, Bell, Cla Meredith, Justin Hampson, Kevin Cameron, and a rejuvenated Doug Brocail, all who had season ERAs under 4.00, in their pen and traded away something surplus to parts.
This has nothing to do with Sabermetrics, but it has to be said: The San Diego Padres have the worst fashion sense in all of baseball.  I feel better now.
Is Greg Maddux looking good because of Petco?:  Well, who doesn’t look better pitching in Petco?  But, for what it’s worth, see for yourself.  Maddux is started 16 games at home and 18 on the road.  Double his road stats and he’s still a 12-12 pitcher, and he still never walks anyone.  He had a 4.65 ERA on the road.  Not bad for a 3rd starter.  The thing that’s probably making Maddux look better is the fact that his manager is using him properly.  None of Maddux’s starts went into a triple-digit pitch count.  It left him as mostly a 6 inning starter, but the Padres had the bullpen to absorb that sort of workload.  Bud Black, a former pitcher himself, apparently understands that while Maddux still has Greg Maddux’s mind, he is no longer his mid-90s physical self.  He’ll be 42 next year, but he’s still a pretty good, although no longer great, major league pitcher.
Outlook: Let’s see here.  Three good starters?  Check.  Good bullpen?  Check.  Offense?  Well, the Padres have that “one big bat away from contending” feel, except that they need more than one bat.  First, they’ll have to rebuild their outfield.  Brian Giles is 37, Mike Cameron is suspended (and a free agent anyway), and Milton Bradley is out for the year already.  But, this is a team that almost almost almost made it to the playoffs last year.  The Diamondbacks and Rockies hit a rather large vein of luck.  Games are played on grass, not on paper, but even with all its little imperfections, this Padres team was a playoff-caliber team… on paper.  It’s probably too much to ask them to rebuild an outfield that quickly, but maybe karma will come back and give a couple good breaks to the Friars next year.