The Santana Hypocrisy

Before getting into the article I wanted to mention that my personal website, www.ericjseidman.com is now back up and running. The site holds information for all of my endeavors, including sabermetrics, magic, and my professional screenwriting.
DISCLAIMER: This will not truly be a statistical piece but rather more along the lines of psychology and opinion. And yes – the title sounds like a Matt Damon movie title.
I was watching Freaks and Geeks the other day and an incident in the episode sparked a metaphor in my mind. In the show, Sam really liked Cindy Sanders, a girl who was dating a jock and only wanted to be his friend. At dinner Sam told his mother about Cindy’s lack of interest. His mother, trying to keep her son optimistic, told him she was making a mistake/dumb decision and that it would be “her loss.”
I wondered, though, would Sam’s mother have been as “down” on Cindy if Sam came home with news that Cindy did like him?
As in, is it okay to “diss” or find flaws in something not yours if you would be ecstatic if said thing was yours?
Even though I would love to continue talking about one of my favorite television shows the purpose of this post is to direct the above question towards the recent trade of Johan Santana.
MY REACTION
Unequivocally, I am a die-hard Phillies fan. Though I seem to adopting the Rays as a second team the Phillies are the sole owners of the baseball-area in my heart. Even though they are my favorite team, and the Mets are in their division, I am really excited about the Johan trade.
Yes, a Phillies fan excited that the Mets improved their team.
Johan has been a favorite of mine since 2002 when, via the MLB digital cable package, I watched him routinely make relief appearances. I always noted how “cool” or “funky” his windup and delivery were and loved watching him on the mound. He has also been the only non-Greg Maddux player that I like to exclusively follow.
Now he is in the same division as the team I root for and I cannot wait to see these games. I cannot wait to see a Hamels/Santana battle of the changeups, or Santana facing off against Jimmy Rollins in the 8th inning of a (hopefully) meaningful September game. I am greatly anticipating a Santana/Peavy Sunday Night Baseball matchup or even just simply watching the guy bat!
Unfortunately, I am mostly alone in my thoughts when it comes to non-NYM NL East fans. You see, a stark contrast exists between the definitions of “die-hard fans” and that is the main reason I am mostly alone in my thoughts. There are fans whose personal lives are so effected by sports that it borders on sick obsession, and there are fans like me, fans who give so much of their heart and mind to the game but can continue their regular lives when the game ends.
I am a die-hard Phillies fan but, when the Mets landed Johan, I did not cry, pop pills, seek therapy, or curse on message boards. I grinned. I grinned as if to say – “Oh, you rascal Metropolitans!” I grinned because this is going to be a very exciting season.
In an initial reactive conversation with my brother Corey, though, he caught me doing the same thing I had been complaining about to him – falling into The Santana Hypocrisy.
THE HYPOCRISY ITSELF
I made a comment to him along the lines of – “I mean, honestly, how many games is he going to personally improve?”
Corey called me on it and I admitted fault. After all, this is such an easy hypocrisy to fall victim to but it becomes a problem when fans become so entrenched in it that they lose touch with reality.
DISCLAIMER 2: This is not means to bash any fan of any team, so Mets, Phillies, Braves, and Twins fans, please do not scream down my throat. I am merely investigating the human nature and seemingly programmed response that falls into this hypocrisy.
I have read a plethora of reactions on this trade and, while most are valid or provide some semblance of a reasonal response, some are ridiculous in their hypocritical nature. The hypocrisy does not stem from the reactions, themselves, but rather the fact that these reactions would be completely reversed if the circumstances were different (IE – if Santana was on their team).
The reactions to this trade seem to come in three forms – excited, disappointed, and angered. You’ll never guess which bunch are excited.
The disappointed department houses some Twins fans, Phillies fans, Braves fans, Manny Acta and Felipe Lopez, 12 of the 32 Marlins fans, some Red Sox/Yankees fans, and one Royals fan (Joe Posnanski). The angered department holds the rest of the Twins fans and some very opinionated Phillies and Braves fans.
Some of those in the angered department have lost some sense of reality. I have read so many posts that point out flaw after flaw after flaw about Johan, be it his home run total of last year, his decline in W-L record (useless stat), his high ERA (yeah, 3.33 in the AL is ridiculously high, right?), his potential arm troubles, how “overrated” he is, or anything else along those lines. These fans are finding everything they can to serve the dual roles of –

  • Raining on the parade of Mets fans
  • Making themselves feel better about not acquiring Johan

There is no way in hell these fans would search for these flaws if their teams landed Santana. If the Twins signed Johan to an extension he would have a great year and would be applauded for staying. If the Phillies got him then it would seem very likely that a team with the NL’s best offense, the MLB’s best pitcher, and arguably the best young pitcher would perform VERY well. If the Braves were able to line him up alongside Smoltz and Hudson, something tells me that his “flaws” would be forgotten more quickly than Mark Lemke’s pitching career.
Why do we all allow ourselves to criticize someone we would shower with love if in our presence? It is jealousy? Fear? Ignorance? Probably all three.
THE MAN NAMED JOHAN
Johan is the best pitcher in baseball and makes a significant difference on any team he plays for. He did not single-handedly will the Twins to the playoffs during his tenure there but I would love to see how many of those Twins teams would have made the playoffs without his services.
To not acknowledge the difference he makes is to be an ignorant baseball fan.
To go as far as to say he is not that great, has a ton of flaws, or is overrated is to be a fan completely detached from reality. I can guarantee that every other pitcher on the teams that these fans root for has many more flaws than Johan.
There are reasons this guy has finished either #1 or in the top five in Wins, ERA, ERA+, WHIP, K, K:BB, SHO, GS, and IP over the last four years. The primary of those reasons is that he is extremely dominant and talented. In my SP Effectiveness System, where you need a +50 or higher to be considered a #1 SP, Johan has averaged a +71.3 in in the last four years. That is clearly the most from 2004-2007 and the only four-year spans since 2000 that were higher were the 2000-2003 seasons of Curt Schilling and Randy Johnson, both of whom are at the end of their careers now.
He has made 134 starts since 2004, and 97 of them have been AQS, which is 72 %, more than anyone else in that span.
Looking even further, if we want to use W-L records as a barometer, we are going to use my Adjusted W-L. Johan has gone a recorded 70-32 in the last four years (an average of 18-8 per season), but by my calculations, his Adjusted W-L would be 78-24 (an average of 20-6 per season).
I have no problem with people being upset that Johan now plays for the Mets. I have no problem with people not personally liking Johan Santana. I have no problem with people not personally liking the Mets (hey, I don’t like them!). I also have no problem with fans questioning the opinions of other fans.
I do, however, have a problem with no middle ground of opinion existing.
It seems that Mets fans believe they have already won the world series and, based on numerous message boards I have read, Phillies and Braves fans think Johan stinks. The Mets fans overexaggerate and the other fans have to do the polar opposite to compensate. There are very few people, relative to those who express opinions, who can be fans of other teams effected by the trade and be able to acknowledge that the Mets did something positive by gaining a great player. It’s either Johan is the messiah or Johan is overrated.
If a player, who when on your team, would increase a bulge in your pants worthy of Ron Burgundy’s thumbs-up, there is absolutely no justifiable reason to legitimately criticize said player and point out his flaws just because he is on another team. It is equivalent to really wanting a toy truck and, when you find out you can’t have it, calling that truck stupid or pretending like you don’t want it. In other words, it’s very childish.

A Closer Look at Closers – Part One

Over the course of the next few weeks I will be primarily working with Closers – trying to determine the most effective ways to evaluate talent and quality at an inconsistent position that sure receives some hefty and consistent dollars.
This first part will introduce my opening step to a weighted formula to determine the value of a Closer, as well as discussing what a Closer is, and how we currently evaluate them.
Though this first part will focus solely on 2007, my study also consists of data from 2005 and 2006.
THE “NINE”
When compiling my data and examining game log after game log, I decided that my study and research should focus on some consistency, which can be hard to find for a Closer. 
I looked at the National League in 2005, 2006, and 2007, and wanted to limit my group to include only those who reached a certain criteria.  Initially I thought that anyone with 25+ saves in all three seasons should qualify.
Then, I actually saw the numbers and realized that would limit my study to include onlyJason Isringhausen, Trevor Hoffman, Billy Wagner, and Chad Cordero.
Suffice it to say, I wanted to have some more people in there.  With that in mind, I altered my criteria to simply those who actually were closers during those three seasons.  I also took into account the fact that some were demoted, promoted, or injured, and so my criteria called for 15+ saves in 2005, 2006, and 2007.
With those numbers, the nine Closers who find themselves under my statistical microscope are – Isringhausen, Hoffman, Wagner, Chad Cordero, Francisco Cordero, Brad Lidge, Jose Valverde, Brian Fuentes, and Ryan Dempster.
Yes, Francisco Cordero was in the AL for 2005 and some of 2006, however he has recorded 103 saves in the last three seasons and 60 of them were in the NL.  Plus, the whole idea of working with Closers stemmed from the idea that an inconsistent one-inning pitcher could receive a 4 yr/40 mil deal.
WHAT IS A CLOSER?
Simply stated, a Closer is a pitcher called on in the 8th or 9th innings, whose job is to seal the win for his team.  If he does his job he records a “Save.”  If the other team comes back to tie the game, he records a “Blown Save.” 
If you asked anyone about those stats before 1969, though, they would assume you were discussing hockey or soccer since saves are a relatively new statistic.
RECORDING SAVES
There are three ways a pitcher can record a save. I know this is a recap for many readers but it is important in the grand scheme of my study. The first way, which is how most people generally describe saves, involves the pitcher entering in either the 8th or 9th inning, with a lead of three or less, and preventing the other team from coming back to tie.
The second way is contingent upon when you enter the game and in what situation.  If you enter the game with the tying run on base, no matter the lead (usually extends it to a 4-run lead), and prevent the team from tying, you get a save.
The third way, which is how most middle relievers will rack up their 1-3 random saves per season, involves a pitcher going for the final three innings of the game – regardless of the score.  If the Phillies lead the Braves 9-1 and Ryan Madson pitches the 7th, 8th, and 9th, he gets a save.
DIFFERENCE IN SAVES
If there are different types of save categories, doesn’t that mean there are different save types for each category?
Yes.  Plenty.  Think of it this way.  If you enter the 9th inning with only one out to go, and a 5-3 lead and bases empty, and you end the game, you get a save.  If you enter the 9th inning with only one out to go and the bases are loaded, and you end the game, you get a save.  One is clearly harder to do than the other and has a higher risk of resulting in a blown save, yet each ultimately results in the same statistic – a save.
With that in mind, I looked at the 9th inning and thought of all the possible situations that someone could receive a save.  In the 9th inning, there are 72 different ways to record a save, excluding what the pitcher does in the inning. 
If we count what the pitcher does, either giving up a run with a two-run lead or two runs with a three-run lead, and so forth, in the 9th inning there are 144 total ways to record a save.  I will get more into these different ways in Part Two, however the basic idea is that there are eight situations of baserunners (empty, 1st, 2nd, 3rd, 1st and 2nd, 1st and 3rd, 2nd and 3rd, bases full) and 18 different variations of these eight situations.  These variations include entering with 1 out, with 2 outs, with no outs, entering with 1-run, 2-run, or 3-run leads, and more of the same.
144 different ways can a pitcher record a save in the 9th inning, depending on how many outs he records, what the baserunning situation is, and how many runs he gives up.  Yes, this can be said for many other statistics, but Saves generally only span 2-innings MAX, and so the huge number of different types means a bit more here.
CLUTCH FACTOR
I am not dealing with “clutch” in my study.  To read some fascinating insights into relief pitching and relief clutch, read Pizza Cutter’s articles on the subject.
Instead, I am looking at what actually happens and how it happens, not the potential of why it happens.
CURRENT CLOSER EVALUATIONS
Many people will look at two, and only two, stats when determining the quality of a closer – total saves, and percentage of successful saves (saves/save opportunities).  It has been pounded into our heads as a barometer and these statistics are supposed to inform us that the “best” closers are the ones with either the most saves or least blown saves.
What I am contending is that if there are 144 different types of 9th inning saves, and the barometer is the sum of all converted opportunities, regardless of the type of save, the needs of your team, and the situation at hand, it is impossible to equate total saves to quality.
Think of it this way – Closer A and Closer B both have 30 saves.  Closer A has 6 blown saves while B has 4 blown saves.  With those numbers, which are usually the only ones readily available, we assume that Closer B was better.  After all, he blew less saves.  What if the 4 saves he blew were all 3-run leads with bases empty and only 1 out to go in the 9th inning, though, which is the least dangerous save situation of the whole 144.  And what if the 6 blown saves of Closer A were all games he came in with runners on third base and no outs, or games where he entered in the 8th inning.
It becomes very difficult to gauge the “better” factor with just those numbers.
Regardless, even looking at hypotheticals like that, which take into account different types of saves, we cannot determine true quality because it does not take into account the needs of the teams these closers are on – which is ultimately the point of the closer.
THE “MVC”
When we discuss who the best closer is, what are we asking?  Are we wondering who was best with the most pressure?  Who posted the best numbers?  And if we are talking about numbers, what numbers are the best numbers?
These questions, and more, can cause a headache.  My point here is that we cannot compare closers to each other or determine true quality and effectiveness without analyzing what each closer did for his team.
In order to do this we need to find the number of games that each team won in a save situation (meaning no walk-off wins or 3-inning saves) and add it to the number of Blown Save-Losses because that tells us the true number of save opportunities each team had.  I call that a TSO – Team Save Opportunity.
Jose Valverde had 47 saves this year, leading the NL, however the DBacks had 64 Team Save Opportunities, whereas Ryan Dempster’s Cubs only had 48 of those games – sixteen less than the DBacks.
Dempster had 28 saves, much less than Valverde, but his conversion rate (28/31) was higher.  Valverde had more saves, but he also had more opportunities because his team played a different way and, as a team, played more close games that needed saving.  And even though Dempster’s percentage was higher, he also had less opportunities to blow saves.  If he had the 54 appearances of Valverde, he may have also blown more saves and had a worse conversion rate.
What we need to do here is level the field of play between those on teams with many save opportunities and teams with fewer.  After all, it is not Dempster’s fault that the Cubs had a better offense and blew teams out more than the DBacks.  He was not needed as often as Valverde and so his raw save and blown save totals do nothing but compare one number of Dempster’s to the overall need of Valverde and the needs of the Diamondbacks.
To really do this, the effectiveness of one pitcher to his team needs to be compared to the effectiveness of another pitcher to another team.
APPEARANCE RATE
The DBacks had 64 TSO’s and Valverde had 54 opportunities.  This means that Valverde appeared in 54 of the 64 total save opportunities for his team, or 84.4 %.
That 84.4 % tells us he was durable since the team had so many potential save opportunities and his appearances were so high.
The Cubs only had 48 team save opportunities and Dempster only had 31 attempts.  His appearance rate would be 31 of 48, or 64.6 %.
Yes, Dempster was hurt, but this does make sense because you cannot be more effective (positive or negative) for your team if you are not involved as often as possible.  The fact that other pitchers were involved in over 1/3 of the Cubs save opportunities says that Dempster was not truly effective in making appearances.
To see the order of the nine closers in terms of Appearance Rate, look at the table below. The table shows the saves and save opportunities of the individual, as well as the total real save opportunities of the team, and then the Appearance Rate.

NAME SAVES SV OPP T OPP AP %
F. Cordero 44 51 58 87.9
Valverde 47 54 64 84.4
Hoffman 42 49 60 81.7
Wagner 34 39 48 81.3
C. Cordero 37 46 59 78.0
Isringhausen 32 34 46 73.9
Dempster 28 31 48 64.6
Lidge 19 27 55 49.1
Fuentes 20 27 59 45.8

Despite this stat being useful to tell us how durable or useful a closer can be in making appearances based on team need, it does not tell us how successful they were in actually converting these saves. Just because Valverde appeared in 54 of 64 team save opportunities for the DBacks does not mean he converted 54 saves – just that he made 54 appearances.
SAVE RATE
After careful thought, I came up with “Save Rate”, which takes the total number of saves by a closer and divides it by the total number of team opportunities.  This statistic takes the Appearance Rate to the next level.  Since closers can have a high Appearance Rate but low number of saves or low save percentage, Save Rate balances that out.
Save Rate lets us know how successful a Closer was in recording saves relative to the percentage of his team’s save opportunities.  It tells us how successful one was based on how effective he was in fulfilling his team’s need.
Essentially, it rewards those with more saves in less team opportunities, and takes away from those with less saves in more opportunities.
Valverde had 47 saves out of 54 chances, and his team had 64 real save opportunities.  His Save% would be 47/54 and his Appearance Rate would be 54/64.
His Save Rate would be 47 (# of saves)/64 (# of total team chances for a save), which comes out to 73.4 %, meaning that Valverde successfully saved 73.4 % of the DBacks team save opportunities.
Francisco Cordero of the Brewers was 2nd in the NL with 44 total saves.  He also blew seven saves giving him 51 opportunities.  His Save% was 44/51, very similar to Valverde, but his Appearance Rate was higher because the Brewers had six less team save opportunities and he only had three less appearances than Valverde.
His Appearance Rate was 51/58, or 87.9 %.  He appeared in more games proportionate to his team’s need.
His Save Rate would be 44/58, or 75.9 %, higher than Valverde’s.
To see the nine closers in order of Save Rate, look at the table below.  Again, it lists the total saves and opportunities of the individual, as well as the total team opportunities, and then the actual Save Rate.

NAME SAVES SV OPP T OPP SV RATE
F. Cordero 44 51 58 75.9 %
Valverde 47 54 64 73.4 %
Wagner 34 39 48 70.8 %
Hoffman 42 49 60 70.0 %
Isringhausen 32 34 46 69.6 %
C. Cordero 37 46 59 62.7 %
Dempster 28 31 48 58.3 %
Lidge 19 27 55 34.5 %
Fuentes 20 27 59 33.9 %

WHAT THIS MEANS
It makes sense that Cordero would be higher because even though his save totals and appearance totals were slightly less, he was involved in a higher percentage of his team’s chances and he converted successful saves at almost an identical number and percent.  Basically, he had less opportunities and still did the same exact thing – not the same ratio, but the same thing.
WHAT THIS DOES NOT MEAN
This does not necessarily mean Cordero had a better season.  This is merely one part of a two or three part article series and Save Rate is only the first part to a weighted system that should be able to determine who the best Closers are based on statistics that essentially define a good Closer.
CONCLUSION TO PART ONE
Next week I will get into the different types of saves featured in the data sheets and discuss their importance in determining quality and effectiveness. WPA and Win Predictors will be discussed as well.
I will also look at raw numbers to help come up with the Seidman Closer Model to properly evaluate Closers.
In closing (pun very intended), I just want to add that the Closer position has become such a fickle one over the years that these evaluations need to be done on a year to year basis.  Jose Valverde was arguably one of the best NL Closers in 2007, and somewhat of a replacement, or makeshift closer, in 2005.  Brian Fuentes was dynamite in 2005 and still pretty good in 2006, yet so bad in 2007 that he lost his job.
It is remarkable how inconsistent Closers are, and that is one of the primary reasons (along with playoff success) why Mariano Rivera will go down as the greatest ever.
Lastly, of the nine closers used in this ongoing study:

  • Lidge and Fuentes were demoted in 2007
  • Lidge and Valverde were traded to new teams
  • Dempster is likely going back to the starting rotation
  • Francisco signed a huge four-year deal with a new team
  • Billy Wagner changed teams from 2005 to 2006

The only NL Closers that have actually kept their job for the same team between 2005 and 2007 are – Trevor Hoffman, Jason Isringhausen, and Chad Cordero.

2007 NL Starting Pitching Analysis

When it comes to analyzing and comparing pitchers, those conducting the comparisons will often find themselves in a tricky situation.  Sure, certain pitchers are better than others, but what are they specifically better at? 

How can we conduct an honest analysis when there are so many variables to consider?  And how can we truly determine which pitchers were better than others when some are on terrible teams with no run support and others are on tremendous teams with tons of run support?
The first step is to determine what we are measuring.  If we want to know who the best strikeout pitcher is, we should look at the raw total for strikeouts and also an average of K/IP, since some guys will make less starts than others.  To figure out who walks the least, we measure the number of walks each pitcher gives up and a walk-IP ratio.
These measurements are contingent on one category, though, and cannot tell us who is better or more effective than the rest.  All of the research and ideas presented in this article are designed to measure the “effectiveness” of a pitcher. 
In order to determine this effectiveness, a whole heck of a lot of numbers need to be measured and properly weighted/scaled so that everybody has a fair shot – whether or not they are on a great team.
I took the 1-3 best pitchers from each National League team and entered their statistics into a database, measuring everything from their raw Innings Pitched totals to their Adjusted Quality Start % (you’ll read more on that below).  After entering all of the statistics, and crunching numbers until my brain turned to mush, I came up with my weighted points system.  I assigned the corresponding point totals and added everything up to determine what I feel is a very accurate measurement of pitching effectiveness amongst the NL’s best. 
This was not applied to every single NL Pitcher in 2007 (I will do that another time) but rather amongst these 30 selected #1, #2, or #3 starters.  For instance, a guy like Jeff Suppan may have been more effective than Jason Bergmann but I wanted to have at least one person from each team.
The system is not 100% perfect and does not take into account every single statistic (do you know how many statistics there are??), but it definitely levels the playing field between those on good or bad teams, those injured/called up or just plain bad, and those who got lucky or unlucky with run support.  The points are assigned based on the areas I, as an intense student of the game, feel are most important to determine true effectiveness. 
The basic idea of this system is to measure the true quality of a pitcher over his season – IE, what would happen if a pitcher was rewarded every time he pitched well and discredited every time he pitched poorly – something that happens perfectly just about 0% of the time. 
We will begin by going over the statistics involved, what their points scale was, and why they are used.  The idea behind these corresponding point totals is to properly weight the areas in which most people intuitively attribute to success and quality.
The points given to each statistical subset are designed to separate the aces from the workhorses and the workhorses from the seemingly replacement level pitchers.  They may seem arbitrary and could be replaced with different numbers, or fractions/decimals, however the difference between the points in subsets was based on the amount of pitchers who fall into certain categories.
GAMES STARTED
In order to be as effective as possible, a pitcher needs to make as many starts as he can.  How can we say that a pitcher with 14 starts is more effective than one with 34-35, even if his numbers in those 14 starts are tremendous and the numbers of the one with 34-35 are a bit worse?  His numbers may be better than the pitcher with 35 starts, however the latter pitcher was involved in 21 more games and proved to be durable enough to pitch an entire season, and solid enough to maintain his SP status for 162 games. 
This does not mean that a pitcher with 35 starts is necessarily “better” than one with 14-16, but rather he is more effective because he is involved in more of his team’s season. 
If the pitcher with 14-16 starts posted the same numbers in 32 starts, it would not be a contest.  But, he didn’t – it was only 14-16.  You cannot have as much of an effect on your team (actual play, not motivational or anything) unless you are out there as often as possible.
***What the end result of this effectiveness points system showed is that those with average numbers, over 30+ starts, were equally as effective, or slightly better/worse, than those with good numbers over 16-20 starts.***
If somebody makes only 14 starts in a season, it could be because he was injured for half of the season or was called up from the minors during the season, so he should not be penalized with negative points for that – he just should not be rewarded as highly as someone with 30+ starts.

  • if over 30 starts, +5
  • if 25-29 starts, +3
  • if 20-24 starts, +2
  • if under 20 starts, 0

INNINGS PITCHED
Just like Games Started, IP can only get you positive numbers, because the low raw number of IP can be attributed to injury or a midseason call-up.  Those with more IP get higher point totals, though.  The reason for 0 points for under 100 innings is because you were not necessarily a bad pitcher, but the lack of innings (whether due to injury or a call-up) limits the effectiveness.

  • if 230+, +8
  • if 220-229, +7
  • if 200-219, +5
  • if 150-199, +3
  • if 100-149, +2
  • if under 100, +1

IP/GAME
This is where negative numbers can begin.  If you were hurt, or called up from the minors, you are not penalized with negatives for the raw number of innings pitched or games started, but if you posted a high number of starts and low number of innings, this statistic will bite you in the rear.  IP/Game separates the hurt or called up from the downright below average or bad.  It also helps reward those with a couple less starts than others but with more raw innings pitched.  These types of pitchers were in the same GS range but some went deeper into games than others.  Nobody averaged over 7 IP/gm, so we start lower.

  • if 6.5-7 IP/gm, +7
  • if 6.0-6.49 IP/gm, +5
  • if 5.5-6 IP/gm, +3
  • if 5.0-5.5 IP/gm, 0
  • if below 5.0 IP/gm, -5

If you cannot average over 5 innings per game, or exactly 5 innings per game, you should not be a starting pitcher.  Even Adam Eaton averaged over 5 IP/gm in 2007.
ADJUSTED QUALITY STARTS
Quality Starts can be an inaccurate statistic because it takes into account games in which a pitcher goes 6+ innings and gives up no more than 3 earned runs… and nothing else.
If a pitcher goes 8.1 innings and gives up 4 runs, it is arguably the same ratio and an equal game in terms of quality, but does not get counted as a quality start.
With that in mind, I came up with the stat of Adjusted Quality Starts, which takes into account all regular quality starts as well as games in which someone goes 7.2-9 innings and gives up no more than 4 runs.  This measures the true number of games in which a pitcher had a good-great performance.
***If you wonder why it is 7.2 IP, instead of 8, the number was derived from the amount of times a pitcher was lifted after 7.2 IP for a specialist, or other sort of reliever, and from the sheer low average of innings pitched/game by a starter this year.  Reaching the 7th inning is now a great feat, let alone coming within one out of finishing the 8th.  Though the previous ratio for a QS was 2:1, due to the data mentioned above, going an extra 1.2 IP to get to 7.2 IP merits being able to give up one more run.***
I used the percentage of AQS to the total number of Games Started to measure effectiveness in this area.  Someone over 75% almost always pitches a good-great game, whereas someone under 50% only pitches a good game less than half of the time – not very effective.

  • if AQS % is above 75%, +5
  • if AQS % is 67-74%, +3
  • if AQS % is 50-66%, 0
  • if AQS % is below 50%, -3

If you’re keeping score at home, AQS= 6+IP with ER =< 3, AND, 7.2+IP with ER =< 4, where =< is the blog version of greater than/less than or equal to. 
COMPLETE GAMES & SHUTOUTS
In addition to AQS, something that needs to be taken into account is how often a pitcher went for a complete game, since they are so rare.  We also need to take into account a shutout, since they occur even less. 

  • For every CG, +2
  • For every SHO, additional +1

***NOTE: Aaron Harang had two games in 2007, one where he went 9 IP, and one where he went 10 IP, when he did not get a decision.  Even so, I am counting these 2 as a combined 1 CG, since he went 9+ innings.***
WINS AND LOSSES (ADJUSTED)
W-L Records are the most deceiving statistics because they do not take into account the true quality of the games pitched.  Just because a pitcher goes 14-7 does not mean he was necessarily a great pitcher.  He could have pitched terribly and had great run support in 10 of 14 wins, but brilliantly with terrible run support in the 7 losses.
The whole point of the adjusted W-L records is to get an AQS, since that means you pitched well and should be rewarded, even if your team (offense or bullpen) does not help you. 
After all, Ian Snell cannot control the Pirates’ offense.  It is not his fault that 4 of his 12 losses were “Tough Losses” and all 11 of his No-Decisions were games in which he pitched brilliantly and had an AQS, yet he received little to no offense to help garner him a ‘W’.
With that in mind, I changed W-L to the following 5 stats:

  • Cheap Wins: wins in which one does not get an AQS (-1)
  • Tough Losses: losses in which one does get an AQS (+2)
  • Legit Wins: wins in which one does get an AQS (+2)
  • Legit Losses: losses in which one does not get an AQS (-2)
  • ND-AQS: no-decisions in which one gets an AQS (+1)

I received some questions for how these numbers came to be, and to keep it simple, the statistics that actually have an effect on the W-L record are valued higher (negatively and positively) than the statistics like ND-AQS, which prevent a pitcher from winning but do not hurt him with a loss.
ND-non AQS is not used here for the same reason that Cheap Wins is only negative one, which is that not every Cheap Win or ND-non AQS was a terrible start.  A large bulk of them were games in which a pitcher had a good outing but only went 5 or 5.1 innings.   Cheap Wins loses you a point (not two, only one) because you do not get an AQS but it does effect your win-loss record.  ND-non AQS means you do not get an AQS but it does not effect your win-loss record, which is why I decided to just leave it out.
WHIP
Though I am not too fond of this statistic and originally tinkered around with separately evaluating H/IP and BB/IP, using WHIP just seemed to make things easier.  Though it does not tell us which pitchers walk less and give up more hits, or vice versa, or tell us how many “empty innings” a pitcher had (innings where no baserunners got on), it does provide a valid average of baserunners to expect in a given game since it does not equate to a per-9 inning scale.

  • if WHIP 1.00-1.15, +3
  • if WHIP 1.16-1.25, +2
  • if WHIP 1.26-1.30, +1
  • if WHIP 1.31-1.40, 0
  • if WHIP above 1.40, -2

K:BB RATIO
Instead of using K’s, I wanted to use the ratio of strikeouts to walks, since not every pitcher is a strikeout pitcher.  Even so, you do not have to be a strikeout pitcher to be an accurate one, and because of this I rewarded those with high K:BB ratios.  Greg Maddux only struck out 104 in 34 starts, but only walked 25 – a K:BB of 4.16.  This meant that Maddux kept more runners off-base by striking them out and not walking them.

  • if K:BB above 4, +7
  • if K:BB above 3, +5
  • if K:BB above 2, +3
  • if K:BB above 1, 0
  • if K:BB 1 or below, -3

EXAMPLE OF USAGE
Now that we have the points, let’s test it out and put it to use.  We will use Ian Snell and Carlos Zambrano.
The table below shows Ian Snell’s 2007 numbers and points he receives for each in my points system.

Starts 32 +5
Innings 208.0 +5
Cheap W 0 0
Tough L 4 +8
Legit W 9 +18
Legit L 8 -16
ND-AQS 11 +11
AQS % 75% +5
IP/Game 6.52 +7
WHIP 1.33 0
K:BB 2.60 +3
CG 1 +2
SHO 0 0

When we add up all eleven of these numbers, we get Snell’s Effectiveness #, which comes to: +48.
Now, let’s look at Carlos Zambrano’s season numbers in the table below and add his point totals up.

Starts 34 +5
Innings 216.1 +5
Cheap W 0 0
Tough L 2 +4
Legit W 18 +36
Legit L 11 -22
ND-AQS 0 0
AQS % 53% 0
IP/Game 6.36 +5
WHIP 1.34 0
K:BB 1.75 0
CG 1 +2
SHO 0 0

We look at his numbers and add up the totals to get his Effectiveness #: +35.
Zambrano had more legit wins but also more legit losses, and of Zambrano’s 3 no-decisions, none were ND-AQS, whereas of Snell’s 11 no-decisions, all were ND-AQS. 
That tells us that if each player got a win for every game he pitched well, and a loss for every game he did not pitch well (did not get an AQS), and the only no-decisions they received came from no-decisions that they pitched poorly in or did not go a full 6 IP, their records would look like this –

  • Carlos Zambrano (18-13) would actually be 20-11
  • Ian Snell (9-12) would actually be 24-8

Snell went further into his games, had a better K:BB ratio, and had that higher AQS %.  It also tells us that of Snell’s 32 starts, 24 of them were of great quality, whereas Zambrano had 18 good-great starts and 16 average-bad starts.
This essentially tells us that while Zambrano’s good-great starts may have been better than Snell’s good-great starts, when Zambrano had his bad starts, Snell was still having good-great ones.
RESULTS
As mentioned before, I used this points system to evaluate 30 National League pitchers.  I compiled a group of spreadsheets, ranking the pitchers in order in different categories to show that certain stats we rely on do a bad job of proving effectiveness.
To view all of my results, click on the links below.  You can use this data in other areas, but please credit my work.

  • To see the list of pitchers and their statistics used to assign points, click here.
  • To see the list of pitchers in order of effectiveness points, click here.

I do not want to post a ridiculously long table on this article, so you will need to look at the linked files to see the results, but I will list the top 15 pitchers and their effectiveness points.

  1. Jake Peavy, +74
  2. Aaron Harang, +69
  3. John Smoltz, +69
  4. Brandon Webb, +67
  5. Cole Hamels, +65
  6. Brad Penny, +64
  7. Tim Hudson, +63
  8. Ted Lilly, +60
  9. Matt Cain, +52
  10. Roy Oswalt, +50
  11. Ian Snell, +48
  12. Bronson Arroyo, +47
  13. Derek Lowe, +47
  14. Greg Maddux, +45
  15. Adam Wainwright, +45
  16. Jeff Francis, +45

And, again, these points were assigned to statistics based on how important they corrolate to effectiveness.  The points system essentially covers the statistics and averages from all angles.
CHRIS YOUNG
The most shocking part of this was how low Chris Young of the Padres came out.  Young went 9-8, with a 3.12 ERA, in 30 starts.  He should have been more effective, I thought, based on those numbers.  After looking at his game logs, though, I changed my mind and realized it made sense.
Of his 30 starts, he was essentially two different people.  In the 19 starts in which he went for 6+ innings, he was 9-1 with a 1.64 ERA, averaging 6.6 IP/gm, with a 0.85 WHIP and 129 K’s in 126.1 innings.
In the other 11 starts, he was 0-7, with a 7.14 ERA, only going 4.2 IP/gm, with a 1.76 WHIP, and 38 K to his 36 BB, in 46.2 innings.
After analyzing his situation and the points system I realized that my effectiveness model favors consistency and lower standard deviations (the average of how far someone strays from his average).  To me, that truly defines effectiveness.
I would much rather have a guy who I knew would amass an AQS 67% or more of the time than a guy who might strikeout 20 batters and pitch a two-hitter in one game, but give up 5 runs in 6 innings for the next three, before again pitching a brilliant game.
As long as the consistency is of a good nature, consistency in this model proves effectiveness.
CONCLUSION
I know, we’re finally at the end of the article, right?  I apologize for the length but it took this long to get everything across. 
Looking at Jake Peavy, the most effective NL pitcher at +74, we see that the only counted statistic in which he led was AQS.  Peavy had the most good-great starts of any NL pitcher.  While he may not have led in IP, IP/gm, K:BB ratio, or least losses (Brad Penny only had 1 legit loss), he led in consistency and being consistently good-great.
These results also show that Cole Hamels, with 6 more starts that he missed due to injury, would likely challenge Peavy for #1 in effectiveness – however, as my model dictates, the fact that he missed those 6 starts and Peavy did not shows that Peavy was more effective.
Yes, there were more stats we could add to this, and more variables to account for, but I feel this accurately levels the field of play between pitchers in distinctly different playing situations, and levels the difference between 2007 reputation and 2007 actual performance.
I must remind you before I come to a close, though, that this is only a measure of effectiveness, not the end-all solution to determining who the “best” pitchers are.
However, for this Sabermetrician, effectiveness directly corrolates with quality and value.

2007 Sabermetric Year in Review: Philadelphia Phillies

Here’s to the folks at MVN command central.  They’ve put a lovely graphic in the top right corner of the page through which you can go to a listing of all of the articles in this year in review series.  You’re about to read the #10 stop on our reverse-alphabetical tour (sorry, Atlanta), as we head to Philadelphia to look at the Phillies. 
Record: 89-73, 1st in NL East 
Pythagorean Projection (Patriot formula):  87.60 wins (892 runs scored,  821 runs allowed)
Team Statistical Pages:
Baseball Reference
Baseball Prospectus
FanGraphs
MVN Blog:
Phanatic Phollow Up (I went to a college which was founded by a man named Philander… everything on campus that should have started with ‘F’ started with Ph… like Philander’s Phebruary Phling.)
More Phillies Resources:
Latest News
Contract Status
Trade Rumors
Overview: It took a monumental collapse by the Mets to put the Phillies in the playoffs, but let’s give credit at its due.  The Phillies won 16 of their last 22 games.  The Phillies started the year in the shadow of their much more expensive division mates, the Mets, although that was never really fair.
What went right: The Phillies somehow fashioned together a pitching staff out of what looked like a collection of spare parts.  Hamels is and has been legit.  Jamie Moyer can not be explained by any rational process, and then there’s Kyle Kendrick.  Kendrick is 22, doesn’t have overpowering stuff, and basically has a job because he gets people to beat the ball into the ground.  Not a bad life for a pitcher, but he’ll always be as good as the defense behind him. 
How on earth the Phillies did what they did with the “bullpen” (note the quotes) they had is beyond me.
Ryan Howard turned out to not be a fluke.  After his first full season in MLB turned out to be a MVP-worthy(?) season, the Philly Phaithful were phearing that he would be a phluke.  Nah.  He still hits fly balls, a lot of them leave the yard, and he still is a danger to strike out 200 times in a season.  Yes, Philly phans, his numbers were “down.”  Please don’t confuse that with “bad player.”  Home runs and strikeouts go together like peas and carrots.  Take a look at the all-time leaders in strikeouts.  See a few names on there that you like?  Howard is a prodigious HR hitter.  It’s probably difficult to watch him knowing that it’s 4 times more likely that he’ll strike out than hit a HR, but the HR are worth their weight in gold.  Remember, the best strategies are not always the one that make you feel the best when you use them.  Howard is a prime example of this very theory.
What went wrong: For a team with two guys on the infield who have won the last two NL MVPs (and Chase Utley, who’s better than either of them), third base is a rather sore spot in the Phillies organization.   Wes Helms and Abraham Nunez were well below acceptable (let’s not even go near “replacement level”).  The Phillies had a third base prospect in their system, Michael Costanzo, who tore up AA ball last year, but he was sent to Houston for Brad Lidge (a good pickup — see below), but ummm…
Speaking of the Phillies bullpen though, let’s talk about J.C. Romero.  Why is he in the “What went wrong” category?  The Phillies just inked him to a three year, $12M contract.  Here’s what they get for that $4M per year.  Granted that Romero had a good half of a year last year when he came to Philadelphia, but the guy had a crazy low BABIP during his time in Philly, and he walks 6.39 per nine innings.  He’s not a strikeout pitcher, and he’s at best a LOOGY.  He’s not a setup/closer caliber pitcher, and he’s certainly not worth $4M per year. 
Yeah, that about sums it up: Pat Burrell, metaphor?  The Phillies up to the All-Star break: 44-44.  After: 45-29.  Burrell’s splits: pre-break, .786 OPS; post-break 1.010 OPS.  (smell that ecological fallacy?)  In fact, if you look at Burrell’s splits, especially by month, you can see that his BABIP varied wildly from month-to-month.  What’s eerie about how it ended is that his 2007 stats were a near copy of his 2006 stats.  Creepy.
Was Jimmy Rollins even the Most Valuable Phillie?: Jimmy Rollins was the Most Valuable Player in the NL?  Rollins?  He had a really really good year, no doubt.  But, he got the award based on some very gaudy raw hit totals.  Rollins did the 20-20-20-20 thing this year with at least 20 double, triples, HR, and SB.  (Oddly enough, Curtis Granderson, who did the same thing, finished in tenth place in the AL.)  Rollins played all 162 games, usually hitting leadoff on a very good offense (tops in the NL in runs scored).  It led to his amassing 715 AB and 778 PAs, both best in the league.  He had more chances to put up counting stats than anyone else in baseball.  Again, this is not to say he had a bad year, but I’m curious as to how he was the best player in the NL when statistically, he wasn’t the best player on his own team.  Chase Utley was.  Utley out-VORPed Rollins, had a better on-base percentage (Rollins rarely walks), and had a better slugging percentage than Rollins.  (Rollins did barely nick Utley in ISO, .235 to .234.)  Utley had more WPA and context neutral WPA.  Utley the better in RC/27 by about three RC’s.  Utley’s big crime was that he broke his hand toward the end of the season.
But, OK.  Rollins played shortstop, and it’s harder to find a good shortstop, or something like that, you will surely say.  If we want to play that game, then the award should have gone to Hanley Ramirez, who led Rollins in all of the above categories with the exception of raw WPA.  Ramirez even stole more bases than Rollins (51-41).  (To be fair, Ramirez was the one of the worst-fielding shortstops in baseball, and Rollins was among the better.)  I’m finding it hard to justify Rollins as MVP on this one.  Rollins was a valuable player, just not the most valuable one in the NL this year.  Or on the Phillies.
Someone in Philadelphia understands the basics of statistics (and psychology!): A small meditation on Brad Lidge, one of the new Phillies.  Here’s what happens when a manager uses his emotions to manage the game rather than logic.  In 2005, he gave up that home run to Albert Pujols and then didn’t do so well in the World Series.  Suddenly, he wasn’t a very good pitcher.  Now, under the belief that Lidge should be judged by the fact that the best hitter in the game of baseball had hit a home run off of him in a single at-bat, rather than… well, everything else Lidge had done up to that point and since.  Lidge was unfairly de-valued in Houston.  Detractors point to his 2006 season as proof that Lidge had “lost it,” conveniently passing over his 2007 season.  His walk and HR rates were up in both years.  In 2007, the home run rate had more to do with the fact that he was giving up more fly balls, rather than line drives, and he lived in the “Juice Box”.  His walk rate being up is what it is.  2005 was probably an outlier season for Lidge and he’s not likely to match that again, but again, this is not the same thing as Lidge being a bad pitcher.  He’s still a strikeout machine and that’s a pretty good thing to have in a reliever.  Someone in Philadelphia understands that, and took advantage of the fact that the Astros had a broken heart over this guy.
Let’s even ignore the fact that Lidge, since his little “accident” in 2005 has been a really good pitcher.  He still has the perception of having a case of the yips and let’s for a moment pretend that he was greatly psychologically damaged by giving up that home run.  How long does it take most people to adjust to major life events (and here, I mean things like a spouse dying suddenly)?  Usually about 6 months.  (The psychologist in me should point out that sometimes, it does take more than 6 months, and if it has, the best thing to do is to seek treatment.  End PSA.)  It’s been two years, folks.  While some Astros fans may not have gotten over that night, it’s likely that Lidge has.  Philadelphia, enjoy your new pitcher.  He’s going to make you one heck of a closer.
Although given what the Phillies signed Romero for… maybe they don’t understand the basics of statistics.
Mensches: No, this has nothing to do with the part-time Brewers outfielder named Kevin.  Mensch is a Yiddish word for an upstanding man of honor.  The Phillies deserve special mention this year for an act that goes well beyond baseball.  They deserve a Mensch of the Year award.  During a game in Colorado, there was a storm a’brewin’ (again, not the former Phillies outfielder), and the Rockies ground crew was scrambling to put the tarp on the field.  The storm got a little crazy and a wind gust picked up one of the guys, who was trying to hold down the tarp, and threw him into the air.  Thankfully, he wasn’t hurt, but it was clear that this was a dangerous situation with 8 guys trying to fight the elements and the tarp to get it on the field.  The Phillies players ran out of the dugout to help out.  Lest we believe that life is a collection of numbers, here’s to the Phillies for a simple act of kindness.  They got enough karma going for them that the Mets collapsed in front of them and ended up letting them into the playoffs… where they lost to… well, Colorado of all teams.  Funny how life works like that.
Outlook: Well, Aaron Rowand has gone off to San Francisco, I suppose to be replaced by the Flyin’ Hawaiian himself, Shane Victorino.  It’s never good to lose a 50+ VORP player, but that’s life.  Brett Myers goes back into the rotation with the arrival of Lidge.  The Phillies still have some legit star power, but are lacking in a supporting cast.  For a playoff team, the Phillies seem to have an awful lot of holes.  But in the three-team NL East division that is the ever-underachieving Mets, and the living-off-their-laurels Braves, what’s to say that the Phillies don’t return to the playoffs next year?