Zito Gaining Speed?

Warning: Low(ish) Content

When running some pitch f/x numbers for an upcoming piece, I noticed Barry Zito’s fastball has been rising in velocity.
When Zito was having success with the A’s, his fastball was around 87 mph. It had dipped before he got to the Giants, but it has started to rise again.
Average lines are calculated approximations (if that makes any sense).
Click on image for a larger picture.

The numbers:
2007: 84.8 mph
2008: 85.1
2009: 86.3
One thing to note, is that the velocity of his fastball has been consistently dropping over his last few starts. Not a lot, but it is there. 
Not sure if this even really means anything, but with all the talk about Zito today, it’s a cool thing to look at.

The Name Game

Growing up in Philadelphia, and raised in an extreme sports environment, Jayson Stark has always been an idol of mine. In fact it was reading his Philadelphia Inquirer column every week that eventually propelled me into sabermetrics. His columns always combined humor and statistics in order to show all of the hilarious or newsworthy baseball happenings that could not be seen on an ESPN show. Not shocking in the least, ESPN eventually brought him onboard. That being said, I thought I would do my sports-writing idol proud by writing an article in a style similar to his.
The idea for this came to me when the Phillies signed Chad Durbin to be their: (circle the correct answer)

  • A) 5th Starter
  • B) 6th Starter
  • C) Mop-Up Reliever
  • D) Waste of Space
  • E) Who cares, we have Adam Eaton!?

Regardless of the answer you selected, this now gave the Phillies Chad Durbin and J.D. Durbin – two completely unrelated Durbins. Now, it isn’t as if we’re talking about two guys with the last name of Smith. I never knew “Durbin” was a last name until a couple of years ago and now there are not only two in major league baseball but two on the same team?
More interestingly enough, there have only been four Durbin’s in the history of major league baseball and the other two ended their careers during, or before, 1909. The only two Durbin’s in the last 98 seasons of major league baseball are now on the same team – and have no relation to one another.
The Phillies acquired J.D. Durbin after the Diamondbacks placed him on waivers in April. Durbin had appeared in one game for Arizona and surrendered 7 hits and 7 runs in 2/3 of an inning. For the Phillies, Durbin was somewhat serviceable, even throwing a complete game shutout against the Padres.
J.D. Durbin made his Phillies debut on June 29th during the first game of a double-header against the Mets.
At the time of acquiring J.D. Durbin, the Phillies had a minor league prospect with the name J.A. Happ. Due to rotation injuries, Happ made his first major league start on June 30th, against the Mets.
Now that would be odd enough, on its own, however the Phillies also acquired J.C. Romero from the Red Sox. Romero also made his Phillies debut on June 29th, during the second game of Durbin’s double-header.
So, to recap, not only did the Phillies have three pitchers with the first names of J.A., J.C., and J.D., but all three of them made their Phillies debuts within the span of 48 hours from June 29th-June 30th!
And, speaking of the Phillies, they acquired Tad Iguchi from the White Sox towards the end of the season. Since he would not have been able to play for the Phillies until May 15th, if he re-signed with them, he went elsewhere (Padres). The Phillies, in need of another bench player, decided to sign So Taguchi. I guess this way the transition will be easier for the players.
Or how about the Twins deciding to replace Luis Castillo with Alexi Casilla.

  • Believe it or not, the American League had an Ellis, an Ellison, and an Ellsbury.  And no, they were not Dale, Pervis, or Doughboy.
  • The Athletics had Dan Haren and Rich Harden.
  • The American League also had a Joakim, a Joaquin, and a Johan.  That’s never happened before with different players.
  • Lastly, there was the Rays’ Delmon Young and the Dodgers’ Delwyn Young, who sadly never got to face each other.

Speaking of “Young’s,” the NL West not only had two of them, but two Chris Young’s.  They could not be more different, either, as one is a 9-ft tall, white, former ivy-league pitcher and the other is a 6-ft, black, college-less outfielder.  Pitcher Chris Young (PCY for those keeping track) won the 2007 battle as his younger counterpart went 0-10, with a walk and 4 K’s against him.

  •  Orlando Hudson went 2-11, with an RBI and 4 BB, against his “River” counterpart Tim Hudson.
  • Unfortunately, Reggie Abercrombie never got to face Jesse Litsch.  I wonder what Sportscenter would call that matchup.  Reggie and Jesse?  Reggie and Litsch?  Abercrombie and Jesse?  Ugh, who knows…
  • Aaron Rowand and Robinson Cano didn’t face each other this past year either.
  • Somehow, the Blue Jays and Rockies have played nine times and we are still waiting on a Halladay/Holliday matchup.
  • Scott Baker didn’t pitch against, or to, Paul Bako in 2007, though my fingers are crossed for 2008.

Mike Lamb is 3-9 in his career against Adam Eaton (who isn’t?) as well as 1-7 off of Todd Coffey.
Coffey and Lamb usually don’t go well together, though, but Felix Pie is also 0-1 off of the caffeinated one.
Eaton has never gotten to face Pie yet.  I’d like to put a pie in Eaton’s face.  3 yrs and 24 mil worth of pies!
In what would probably cause the universe to crumble, I am patiently awaiting a Rick VandenHurk vs. Todd Van Benschoten matchup.  I’m feeling 2008 or 2009.
In the long-name department, Jarrod Saltalamacchia went 1-2 against Andy Sonnanstine.  Salty also went 0-2 against Mark Hendrickson.  He went 1-1 against Ryan Rowland-Smit, but Ryan had two last names to reach eleven letters and therefore had an unfair advantage.
Easily the most hypocritical name award goes to Angel Pagan.  You can figure that one out.  Did you know, though, that the National League had “Two Wise Men”?  That’s right – Matt and Dewayne.
Though Matt Wise surrendered a hit to Angel Pagan, he struck out Dewayne Wise, proving what we already knew – Matt Wise is the smartest pitcher ever.
On a sad note,  2007 proved to be a disappointment in the generic name field (not Nate Field or Josh Fields).  Combined, there were only four Smith’s.  Jason, Joe, Matt, and Seth.
Even sadder, we only had three Williams’ – Dave, Jerome, and Woody.  Scott Williamson tried his hardest but that does not count.  Could be a cool sitcom title – Three Williams and a Williamson.
Major League Baseball spanned the endpoints of the life cycle this year.  On one side we had Alan Embree (embryo) and Omar Infante (infant) and on the other there were Jermaine Dye (die) and Manny Corpas (corpse).
Dye has never faced Corpas but is 2-7 in his career off of Embree.  Infante has also never faced Corpas but has doubled in 4 at-bats against Embree.
Jorge de la Rosa and Eulogio de la Cruz did not face each other this year despite being the only two “of-the” names.  And, just to clarify the none of you who asked, Valerio de los Santos would not qualify for this category since de los would technically be “of-them” or “of-those.”
Miguel Cairo has long been the MVP of this group but he welcomed two additions this year in the forms of Ben Francisco and Frank Francisco.  I had always thought of Francisco as a Spanish first name but was very surprised to find it as an American last name.  In fact, if you say Ben Francisco really quickly and in front of a drunk, it could even sound like San Francisco.
I recently got an original NES and could not help but notice that two major leaguers sound like items from a Zelda game.  Don’t both of these sentences make sense?

  1. Link, to defeat Ganon, you must hit him in the lower Velandia.
  2. Use your Verlander to blow up the stones blocking the entrance.

One of my favorite movies is Sinbad’s Houseguest, and whenever I hear the name of Giants’ 2B Kevin Frandsen I am reminded of Sinbad’s character Kevin Franklin.  Something tells me Frandsen never impersonated a dentist.
In addition to everyone else we had six players with job names.  Chris Carpenter and Lee Gardner maintained the stadiums and fields, Scott Proctor made sure they didn’t cheat, Skip Schumaker supplied them all with cleats, while Matt Treanor helped rehab Torii Hunter.
Schumaker did not face Carpenter, Gardner, or Proctor.  Treanor is 1-3 off of Carpenter in his career.  Hunter was 3-6 with a HR and 2 RBI off of Carpenter (career), as well as 2-6 with an RBI off of Proctor.
Clearly, a Hunter is more valuable than a Proctor and a Carpenter.
Point blank – the following names sound incredibly made up and fake:

  • Frank Francisco
  • Dave Davidson
  • Emilio Bonifacio
  • Rocky Cherry

When primitive men first began to speak it was easiest to combine two words together without any intermediates.  Thousands of years later we still have names like Grady Sizemore, Jarrod Washburn, Mark Bellhorn, and Chris Bootcheck.
Speaking of Chris Bootcheck, I wonder what he and Jon Knotts would talk about.
In the anatomy field, Rick Ankiel and Brandon Backe were in the same division, with Ankiel going 0-3 with an RBI off Backe.

  • DIRTY NAME AWARD – Rich (Dick) Harden
  • ACADEMY AWARD – Sean Henn
  • LED ZEPPELIN AWARD – Scott Kazmir
  • FUTURE PIZZA SHOP NAME AWARD – Doug Mirabelli (hon. mention – Mike Piazza)
  • FICTIONAL SERIAL KILLER AWARD – Mike Myers (as usual)
  • NAME TYPO AWARD – Jhonny Peralta
  • MOST FUN TO SAY AWARD – Jonathan Albaladejo
  • IMPERVIOUS AWARD – (tie) James Shields and Scot Shields

And there you have it.  We covered the life cycle, the entertainment (regular and adult) industry, jobs, cities, the bible, and more.
We can only hope that 2008 will finally bring us a VandenHurk/Van Benschoten or a Holliday/Halladay.
Keep your fingers crossed.

2007 NL Starting Pitching Analysis

When it comes to analyzing and comparing pitchers, those conducting the comparisons will often find themselves in a tricky situation.  Sure, certain pitchers are better than others, but what are they specifically better at? 

How can we conduct an honest analysis when there are so many variables to consider?  And how can we truly determine which pitchers were better than others when some are on terrible teams with no run support and others are on tremendous teams with tons of run support?
The first step is to determine what we are measuring.  If we want to know who the best strikeout pitcher is, we should look at the raw total for strikeouts and also an average of K/IP, since some guys will make less starts than others.  To figure out who walks the least, we measure the number of walks each pitcher gives up and a walk-IP ratio.
These measurements are contingent on one category, though, and cannot tell us who is better or more effective than the rest.  All of the research and ideas presented in this article are designed to measure the “effectiveness” of a pitcher. 
In order to determine this effectiveness, a whole heck of a lot of numbers need to be measured and properly weighted/scaled so that everybody has a fair shot – whether or not they are on a great team.
I took the 1-3 best pitchers from each National League team and entered their statistics into a database, measuring everything from their raw Innings Pitched totals to their Adjusted Quality Start % (you’ll read more on that below).  After entering all of the statistics, and crunching numbers until my brain turned to mush, I came up with my weighted points system.  I assigned the corresponding point totals and added everything up to determine what I feel is a very accurate measurement of pitching effectiveness amongst the NL’s best. 
This was not applied to every single NL Pitcher in 2007 (I will do that another time) but rather amongst these 30 selected #1, #2, or #3 starters.  For instance, a guy like Jeff Suppan may have been more effective than Jason Bergmann but I wanted to have at least one person from each team.
The system is not 100% perfect and does not take into account every single statistic (do you know how many statistics there are??), but it definitely levels the playing field between those on good or bad teams, those injured/called up or just plain bad, and those who got lucky or unlucky with run support.  The points are assigned based on the areas I, as an intense student of the game, feel are most important to determine true effectiveness. 
The basic idea of this system is to measure the true quality of a pitcher over his season – IE, what would happen if a pitcher was rewarded every time he pitched well and discredited every time he pitched poorly – something that happens perfectly just about 0% of the time. 
We will begin by going over the statistics involved, what their points scale was, and why they are used.  The idea behind these corresponding point totals is to properly weight the areas in which most people intuitively attribute to success and quality.
The points given to each statistical subset are designed to separate the aces from the workhorses and the workhorses from the seemingly replacement level pitchers.  They may seem arbitrary and could be replaced with different numbers, or fractions/decimals, however the difference between the points in subsets was based on the amount of pitchers who fall into certain categories.
In order to be as effective as possible, a pitcher needs to make as many starts as he can.  How can we say that a pitcher with 14 starts is more effective than one with 34-35, even if his numbers in those 14 starts are tremendous and the numbers of the one with 34-35 are a bit worse?  His numbers may be better than the pitcher with 35 starts, however the latter pitcher was involved in 21 more games and proved to be durable enough to pitch an entire season, and solid enough to maintain his SP status for 162 games. 
This does not mean that a pitcher with 35 starts is necessarily “better” than one with 14-16, but rather he is more effective because he is involved in more of his team’s season. 
If the pitcher with 14-16 starts posted the same numbers in 32 starts, it would not be a contest.  But, he didn’t – it was only 14-16.  You cannot have as much of an effect on your team (actual play, not motivational or anything) unless you are out there as often as possible.
***What the end result of this effectiveness points system showed is that those with average numbers, over 30+ starts, were equally as effective, or slightly better/worse, than those with good numbers over 16-20 starts.***
If somebody makes only 14 starts in a season, it could be because he was injured for half of the season or was called up from the minors during the season, so he should not be penalized with negative points for that – he just should not be rewarded as highly as someone with 30+ starts.

  • if over 30 starts, +5
  • if 25-29 starts, +3
  • if 20-24 starts, +2
  • if under 20 starts, 0

Just like Games Started, IP can only get you positive numbers, because the low raw number of IP can be attributed to injury or a midseason call-up.  Those with more IP get higher point totals, though.  The reason for 0 points for under 100 innings is because you were not necessarily a bad pitcher, but the lack of innings (whether due to injury or a call-up) limits the effectiveness.

  • if 230+, +8
  • if 220-229, +7
  • if 200-219, +5
  • if 150-199, +3
  • if 100-149, +2
  • if under 100, +1

This is where negative numbers can begin.  If you were hurt, or called up from the minors, you are not penalized with negatives for the raw number of innings pitched or games started, but if you posted a high number of starts and low number of innings, this statistic will bite you in the rear.  IP/Game separates the hurt or called up from the downright below average or bad.  It also helps reward those with a couple less starts than others but with more raw innings pitched.  These types of pitchers were in the same GS range but some went deeper into games than others.  Nobody averaged over 7 IP/gm, so we start lower.

  • if 6.5-7 IP/gm, +7
  • if 6.0-6.49 IP/gm, +5
  • if 5.5-6 IP/gm, +3
  • if 5.0-5.5 IP/gm, 0
  • if below 5.0 IP/gm, -5

If you cannot average over 5 innings per game, or exactly 5 innings per game, you should not be a starting pitcher.  Even Adam Eaton averaged over 5 IP/gm in 2007.
Quality Starts can be an inaccurate statistic because it takes into account games in which a pitcher goes 6+ innings and gives up no more than 3 earned runs… and nothing else.
If a pitcher goes 8.1 innings and gives up 4 runs, it is arguably the same ratio and an equal game in terms of quality, but does not get counted as a quality start.
With that in mind, I came up with the stat of Adjusted Quality Starts, which takes into account all regular quality starts as well as games in which someone goes 7.2-9 innings and gives up no more than 4 runs.  This measures the true number of games in which a pitcher had a good-great performance.
***If you wonder why it is 7.2 IP, instead of 8, the number was derived from the amount of times a pitcher was lifted after 7.2 IP for a specialist, or other sort of reliever, and from the sheer low average of innings pitched/game by a starter this year.  Reaching the 7th inning is now a great feat, let alone coming within one out of finishing the 8th.  Though the previous ratio for a QS was 2:1, due to the data mentioned above, going an extra 1.2 IP to get to 7.2 IP merits being able to give up one more run.***
I used the percentage of AQS to the total number of Games Started to measure effectiveness in this area.  Someone over 75% almost always pitches a good-great game, whereas someone under 50% only pitches a good game less than half of the time – not very effective.

  • if AQS % is above 75%, +5
  • if AQS % is 67-74%, +3
  • if AQS % is 50-66%, 0
  • if AQS % is below 50%, -3

If you’re keeping score at home, AQS= 6+IP with ER =< 3, AND, 7.2+IP with ER =< 4, where =< is the blog version of greater than/less than or equal to. 
In addition to AQS, something that needs to be taken into account is how often a pitcher went for a complete game, since they are so rare.  We also need to take into account a shutout, since they occur even less. 

  • For every CG, +2
  • For every SHO, additional +1

***NOTE: Aaron Harang had two games in 2007, one where he went 9 IP, and one where he went 10 IP, when he did not get a decision.  Even so, I am counting these 2 as a combined 1 CG, since he went 9+ innings.***
W-L Records are the most deceiving statistics because they do not take into account the true quality of the games pitched.  Just because a pitcher goes 14-7 does not mean he was necessarily a great pitcher.  He could have pitched terribly and had great run support in 10 of 14 wins, but brilliantly with terrible run support in the 7 losses.
The whole point of the adjusted W-L records is to get an AQS, since that means you pitched well and should be rewarded, even if your team (offense or bullpen) does not help you. 
After all, Ian Snell cannot control the Pirates’ offense.  It is not his fault that 4 of his 12 losses were “Tough Losses” and all 11 of his No-Decisions were games in which he pitched brilliantly and had an AQS, yet he received little to no offense to help garner him a ‘W’.
With that in mind, I changed W-L to the following 5 stats:

  • Cheap Wins: wins in which one does not get an AQS (-1)
  • Tough Losses: losses in which one does get an AQS (+2)
  • Legit Wins: wins in which one does get an AQS (+2)
  • Legit Losses: losses in which one does not get an AQS (-2)
  • ND-AQS: no-decisions in which one gets an AQS (+1)

I received some questions for how these numbers came to be, and to keep it simple, the statistics that actually have an effect on the W-L record are valued higher (negatively and positively) than the statistics like ND-AQS, which prevent a pitcher from winning but do not hurt him with a loss.
ND-non AQS is not used here for the same reason that Cheap Wins is only negative one, which is that not every Cheap Win or ND-non AQS was a terrible start.  A large bulk of them were games in which a pitcher had a good outing but only went 5 or 5.1 innings.   Cheap Wins loses you a point (not two, only one) because you do not get an AQS but it does effect your win-loss record.  ND-non AQS means you do not get an AQS but it does not effect your win-loss record, which is why I decided to just leave it out.
Though I am not too fond of this statistic and originally tinkered around with separately evaluating H/IP and BB/IP, using WHIP just seemed to make things easier.  Though it does not tell us which pitchers walk less and give up more hits, or vice versa, or tell us how many “empty innings” a pitcher had (innings where no baserunners got on), it does provide a valid average of baserunners to expect in a given game since it does not equate to a per-9 inning scale.

  • if WHIP 1.00-1.15, +3
  • if WHIP 1.16-1.25, +2
  • if WHIP 1.26-1.30, +1
  • if WHIP 1.31-1.40, 0
  • if WHIP above 1.40, -2

Instead of using K’s, I wanted to use the ratio of strikeouts to walks, since not every pitcher is a strikeout pitcher.  Even so, you do not have to be a strikeout pitcher to be an accurate one, and because of this I rewarded those with high K:BB ratios.  Greg Maddux only struck out 104 in 34 starts, but only walked 25 – a K:BB of 4.16.  This meant that Maddux kept more runners off-base by striking them out and not walking them.

  • if K:BB above 4, +7
  • if K:BB above 3, +5
  • if K:BB above 2, +3
  • if K:BB above 1, 0
  • if K:BB 1 or below, -3

Now that we have the points, let’s test it out and put it to use.  We will use Ian Snell and Carlos Zambrano.
The table below shows Ian Snell’s 2007 numbers and points he receives for each in my points system.

Starts 32 +5
Innings 208.0 +5
Cheap W 0 0
Tough L 4 +8
Legit W 9 +18
Legit L 8 -16
ND-AQS 11 +11
AQS % 75% +5
IP/Game 6.52 +7
WHIP 1.33 0
K:BB 2.60 +3
CG 1 +2
SHO 0 0

When we add up all eleven of these numbers, we get Snell’s Effectiveness #, which comes to: +48.
Now, let’s look at Carlos Zambrano’s season numbers in the table below and add his point totals up.

Starts 34 +5
Innings 216.1 +5
Cheap W 0 0
Tough L 2 +4
Legit W 18 +36
Legit L 11 -22
ND-AQS 0 0
AQS % 53% 0
IP/Game 6.36 +5
WHIP 1.34 0
K:BB 1.75 0
CG 1 +2
SHO 0 0

We look at his numbers and add up the totals to get his Effectiveness #: +35.
Zambrano had more legit wins but also more legit losses, and of Zambrano’s 3 no-decisions, none were ND-AQS, whereas of Snell’s 11 no-decisions, all were ND-AQS. 
That tells us that if each player got a win for every game he pitched well, and a loss for every game he did not pitch well (did not get an AQS), and the only no-decisions they received came from no-decisions that they pitched poorly in or did not go a full 6 IP, their records would look like this –

  • Carlos Zambrano (18-13) would actually be 20-11
  • Ian Snell (9-12) would actually be 24-8

Snell went further into his games, had a better K:BB ratio, and had that higher AQS %.  It also tells us that of Snell’s 32 starts, 24 of them were of great quality, whereas Zambrano had 18 good-great starts and 16 average-bad starts.
This essentially tells us that while Zambrano’s good-great starts may have been better than Snell’s good-great starts, when Zambrano had his bad starts, Snell was still having good-great ones.
As mentioned before, I used this points system to evaluate 30 National League pitchers.  I compiled a group of spreadsheets, ranking the pitchers in order in different categories to show that certain stats we rely on do a bad job of proving effectiveness.
To view all of my results, click on the links below.  You can use this data in other areas, but please credit my work.

  • To see the list of pitchers and their statistics used to assign points, click here.
  • To see the list of pitchers in order of effectiveness points, click here.

I do not want to post a ridiculously long table on this article, so you will need to look at the linked files to see the results, but I will list the top 15 pitchers and their effectiveness points.

  1. Jake Peavy, +74
  2. Aaron Harang, +69
  3. John Smoltz, +69
  4. Brandon Webb, +67
  5. Cole Hamels, +65
  6. Brad Penny, +64
  7. Tim Hudson, +63
  8. Ted Lilly, +60
  9. Matt Cain, +52
  10. Roy Oswalt, +50
  11. Ian Snell, +48
  12. Bronson Arroyo, +47
  13. Derek Lowe, +47
  14. Greg Maddux, +45
  15. Adam Wainwright, +45
  16. Jeff Francis, +45

And, again, these points were assigned to statistics based on how important they corrolate to effectiveness.  The points system essentially covers the statistics and averages from all angles.
The most shocking part of this was how low Chris Young of the Padres came out.  Young went 9-8, with a 3.12 ERA, in 30 starts.  He should have been more effective, I thought, based on those numbers.  After looking at his game logs, though, I changed my mind and realized it made sense.
Of his 30 starts, he was essentially two different people.  In the 19 starts in which he went for 6+ innings, he was 9-1 with a 1.64 ERA, averaging 6.6 IP/gm, with a 0.85 WHIP and 129 K’s in 126.1 innings.
In the other 11 starts, he was 0-7, with a 7.14 ERA, only going 4.2 IP/gm, with a 1.76 WHIP, and 38 K to his 36 BB, in 46.2 innings.
After analyzing his situation and the points system I realized that my effectiveness model favors consistency and lower standard deviations (the average of how far someone strays from his average).  To me, that truly defines effectiveness.
I would much rather have a guy who I knew would amass an AQS 67% or more of the time than a guy who might strikeout 20 batters and pitch a two-hitter in one game, but give up 5 runs in 6 innings for the next three, before again pitching a brilliant game.
As long as the consistency is of a good nature, consistency in this model proves effectiveness.
I know, we’re finally at the end of the article, right?  I apologize for the length but it took this long to get everything across. 
Looking at Jake Peavy, the most effective NL pitcher at +74, we see that the only counted statistic in which he led was AQS.  Peavy had the most good-great starts of any NL pitcher.  While he may not have led in IP, IP/gm, K:BB ratio, or least losses (Brad Penny only had 1 legit loss), he led in consistency and being consistently good-great.
These results also show that Cole Hamels, with 6 more starts that he missed due to injury, would likely challenge Peavy for #1 in effectiveness – however, as my model dictates, the fact that he missed those 6 starts and Peavy did not shows that Peavy was more effective.
Yes, there were more stats we could add to this, and more variables to account for, but I feel this accurately levels the field of play between pitchers in distinctly different playing situations, and levels the difference between 2007 reputation and 2007 actual performance.
I must remind you before I come to a close, though, that this is only a measure of effectiveness, not the end-all solution to determining who the “best” pitchers are.
However, for this Sabermetrician, effectiveness directly corrolates with quality and value.

Poor Matt Cain

Matt Cain was flat out awesome in 2007.  He pitched 200 innings, surrendering only 173 hits.  He also struck out 163 batters and posted a 3.65 ERA.  In fact, of pitchers with 200 or more innings, the only one who gave up less hits than Cain was Cy Young winner Jake Peavy.  And, Cain’s 3.65 ERA placed him tenth best in the entire National League.  Reread this paragraph and let the numbers sink in.
Matt Cain’s record in 2007 was 7-16.  7 wins and 16 losses!  Yes, that is correct!
Brad Penny, in 208 innings, gave up 200 hits and struck out only 135. And do you want to know his record?  16-4!
If that is not a clear-cut indicator of how win-loss records can be deceiving, we are going to take a microscopic look at Matt Cain’s 2007 campaign.  Afterwards, try and argue that his 7-16 season was not better than any NL pitcher not named Peavy or Webb.
The amount of hits a pitcher surrenders is an oft-overlooked statistic because most people want to know about WHIP (walks+hits/IP).  Usually WHIP’s average around the 1.4-1.5 mark.  Even if we look at Cain’s WHIP, something I wanted to avoid because I hate the statistic in this instance, it was 1.25.
I do not hate the WHIP statistic overall, but in a case like Cain, just examining the hits surrendered really shows an unhittable factor.  Batters may have reached base because of mistakes he made in walking them, but the fact that he allowed such a low amount of hits for the innings he pitched shows that hitters truly had a very tough time hitting him.
Think about this… in 32 starts, Matt Cain went 7+ innings fifteen times and allowed four hits or less ten times.
In April alone, he pitched 35 innings in 5 games, and gave up only 12 hits. 12!!  In 35 innings!  That is one hit for every three innings pitched.  And since he averaged seven innings per game, that means in April, he would give up an average of 2.3 hits per game.
More numbers oft-overlooked are quality starts, tough losses, cheap wins, and blown wins. You might already know about quality starts and can probably figure out what a blown win is, but the tough losses and cheap wins are great devices for determining what a pitcher’s TRUE win-loss record should be.
A quality start refers to when a pitcher goes for at least six innings and gives up no more than three earned runs.  Quality starts are a useful number because they let us know how often a starting pitcher put his team in a position to win the game.  Since most teams average 4+ runs per game, if a pitcher gives up three or less, his team should win the game barring unforeseen circumstances.
Cain twirled 22 quality starts out of 32 possible starts.  That is a quality start percentage of 69%.  This means that when Matt Cain pitched, the Giants had just about a 70% chance of winning, or knowing that they would be kept in the game.  Nobody had higher than an 80% quality start percentage.
To put those numbers in perspective, the only NL pitchers with a higher quality start percentage were Jake Peavy, Tim Hudson, Brad Penny, and John Smoltz.  That puts a 7-16 pitcher 5th in quality start percentage, meaning only FOUR other pitchers in the NL gave their team a better chance to win.
Next, we have tough losses.  A tough loss refers to when a pitcher makes a great start, or quality start, and ends up getting a losing decision.  As I mentioned before, since most teams average over four runs a game (the Giants averaged 4.2 in 2007), a pitcher who gives up only three runs or less should end up winning.
For Cain, that was not the case. Of his 16 losses, 9 were tough losses.  Nine times Cain gave up 3 or less runs while pitching 6 or more innings, and LOST.  Nine times!
In his 32 starts, the Giants only provided him with 3.3 runs per game.  And, if you subtract two big blowouts from that, where they scored 15 and 9, in the other 30 starts they provided him a grand total of 84 runs… or 2.7 runs per game, 1.5 runs less per game than their season average.
This would mean that, for Cain to win, he had to give up 2 or less runs.  Only, he did that 18 times out of his 32 starts, and only amassed a 5-7 record (with 6 no decisions) in that span.
This primarily occurred because even when Cain adapted to the lack of run support and prevented the other team from getting over two runs, the Giants also adapted and forgot how to score.  The Giants were shut out four times during Cain starts, scored only one run on another four starts, and only two runs on another five starts.  That adds up to thirteen starts where the Giants gave Cain a maximum of two runs.
Cheap wins refer to when a pitcher does not pitch very well but walks away with a win (SEE: ERIC MILTON).  Of Cain’s 7 wins, 0 were cheap.  All were legitimate wins.
Blown wins refer to when a pitcher leaves the game with a lead but does not get a decision because the bullpen blows the game.  I do not like to count these as no-decisions because extenuating circumstances prevented the pitcher from getting a decision.  The only decisions I like to count as legitimate ND’s are when a pitcher leaves a game while tied or losing and his team comes back to tie or win the game after he has left.
Of Cain’s 9 ND’s, 5 were blown wins.
I threw a lot of numbers at you.  Let’s summarize everything and let it sink in. 

  • Matt Cain pitched 22 quality starts out of 32 starts, a percentage topped only by four other guys. 
  • He gave up 2 or less runs on 18 different starts but somehow only went 5-7 during that span. 
  • Nine of his sixteen losses were games he should have won, due to pitching brilliantly, and twirling better than a quality start.
  • Other than two blowout games, the Giants only gave him 2.5 runs of offense per start.
  • He lost 5 wins thanks to the bullpen.

This is where we will put the numbers to use and generate what Matt Cain’s true 2007 season looked like, since it sure as heck was not a 7-16 season.
Of his 32 starts, 4 were legitimate no-decisions. He had 9 total ND’s, but as mentioned before, I only count games when a pitcher left while tied or losing as a legitimate ND.  That means he should have received 28 decisions this year, win or loss.  Of his 7 wins, none were cheap wins, so all were legitimate and remain counted.  Clearly he could not win cheap because the Giants never scored for him.
Of his 16 losses, nine were tough losses, and seven were legitimate losses.  I’m not trying to make the guy look like a superhero – there were times (seven times) that he really deserved to lose due to poor pitching or just not being on top of his game.
The starting pitchers of the 2007 Giants, other than Cain, had a quality start-win percentage of 76 %, meaning that when the other pitchers were on the mound and pitched a quality game, they won three out of four times.  If we apply that to Cain, he should have won 6 or 7 of those 9 tough losses.  If you take his own quality start-win percentage of 33%, he should have won 2 of the 5 blown wins.
So, he has 7 legit losses.  Add the remaining two losses from the tough losses (we’re counting 7 of his 9 tough losses as wins) and he has 9 losses.  Also add 3 more no-decisions from the blown wins I am not counting as wins, and he has 9 losses out of 25 possible decisions.
Now, add up his 7 legit wins with 2 blown wins and 7 tough losses being counted as wins, and you get what Cain’s 2007 numbers REALLY are – 16-9, 3.65.
16-9, with a 3.65 ERA is good enough for the top ten in Cy Young voting.  I cannot imagine many voters felt comfortable giving votes to a guy with a 7-16 record.
Point blank, the point is that win-loss records are often useless when determining the value or efforts of a pitcher, unless said pitcher shows consistency with it (SEE: GREG MADDUX). 
Look at the 2005 Cy Young Award situation.  Johan Santana went 16-7 with a 2.87 ERA, and led MLB with 238 strikeouts.  Bartolo Colon went 21-8 with a 3.48 ERA, with 157 strikeouts.  Colon won the award.  Voters HAD to have seen his win-loss record and voted accordingly, which makes no sense, since Santana clearly had the better season and would have had a better record if the Twins performed better.
Matt Cain should be thought of as one of the ten or fifteen best in the national league, whether he had a 7-16 record or not.  He cannot, and should not, be blamed for his team refusing to score runs when he pitched brilliantly.
 In fact, you could honestly make the case that the only NL pitchers who really posted better overall seasons were Peavy and Webb, and maybe Hamels.
Seems odd to say that a 7-16 pitcher was potentially the third or fourth best in the whole league, but that is because the Win-Loss record has, for some unjustified reason, become the barometer for measuring the value of a pitcher.
It is just a shame for a guy like Matt Cain, who is not earning the big bucks yet, and pitching leagues better than guys making 15-20 times his salary. 
Win-Loss records should not be put on the pedestal anymore unless the stats justify it. Peavy truly was 19-6 this year. Beckett truly was 20-7 this year.   Anthony Reyes truly was 2-14 this year.
Matt Cain was not truly 7-16 this year.

2007 Sabermetric Year in Review: San Francisco Giants

Continuing our reverse alphabetical tour of MLB, StatSpeak heads west to the C-state for stop #7: San Francisco, which, I might add is about to get the Lilo and Stitch treatment.
Record: 71-91, 5th in NL West.  For a team that had that much attention paid to them in the past year, they were… a last place team.
Pythagorean Projection (Patriot formula): 77.03 wins (683 runs scored, 720 runs allowed). 
Team Statistical Pages:
Baseball Reference
Baseball Prospectus
MVN Blog:
Giants Cove 
Other Giants Resources:
Latest News
Contract Status
Trade Rumors
Overview: Let me see here.  Did anything happen in 2007 of any importance in San Francisco? I’m not coming up with anything, except that the American Psychological Association held its annual convention there.  (I went.)  Must have been that kind of year.  No huge storylines.  No controversy.  Just your basic baseball season.  They did have the All-Star Game, which must have been fun.
What went right: Cain, Lowry, Lincecum.  Has that Smoltz, Glavine, Avery feel to it, doesn’t it?  I suppose that they can argue amongst themselves which one gets to be Steve Avery, but things worked out pretty well for that threesome of pitchers, eh?
Don’t let the record fool you.  Cain lost 16 games, but posted an ERA of 3.65.  His weakness is that he walks too many batters (3.56 per 9 innings), but he was also one of the better strikeout starters in the league last year.  Take a look at his plot for the amount of break on his pitches.  You’ll see that his fastballs are all generally within one blob, suggesting that he has a good idea of where the fastball is going, which is probably why he throws it more than 60% of his pitches.  With his off-speed/breaking stuff, on the other hand, there are a few curves and sliders and changes that seem to be little islands unto their own.  Cain is 22, and has time to learn to control those pitches.  He also gives up a lot of flyballs, but he’s right-armed and lives in a spacious park that is murderous on left-handed power hitters (or at least so the reputation goes).  Cain is able.
Lincecum struck out more than a batter an inning, induced ground balls in 47% of the balls hit off of him, and had a line drive rate of 15.4%.  These are all good results.  He’s also got a 95 mph fastball, and a change and hook to go with it.  He’s also part of the ”I walk a few too many hitters (4 per nine innings)” club, which seems to be a problem with the Giants.  Maybe after seeing Barry Bonds walked so often, they just figured that’s what you’re supposed to do when facing a hitter.  Hmmm… Fantasy players, watch Cain and Lincecum’s walk rates early in the year.  If they’re going down, then buy buy buy buy buy.
Noah Lowry is being bandied about as possible trade bait.  He’s not awful, but he did walk as many batters as he struck out (5 per 9 IP).  He’s also 26, which means the ceiling isn’t quite as high.  But, people who aren’t paying attention might get him confused with Lincecum and Cain (who are 3 and 4 years younger) and assume that Lowry is also 22 or 23.  Maybe that will increase his value.  He’s also left-armed, so he’s looking more like Steve Avery every moment.
What went wrong:  I suppose to continue the above analogy, Barry Zito was supposed to be Greg Maddux, the former Cy Young Award winner free agent signing who would put the team over the top.  In fairness to Zito, he didn’t have a terrible season.  He threw 196 innings, put up respectable numbers, and hey for a fourth starter, I think most teams would be happy to have him in that spot in their rotation.  But, 7/126 is a set of numbers that will haunt the Giants for a very long time.  Six more years to be exact. 
There was one other little problem with the Giants this past year.  The offense was… offensive.  The Giants had three position players with a VORP above 10.  Barry Bonds (55.2) was one of them and he isn’t coming back next year.  The other two were Randy Winn (26.4) and Bengie Molina (14.4).  Pedro Feliz, Omar Vizquel, and Ray Durham, representing 3/4 of the Giants’ infield, all functioned below replacement level.  Even allowing that Feliz is one of the best fielding third basemen in the league, and Vizquel, even at 40-something, is still a premiere fielding shortstop, that can’t be healthy for a team.
Yeah, that about sums it up: And now a list of everyone under the age of 30 who logged more than 250 AB for the Giants this past year: Kevin Frandsen.
Oh yeah, him:  Congratulations to Barry Bonds.  We’re not entirely sure for what yet, but it’s clear that he did something this year.  I think more ink has been spilled on Bonds this year than perhaps the rest of the league combined.  Why waste more?
Brad Hennessey: Here’s another case of a hidden closer who deserves a second look.  The Giants installed Hennessey as their closer at the end of May after the Giants recognized that Armando… Benitez… sorry, I’m doubled over laughing here that Armando Benitez was allowed near the ninth inning.  By the looks of it, Hennessey was replaced by Brian Wilson after Hennessey had a few bad outings at the beginning of September.  During his tenure in the bullpen, Hennessey had 19 saves, 13 holds, and 5 BS, for a close lead protection rate of 84% (32/37), which stacks up decently against the rest of the league.  No one will argue that he’s an outstanding reliever and I wouldn’t want him as my first choice to close, but one could do worse.  He seemed to lose the job based on the fact that he had a few bad outings.  (Why do managers insist on playing the “hot hand?”)  Hennessey was a starter who didn’t really make it as a starter, and so he became a bullpen specialist (politically correct term for “reliever.”)  He doesn’t have electric stuff, but now he has “closing experience” (which he can parlay into at least a million more per year on his next contract.)  I’m not sure what the Giants have in mind for their bullpen this year, but they would do well to consider the reasons why that if Hennessey was good enough to close for them in July, he wasn’t good enough in September.  To me, it sounds like a team that’s clutching at straws.
Hooked on speed?:  Forget steroids.  It looks like the Giants are hooked on speed.  Take a quick look at the run-down of the Giants’ minor league system.  Focus your eyes on the columns marked SB and CS.  See some eye-popping numbers in there?  See them repeated?  The Giants had 15 players in their minor league system who stole more than 20 bases this past year and five who stole at least 40.  The Giants have apparently decided to turn their farm system into a rabbit breeding ground.  Parlez-vous organizational philosophy?  While that’s nice, speed is only helpful if one is on base to use it.  Only five of those 15 speed-demons had OBP’s above .350.
Outlook: Well, let’s see.  Your team loses its biggest offensive weapon from an offense that wasn’t very good to begin with.  They were a last place team last year, even if you believe their Pythagorean record.  You do the math.  This is an organization that’s apparently building around pitching, defense, and speed, instead of… offense, I guess.  Call it the other Barry Bonds backlash.  Now that Bonds has his magic home run, what of the Giants?  They’ve basically existed for the last few years as a vehicle to get Bonds to 756.  Looks like it’s tme to rebuild.