Ode to Ed

Quick, without any research or advanced thought, name baseball’s career leader in ERA!  No, not Walter Johnson.  It isn’t Cy Young either.  Bob Gibson isn’t the answer nor is Sandy Koufax.  The answer, friends, is Ed Walsh, a White Sox pitcher prior to the 1919 scandal.  Kind of ironic that ERA is used as an end all barometer today and finds itself atop the lists of many fans as their favorite statistic, yet a large majority of people have no idea who has the lowest mark in the history of the sport.  This reminds me of something I used to exploit in film school, in that most students would claim The Shawshank Redemption is the greatest film ever made, or their favorite movie, yet couldn’t name its director.  For the record, it is Frank Darabont, but back to ‘Ole Ed Walsh.
Walsh, a masterful spitball artist, pitched for fourteen seasons, 1904-1917, though his “real” playing career ended following the 1912 season.  From 1913-17 he made just fifteen appearances and, in 1917, was one of the oldest players in the league at the relatively young by today’s standards 36 years old.  He spent three games in 1924 as ChiSox skipper as well, going 1-2 before ending his tenure.
For his career, which resulted in Hall of Fame induction in 1946, he produced a 1.82 ERA, 1.00 WHIP, and quite the impressive 1.87 BB/9.  Though he did not strike hitters out at a furious pace–5.27 K/9–his low walk rate led to a 2.81 K/BB ratio.  As mentioned, his ERA is the lowest ever, but he also holds the second lowest WHIP, 7th highest ERA+, and 11th most shutouts.
What should stand out from the previous sentence is his ERA+ ranking seventh despite posting a raw ERA lower than everyone else.  Though ERA+ isn’t a perfect statistic in its own right, it does stack the metric next to the rest of the league with some adjustments thrown in; this way we have a better indicator of success relative to the era.  Walsh pitched in a time when lower ERAs were more common.  Not to take anything away from his 1.82 mark, but a 1.82 career ERA in today’s game would be much more impressive, hands down.  Pedro Martinez actually holds the best career ERA+ at 157; Walsh’s is at 146.
Ed was twice the runner up in MVP voting, in both of 1911 and 1912, his two final truly effective seasons.  He likely would have won or ranked highly in voting during previous years except the MVP didn’t exist.  Via a previous article of mine, Walsh would have won the Cy Young Award three times, in 1907, 1908, and 1911.  These three wins are all the more impressive given that his peak really only lasted from 1906-1912, meaning that in the span of seven years he would have likely won three “best pitcher” awards.  Here are his numbers in each year of this peak:

  • 1906: 278.1 IP, 58 BB, 171 K, 0.98 WHIP, 1.88 ERA
  • 1907: 422.1 IP, 87 BB, 206 K, 1.01 WHIP, 1.60 ERA
  • 1908: 464.0 IP, 56 BB, 269 K, 0.86 WHIP, 1.42 ERA
  • 1909: 230.1 IP, 50 BB, 127 K, 0.94 WHIP, 1.41 ERA
  • 1910: 369.2 IP, 61 BB, 258 K, 0.82 WHIP, 1.27 ERA
  • 1911: 368.2 IP, 72 BB, 255 K, 1.08 WHIP, 2.22 ERA
  • 1912: 393.0 IP, 94 BB, 254 K, 1.08 WHIP, 2.15 ERA

Some career to have when your decline years consist of a sub-1.10 WHIP and ERAs below 2.30.  An average year in this seven year span produced 361 IP, 68 BB, 220 K, 0.97 WHIP, 1.71 ERA.  In each of these years he ranked among the top ten in GP, ERA, ERA+, WHIP, K/9, K/BB, and Shutouts.  Without question he had a remarkable peak but he was not a super-duperstar like we would imagine someone with the lowest career ERA might be.
His career’s longevity also suffered from tons of work.  Following his ridiculous 1908 season, Walsh was only able to muster up half the workload.  After all, in 1908 he made 66 appearances, which essentially worked out to be every other day.  This workload picked right back up following the 1909 season, and by 1913 his arm was virtually dead.  On top of that, he pushed himself to the max most of the time, often putting himself on the mound without proper off-season rehabilitation.
It came as no surprise that his arm was completely dead following the 1916 season.  He asked Charles Comiskey for time off but was instead released.  His 1917 stint with the Boston Braves did not last too long and Walsh soon found himself without a place to pitch.  He stuck around as a coach for a while, and tried his hand at umpiring as well, but nothing could take away from his extremely effective peak.
Ed Walsh, who supposedly convinced the architect of Comiskey Park to tailor the stadium to his and nobody else’s pitching style, is another example not only of a player who built gaudy career numbers on nothing more than an impressive peak, but also of how players from the past are easily forgotten; in Ed’s case, his career low ERA really has not made a difference in the broad spectrum of fans knowing he ever existed.

They got Mark Teixeira for what?

I’ve got to hand it to the Los Angeles California Angels of Anaheim, California which is near Los Angeles.  They didn’t get the Garrett Anderson replacement that they needed, but they got something that might end up being even better for them.  And they didn’t really give up a lot.  The Braves got hosed.
In case you missed it, the LACAoACwinLA’s got Mark Teixeira from the Braves (who acquired him a mere year ago… remember this?) for Casey Kotchman and Steve Marek.  I can only scratch my head and wonder what the Braves were thinking.  Sure, Teixeira is a free agent at the end of the year, and Braves have decided to run up the white flag on this year, so they had to get something for him.  But, all of the buzz said that there were several teams in line for Mr. Teixeira’s services, and that the Braves originally wanted to get three fairly high level players in return.  They ended up with two, and I’m not at all sold on Kotchman as being high level.  Maybe I’m being a bit harsh on the Braves.  This is really only a two-month rental, so we’re not talking about a big chip to trade, but Joe Sheehan at Baseball Prospectus pointed out that he’s probably the one player left on the market (Manny Ramirez notwithstanding) that’s worth 3 wins to a team over the next few months.
It’s nice that Kotchman doesn’t strike out much, and that he even walked more than he struck out last year, but all that means is that Kotchman puts the ball in play a lot.  When he does put the ball in play, he’s a very ground ball happy hitter, with the majority of his balls kissing the grass.  When he does get the ball in the air, only about 10% of those balls leave the yard.  Kotchman’s line drive rate is a lowly 16% this year.  He does make good contact, but who needs a contact/singles hitter who shows limited power potential at first base?  Then again, the Braves aren’t exactly busting at the seams with soon-to-be-ready first base prospects, so Kotchman might just be a stopgap for a few years.  On the other hand, the Angels have Matthew Brown (25 years old, hit .326/.377/.594 at Salt Lake City), and ever the prospect Kendry Morales (25 years old, .310/.347/.482 at SLC) ready to step in.  Maybe the Braves should have asked for one of those two?  Maybe?
Steve Marek is a 24 year old right-handed pitcher at AA, who just recently moved to the pen, and it seems to have done a world of good for him.  He’s now striking out 11 per nine innings (used to be 7.5), and has a good K/BB ratio and an OPS against of .601.  Pretty good numbers, which probably project him to be a good RH reliever (dare I say “closer”?) down the road.  The Angels had another like him (Ryan Aldridge) at AA who fits the same profile, so they must have figured that Marek was a redundant part. 
But that’s it?  For the fourth-best NL first baseman by VORP (behind Berkman, Pujols, and Conor Jackson) and 29th best player by VORP in all of MLB?  With multiple teams bidding?  Something just ain’t right here.  The Braves gave up that Saltalaralphmacchio guy and some other nice pieces for Teixeira last year at this time.  It made sense at the time, and the Braves were unlucky that it didn’t work out, but this seems like a fiasco for the Braves.  Maybe the market was much softer than was reported, although the Braves still had 48 more hours to play chicken with the Angels and anyone else who was calling.  People make bad decisions on short deadlines.  Did no one in the Braves organization ever take a psychology class?
But, here’s Teixeira walking away from the Braves for a good (in the sportscaster sense of the word) first baseman and a reliever prospect.  To add insult to injury, if Tex signs elsewhere next year, the LACAoACwinLA’s get the draft picks that he will surely rate.  Is that all there is
David Cameron over at FanGraphs showed how Teixeira will benefit the Angels at the plate, at least on paper.  I looked at the OPA! numbers (my Retrosheet-compatable defensive rating system) for Teixeira and Kotchman for 2007.  Kotchman was about 6 runs better than (ahem, 2 time Gold Glove winner) Teixeira last year, but both were hovering around average for the position, with Kotchman slightly above and Tex slightly below.  The Angels take a small hit on defense, but nothing that Teixeira won’t easily make up for with his bat.  The Angels made out like a couple of high schoolers in an empty movie theatre.
The Angels are playing over their heads as far as their Pythagorean record goes (by 8-12 games, according to Baseball Prospectus’ adjusted standings), but the reality is that they are 11.5 games up on second place Texas, the A’s have already loudly surrendered, and the Mariners are on their way to losing 100 games.  The Angels simply have to coast into the playoffs this year.  Once they get there, they’re a much better team than they were 24 hours ago.  Even if they lose Teixeira at the end of the year, they have cover in the organization at first base that doesn’t look to be any worse than Kotchman.  So their net cost to load up for the playoffs was a placeholder and one AA pitching prospect.  Well played.
On a related note, be sure to tune in to MVN’s coverage of the trading deadline on Thursday, starting at 9 am Eastern, going to 4 pm.  Maybe later.

World Famous StatSpeak Roundtable: July 29

The roundtable rolls into the trading deadline this week with a visit from Sky King, who runs his own blog (with a hint of lime).  For a recap of all the moves that may or may not be made at the trading deadline in real time, be sure to tune into MVN’s live coverage of the trading deadline moves on Thursday, starting at 9:00 am Eastern.  Anyway, back to the roundtable, where today we discuss (what else) some trades that we think aren’t getting enough press, whether the Tigers are crouching or hiding, and why Pizza Cutter will never violate his own rule to never leave a baseball game early ever again.
Question #1: What players aren’t being discussed enough as trade targets and which contending teams have holes to be filled that aren’t getting enough attention?
Sky King: Mark Ellis needs to be playing for a contender. Calling him today’s Ozzie Smith isn’t too far-fetched. If Milwaukee was serious about upgrading from Rickie Weeks, Ellis is a ten-fold improvement over both him and Ray Durham. Randy Winn’s a good target, combining a solid bat with a good glove. And the Giants certainly don’t need him. Bengie Molina, either — he’s got a decent stick for a catcher. How about Aubrey Huff as a sell-high candidate from the Orioles. Is there a contending team who doesn’t need a bat? Scott Downs would quickly be the top reliever available next to Huston Street if Ricciardi wanted to trade him. It would take a bunch of prospects since he’s signed to a solid contract, but he’s a closer waiting to happen (2.5 K/BB and 60% GB). Brian Giles is an All-Star caliber corner outfielder. I’m not saying Kevin Towers would want to trade him, but there should definitely be teams beating down his door asking about Giles.
As for holes, nobody’s talking about the Angel’s needs because they’ve got the division locked up, but they need something better than Garrett Anderson in left field. Ugh. The Yankees have a complete hole when you consider Abreu is an awful fielder and Melky is an awful hitter. Abreu needs to DH, even before Giambi does (unless you think Abreu can move to first base). Nady should play right field every day, and Damon should be in center (his range almost makes up for his poor arm compared to Melky). And in left field? Well, that’s where the hole is. Think the Reds would spring for an Adam Dunn for Melky trade? I’d say something about Minnesota’s holes, but the Twins don’t seem interested even in plugging in internal options (Liriano) or trading for a massive upgrade (Beltre) with a team which historically makes poor trades.
Eric Seidman: I wanted to start by listing the contenders and then evaluating their weaknesses, but honestly, there are not a whole bunch of players easily available that would alleviate any of these concerns.  Matt Holliday’s name is talked about a ton but the Rockies aren’t even, by their own admission, committed to moving him, and a team like the Phillies don’t need another outfielder this year.  Then there’s the starting pitchers, of whom AJ Burnett seems to be the nicest-looking.  But there’s a scary opt-out clause and the teams looking for a serious upgrade have traded for Sabathia, Harden, and, well, Joe Blanton.  The Cardinals have been linked to Burnett but he’s another one I don’t see moving.
If I had to pinpoint one player and team I would say Adam Dunn to the Diamondbacks, since Conor Jackson at 1B and Dunn in LF would be an upgrade over any combo of Clark at 1B/Jackson in LF, Clark at 1B/Tracy in LF, Tracy at 1B/Jackson in LF, Jackson at 1B/Tracy in LF.  Plus, they are 52-51 right now and with some added offense, should run away with that division.
The Mets should look for an upgrade as well, especially if Jason Bay is made available.  Endy Chavez may be a nice player, but with Alou and Church out for however long, they are throwing Marlon Anderson, Fernando Tatis, and Chavez into the outfield to sandwich Beltran.  Adding Bay would not only provide a nice offensive bat, but it would mean they could get rid of either Anderson/Tatis, who don’t really strike confidence into Mets fans (at least they shouldn’t).
Pizza Cutter: It’s hard to think that with MLB Trade Rumors going full force this time of year that there would be any needs or targets that are truly un-covered.  The Mets sure could use a left fielder, even though most of the media’s attention (and apparently the Mets’) is focused on their desire for a reliever.  The Mets have started five different guys at least ten times in left field this year: Angel Pagan (best name in baseball), Marlon Anderson, Endy Chavez, Fernando Tatis (he’s still around?), and Moises Alou.  They’ve also recently been letting Damion Easley (he’s also still around?) hit fifth.  Adam Dunn is a free agent at the end of the year and the Reds aren’t going anywhere this year.  The Reds have sworn that they won’t trade Dunn even though that makes no sense, and for some reason the media is believing them for the most part.  Are you thinking what I’m thinking?
Read more of this post

The Real Gold (and Lead) Gloves

OPA! is more or less finished.  OPA! is my Retrosheet compatible fielding system that is based on out probability added above average (OPAAA… get it?  OPA!)  Over the past few weeks, I’ve been refining the system little by little, and I’ve finally gotten it where I want it.  Now, I can have some fun with it. 
For my first trick, I wrote some code that assigned a run value to all of the events I looked at.  Let’s take a simple grounder somewhere in the neighborhood of the third baseman.  If the third baseman makes the play (and the first baseman catches his throw), an out has been recorded, a runner is not on base, and any runners on at the time of that ground ball have less of a chance of moving up.  If the 3B lets the ball go through, there’s no out recorded, the batter reaches base, and the runners may just go crazy.  There’s a lot riding on that one play.  I looked at all ground balls that the third baseman either got to or were reasonably within his area of town (if the left fielder actually fielded it, the third baseman got part of the blame) and looked at the run expectancy (how many runs actually scored afterward) after a completed play on a grounder vs. a play that went for a hit.  The difference was the run value of a third baseman converting a ground ball into an out.  I did the same for all the other events for which I coded.
I then applied the run values to the players in the 2007 data file to answer the question, “Who were the best fielders in baseball at their respective positions in 2007?  And while I’m here, who were the worst?”  The Gold Glove is given to the player voted to be the best fielder at his position in a given year.  So, here I present the Lead Gloves!  I’m not the first Sabermetrician to present these types of awards, but I am the third-coolest.
I limited the contenders to those who played more than 729 innings (81 games) at the position in question in 2007.  (For pitchers, I simply asked for 50 IP.)  I ranked the players by how many runs they contributed over the course of the year, which does reward those who played more often.  I suppose I could do runs per inning and set a cutoff for innings played, but I’m lazy.  I continued to follow the tradition of having two winners per position, one from each league, and the tradition of not distinguishing by outfield spots.  One drawback: my system does not rate catchers.  Most of what catchers do is not immediately apparent on Retrosheet, particularly blocking balls in the dirt.
Pos   AL OPA! Gold Glove (OPA! runs)  AL OPA! Lead Glove (OPA! runs)
P      Shaun Marcum (2.76)                       Bartolo Colon (-2.71)
1B    Kevin Youklis (12.02)                       Carlos Pena (-9.67)
2B   Mark Ellis (20.85)                             Josh Barfield (-13.11)
SS    Tony Pena (16.88)                            Brendan Harris (-21.82)
3B   Nick Punto (9.13)                              Alex Rodriguez (-2.47)
OF  Curtis Granderson (22.74)               Jermaine Dye (-26.26)
OF  Coco Crisp (22.42)                             Raul Ibanez (-15.04)
OF  Ichiro Suzuki (16.20)                        Torii Hunter (-13.61)
Pos  NL OPA! Gold Glove (OPA! runs)  NL OPA! Lead Glove (OPA! runs)
P     Jeff Francis (3.03)                              Wandy Rodriguez (-2.78)
1B   Todd Helton (27.88)                          Lance Berkman (-14.27)
2B   Kaz Matsui (30.40)                           Dan Uggla (-25.60)
SS   Omar Vizquel (23.18)                        Hanley Ramirez (-20.56)
3B   Pedro Feliz (21.53)                            Ryan Braun (-33.18)
OF  Austin Kearns (30.76)                       Juan Pierre (-28.86)
OF  Jeff Francoeur (28.56)                      Ken Griffey Jr. (-22.21)
OF  Alfonso Soriano (26.35)                    Chris Duncan (-21.57)
Of the actual winners, only Youkilis, Ichiro, and Francoeur won OPA! Gold Gloves.  Torii Hunter won a Gold Glove in real life (his seventh in a row!), but was actually the third worst fielding outfielder in the American League last year.
A few other notable happenings:

  • To give you an idea of how bad Ryan Braun was at third base last year, the second worst fielder out there was Edwin Encarnacion, who checked in about 9 runs below average.  To give you an idea of how good Pedro Feliz was last year, Punto was in second place among all Major Leaguers.
  • The National League infield includes three Rockies (Francis, Helton, Matsui) and a fourth (Tulowitzki) just missed.  Speaking of, Omar Vizquel is nothing short of amazing.  Last year, he was 40 years old, and he’s still the best defensive shortstop in baseball.
  • Someone call the irony police: Nick Punto was tops at something and A-Rod was last!
  • Juan Pierre had the most horrid arm of anyone in baseball last year.  He cost whoever it is that he plays for (yeah, I know, the Dodgers), almost 24 runs by virtue of the fact that teams ran at will against him.  But, he’s fast!
  • The NL Lead Glove infield all made the 2008 NL All-Star team… including Dan Uggla.  Guess what happened.
  • Francoeur got most of his points off his arm.  Austin Kearns caught a lot of line drives.

Now, for a small laugh at the expense of the Gold Glove voters.  Let’s pretend that the actual winners played on a team and see what would happen with them defensively.
Pos  AL Winners (OPA! runs)   NL Winners (OPA! runs)
P      Johan Santana (1.46)         Greg Maddux (0.28)
1B    Kevin Youkilis (12.02)       Derrek Lee (-0.61)
2B   Placido Polanco (-5.62)      Orlando Hudson (1.13)
SS   Orlando Cabrera (-5.17)    Jimmy Rollins (8.35)
3B   Adrian Beltre (-1.65)          David Wright (5.04)
OF  Torii Hunter (-13.61)         Carlos Beltran (8.18)
OF  Grady Sizemore (-0.56)    Andruw Jones (17.64)
OF  Ichiro Suzuki (16.20)         Jeff Francoeur (28.56)
(Aaron Rowand, 11.56 OPA! runs, tied with Frenchy.  I gave the voters the benefit of the doubt that they really wanted Frenchy.)
It’s forgivable if the top guy doesn’t win if he loses to a guy who’s #2 or #3 at his position.  It’s another to think that the majority (5 out of 8) of the AL Gold Glove winners actually rated below average in the OPA! system.  Even worse, the AL team, if it were real would have a grand total of 3.07 runs above an average team on defense.  In the NL, the writers at least picked mostly above average defenders (with Lee coming in just a bit below), but Hudson, Rollins, and Wright, while above average, are hardly “the best.”
The fact that the Gold Glove voters have no idea what they’re talking about can be proven by a quick examination of Derek Jeter’s trophy case, and the argument has been made before.  But, now with OPA!, I can look to see how good or bad the folks voting on these things have been in all the years that Retrosheet has available.  Perhaps I will.

From Z-Scores to T-Tests

Every now and again I’ll get an e-mail asking me to explain certain statistics terms or methods used in an article from myself, Pizza Cutter, or whomever.  While I do not profess myself to be an all-out expert, I did pay attention in my classes through high school and college, and gladly try to make a post an author slaved over more enjoyable through a learned understanding.  I pride myself on accessibility and so, from time to time, will offer some primers or “lessons” using baseball examples.  One thing I still fail to understand is how so few collegiate-level classrooms incorporate sports heavily into their statistics curriculum.  Seriously, do you know how many people would jump at the opportunity to learn statistics if sports were involved as an instructive tool?  But I digress…
Today the topics on the metaphorical table are standard deviations, z-scores, and t-tests.  Getting right into it, what is a standard deviation?
The given definition of a standard deviation is a measure of the dispersion of a set of values, but what does that mean and how does it relate to baseball?  Essentially, standard deviation is the square root of the mean of the squared deviation of each member of the dataset to the overall mean.  I realize that can be very confusing.  Say we have ten hitters, who average out to a .345 OBP.  To calculate the standard deviation of this group we must begin by finding how far from the mean, or average, each hitter falls.  If Hitter A has a .329 OBP, his deviation would be -.016.  The same is calculated for each member of the group.  The results are then squared, so the -.016 becomes a .000256.
Once we square all of the deviations, they are averaged together to form our mean of squared deviation.  This is known as the variance and it goes hand in hand with standard deviation.  In fact, to calculate the standard deviation, from here on out called SD, we simply take the square root of the variance.  If the variance–the average of squared individual deviations from the mean–comes to, say, .000231, we take its square root: .0151.  This tells us the standard deviation of OBPs amongst these ten players is .0151.  What do we make of this number, though?
SDs are tremendous for exploring ranges and where numbers in a dataset are expected to fall.  Since the mean is .345 and the SD is .0151, to find the range of 1 SD we add and subtract .0151 to/from .345.  So, 1 SD of our mean would fall between an OBP of .329 and .360.  If our data follows a bell-shaped curve, then the 68-95-99.7 rule comes into play.  This explains that 68% of the data in our sample is expected to fall within 1 SD; 95% is expected to fall within 2 SDs; and 99.7%, virtually everything we have, should fall within 3 SDs.
In terms of ranges, if 1 SD = .0151, then 2 SDs = .0302, and 3 SDs = .0453.  1 SD would fall between .329 and .360; 2 SDs would fall between .315 and .375; and 3 SDs would fall between, .300 and .390.  Of course this is just an example of this particular hypothetical dataset.
We can use the mean of a dataset as a jumping off point, so to speak, with the use of the z-score.  A z-score tells us how many standard deviations from the mean an individual piece of data fell.  To calculate, we subtract the individual data from the mean and divide by the standard deviation.  If the average amount of home runs hit was 14, and our player of interest hit 34, we know he exceeded the mean, but by how much?  Assuming our guy belonged to a set of data wherein 1 SD = 3.2 HR, the z-score would be: 34-14/3.2 = 20/3.2 = 6.25.
Recall the 68-95-99.7 rule in 99.7% of data can be expected to fall within 3 SDs of the mean, because this player exceeded the mean by over six SDs.  The z-scores are great when comparing players from different eras.  If you wanted to know whether Roger Maris’s 61 HR were more impressive than Mark McGwire’s 70 in 1998, find the mean HR in each of those years as well as the standard deviation and calculate the z-score.
Finally, that brings us to t-tests, which I used as recently as this past week and will use as soon as this upcoming week.  The t-test compares the means of two different groups to explain whether or not they are significantly different.  This is different than gauging a general difference between two groups.  If we have two sets of data, one with a mean BA of .276 and the other with a mean BA of .264, it definitely appears Group A performed better with regards to batting average.  This may be true but is not necessarily. Sure, the number itself is higher, but perhaps the sample sizes are too small and the difference is purely noise.  This test accounts for that possibility and explains when means are or are not significantly different from one another.
The calculation of the t-test can be found here, though it is much easier to automate the process via SPSS or some other statistics programs.  Once the t-value is calculated we then have to match it up with its significance level to see if the difference between the means is real.  SPSS goes right to the significance level to save some time.  A p-value of .05 or below corresponds to the means being significantly different; any higher and the differentials begin to lose significance.  If Group A had a .276 and B had a .264, and the p-value of the t-test is .013, then yes, the means are different from each other and Group A really did perform better relative to that metric.
T-tests are great for comparing the means of two different datasets and, in baseball terms, can be used to do things like compare players before and after in splits, on the road or at home, etc, anything along those lines.  They help us understand that a higher number doesn’t always mean the group with the higher number is better, or that the lower number is worse.  For more recap on statistics, I highly recommend Pizza Cutter’s primers, which can be found by clicking on some words in this sentence.

The Deal With Derby Participants

In 2005, Bobby Abreu of the Phillies put on a showcase at the Home Run Derby, breaking the single-round record en route to a derby victory.  All told, Abreu swatted 41 home runs into the Detroit stands that night.  Since that fateful day, three and a half years ago, Abreu has hit just 47 home runs total.  From 1999-2004, he averaged 24.3 HR/yr; combining the second half of 2005 with the first half of 2008, and adding it to 2006-07, Abreu has averaged just 16 HR/yr since.  While several factors could account for this drop, such as age, a decline in bat speed, more grounders, to name a few, many fans and analysts alike attributed his second half dropoff to a “tired swing” due to the derby.
This story is not alone, either.  Juan Rodriguez, in this Florida Sun-Sentinel article, discusses the highly popular idea that home run derby participants will experience a second-half dropoff.  According to his research, half of those studied experienced drops in their home run rates while the other half stayed stagnant or increased their rate.  Still, whether in jest or born out of complete concern, the idea of a derby-driven dropoff is a very popular one.
There have not been many studies on the subject either, meaning nothing has necessarily debunked the myth or proven it to be true.  I have seen a couple of studies but they, just as I did in an initial look at this very subject, fell into the same trap.  By straight up comparing first half to second half, our results are not truly expressing what we intend.  The problem stems from a selection bias in that those named to the all-star team or home run derby are likely having big first-halves.  Overachieving first-halves, that is, meaning they are naturally due for a second half regression whether they participate in the derby or not.
To properly conduct this study, by using the true talent level, we need to compare the actual second half production to the projected production based on the previous three years and the big first half of the year in question.  This way, the real test will be whether or not players fall short of their projection in the second half.  If so, then yes, the idea of a decline following the derby does carry some weight.  If not, and/or the results are ambiguous, then it is nothing more than a theory as their would be nothing to suggest a decline. 
Those supporting the idea could play the “tired swing” or “uppercut mentality” cards but I say it’s largely poppycock.  I am, however, willing to be openly swayed by the numbers should they come to suggest such a result.  My first step was compiling a list of all derby participants from 2000-2007, then entering their actual second half performance into a spreadsheet.  Using the in-season Marcel projector, I then manually entered the pertinent numbers into the required fields, which took forever (Hardball Times, you need to go prior to 2004!) but eventually offered the projections for the second halves in each of those years for each of those players. 
Next, I tested the strength of the numbers by running a simple correlation.  As you will see below, everything other than batting average correlated quite strongly to each other between the halves:

  • AB/HR: .49
  • BA: .28
  • OBP: .65
  • SLG: .58
  • OPS: .61

Testing the correlations or running a linear regression could help in this study but I decided to go with a paired samples t-test instead.  A t-test compares the means between two sets of data and lets us know if the differences between the means are statistically significant or not.  Keep in mind the sample size here is 64 players so these results may not be anything definitive, but I’m really just testing to see if the idea of a decline should be given any credence, whether or not it shows up in any way in the numbers. 
Anyways, back to t-tests: In them, a p-value of .05 or less suggests that the means are, in fact, statistically different.  Higher than that and the means are not really that different regardless of whether or not one appears higher or lower than another.  Since we are testing for a decline here, the expectation is that the mean of the projected statistics will exceed the mean of actual statistics.  After running the t-test I was surprised to find that all five measured stats (AB/HR, BA, OBP, SLG, and OPS) had a p-value below .05; in fact they were all below .03, with batting average being the least significant.  Since the means are all significantly different from a statistics standpoint, here are the comparisons:

  • Projected AB/HR: 17.8
  • Actual AB/HR: 19.9
  • Projected BA: .293
  • Actual BA: .299
  • Projected OBP: .382
  • Actual OBP: .397
  • Projected SLG: .546
  • Actual SLG: .563
  • Projected OPS: .928
  • Actual OPS: .961

According to these results, the derby participants from 2000-2007 have actually outdone their projections in the slash line department as well as in OPS.  This offers, at least amongst these numbers, that these players are not declining in overall production in the second half relative to what they were expected to do.  In fact, it might even point in the opposite direction, that the derby was merely a stepping stone towards a great year for the players in question.  By outdoing the second half projections and beating the expected regression, the slash line and OPS do not suggest decline in the least.  We may be picking nits over whether it suggests improvement, but definitely not decline.
However, and it’s a big however, the AB/HR did get worse in the actual data.  On average, the projected player would hit a home run once every 17.8 at-bats, while the actual players did so once every twenty or so at-bats.  Essentially, the overall production of these players did not decline but their rate of home runs did.  From a psychological standpoint, Pizza Cutter noted that perhaps pitchers will bear down more in the second half against these all stars and derby participants to avoid surrendering home runs from them, even though pitchers tend to give up more flyballs in the second half.
Overall, I would like to extend this into a larger study unless someone beats me to the punch, to see if the results hold up when we add say 12-15 years of derby data to the fold.  Based on this study, however, it does appear that players will experience a drop in their home run rates while simultaneously beating their projections in BA, OBP, SLG, and OPS.  The key to remember is that we are comparing projected second halves to actual, not a straight up comparison between both halves for each players; that would produce different results, and wouldn’t be a fair test for decline.
Taking this to the next level would involve using a larger sample of derby participants and conducting another t-test to compare the means in several areas.  Additionally, we would want an equal sample of non-derby participants with similar numbers in perhaps the AB/HR area.  We would conduct the same t-test for them and see if the derby actually has an effect; if it does, then we would see the same lower AB/HR rates for derby guys but different results for the non-derby guys.  They would be our control group.  For now though, it’s interesting to see that the derby participants only decline in that area.  Essentially, we cannot automatically assume that the derby caused the ab/hr decline until we see them stacked alongside similar players not in the derby as a control.

On the reliability of defensive abilities, part 2

And as often happens here at StatSpeak, we return to the curious case of Denny Hocking.  Hocking, the former Minnesota Twins “10th man” is to this day one of my most favorite Sabermetric players.  Not because he was any good, mind you.  In fact, he never… really… did much.  He simply amuses me for the fact that a man with a career .251/.310/.344 line with at-best-average speed could hang around for thirteen seasons.  What was his secret?  He was “versatile.”  Hocking played all 7 non-battery positions in his career, and played all of them… correctly.  But, he’s a symbol of a guy who can extend his career simply by willing to be a jack of all trades (and a master of none) and a symptom of the fact that it’s hard to find someone who can play more than one position in the field.
Really though, how hard is it to play more than one position?  Hocking did it.  If a player is a gifted enough athlete, shouldn’t he be able to perform, no matter where he’s placed on the ball diamond?  Let’s take a look to answer this one.
First off, let’s look to see if there is a correlation in skills within a position.  I ran the OPA! system on all players from 2004-2007 and took all those who had logged at least 450 innings (50 games) at a position.  Then, I looked, position-by-position whether there were inter-correlations among the various skills that a player might show.  That is, I looked to see among first baseman, was “range” OPA! on ground balls correlated with “hands” OPA! on ground balls or “arm” OPA! on ground balls and so on.  (I normed each skill by dividing by the number of chances each player had.)  This serves as a test of whether “defensive ability” is one skill or several.
I found almost nothing.  Just about everything was uncorrelated with everything else.  (For the really statistically super-savvy, I even went so far as to run some factor analytical models to see if I could salvage anything.  If you ever want to see some ugly factor loading plots, call me.  This could be the result of the fact that I have some unstable parameters year-to-year, or it could be that the that fielding skills are unrelated to one another.  If you have no idea what I just said, don’t worry.)
The few significant correlations that I did find were modest at best (< .30), although they did put my mind a little bit at ease on a few things.  At both second and third base, a strong arm was correlated with a few unexpected things.  In both cases, a strong arm meant a good amount of skill at fielding the ball cleanly.  A strong arm also correlated (weakly) with, of all things, range on popups (third base) and range on line drives (second base).  Those might be type I errors, due to running so many tests.  But there’s also a correlation at shortstop between arm OPA! per ground ball and range OPA! per ground ball.  It was modest (r = .209) but it was positive.  I had actually worried that these would be negatively correlated in that a player who had excellent range would get to a lot of balls that others would simply let into left field, but which he would have no chance of actually making a throw on.  This probably does happen, but it looks like a shortstop with good range is more likely to have a good arm rating.  Looks like those gross-motor skills develop together.
Turning the double play at second (r = .689) and short (r = .633) was correlated strongly with a player’s arm ratings on ground balls generally.  That makes sense since both are essentially throws anyway.  In the outfield, almost nothing was correlated with anything else.
This means that fielding skills (things like soft hands, a good arm, etc.) do not all come as one package, but are all component pieces that a player must have.  A player can easily have soft hands but a noodle arm… or soft hands and a good arm.  It’s hard to find a player who is particularly good at all of them, but what we can do is look at which ones are most important at which position.  I calculated a player’s OPA! per inning at the position, and looked at which of the player’s skills were most closely associated with the amount of out probability he was adding.
On the infield, not shockingly, ground ball abilities were most closely associated with OPA!  After all, that’s most of what an infielder does.  But, the parts of the ground ball that were most associated with making outs differed between the positions.  Below, I present the correlations at each position between the skill per GB rating for each position and OPA! per inning.
pos — range — field — arm — catch
1B  — .604   — .670 – .477 — .265
2B  — .581   — .625 – .740 — .005
SS  — .651   — .386 — .741  — (.037)
3B  — .623  — .615  — .724  — (.045)
Line drive range is also significantly correlated at all four positions, but not equally so.  Line drive range is much more important for “basemen” than the shortstop.  Liner range correlated with OPA! per inning at .433 at first base, .490 at second base, .298 at shortstop, and .415 at third.
A first baseman actually benefits from being able to cleanly field a grounder most of all the infielders.  He’s also the only one who really benefits from being able to catch the ball, although he’s pretty much the only one who needs to use that skill on a consistent basis.  The big surprise is that a shortstop booting a few ground balls isn’t the best determinant of his defensive prowess.  A shortstop doesn’t benefit most from soft hands, but from good range and a good arm.  A player who has soft hands, but a lousy arm at another infield position might benefit from a move to first.  A shortstop who has lost some range but still has soft hands might make a good second baseman, and a second baseman who has good range, but some trouble fielding the ball on a grounder might actually benefit from a move to short.  A second baseman and a third baseman seem to share about the same profile for success.
In the outfield, range on flyballs and liners was uniformly important (no shock there), although a center fielder and a right fielder’s arm rating in gunning runners down was also moderately correlated with OPA! per inning (CF, r = .371, RF, r = .334).  Again, not much of a surprise that a right fielder’s arm would come into play, but I don’t know that a center fielder’s arm quite gets the credit it’s due for how important it is.
Are skills portable from one position to another?  For example, if I have a strong arm at second, will it translate into a strong arm at third if I play over there.  To study this further, let’s take a look at the Denny Hocking’s of the world.  I looked for all players in a year who had logged more than 90 innings (10 games) in a season at two different positions between 1995 and 2007.  I ran correlations to see if, for example, range at second base correlated with range at short.
Ground ball range was very portable across the infield.  In fact, other than first base and shortstop, every other combination of positions on the infield had a correlation of better than .74.  Range on line drives and to a lesser extent, popups told a similar story.  If you’re fleet of foot (or leaden of foot) at one position on the infield, you’re going to be similarly fleet (or leaden) elsewhere on the infield.  If a team has a slow infielder, it’s just a matter of figuring out where to hide him so that he will do the least damage.
Hands (not making a fielding error on a ground ball) was moderately consistent (correlations around .35) between first basemen and second baseman, and first basemen and shortstops (although oddly, not between second and short).  Arm ratings on ground balls were somewhat correlated (r = .224) for second and third basemen and catchin throws was consistent across first and second base (r = .448).  So, some skills can be taken with you from position to position, but in general, it’s interesting to see how many correlations were not significant.  In particular, arm ratings did not translate from position to position.  Since we’re measuring players against themselves, arm strength is the same.  Maybe it’s a matter of different types of throws that need to be made.  The throw from third to first is long, but it’s a pretty straight shot.  The throw from second is shorter, but may require the fielder to throw from a much more contorted body position.
However, once you put everything together, total OPA! per ground ball is moderately consistent across most infield positions.  First base and third base are correlated at .360.  Shortstop and second base are correlated at .606.  Performance at third base also correlates fairly well with performance at short (r = .504) and second (r = .568).  The same pattern emerged for line drive OPA!  Because range is so consistent, and it’s the first step in doing anything, it’s going to have a profound effect.
One other finding that might mean something.  When it comes to catching line drives, SS and 2B have a correlation of .315.  However, 1B and 3B have a correlation of -.285.  The angle that a line drive comes off the bat might make a difference, and fielders might be better at some angles than others.  In the middle infield, where the angles are likely more similar, and the fielders play deeper anyway, it’s probably a little easier.
So, considering all of the above, if a player is a good fielder at one infield position, would he thrive at another?  After all, it doesn’t matter how it’s done, you just have to produce outs.  Correlations of total infield OPA! per inning at the various positions are presented below.
pos — 1b  –  2b  –  3b  –  ss
1b  — 1.0 — .278 — .347 — .101
2b –  xx  — 1.0  — .504 — .528
3b –  xx  — xx   — 1.0  –  .434
ss  –  xx  — xx  –  xx  –  1.0
Correlations are pretty good between second, short, and third, and if a first baseman is going to play anywhere else, it’s probably going to be third base.  That’s basically the pattern that plays out in actual bench construction.
In the outfield, things are much more murky.  I couldn’t find a single correlation worth reporting in which a skill at one outfield position carried over to another.  Nothing.  In part one, I found that most of the outfield abilities that I measured were not very stable from year to year.  It seems that outfield defense is a little bit like batting average on balls in play.  Sure, there’s some skill to the craft, but the outfield is a big place, and over the course of a season, there’s not enough to measure to really get a good idea on how good a fielder is.  At least in my system.
I have to leave open the thought that while my system seems to work well for infielders, the lack of location data is crippling me.  I can imagine that if I had more concrete location data to use (rather than I know that the ball was hit into the general vicinity of the left fielder), I’d be able to sharpen things a bit better.  As of right now, I have to say that I’m not thrilled with the results that OPA! is giving me in the outfield.  The other possibility is that outfield defense is much more luck than we’d like to give it credit for.  At this point, I’d have to say it’s an open question.
So, was Denny Hocking superman?  His range rankings on the infield show up as usually pretty high, so he had the most basic skill, and the one that would help him the most.  Hocking got pretty high marks at 2b, ss, and 3b in the seasons that he played there on most of the other skills that OPA! measures.  So, he managed to master several different and seemingly un-related skills.  Not a bad thing to be able to say.  And he can say he was a major leaguer for 13 more years than I was.  So, reader, you now know a little bit more about OPA! and a lot more about what talent it takes to be a utility infielder.

On the reliability of defensive abilities, part 1

What part of defense is luck and what part is skill?  As I continue the development of my fielding system (OPA! or out probability added above average), I’m curious as to whether fielding is a repeatable skill.  Some time ago, DIPS theory was coined when it was discovered that certain things that pitchers do (give up walks, strike batters out) were more stable from year to year, while others (BABIP) were not.  The conclusion is that a pitcher has much more of a repeatable skill for events that do not involve the defense than those that do.  What of fielding skill?  Are there fielding events which seem to be more repeatable from year to year?
The nicest part about my OPA! system is that I’m able to take an isolated look at various components of fielding, especially on ground balls.  Add a few adjustments for the number of balls hit to the fielder and it’s easy to create a rating per ground ball or fly ball or popup for everyone in baseball.  Select out the guys who have played a decent amount of time at the position (450 innings, which is 50 full games in a season).  My statistical technique of choice is intraclass correlation, which is sort of like a year-to-year correlation, but with the ability to use multiple datapoints.  In this case, I’m using four years worth of data from 2004-2007.  The higher the correlation (the maximum is 1.0), the more stable the stat from year to year.  In a lot of cases, the results were in the .10 to .20 range, meaning that either those particular skills at that particular position are much more the result of luck than anything else, or there’s a lot of skill involved and my measure is completely off. 
I’ve detailed my method for getting the ratings on infield grounders here and here, discussed fly ball extra bases prevented (cutting the ball off) here, and my model of fly ball range here.  I looked at line drives and pop ups (for infielders) much the same way.  I pretty much stole John Walsh’s method for rating outfield arms.
I’ve hidden the numerical spaghetti behind the cut.  For those who are interested in the gory details, it’s all there in black and white.  For those who just want the conclusions, here they are:

  • Range is generally the most consistent thing about a fielder from year to year.  That means it’s something you’re either good at or not.  Sorry, Derek.
  • Middle infielders are most consistent in those stats that measure gross motor skills (range, throwing) than fine motor skills (catching, fielding).  Makes sense.  They have to cover the most ground, so coaches look for the guy who can consistently get to the ball.  We can worry about picking the ball off the ground when you get there.
  • Ratings involving catching the ball for first basemen are pretty consistent from year to year.  Part of it is that the first baseman is asked to catch a lot of throws during the game, so his “true” talent level is more likely to be exposed.  But, we also see this pattern when it comes to catching line drives and popups.  Then again, I suppose that a coach finds a good first baseman by looking for the guy who’s good at consistently catching the ball.
  • In fact, corner infielders seem to be much more consistent in their fine motor skills (fielding, catching) from year to year.  Of course, they also have less time to react to grounders (but don’t have to cover as much ground), since they play closer to the plate, so it makes sense that a player who was particularly sure handed would be steered to those positions.  It also means that since they don’t have a lot of reaction time, whether they get to the ball or not will be more of a matter of luck.
  • Sean Smith, former StatSpeak writer and inventor of TotalZone, one of the systems to which I owe a great deal of credit on OPA! once pointed out that pop ups tell us nothing about an infielder’s defensive abilities.  He’s right.
  • Line drives to the right side are mostly a matter of luck.  Line drives to the left side are slightly less a matter of luck.  I’m not entirely sure why.  It might be that there are more RH pull hitters who hit liners to the left side, so it’s just a matter of the left-sided fielders getting more chances.
  • Left fielder range is surprisingly more consistent year-to-year than CF or RF range.  Maybe it’s just that the guys who end up in LF are consistently slow.
  • In fact, most measurements in the OPA! system begin to become less consistent year-to-year as you move from left to right in the outfield.  Weird.
  • Arm ratings for outfielders showed an interesting pattern.  Left fielders were more consistent in their actual throwing runners out than their holding runners.  Center fielders showed the opposite pattern.  There was a lot of consistency in how many runners they held (presumably by the reputation of their arm), but little consistency in how many outs they recorded.  Seems that runners and third base coaches are afraid of center fielders’ arms on reputation despite the fact that a CF’s performance in the past has little bearing on his performance in the future.  Right fielders showed a good pattern of stability in both holding runners and throwing them out.
  • OPA! overall is a pretty consistent for infielders, but not so much for outfielders.  Again, either my measure is flawed (entirely possible) or outfield defense really is something of a crapshoot.  It’s weird because the list of outfielders that I get in terms of range at least passes the smell test.  Hmmm… 

Fielding positions are, of course, not arbitrarily assigned.  Unlike evaluating hitting, which everyone does, to evaluate second basemen or left fielders is to evaluate a very specific set of folks.  There are specific skill sets that come in handy in different positions on the field and it appears that managers (presumably at all levels) subcosciously or not-so-subconsciously put guys whom they know are consistent at that particular skill at those positions.
Tomorrow, I’ll look at the correlations between different fielding abilities, both within a position and among different positions (think utility infielders…)
Click below for the gory details.
Read more of this post

World Famous StatSpeak Roundtable: July 21

The weekly roundtable is pleased to welcome Sal Baxamusa, of The Hardball Times for a little discussion of the latest in baseball news.  Sal is a grad student and a proud first-time dad (yay!) and today he joins us to talk about Rich Harden, deadline deals, and Pizza Cutter’s favorite topic in the whole world…
Question #1: Do the Indians have a strong enough core with Sizemore/Peralta/Martinez/Hafner/Carmona to regroup and try again next year, or is it time for another extended rebuilding process?
Sal Baxamusa: “Rebuilding” is a funny thing.  Lots of teams talk about doing it, lots of ink is spilled over whether or not teams should do it, but there’s a dirty little secret about rebuilding: nobody ever really does it sucessfully.  How many teams in the past ten years have dismantled a mediocre team with the express purpose of acquiring prospects, and then turned those prospects into the core of a great team?  The Marlins, certainly.  One might say the A’s of the late 90s.  And most definitely the recent-vintage Indians. 
Pretty much every possible veteran on 2002 team, from Chuck Finely to Ricardo Rincon, was made into prospects.  Some of these deals worked out and some of them didn’t.  But a strong core of Travis Hafner, Cliff Lee, Grady Sizemore, and Coco Crisp were all acquired for veterans. Along with CC Sabathia, Jhonny Peralta, and Victor Martinez, the Indians created a team that was excellent for three years, outscoring their opponents by 148, 88, and 107 runs from 2005-2007.
Now the Indians are mediocre again.  What to do?  They still have a potentially strong core in Martinez/Peralta/Hafner/Carmona/Lee/Sizemore.  Going into the year, that would be the envy of any contending a team: a young group of star to superstar performers.  But now Martinez, Hafner, and Carmona are hurt.  Peralta’s 2005 looks less like the breakout of a superstar and more like an early-career peak.  Only Lee and Sizemore are performing up to expectations (and, to their credit, well beyond expectations).
Management has already punted this season by trading Sabathia.  But I don’t believe they should punt on 2009.  For starters, the Indians don’t really have the veteran trade chips to get a huge prospect haul.  The guys who would might have had the most trade value now have serious questions regarding their health and/or their performance.  The Indians would be selling low on them, hardly optimal conditions for accumulating top-shelf young talent.  They’re better off hoping for a return to form by Hafner, Martinez, and Jake Westbrook than by trading them.  And who’s lining up to deal for Paul Byrd?  Jorge Julio?  David Dellucci?
Secondly, and most importantly, the Indians have a huge head start on the rest of the league simply by fielding Grady Sizemore, easily a top-5 position player in the AL.  He’s coming into his prime, and there’s no way the Indians would get fair value on any trade of Sizemore – no team has the quantity or quality of prospects that the Indians would demand in a potential Sizemore trade (or, if they did, they’d be very close to being a good team).  Furthermore, a player like Sizemore rarely falls into your lap.  When you’ve got him, you build around him.
The 2009 Indians would be better of hoping for health from some of their key contributors, building some depth with low-key moves as a hedge against those contributors’ health (or performance), judiciously plugging holes with role players, and letting Sizemore do the rest.  Sure, it’s easy for me to say things like “plug the holes,” but I have faith in the Indians.  From my experience with the club, they have a bunch of really smart folks working for them.  They’ve got the ability to build a winner next year, and I hope they do it.
Eric Seidman: The Indians befuddle me.  In 2006 they were coming off of a great 2005 season and looked primed to win the division.  Then they went 78-84.  Despite that, their pythag record was 89-73.  They follow that up with a division win last year, and this year sit under .500 despite a 48-47 pythag record.  Sure, they lost Sabathia, but the return was solid, and Victor Martinez is not only hurt but homerless in the 54 games he played, so some guys are underachieving.  I don’t know what Pronk’s deal is but I think they’ll largely be fine next year with some added parts in the free agent market but another year or two like this could spearhead the rebuilding movement.
Pizza Cutter: Must… not…. gush.  But then, I write this from Cleveland and my real job is just down the street from Jacobs Progressive Field.  Grady Sizemore is ascending to the level of one of the best players in the game, and he’s signed at a ridiculously underpriced contract.  Victor Martinez has battled injury this year, but is one of the best catchers in baseball.  Fausto Carmona is for real and I’m starting to be a Cliff Lee believer.   On the other side, Hafner is looking more like his monster 2006 was a fluke.  Jhonny Peralta is the greatest enigma in Cleveland sports.  The Indians have some very good pieces, and are well positioned financially, plus they have some good pitching depth in their system, so there should hopefully be some money.  The problem is that other than the newly-acquired Matt LaPorta, there’s no earth-moving position player coming up through the minor league system.  So, they’ll have to shop for spare parts on the free agent market.  This might be my home-team tendencies shining through, but I think the Indians do have enough to re-load for next year, rather than to rebuild.  Like a few other teams they’re a few pieces and a little bit of luck away.  It’s close enough to justify taking a shot over the next year or two.
Read more of this post

Then and Now: Cliff and CC

On Thursday, I offered a primer of sorts with regards to how players should and should not be evaluated.  The key involves using the player’s true talent level–derived from weighting the last three or more seasons of data–in order to project current numbers.  Without regurgitating the entire post I will sum it up by saying it is always incorrect to solely quote this season’s statistics when conducting such an evaluation.  At a certain point later in the year, like around now, the 90-100 game mark, current season numbers can truly garner the most weight, but still require the previous three years to serve as supporting evidence.
All of this leads me to my major point today, using math anyone and everyone can understand: Four starts in April is not 90-100 games into the season.  It may seem like common sense and elicit sarcastic reactions at the obvious nature of this statement, but rewind back to April and take another look at the almost uniform reactions to both Cliff Lee and CC Sabathia.  These were two pitchers heading in what appeared to be different directions entering the year–CC winning the Cy Young Award in 2007, primed for a big contract after the season, and Lee struggling to even receive a guaranteed spot in the rotation.  Despite this, they produced results of the inverse extremes through four starts:

  • Cliff Lee: 4 GS, 31.2 IP, 11 H, 1 ER, 2 BB, 29 K, 0.28 ERA, 1.21 FIP
  • CC Sabathia: 4 GS, 18 IP, 32 H, 27 ER, 14 BB, 14 K, 13.50 ERA, 7.25 FIP

Most Cleveland fans didn’t know how to react, mainstream writers wrote Sabathia off as struggling under the weight of his impending new contract, and I’m sure John Kruk cherrypicked videos to show that, without question, something was wrong with CCs mechanics.  After nothing more than four starts the minds of many were made up that Cliff Lee was absolutely amazing and Sabathia was in the midst of a severe post-award dropoff.  Recall Thursday’s example of a 3-9 hitter vs. a 300-900 hitter: with such small samples, a few more games of data could drastically change the outlook.
Granted, CC and Cliff were on extreme opposite ends of the four-start spectrum but what should have been asked is how their early season performanc affected their true talent level.  Did it at all?  Or were their starts so incredibly good or bad that they were capable of overcoming small sample shortcomings to drastically alter the true talent level?  For starters, Lee was projected to post a 4.38 FIP entering the year, with Sabathia at 3.25.
Sabathia’s 7.25 was much higher than his projected 3.25 whereas Lee’s 1.21 was about four times lower than what was projected for him.  Plugging their starts into Sal Baxamusa’s in-season Marcel projector tells us that, after these four starts, Lee was projected to post an FIP of 4.03 over the remainder of the season whereas Sabathia would post a 3.47 FIP in the coming weeks and months.  Lee improved by about one-third of an FIP point while Sabathia dropped by about one-fifth of one.  Something to keep in mind is that each of these were extreme small samples, with insanely good or bad numbers.   Still, the true talent levels had not changed as significantly as some thought.  Cliff Lee was not considered capable to be a 1.21 FIP pitcher with a 14.5 K/BB.  His projection understood that he would likely outdo his pre-season 4.38 FIP projection but not to the extent he did in the early going.  With 27-29 starts remaining after those initial four, the balance FIP of 4.03 would more than outweigh the early 1.21.
Sabathia, however, had been bad enough in the first four starts that his projection understood he HAS to perform better, but that his overall numbers this season might not be as great as was thought possible prior to the beginning.  A 7.25 FIP through four games would largely be drowned out by 30 more starts of 3.47 FIP, but the end result would not see Sabathia with a 3.25 FIP like his pre-season projection.  He would perform much better than his first four starts but not better enough to even things out to the 3.25.
All told, after these first four starts, Lee’s 4.03 projection deemed him a good, not great, pitcher, whereas Sabathia’s 3.47 still merited him some recognition in the “great” department.  What happened from that point until now?  I’m glad you asked!  Since those fateful fourth starts, Cliff Lee has made 14 starts at a 2.48 FIP while Sabathia has made 16 starts at a 2.32 FIP.  For the season, excluding last night’s action, Sabathia’s FIP at the half was 3.23, Lee’s was 2.31.  It appeared that, despite his four poor starts to begin the season, Sabathia had pitched so well in the coming weeks that his halfway point FIP actually outdid his projection, something few people thought possible following his second straight 9-run performance in April.  Lee, however, was still outdoing his projection by a large margin even though it gravitated towards Earth.
Plugging their halfway point numbers in, Sabathia is projected to post a 3.21 FIP over the rest of the season while Lee’s projection calls for a 3.66 FIP.  Suffice it to say, these projections mean that relatively nothing will have changed for Sabathia, as his FIP would actually be an improvement, ever so slightly, over his pre-season 3.25 FIP.  Lee, on the other hand, would alter his projection significantly.  His FIP would be somewhere in the upper two’s or lower three’s, which would vastly outdo not only what was thought of as probable prior to the season and possible during the season.
Entering next year, however, his 2008 numbers will be weighted the heaviest but not the sole factor in determining his true talent level.  Lee may very well post better numbers in this particular season, but that doesn’t make Sabathia’s any less remarkable.  Based on their track records, CC is more likely to sustain performance like this as well.  The point of all this is not to get carried away with small samples of early season performance.  It’s fine to look at what has happened and, on a granular level, analyze what went into those numbers, but don’t make Cliff Lee out to be the next Bob Gibson based on four tremendous starts that nobody could sustain, let alone a guy fighting for a rotation spot in spring training without a proven track record.  It’s not to say he couldn’t continue to pitch on a tremendous level or that he lacks talent, but that judgments and evaluations need to be based on concrete evidence, like a true talent level, and as I mentioned before, four starts is not the true talent level of a pitcher.
I believe it was Rob Neyer who once said that you have two choices as a baseball writer early in the season: have fun with the numbers or bore everyone about how the samples are so small.  I definitely agree, so long as that fun does not include going overboard and determining that a player has “changed” something based on four starts or something like twenty games for a hitter.

Follow

Get every new post delivered to your Inbox.