A-Rod opts out (and oh yeah, Red Sox win World Series)

Has there been a free-agent on the market of this calibre since, well A-Rod was last on the market?  (Oh yeah, congratulations to the Red Sox for winning the World Series.)  For those of you who somehow didn’t hear, Alexander Emmanuel Rodriguez announced that he’s tearing up the most expensive contract in baseball history (that rush of wind is the sigh of relief coming from Arlington, Texas) and pursuing the free agent market.  And the Yankees have already sworn that they won’t talk to him (but they will need someone to play third next year!), and leaving New York is never easy.  But, do you want a reigning MVP (has that been made official?), who can play third base or if you asked him nice enough, shortstop, who will probably get himself up to 700 homeruns (or more?) while playing for your team?  Maybe you’d like the man who will go down as the “Greatest Hitter of All-Time” (debatable, but another day) will go into Cooperstown wearing your hat?

All it will cost you is around $30 million per year (want to take a guess on exactly how much?), or roughly, what the Florida Marlins paid everyone on their roster this past year.  Don’t worry about buying a lemon.  Every sign points to you getting the real deal.  I suppose A-Rod could decide to join a Trappist monastary halfway through the contract, but that’s a danger with every MLB player.  We’ve got a good idea of A-Rod’s talent level, and let me tell you, it’s quite high.  He’ll probably start to decline a little bit, but he’s going to be the best in the world for a bit longer.

But he’s not worth $30 million per year, at least not for what he does on the field.  This past all-star break, I asked the question of how much a run costs.  I’ll ask it again.  I have 2007 salary data for most MLB players, but I stuck to those making $1 million or more.  The reason is that rookies and young guys who tore it up this year (Ryan Braun, Hunter Pence, Jack Cust) were all generally making the minimum salary because they haven’t had a chance to hit free agency/arbitration yet.  The guys making a million have at least had the chance to do a bit of negotiation on their behalf.  I also have Baseball Prospectus’ rating of Runs Above Replacement (if I’m not mistaken, it’s just batting stats taken into account here).  I regressed salary on Runs Above Replacement.  The equation of what MLB teams will pay for a run was roughly, a base salary of $4M for a bum who does nothing more than a AAA/waiver wire/bench guy could give you (a sad statement right there), and roughly $110,000 for each run above replacement thereafter.  A-Rod had 85.6 runs above replacement, leading the league by 8 runs over 2nd place finisher Magglio Ordonez.  When you work it out, he’s worth about $13.6 million based on his 2007 performance, or about $9 million less than he was actually paid.  No one is worth $25 million per year.  (If it makes Yankee fans feel better, he was only the fifth most over paid hitter in baseball behind Jason Giambi, Derek Jeter… I swear this is supposed to make Yankee fans feel better… Richie Sexson, and Jason Kendall.) 

This isn’t to say that A-Rod isn’t worth more than just his on-the-field performance.  He will sell t-shirts, and extra tickets, and maybe even prop up a fledgling cable network for your team.  But, if you pay him $30 million per year, it looks like you’d better be getting $15-20 million extra in revenue from A-Rod being a part of your organization.  In Baseball Between the Numbers, there’s a chapter on just this topic.  They suggest that a team that could best benefit from A-Rod financially in the “extras” that he brings is a team that’s on the cusp of a playoff birth, where A-Rod would make the difference between the playoffs and golfing in October (after all, there’s only one October!)

So we need to find a team that can afford to splash out $30M, needs a third baseman or shortstop, and where A-Rod would be the difference between making the playoffs and not making the playoffs.   Well, looking at team payrolls, we can probably exclude all the teams south of $70 million.  I have no rationale behind that other than I just pulled it out of the air.  We’ve eliminated 11 teams. 

Even assuming that all the remaining teams had a league average hitter playing 3B (the Twins have Nick Punto at third!  A-Rod was 110 runs better!), A-Rod is worth 82 runs over the league average guy, so eight wins.  Teams remaining that were within eight wins of the playoffs?  Detroit.  Seattle.  The Mets.  Atlanta.  Milwaukee.  St. Louis.  San Diego. The Dodgers.

Detroit could use an upgrade over Brandon Inge.  Seattle already has their own mistake of a contract at third base in Adrian Beltre.  The Mets have David Wright.  Atlanta has Chipper Jones, who could, in theory, move back to left field.  Milwaukee has Ryan Braun, who might move anyway because he’s an awful fielder.  San Diego is the land of Kevin Kouzmanoff, who’s been less than thrilling to Pads fans.  St. Louis and Scott Rolen?  LA and Nomah?  So, we’re down to Detroit, and maybe Atlanta, Milwaukee, San Diego, St. Louis, or the Dodgers.  With the exception of the Dodgers, they’re not the sorts of markets where an investment of an extra 15 million over what a guy’s worth can be recouped from the size market base that they have. 

The Cubs and Angels, playoff teams both, although mostly in name, and ones in big markets, have been mentioned as possible suitors as well.  Apparently, they have money to burn.  Speaking of money to burn, I left out one team that needs a third baseman, is non-playoff team without A-Rod, but a playoff team with him, and they have plenty of money.  They play in a big media market and could probably benefit from A-Rod’s presence on their own cable network.  The Yankees!  For what it’s worth, the Red Sox also need a 3B with Mike Lowell about to become a free agent, but other than sticking it to the Yankees, the move doesn’t make financial sense.  (But then again, they could stick it to the Yankees!  If they do that, however, don’t the Red Sox become the very Yankees they hate?)

So, this could become a Yankees-Dodgers-Angels-Cubs-Red Sox? bidding war.  My guess is that someone will end up with a $32 million dollar per year trophy for the next five years (I completely made those numbers up) but when looked at with a hard eye, will live to regret buying something that just isn’t worth that much.

A-Rod is really good, but he’s just not quite that good.  No one is.

Oh and by the way, the Red Sox are World Champions.  Which is nice, but not the news of the day.


I thought we were all professionals here

This is the story of my search for professional hitters.  You know the type.  He’s a .260-.270 (read: average) hitter, but a “guy who does the little things with the bat that don’t show up in the box score.”  He makes “productive outs” (there’s an oxymoron!)  He’s “good with the bat.”  He “moves runners along.” (Run along now children!)  I bet he has a great personality too
Despite the fact that the point of the game (at least on offense) is not to make outs, some hitters apparently make outs like nervous eighth graders at their first co-ed party… but that’s OK, because they make the good kind of outs.  I suppose that in the game of “let’s make the best of a bad situation,” there are some plays that produce outs that are preferable to others.  A sacrifice fly is an out, but it does score a run.  A grounder to the right side with a runner at second and less than two outs is better than a grounder to the left side, because the grounder to the right side probably advances the runner.  But, are there really guys who excel in making “productive outs”?
Well, let’s look at situations in which a batter has a chance to make a “productive out.”  There have to be less than two outs, because a batter who makes an out when there are two outs… do I really need to explain what happens next?  Also, there need to be runners on base.  A batter who makes an out with no one on base has just gotten himself out, and there’s really nothing else left to happen.  He also needs to, ummm… make an out.  But not just any out.  Productive outs are usually attributed to some sort of ability to place the ball on the field.  So, let’s look at all at-bats where the batter made his out on a ball in play (i.e., not a strikeout).  I found all the situations from 2003-2006 that met these criteria.
I figured out how much win probability each of these events added (actually, it’s usually more like subtracted).  Now, when dealing with WPA, we need to remember the WPA is affected by leverage in a given situation, so I divided the WPA for each event by the leverage of that event.  This gives us context-neutral wins added.  I then found the mean context neutral wins added (within this subset of plate appearances), for a batted ball of the type hit to the fielder who fielded it.  (e.g., the average ground ball to the third baseman usually added -.03 context neutral wins).  Then, I looked at whether the batter outperformed (by being less of a drag on his team’s chances than expected) or underperformed (perhaps by hitting into a double play rather than just a fielder’s choice) this expectation.  Sum up a player’s total and see what happens.
(If there’s a StatSpeak drinking game… and there should be… two shots should be required every time I do an intra-class correlation.  Pour yourself a double.)  Over the four years in the dataset, among those batters with at least 25 at-bats in the season under consideration, there was an ICC of .16 for the total sum of WPA over expectation in making these outs.  I divided by number of plate appearances and got an ICC of .14.
To put that in some perspective, I did a similar examination of clutch hitting and found an ICC of .074.  An intra-class correlation (a measure of year-to-year consistency over multiple years for the uninitiated) of .16 is about 5 times as strong (using r-squared), but that’s not saying much.  That’s around the range of year-to-year consistency of BABIP for pitchers.  So, professional hitting seems kinda like clutch hitting.  There are certainly clutch hits, in the same way that there are professional hits.  There are guys who in one year might have several professional hits to their names.  It’s just that year-to-year, it’s not consistent, which we would expect if it was an inherent skill.  “Professional hitter” is a nice thing that people say about average hitters whom they like for some reason.  It’s based mostly on a few isolated incidents that someone remembered, but something that doesn’t shake out when you look at all the data.
But, for what it’s worth, in 2006, the league leaders in context neutral wins above expectation on outs hit in play per relevant plate appearance (or, if you prefer it Baseball Prospectus style, CNWAEOHIP/RPA)… just call it the professional hitting index, were:

  1. Jody Gathright – who actually fell just short of a full win above expectation in this stat
  2. Jeremy Hermedia
  3. Dave Ross
  4. So Taguchi
  5. Craig Wilson

The most un-professional hitters were

  1. Chris Duncan
  2. Tim Salmon
  3. Jason Kubel
  4. Gary Sheffield
  5. Willy Aybar

UPDATE: Tango Tiger requested that I put up the entire list.  The 2006 list is available here in Excel format.  Players are listed by their Retrosheet ID, and they’re sorted by CNWAEOHIP/RPA.  Enjoy.  (The reason it’s the 2006 list and not the 2007 list is that the 2007 Retrosheet event file is still being compiled.  When that comes out, I’ll check out who did what in 2007 using some of my home-cooked stats.)

The playoffs, The Gambler's fallacy, and The 50-50-90 rule

One of the basic rules of statistics applied to last night’s Game 7 of the ALCS last night:

The playoffs, The Gambler’s fallacy, and The 50-50-90 rule

One of the basic rules of statistics applied to last night’s Game 7 of the ALCS last night:  The 50-50-90 Rule.  If there’s something that’s a 50/50 shot for the team for which you are cheering, your team will lose 90% of the time.  That is, unless you’re a Red Sox fan.  But I grew up in Cleveland and my first coherent memories are of watching the Cleveland Indians.  (True story.)  This is an iron-clad rule of statistics.  You can look it up.
I work in a hospital, and in the emergency room, they have a measure called “Subjective Units of Discomfort” (SUDs), to measure people’s level of pain when they come in.  It goes from 1 to 10.  Being a practicing Sabermetrician and a psychologist, I felt the best way to cope with this turn of events would be to make a new statistic that would adequately capture the magnitude of what happened.  I thought about calling it Pizza Cutter Depression Probability Added (this was a particularly high leverage game for that particular stat).  Finally, I settled on Subjective Units for Cleveland Knockouts (I’ll let you do the acronym).  The formula is Opponent wins x 2.5.  The scale goes from 0 to 10.
But then again, I should have known it was coming.  My wife, who’s never wrong (and the sentence should end right there, just ask her… although she did wonder out loud why Travis Hafner wasn’t trying to steal third), said this morning that she had a feeling the Indians would win.  She’s had a hot hand on picking these 50/50 shots, but this morning, we found out that one of her picks for the sex of one of the 457234 babies that are being born to people we know within the next few months was wrong.  (She called boy.  They’re having a girl.  She also picked a Cubs-Angels World Series)  Looks like her hand has gone cold.
With that said, I would warn fans of the Red Sox and Rockies to watch out for a (real) property of statistics: The Gambler’s Fallacy.  Consider the simplest of all games of chance: the flip of a coin.  Suppose that you flip a coin ten times, and ten times in a row it comes up heads.  What are the chances that the next flip will be tails?  Did you say something other than “Fifty percent?”  Did you mumble something about the “Law of Averages?”  Sound like a baseball team about which Dane Cook has been yelling all week?
Red Sox fans will probably be saying to themselves that they are sure to win the World Series because the Rockies are “due to lose.”  Rockies fans will probably be saying to themselves that they are “on a roll” and will win the World Series because of momentum.  Of course, one of them will be proven “right” in the next week and a half.  In fact, neither one is right.  The Red Sox had a better regular season record and a better Pythagorean record, plus they have four games at home to the Rockies’ three.  So, the Red Sox are the favorites.  But each game starts at 0-0, so the probabilities of either team winning reset themselves after each game.
Now, the other thing that will be bandied about is that “In a short series, anything can happen.”  This is a nice way of saying that a seven-game series is an inadequate sample size from which to determine the relative quality of the two teams.  Which is true.  If I were to submit something to a scientific journal with an N = 7, I would have the paper sent back to me with a laugh.  In baseball, you get a trophy for your efforts.  Still, there’s a part of me that wishes that the Indians were part of that inadequate sample size of independent events.
After the game, my wife, in an attempt to console me, said that she didn’t think of it so much as losing a series, but re-gaining a husband.

Did Doug Mientkiewicz get Joe Torre fired?

In what was the second least convincing lie told yesteday (the first being the Indians use of Josh Beckett’s ex-girlfriend to sing the National Anthem before Game 5 of the ALCS being… ready?… an “incredible coincidence!”), the Yankees attempted to make it look like they wanted Joe Torre to stick around.  It’s just that they offered him a one year deal with a 33% paycut and, just to make sure that he got the message, there was a second option year that would only have vested had Torre and the Yankees made it to the World Series.
I have to be honest here.  I hate the New York Yankees.  The two greatest days of the year in baseball (and by extension, the year in general) are Opening Day and the Day The Yankees Are Eliminated.  It’s nothing against the individual players.  It’s the principle of the thing.  I shouldn’t speak ill of evil empires, since my wife was born in Moscow, but well, maybe this will convince you.  Despite my general dislike for anyone wearing a Yankees cap who wasn’t born within the New York metropolitan area (oh, hi LeBron), I felt insulted for Joe Torre.
I’m assuming that since Joe wasn’t fired after any of the previous dozen seasons, he must have committed his unforgivable error sometime in the last 12 months.  I suppose that in Yankee-land, not winning the World Series this year (or for the past… gasp!… 7 years) is unforgiveable enough.  Especially since the World Series trophy actually belongs to the Yankees and is just leased to the rest of baseball whenever the Bronx Bombers are feeling generous.  Well, let’s try to find this egregious mis-step.
A manager has three jobs.  He is the team’s spokesman to the media, and by extension, the public.  He’s in charge of keeping the players happy, in essence being the psychologist-in-chief.  He makes the in-game strategic decisions.  On the first matter, dealing with the New York media is an impossible job.  New Yorkers are convinced that they are the most important people in the world.  They’re like Americans on steroids.  (Was that perhaps the wrong way to phrase that?)  I’m amazed that after 12 years of that part of the job alone, Torre didn’t quit.  As to keeping the players happy, we’ll never know.  We don’t know what went on in the locker room.  I suppose if the team was running off the rails in that direction, it would be OK to fire Torre, but I haven’t heard any indications that it had.
So, the in-game decisions.  Let’s first point out some of the things that the manager does not do.  He does not assemble the roster, for the most part.  I have to believe that the manager gets some say in player-personnel decisions, particularly those in-season moves like whether to send Player X down to AAA and whom to bring up when Player Y gets hurt.  But those are usually minor moves involving the 22nd through 25th spots on the roster.  The big-ticket items are usually provided to him by the general manager.  While the manager does set the starting rotation in the spring, the rotation generally runs itself.  He also doesn’t hit, pitch, or run in the game.
There are a few correct decisions that Torre made for which he gets absolutely no credit, nor does he deserve any.  One job of the manager is to apportion playing time.  Torre, every day, was faced with a key decision.  Whom should he start at third base?  Looking over his options (A-Rod, Miguel Cairo, Wilson Betemit), he picked A-Rod on a consistent basis.  Not exactly rocket science.  In fact, among the Yankee regulars, Posada, Cano, Jeter, A-Rod, and Abreu were all in the top 10 at their respective positions in VORP leaguewide.  (Hideki Matsui was the 11th best LF. )
In this area, the only place where Torre had to make a decision was figuring out who would play first base (options, Doug Mientkiewicz, Josh Pehlps,  a hobbled Jason Giambi, Andy Phillips, or hilariously enough, Miguel Cairo).  The job was split between Phillips (who functioned at replacement level and was a slightly below average fielder)  and Doug M. (who functioned slightly above replacement level and was a slightly above average fielder).   Phillips and Cairo got most of their reps because Mientkiewicz was hurt for part of the year.  In other words, Torre was dealt a bad hand at first base.  Then, there was the matter of moving Melky Cabrera to center field and Johnny Damon to left.  RZR shows that Damon was actually the better left fielder and the better center fielder last year (both players logged a good amount of time at both positions), despite the general perception that Cabrera is the better fielder.  Hideki Matsui was actually a better left fielder than both of them, but was injured toward the end of the year. 
So, Joe Torre’s biggest mistake this year was giving too much playing time to Andy Phillips and Miguel Cairo, when their would-be replacement, Doug M. wasn’t all that terrific either.  (And not believing in the fielding prowess of Johnny Damon.)  Torre’s fascination with Mientkiewicz is well-known and completely inexplicable.  Why they kept him around as “the answer” at first base baffles me.  It’s not like the Yankees were trying to keep costs low.  I have to wonder if Torre didn’t say something to keep the Yankees from pushing harder for Mark Teixeira (or somebody… anybody… who could play a more-than-replacement level first base!) mid-year because of his “belief” in Doug M.
The manager also takes care of bullpen management.  Ideally, the best relievers should pitch to the most hitters, right?  Looking at Yankee relievers who logged at least 100 batters faced in relief, we get the following list, ranked by batters faced.

  1. Luis Vizcaino (334)
  2. Mariano Rivera (295)
  3. Kyle Farnsworth (266)
  4. Scott Proctor (245)
  5. Brian Bruney (228)
  6. Sean Henn/Ron Villone/Mike Myers (175-181 each)
  7. Edwar Ramirez (103)

How could Joe not have Mariano face the most hitters, since he is the clearly the best of the bunch?  Well, maybe Joe’s not as dumb as you think.  (Now do you understand why Joba Chamberlain went to the bullpen, instead of the rotation?)  Take a look at the average leverage that each pitcher faced in each of his plate appearances.  Mariano checks in at 1.76.  Vizcaino has an average of 1.00.  An average situation has a leverage of 1.00.  That means that the average situation that Vizcaino faced was exactly average compared to all other possible situations.  Rivera, on the other hand, faced situatons that were, on average, one and three-quarters times as important as the average plate appearance.  Let’s multiply each pitcher’s batters faced by his average leverage index and see what happens to that list.

  1. Mariano Rivera (519.2)
  2. Luis Vizcaino (334)
  3. Kyle Farnsworth (282.0)
  4. Scott Proctor (264.6)
  5. Henn/Villone/Myers (106.8, 68.6, 87.5)

Looks like Joe got that right.  Rivera faced situations that were, all told, as important as 519 average plate appearances.  He pitched the more difficult situations.  Anyone can pitch garbage time.  You want the good guy in there when it’s crunch time.
The manager also does things like give the steal sign.  70% is considered break-even, and the Yankees stole 123 bases and were caught 40 times for a success rate north of 75%.  The manager also puts in pinch hitters, although Joe didn’t really pinch hit that much this past year.  Damon, Posada, and Giambi all pinch hit more than ten times each, and all had an average leverage at insertion of more than 1.3.  The bit players who pinch hit, usually did so in low-leverage situations.  Those were probably blow-out garbage time pinch hits.  The only weird exception was Dougie M.  He pinch hit 7 times in an average leverage of 1.79.  Why, I have no idea.
The one thing that Joe obviously did wrong this year was not win the ALDS against Cleveland.  He was criticized for his handling of the pitching (hard to handle a staff when you have only two men in the bullpen whom you can trust), although he put his two best starters out for Games 1 and 2 and they just outright got beat.  Bringing back Chien-Ming Wang on three days rest was stupid, but in some ways defensible.  The other criticism he took was not using Jason Giambi at first, but instead sticking with… Doug Mientkiewicz.  All of Torre’s foibles seem to go back to that one man.
But what else did Joe Torre do wrong?  Did he give the “take” sign too often?  Did he call for too many pitchouts?  What was it?  And how exactly would another manager have done things differently?
Joe Torre is being blamed for 7 years of no championships in Yankee Stadium and that’s not fair.  I could go back into previous years and calculate the same sort of numbers, but look at what Baseball Prospectus said about the Yankees’ odds of winning the World Series this year.  About 10%.  They weren’t even favored to win the Division Series.  The playoffs really are a crapshoot.  Even if a team is so good that the would win 60% of their playoff games if playoff series were a million games long, there’s still a 1 in 3 chance of losing a best-of-five series, and the Yankees haven’t been that good in a long time.  The Yankees have no divine right to the World Series trophy, and there’s not a whole lot that the manager can do on the field to affect his team’s chances of winning that Joe Torre wasn’t already doing.
About the only thing of which Joe Torre is guilty is having too high an opinion of Doug Mientkiewicz.  Maybe that’s what got him fired.

Still more Pythagorean musings

Things continue to get interesting on the SABR Statistical Analysis chatlist on the issue of those pesky Pythagorean over-achievers.  No less a luminary than the founder of the theorem itself, Bill James, has come up with a little study of his own on the subject of whether teams who under-achieve one year are more likely to under-achieve in the next year (and whether over-acheivers will over-achieve the next year)
In it, he takes the top 100 over-achievers and the top 100 under-achievers of all time (using the Smyth/Patriot/Pythagenpat formula).  He finds that the top 100 over-achievers continued to over-achieve in the following year, although their level of over-achievement dropped from an average of 8.3 wins to an average of 0.47 wins.  For the under-achievers, they too underachieved on average, but fell from 8.68 wins to 0.24 wins.  He comes to the conclusion that while the effect isn’t zero, although it must be pretty small.  (He also runs a matched groups design in Parts III and IV of his paper that made me scratch my head.)
What Bill is describing in his paper is a regression to the mean effect that doesn’t quite regress all the way to the mean.  Let me take a look at this using a slightly different and more complete method.  I took the database of all teams from 1901-2005 and calculated their actual and Pythagenpat winning percentages, plus the Pythagenpat residuals.  I did the same for the following year for each team and matched the two up.  This gave me 2084 team-seasons.  The year-to-year correlation for Pythagenpat residuals is .043.  The mean of Pythagenpat residuals is zero.
That means that, knowing nothing else, our best guess for next year’s Pythagenpat residual can be given as:
0.043 * This Year’s residual + (1 – 0.043) * mean.  Since the mean is zero, that term drops out.  8.3 wins above expectation in year one would have a year two expecation of .043 * 8.3 + (1 – .043) * 0.  The answer is .3569.  Bill found that the actual teams checked in at 0.47 in the next year.  (And if I had twenty minutes of your time, I’d explain why what I just did was playing really fast and loose with some rules of math to get that number… Suffice it to say, it’s good enough for the situation.)  The 100 under-achievers would have an expectation in year 2 of -.3724.  Bill got -.24.  So, no the effect size is not zero, just like the chances that I will be hit by a bus today on my way walking to work are not zero.  But, since I generally look both ways before crossing the street, those chances aren’t anything to worry about.  I wouldn’t worry about this effect either.
Bill also brings up another interesting question posed by Mike Emeigh as to which was the better predictor of next year’s actual winning percentage for a team: their current year’s actual winning percentage or their current year’s Pythagorean projection.  Since I had the data set sitting in front of me, it seemed a shame not to ask the question.
Correlation between Year 2’s Actual Winning Percentage and:
Year 1’s Actual Winning Percentage = .603
Year 1’s Pythagenpat Winning Percentage = .626
I even ran Cohen’s test for specficity of correlated outcomes, and Pythagenpat really is significantly better (t = 4.35, for the curious) at predicting next year’s record.  Not by a lot, but it’s the better bet.  Still, score another one for Pythagoras.

The triumph of Pythagoras

On the SABR Statistical Analysis Listserv, there’s been a great deal of chatter concerning the good old Pythagorean win estimator.  This year, as it seems happens every year, most teams finish around their estimates.  But, there always seems to be that one oddity and this year, it’s the Arizona Diamondbacks.  The Diamondbacks were outscored this year (712-732), and had a Pythagorean expectation around 79 wins, depending on exactly which formula you use.  They won 90 games, good for  the best record in the NL.  Huh?
So, are the Arizona Diamondbacks a sub .500 team, like their Pythagorean projection says or are they a 90 win team like their… ummm… actual record says?  It’s an interesting question.  When trying to figure out how “good” a team is, which should we look at?  This is a topic which has been taken up before by Chris Jaffe, specifically with reference to the Diamondbacks, and more theoretically a few years ago by Dan Fox.  Dan found that early in the season, if you want to know what a team’s season-ending winning percentage will be, you’re best to look at their Pythagorean record.  That is, until about 100 games in, when the team’s actual record becomes the better predictor of their season ending record.  (By the end of the year, actual record is a perfect predictor of season-ending actual record.)  But which one better predicts what a team will do in its future games?
In July of this year, Joe Sheehan of Baseball Prospectus made the assertion that “Run differential is a key measure of team quality, and a better predictor of future performance than win-loss record.”  Well now, sounds like something we can test.  I took the Retrosheet Game Logs from 1980-2006.  (666, no kidding, team-seasons)  I took each team’s games in sequence.  After each game, I calculated the team’s actual winning percentage, as of that moment, as well as their Pythagorean projection as of that moment.  So, if a team is 10-10 after 20 games and had scored 93 runs while giving up 91, I ran the numbers.  (Methodological note: I used the David Smyth/Patriot formula and the standard formula with a 1.82 exponent, although they were pretty indistinguishable, so I just reported the Smyth formula)  Then, I calculated the team’s actual winning percentage over the rest of the season.  So, if that team went 72-70 over the last 142 games, I calculated those numbers.  I ran the numbers 162 times, one for each game of the year.  Which of the first two (current actual win percentage or current Pythagorean projection) was a better predictor of performance over the rest of the season from that point forward?
Want to see a pretty graph?
The graph shows correlation coefficients of the two methods to performance the rest of the way.  Coefficients are low at the beginning of the season because after game one, everyone’s either got a winning percentage of 1.000 or .000, and that’s not going to correlate well with much of anything.  At the end of the year, there’s the same problem in the opposite direction.  Focus on the middle part of the graph, where the sample sizes in both halves are roughly equivalent.  That’s where the story is.  You’ll see that the green line, representing the Pythagorean projection (using the Smyth method, although the 1.82 method had the same pattern) at that particular moment is consistently above actual winning percentage.  At the exact midpoint of the season (81 games), Pythagorean projection correlates with winning percentage the rest of the way at .494, while actual winning percentage has a correlation of .464.
(Side note: The weird jump around game 110 is because of the 1981 and 1994 seasons.  Teams played a little less than 110 games in those years, which led to some funky data in those years… just enough to cause a little blip in the data.)
In terms of predictive power, run differential really is the more important information to know when it comes to predicting the future.  What’s the deal with the Diamondbacks?  Well, for what it’s worth, the correlation between Pythagorean projection and future performance at 81 games is .5, which isn’t bad, but it isn’t all that great.  In fact .5 is possibly the most infuriating correlation coefficient out there.  .5 means that about 25% of the variance is explainable by whatever factor you’re using as a predictor.  25% is a quarter of the variance!  But 25% is only a quarter of the variance.  As the season wears on, the gap between Pythagorean and actual win percentage narrows, until they become roughly the same around game 150 or so, where the correlations are around .35.  The thing is that at game 150, the sample size for the “rest of the season” is only 12 games, and by that point, Pythagorean projection and actual winning percentage are usually mirroring one another.
But, there’s evidence here that a team is better described, over the long run, by their run differential than their actual record.  This will certainly come as great news to fans of the Padres and Braves, who finished with the 2nd and 3rd best Pythagorean win percentages in the NL this year, as they watch the Diamondbacks in the playoffs this year.