2007 Sabermetric Year in Review: Cleveland Indians

My favorite spot on the tour.  Stop #23.  In a very real way, it’ll be the next stop on my tour of the U.S., since I’m moving (back) there this summer.  Please excuse any momentary lapses in objectivity on my part.  I love the Indians.
Record: 96-66, 1st in AL Central (Won ALDS 3-1 over Yankees, Lost ALCS 4-3 to Red Sox)
Pythagorean Projection (Patriot formula): 91.82 wins (811 runs scored,  704 runs allowed)… good to see that karma caught up with the Indians.
Team Statistical Pages:
Baseball Reference
Baseball Prospectus
FanGraphs
MVN Blog:
Tribe Report
More Indians Resources:
Latest News
Contract Status 
Trade Rumors
Overview: They broke my heart, but my left brain knows that love is fleeting.  I have faith in my dear Indians.  They’re one of the organizations in baseball that has gone fully into Sabermetrics and advanced statistical analysis as a method to aid in decision-making.  All off-season, I’ve been nursing my broken heart with that fact.
What went right: Everything that went right for the Indians last year starts with Fausto Carmona… and C.C. Sabathia.  Sabathia simply continued a long arc of progress to the point where he can now claim to be among the best in baseball, but Carmona came out of nowhere.  In 2006, he was a failure as a would-be closer and was just awful.  Or was he?  Look at Carmona’s batted ball profiles from the last two years.  Carmona rarely gives up line drives and nearly 60% of this balls in play were grounders.  Carmona’s BABIP was up in 2006.  The kid has the stuff, although everyone says that he’s going to suffer from the fatigue of over-use last year?  Has anyone ever done a study on whether such an effect exists, once one controls for regression to the mean?
Here’s a stat that shows what the Indians’ bullpen meant to them last year.  Net WPA added by the offense, 2.92 wins.  By the starting pitching, 4.57 wins.  By the bullpen, 7.51 wins.  Rafael Betancourt had such a good year last year that I could have seen an honest vote going his way for the Cy Young Award.   Rafael Perez showed up mid-season and became Mr. Indispensible.  The bad news is that Dos Rafaelos (it’s a Cleveland thing) both had BABIP’s in the .240s.  However, bullpen fillers Tom Mastny, Jensen Lewis, and Joe Borowski… yeah he was a bullpen filler… all had BABIP’s in the .340s.  So, things should even out a bit in 2008.
And they eliminated the Yankees.
What went wrong: Josh Barfield was supposed to be the great solution at second base for the Indians.  He cost the Tribe Kevin Kouzmanoff (my Russian-speaking wife hates it when I say his name… apparently, I and everyone else in baseball butchers it), who given the other Indians’ problem that is Andy Marte, would look good in an Indians uniform right now.  Barfield was one of the least valuable players (by VORP) in baseball last year.  What went wrong here?  Simple.  He developed a hole in his swing.  His batted-ball profile didn’t change.  His BABIP didn’t change.  His K rate went way up.  If you look at his Pitch f/x plots, it shows a man who is chasing off-speed pitches down and away.  Barfield also looks like he’s fouling off a lot of pitches.  Hopefully, he’s been doing some work in the cages this year to fix that hole.
The David Dellucci and Trot Nixon signings didn’t exactly work out either.  Nothing fancy there.  The two of them simply fell victim to the fact that guys who are 33 tend to trend downward.
Yeah, that about sums it up: A small peak into the mind of manager Eric Wedge.  The Indians’ top four starters (Sabathia, Carmona, Jake Westbrook, and Paul Byrd) issued 11 intentional walks.  Then again, Sabathia and Byrd issued fewer than 1.4 total walks per nine innings.  Apparently, Cleveland’s pitching philosophy is “put the ball in play, we’ll take care of the rest.”
Oh Asdrubal Cabrera, I want to believe: Cleveland’s new second baseman is actually a shortstop whom they stole from Seattle at the trading deadline in 2006 for Eduardo Perez.  Cabrera came up to Cleveland in August and finally made Josh Barfield sit down.  Cabrera put up a nifty .283/.354/.421 in 186 PA.  Nice.  Cleveland fans, please do take note of the following.  We don’t yet know the real Asdrubal Cabrera.  186 PA is a pretty small sample size.  What can we say about a player after 186 PA?  Well, we have a pretty good idea of how he likes to swing the bat along with his walk and strikeout rates.  Cabrera struck out 18.2% of the time, and he hits a lot of ground balls.  I want to believe that he’s the second coming of Robby Alomar, but I’m worried that he’s the second coming of Tommy Hinzo.
What happened to Travis Hafner?: His line drives were down and his grounders were way up.  His power numbers were reduced (it’s hard to hit ground ball home runs, but even his HR/FB were down).  Sounds like a guy who’s got a little hitch in his swing.  Cleveland fans, if you’re hoping that Pronk will make his way back up to 2006 levels when he had a good argument going for “The Best Hitter in Baseball,” you’re likely not going to get it.  However, Hafner’s track record says that he’s going to hit more fly balls this year, and more of them will leave the yard.  His BABIP was down well-below his career average (the ground balls, probably… he’s not going to beat many of them out…)  I’m bullish on Travis Hafner.  Perhaps I’m just hopeful.
Outlook: The entire city of Cleveland will be holding its collective breath during the season and then after to see whether C.C. will sign with the team.  Think of it as Johan Santana, Part Deux.  I don’t envy the position that the Indians’ PR department is in.  The Indians will probably (wisely) refuse to put more than four years on a contract for a pitcher, and they have super-prospect Adam Miller waiting in the wings, but they’ll have to explain why they’re letting the best* pitcher in baseball walk away.

Advertisements

The Name Game

Growing up in Philadelphia, and raised in an extreme sports environment, Jayson Stark has always been an idol of mine. In fact it was reading his Philadelphia Inquirer column every week that eventually propelled me into sabermetrics. His columns always combined humor and statistics in order to show all of the hilarious or newsworthy baseball happenings that could not be seen on an ESPN show. Not shocking in the least, ESPN eventually brought him onboard. That being said, I thought I would do my sports-writing idol proud by writing an article in a style similar to his.
The idea for this came to me when the Phillies signed Chad Durbin to be their: (circle the correct answer)

  • A) 5th Starter
  • B) 6th Starter
  • C) Mop-Up Reliever
  • D) Waste of Space
  • E) Who cares, we have Adam Eaton!?

Regardless of the answer you selected, this now gave the Phillies Chad Durbin and J.D. Durbin – two completely unrelated Durbins. Now, it isn’t as if we’re talking about two guys with the last name of Smith. I never knew “Durbin” was a last name until a couple of years ago and now there are not only two in major league baseball but two on the same team?
More interestingly enough, there have only been four Durbin’s in the history of major league baseball and the other two ended their careers during, or before, 1909. The only two Durbin’s in the last 98 seasons of major league baseball are now on the same team – and have no relation to one another.
SPEAKING OF J.D. DURBIN
The Phillies acquired J.D. Durbin after the Diamondbacks placed him on waivers in April. Durbin had appeared in one game for Arizona and surrendered 7 hits and 7 runs in 2/3 of an inning. For the Phillies, Durbin was somewhat serviceable, even throwing a complete game shutout against the Padres.
J.D. Durbin made his Phillies debut on June 29th during the first game of a double-header against the Mets.
At the time of acquiring J.D. Durbin, the Phillies had a minor league prospect with the name J.A. Happ. Due to rotation injuries, Happ made his first major league start on June 30th, against the Mets.
Now that would be odd enough, on its own, however the Phillies also acquired J.C. Romero from the Red Sox. Romero also made his Phillies debut on June 29th, during the second game of Durbin’s double-header.
So, to recap, not only did the Phillies have three pitchers with the first names of J.A., J.C., and J.D., but all three of them made their Phillies debuts within the span of 48 hours from June 29th-June 30th!
STRIKINGLY SIMILAR DEPARTMENT
And, speaking of the Phillies, they acquired Tad Iguchi from the White Sox towards the end of the season. Since he would not have been able to play for the Phillies until May 15th, if he re-signed with them, he went elsewhere (Padres). The Phillies, in need of another bench player, decided to sign So Taguchi. I guess this way the transition will be easier for the players.
Or how about the Twins deciding to replace Luis Castillo with Alexi Casilla.

  • Believe it or not, the American League had an Ellis, an Ellison, and an Ellsbury.  And no, they were not Dale, Pervis, or Doughboy.
  • The Athletics had Dan Haren and Rich Harden.
  • The American League also had a Joakim, a Joaquin, and a Johan.  That’s never happened before with different players.
  • Lastly, there was the Rays’ Delmon Young and the Dodgers’ Delwyn Young, who sadly never got to face each other.

COINCIDENCE MATCHUPS
Speaking of “Young’s,” the NL West not only had two of them, but two Chris Young’s.  They could not be more different, either, as one is a 9-ft tall, white, former ivy-league pitcher and the other is a 6-ft, black, college-less outfielder.  Pitcher Chris Young (PCY for those keeping track) won the 2007 battle as his younger counterpart went 0-10, with a walk and 4 K’s against him.

  •  Orlando Hudson went 2-11, with an RBI and 4 BB, against his “River” counterpart Tim Hudson.
  • Unfortunately, Reggie Abercrombie never got to face Jesse Litsch.  I wonder what Sportscenter would call that matchup.  Reggie and Jesse?  Reggie and Litsch?  Abercrombie and Jesse?  Ugh, who knows…
  • Aaron Rowand and Robinson Cano didn’t face each other this past year either.
  • Somehow, the Blue Jays and Rockies have played nine times and we are still waiting on a Halladay/Holliday matchup.
  • Scott Baker didn’t pitch against, or to, Paul Bako in 2007, though my fingers are crossed for 2008.

DELICIOUS MATCHUPS
Mike Lamb is 3-9 in his career against Adam Eaton (who isn’t?) as well as 1-7 off of Todd Coffey.
Coffey and Lamb usually don’t go well together, though, but Felix Pie is also 0-1 off of the caffeinated one.
Eaton has never gotten to face Pie yet.  I’d like to put a pie in Eaton’s face.  3 yrs and 24 mil worth of pies!
ULTIMATE MATCHUPS
In what would probably cause the universe to crumble, I am patiently awaiting a Rick VandenHurk vs. Todd Van Benschoten matchup.  I’m feeling 2008 or 2009.
In the long-name department, Jarrod Saltalamacchia went 1-2 against Andy Sonnanstine.  Salty also went 0-2 against Mark Hendrickson.  He went 1-1 against Ryan Rowland-Smit, but Ryan had two last names to reach eleven letters and therefore had an unfair advantage.
BIBLICAL DEPARTMENT
Easily the most hypocritical name award goes to Angel Pagan.  You can figure that one out.  Did you know, though, that the National League had “Two Wise Men”?  That’s right – Matt and Dewayne.
Though Matt Wise surrendered a hit to Angel Pagan, he struck out Dewayne Wise, proving what we already knew – Matt Wise is the smartest pitcher ever.
GENERIC BE GONE?
On a sad note,  2007 proved to be a disappointment in the generic name field (not Nate Field or Josh Fields).  Combined, there were only four Smith’s.  Jason, Joe, Matt, and Seth.
Even sadder, we only had three Williams’ – Dave, Jerome, and Woody.  Scott Williamson tried his hardest but that does not count.  Could be a cool sitcom title – Three Williams and a Williamson.
BIRTH AND DEATH
Major League Baseball spanned the endpoints of the life cycle this year.  On one side we had Alan Embree (embryo) and Omar Infante (infant) and on the other there were Jermaine Dye (die) and Manny Corpas (corpse).
Dye has never faced Corpas but is 2-7 in his career off of Embree.  Infante has also never faced Corpas but has doubled in 4 at-bats against Embree.
“OF-THE” NAMES
Jorge de la Rosa and Eulogio de la Cruz did not face each other this year despite being the only two “of-the” names.  And, just to clarify the none of you who asked, Valerio de los Santos would not qualify for this category since de los would technically be “of-them” or “of-those.”
CITY NAMES
Miguel Cairo has long been the MVP of this group but he welcomed two additions this year in the forms of Ben Francisco and Frank Francisco.  I had always thought of Francisco as a Spanish first name but was very surprised to find it as an American last name.  In fact, if you say Ben Francisco really quickly and in front of a drunk, it could even sound like San Francisco.
ZELDA NAMES
I recently got an original NES and could not help but notice that two major leaguers sound like items from a Zelda game.  Don’t both of these sentences make sense?

  1. Link, to defeat Ganon, you must hit him in the lower Velandia.
  2. Use your Verlander to blow up the stones blocking the entrance.

HOUSEGUEST AWARD
One of my favorite movies is Sinbad’s Houseguest, and whenever I hear the name of Giants’ 2B Kevin Frandsen I am reminded of Sinbad’s character Kevin Franklin.  Something tells me Frandsen never impersonated a dentist.
JOB NAMES
In addition to everyone else we had six players with job names.  Chris Carpenter and Lee Gardner maintained the stadiums and fields, Scott Proctor made sure they didn’t cheat, Skip Schumaker supplied them all with cleats, while Matt Treanor helped rehab Torii Hunter.
Schumaker did not face Carpenter, Gardner, or Proctor.  Treanor is 1-3 off of Carpenter in his career.  Hunter was 3-6 with a HR and 2 RBI off of Carpenter (career), as well as 2-6 with an RBI off of Proctor.
Clearly, a Hunter is more valuable than a Proctor and a Carpenter.
FAKE NAMES, INC.
Point blank – the following names sound incredibly made up and fake:

  • Frank Francisco
  • Dave Davidson
  • Emilio Bonifacio
  • Rocky Cherry

CAVEMEN AND ANATOMY
When primitive men first began to speak it was easiest to combine two words together without any intermediates.  Thousands of years later we still have names like Grady Sizemore, Jarrod Washburn, Mark Bellhorn, and Chris Bootcheck.
Speaking of Chris Bootcheck, I wonder what he and Jon Knotts would talk about.
In the anatomy field, Rick Ankiel and Brandon Backe were in the same division, with Ankiel going 0-3 with an RBI off Backe.
MISCELLANEOUS NAME AWARDS

  • DIRTY NAME AWARD – Rich (Dick) Harden
  • ACADEMY AWARD – Sean Henn
  • LED ZEPPELIN AWARD – Scott Kazmir
  • ACTION HERO NAME AWARD – Boone Logan
  • FUTURE PIZZA SHOP NAME AWARD – Doug Mirabelli (hon. mention – Mike Piazza)
  • FICTIONAL SERIAL KILLER AWARD – Mike Myers (as usual)
  • NAME TYPO AWARD – Jhonny Peralta
  • MOST FUN TO SAY AWARD – Jonathan Albaladejo
  • IMPERVIOUS AWARD – (tie) James Shields and Scot Shields
  • FIRST AND LAST NAME SHOULD BE SPELLED DIFFERENTLY AWARD – Kameron Loe

And there you have it.  We covered the life cycle, the entertainment (regular and adult) industry, jobs, cities, the bible, and more.
We can only hope that 2008 will finally bring us a VandenHurk/Van Benschoten or a Holliday/Halladay.
Keep your fingers crossed.

2007 American League SP Analysis

A couple of weeks ago, I presented the Seidman SP-Effectiveness Model, which took into account a large majority of statistics that deem a pitcher to be effective and weighted them with points based on how important/rare they were.  The system is designed to take into account various factors that need to be taken into account in order to level the field of play between those on good or bad teams, those with or without run support, and those either called up/injured or those just plain bad.
Not surprisingly at all, Jake Peavy ended up being first, five points ahead of his competition, but the order of those that followed him turned out to be a bit more surprising than I thought.  Everything made proper sense, though, because the pitcher cannot be blamed for his team not scoring for him or not getting decisions in brilliantly-pitched games.
Essentially, my SP-Effectiveness Model answers the question – What would happen if a pitcher was rewarded every time he pitched well and negated every time he pitched poorly?
I also introduced my statistic, the AQS, or Adjusted Quality Start, which extends the general rule of 6+ IP and 3 or less ER to also include games of 7.2+ IP and 4 or less ER.  Based on my analysis of innings pitched by starters and the frequency of when they were lifted for relievers, coming one out short of the eighth inning truly merits being allowed to give up that fourth run.
If you have not yet read the NL Article on this same subject, I highly suggest you click the below link – that way you will understand the rubric and reasoning.
To read the NL 2007 SP-Effectiveness article, and see the results, click here
AMERICAN LEAGUE
In this article, I am applying my model to 2007 American League pitchers.  Just like the NL, there were some expected results, as well as some initially peculiar results that make sense upon further thought.  Additionally, just like with my NL post, I did not apply this to every American League pitcher.  Instead, I selected 1-3 pitchers from each AL team.  Before the 2008 season begins I will plug every pitcher from both leagues into my system to see who was worst – which is always fun.
I will not explain all of the statistics or points values, since I did that in the previous post on the NL, but I will say that I did consider the fact that AL managers did not have to worry about pinch-hitters.  Due to this, I considered making the IP requirements more stringent with the AL, but the fact is that even though they do not need to be removed for pinch-hitters, they are facing an extra offensive player (not a pitcher in the 9th spot).  They should, in theory, give up more runs and have just as good of a reason to come out of a game.
Overall, though, only a few more AL pitchers had over 225 IP than NL pitchers and so it was not worth changing.  The biggest difference in both leagues was the average IP/game of the selected pitchers.  AL starting pitchers accounted for 66.2% of the total IP in 2007, whereas NL starting pitchers accounted for 63.5%.  Though the numbers are pretty close, when we are dealing with over 23,000 IP in a league that extra 2.7% equates to approximately 600 IP.
RESULTS

  • To view the raw statistics of all the pitchers used, click here.
  • To view the list of AL SP used in the order of effectiveness points, click here

Again, if you wonder why certain statistics are used and/or why they were assigned certain points, please read the previous NL article linked at the top of the page.
I do not want to post a table of 28-30 pitchers, so you will have to click the link to view the results spreadsheet, but I will list the top ones below.

  1. CC Sabathia, +84
  2. Dan Haren, +76
  3. Fausto Carmona, +74
  4. John Lackey, +72
  5. Roy Halladay, +68
  6. Johan Santana, +60
  7. Mark Buehrle, +59
  8. Josh Beckett, +58
  9. Justin Verlander, +58
  10. James Shields, +57
  11. Javier Vazquez, +57
  12. Kelvim Escobar, +57
  13. Joe Blanton, +57

JOSH BECKETT
In the National League, the odd ranking was Chris Young, whose barometrical statistics suggested he should have been ranked higher.  In the AL, Beckett falls into the same category. The issue here has nothing to do with Beckett’s numbers, but rather the fact that there were other pitchers who were not as lucky as he was in getting run support or solid bullpen help. 
Of the players listed above Beckett, both Santana and Haren had 7 tough losses, Buehrle and Lackey had 5 tough losses, Halladay led MLB in IP/gm and CG, and Carmona had more legit wins and less legit losses.
Essentially, there is nothing wrong with Beckett’s 2007 numbers, however there were other pitchers who happened to perform better in certain areas than he did.
The Red Sox had a dynamite bullpen, so going to Okajima or Papelbon was something that just about any manager would feel comfortable and justified in doing, whereas some of these other teams needed their starters to last longer. 
No, this system does not take into account any sort of clutch factor, where I am sure Beckett would excel, but it does level the playing field to show which pitchers were the most effective, based on the numbers they individually put up. 
Just like the conclusion that was made in the Snell/Zambrano comparison, this is all about consistency.  The quality of Josh Beckett’s AQS’s may have been far greater than those of the other pitchers, however they occurred less frequently compared to the same other pitchers.   Even though his good-great games may have been astounding, when he was having average or bad games, the other pitchers were still having good-great games.
Beckett had an AQS 67% of the time (20 of 30 starts) while those listed were 73% and higher. This is not necessarily a measure of how good a pitcher was in his good games, but rather how often he was good.
ADJUSTED W-L RECORDS
One of the major reasons we considered Beckett to have been so good this past season was his record.  If he was only 15-9, like Dan Haren, there would not have been a Cy Young debate. 
That tends to be a problem because, as I will get into in the next category, W-L records do not differentiate between these Cheap Wins and Tough Losses.  If we gave every pitcher a Win for each Tough Loss, and a Loss for each Cheap Win, Beckett’s record would not have been 20-7.  It would have been 19-8. 
There is not a huge difference between his 20-7 and 19-8, but when we do the same for the AL pitchers above him in points, we get the following records: Sabathia (21-5), Carmona (23-4), Santana (19-9), Haren (21-3), Buehrle (15-4), Lackey (22-6). 
If we are going to use W-L record as a barometer, and include these Tough Losses and Cheap Wins, all of those above records are either better than or equivalent to Beckett’s 19-8.
Based off of just looking at the Adjusted W-L records, if we were to use that as the barometer for the Cy Young Award or the best pitcher, the debate would not be between Sabathia and Beckett – it would be between Haren and Carmona.  I am not saying it should have been between Haren and Carmona, but rather that if we are going to use W-L as an “end-all” statistical solution, we should at least use the Adjusted W-L, or the True W-L.
TRUE WIN/LOSS
I described the different types of wins in the NL article but I did not mention the statistic “True W-L Record.”  In order to properly evaluate pitchers, W-L records have to be broken down and examined.  Some pitchers will get tremendous run support and win games even if they only last 5.1 innings and give up 4-5 runs. 
Then there are some who will go 6.2-7.1 innings, give up 2-3 runs, and lose.  After separating these Cheap Wins and Tough Losses from a W-L record, we are left with a record of legitimate wins and losses – games that a pitcher deserved to win or lose based on performance. 
A legit win occurs when you record an AQS and win, and a legit loss occurs when you do not record an AQS and lose.
The difference between True W-L and the Adjusted W-L I used in the Beckett comparison is that the True W-L does not include Cheap Wins or Tough Losses.  True W-L only includes games in which the pitcher recorded a win or loss when either decision was merited.
You can see these True W-L Records in the raw statistics spreadsheet, but I have listed the best ones below.  In parenthesis next to the True W-L Records are the Actual W-L Records.

  • Dan Haren, 14-2, (15-9)
  • Kelvim Escobar, 14-3, (18-7)
  • Fausto Carmona, 18-3, (19-8)
  • Josh Beckett, 17-5, (20-7)
  • Chien Ming-Wang, 17-5, (19-7)
  • CC Sabathia, 17-5, (19-7)

Again, we see that if win-loss was to be the “end-all” tool to evaluate a Cy Young Award or the best pitchers, Haren and Carmona would be atop the list.
BOB GIBSON, GREG MADDUX, PEDRO MARTINEZ
For fun, I decided to plug some legendary seasons into my system to see what the end results were. Yes, it is impossible to perfectly compare a season from 1966 to one from 1996, but still it is interesting to see how they would rank. To do this, I took the 1968 season of Gibson, the 1995 season of Maddux, and the 2000 season of Martinez. The points results for the three were:

  • Bob Gibson, 1968, +178 pts
  • Pedro Martinez, 2000, +104 pts
  • Greg Maddux, 1995, +97 pts

CONCLUSION
And there you have it.  By the middle of February I should have a spreadsheet/PDF made up of all NL and AL pitchers plugged into this effectiveness model.  That way we can see who were the absolute worst as I am sure we will find some surprises and unexpected names there.
The biggest surprises to me in both leagues, in a positive turn, were Bronson Arroyo and James Shields.
The most unexpected finishes were Beckett and Chris Young, as I predicted they would be higher.
An interesting thing to look at is how players on the same team ranked next to each other.  In the NL, Zambrano is widely thought of as the #1 of the Cubs, yet Ted Lilly finished much higher.  In the AL, Kazmir is definitely thought of as the Rays ace, yet Shields ranked 9th out of the pitchers used here, and Kazmir finished 20th.
And, since the Yankees have to be stubborn, both Pettitte and Wang tied in effectiveness points. 
This model is not the end-all solution to determining who the best pitchers are in a given year, but it is a darn good predictor and estimator since it equalizes the field of play and makes sure it is known that you do not have to be on a great team to be a great pitcher or have a very effective year.  
This measures a specific season, where some players may be better than others, even if they are nowhere near better in a retrospective look at their careers. 

The playoffs, The Gambler’s fallacy, and The 50-50-90 rule

One of the basic rules of statistics applied to last night’s Game 7 of the ALCS last night:  The 50-50-90 Rule.  If there’s something that’s a 50/50 shot for the team for which you are cheering, your team will lose 90% of the time.  That is, unless you’re a Red Sox fan.  But I grew up in Cleveland and my first coherent memories are of watching the Cleveland Indians.  (True story.)  This is an iron-clad rule of statistics.  You can look it up.
I work in a hospital, and in the emergency room, they have a measure called “Subjective Units of Discomfort” (SUDs), to measure people’s level of pain when they come in.  It goes from 1 to 10.  Being a practicing Sabermetrician and a psychologist, I felt the best way to cope with this turn of events would be to make a new statistic that would adequately capture the magnitude of what happened.  I thought about calling it Pizza Cutter Depression Probability Added (this was a particularly high leverage game for that particular stat).  Finally, I settled on Subjective Units for Cleveland Knockouts (I’ll let you do the acronym).  The formula is Opponent wins x 2.5.  The scale goes from 0 to 10.
But then again, I should have known it was coming.  My wife, who’s never wrong (and the sentence should end right there, just ask her… although she did wonder out loud why Travis Hafner wasn’t trying to steal third), said this morning that she had a feeling the Indians would win.  She’s had a hot hand on picking these 50/50 shots, but this morning, we found out that one of her picks for the sex of one of the 457234 babies that are being born to people we know within the next few months was wrong.  (She called boy.  They’re having a girl.  She also picked a Cubs-Angels World Series)  Looks like her hand has gone cold.
With that said, I would warn fans of the Red Sox and Rockies to watch out for a (real) property of statistics: The Gambler’s Fallacy.  Consider the simplest of all games of chance: the flip of a coin.  Suppose that you flip a coin ten times, and ten times in a row it comes up heads.  What are the chances that the next flip will be tails?  Did you say something other than “Fifty percent?”  Did you mumble something about the “Law of Averages?”  Sound like a baseball team about which Dane Cook has been yelling all week?
Red Sox fans will probably be saying to themselves that they are sure to win the World Series because the Rockies are “due to lose.”  Rockies fans will probably be saying to themselves that they are “on a roll” and will win the World Series because of momentum.  Of course, one of them will be proven “right” in the next week and a half.  In fact, neither one is right.  The Red Sox had a better regular season record and a better Pythagorean record, plus they have four games at home to the Rockies’ three.  So, the Red Sox are the favorites.  But each game starts at 0-0, so the probabilities of either team winning reset themselves after each game.
Now, the other thing that will be bandied about is that “In a short series, anything can happen.”  This is a nice way of saying that a seven-game series is an inadequate sample size from which to determine the relative quality of the two teams.  Which is true.  If I were to submit something to a scientific journal with an N = 7, I would have the paper sent back to me with a laugh.  In baseball, you get a trophy for your efforts.  Still, there’s a part of me that wishes that the Indians were part of that inadequate sample size of independent events.
After the game, my wife, in an attempt to console me, said that she didn’t think of it so much as losing a series, but re-gaining a husband.

The playoffs, The Gambler's fallacy, and The 50-50-90 rule

One of the basic rules of statistics applied to last night’s Game 7 of the ALCS last night: