LoHud Guest Post

As some of you may know, I’m a big Yankee fan, despite Pizza’s objections. As a Yankee fan, it is only natural that I’m a regular reader of the LoHud Yankees Blog, which happens to be the most widely read team-specific blog on the planet. So, for the past two years, the blog has featured guest posts from other bloggers during January to fill the time when the baseball world is debating pointless things such as that annoying Joe Torre book. Or in the case of baseball nerds, whether Jeff Francoeur can possibly be worth $12 million dollars.

Today, it was my turn to take a shot at wowing the masses with spectacular prose and sheer intelligence. The jury is still out on whether I accomplished either one. My post is about evaluating trades in a fair and objective manner. It’s nothing that regular StatSpeak readers don’t already know, but I figured I might as well throw a link up. Here it is, check it out.

Innovation Profile: Oakland Athletics

The easiest yet perhaps the most interesting profile to make, I’m starting with them for simplicity’s sake.

“How’d [Oakland] do it?  What was their secret?  How did the second poorest team in baseball, opposing ever greater mountains of cash, stand even the faintest chance of success, much less the ability to win more regular season games than all but one of the other twenty-nine teams?” (Moneyball: The Art of Winning an Unfair Game, Lewis, pg XII)

With these questions, Michael Lewis attempted to discover what exactly allowed one of the poorest franchises in the sport become one of the most successful ones.  Certainly it was an intriguing question; the team was not stocked with the typical great players yet still managed to win 102 games without flashy numbers.  Lewis, in his preface, notes that “in any ordinary industry would have long since acquired most other baseball teams, and built an empire” (Moneyball, Lewis, pg XIII).  Empire, monopoly, or dynasty.  Keywords that I started to discuss last time.  Most of you probably know the answers to the questions Lewis posed as we write to a well baseball-educated audience.

Here’s a thought: Michael Lewis has a Masters of Economics.

Competitive Advantage

We harken back to the principles of Creative Destruction.  Billy Beane, GM of the Oakland Athletics, created a monopoly out of a small payroll in a tough division.  It’s easy to suggest that to create this monopoly, Beane had to have come up with an innovation.  In fact, he came up with several.

Detracters of the book Moneyball claim that it glorifies on-base percentage over all else and it’s certainly easy to see why.  Beane’s use of OBP brought it into the mainstream school of thought.  But the book is not about OBP, it’s about developing new measures to determine player worth in order to find underrated players.  So Beane innovated by using more advanced player evaluations.  He knew that there was no advantage in signing players that the market knew were good so he avoided doing so.

He also used player evaluation to discover that most players play to their greatest potential when they are aged 27-30.  Again this is hardly surprising, but he concluded that there was less worth in paying free agents with declining value.  He was able to determine the time when his assets were at their maximum worth and uses this to his advantage when trading.  Buying low and selling high is the basic rule of thumb when trading anything and Beane follows this to a T.

Beane’s A’s were the first to change their drafting dogma to the idea that past statistical performance can predict future statistical performance.  This allowed them to have better handles on what to expect out of their players.  Also in the draft was the idea that college players had a higher rate of becoming major league players and were not, in fact, less likely to become superstars.  The fact that college players often signed for less certainly did not hurt.

Ignorance Profile

Core to a systematic approach to innovation is the concept of building an ignorance profile.  When you are attempting to do something has never done before it is vitally important to discover why no one has done this before.  It might be due to some market wisdom that is incorrect or it might be due to a fact that the potential innovator has overlooked.  A lot of people get caught by not preparing something on this scale.  I am not suggesting that Billy Beane created an ignorance profile, but this is what it might have looked like.

Why did teams overlook the value of sabermetric evaluation?  Certainly the technology for the kind of sabermetrics we engage in now was not readily available until the late eighties but on-base percentages are not too difficult to calculate by hand.  Although few general managers are former players the wisdom of the players pervades the baseball school of thought.  Walks were not sexy while hits were.  For a long time people considered taking a walk from a plate appearence was a failure due to not getting a hit.  But a huge part of this was due to the prevailing wisdom that walks were entirely under the control of the pitcher and that the hitter was not a factor.  This is clearly and demonstrably false but dictated management of the game for a long period of time.

Another large problem with the market is that scouts – who used a kind of subjective analysis that Beane recognized as problematic – would only scout players whom other scouts scouted (did I say scout enough in that sentence?).  But if this is the case, then were is the advantage in scouting at all?  If they only discovered players that all other teams discovered then it gave teams no competitive advantage.  So the system of scouting was completely flawed from a economical point of view.

So certainly Beane could be sure that it was the market which was ignorant and not he, so he had a valid innovation in his hands.

Market Adjustment

As with any innovation these baseball innovation could only give the A’s an advantage for a short length of time.  On-base percentage has now reached the mainstream and even better tools of evaluation have been created.  There is little competitive advantage to evaluating players based on on-base percentage because most teams do this now.  Consider the Boston Red Sox under Theo Epstein, who have a large budget while applying Beane’s small budget principles to great effect.

As much as the book Moneyball helped bring Beane’s brilliance into the limelight it also revealed to the market the source of Beane’s success.  The success of an innovation is when it is secret so that the innovator can continually use this to his advantage.  What Lewis did was expose Beane’s innovation in a very easy too read sort of way.  That’s not to suggest the market would not adjust anyways but it could not have helped.

A very curious factor is the very idea of blogging.  Most bloggers are hobbyists who are not paid and are under no obligation to help certain teams, yet some brilliant analysises are created in this medium, just look back into the archives of this very website.  Bloggers are trending towards better and better players evaluations than I suspect teams are capable of.  But it’s all available for free (or for a pittance) over the Internet and so can only be a competitive advantage for a team insofar as other teams are ignorant of the particular analysis.  So, by writing a sabermetric blog, we’re morphing the market towards one where sabermetrics give no competitive advantage.

Into the Future

Sure, Beane’s initial innovations give less of an advantage to him now, but I think that Beane has demonstrated the traits of an innovator.  He has certainly shown himself as a superior trader and salesman who knows exactly when his assets are at their maximum value (see: Zito, Barry).  In time he will create a new innovation to destroy the monopolies created by older ones and soon be the top of the game again.  The competitive pressures demand it of him.

Some rockin' links

From the “stuff we’re reading” file: Derek Carty takes a look at predicting BABIP for batters.  He finds that a regression-based formula put together by Peter Bendix and Chris Dutton rates the best of them all.  This is a good next step.  There are multiple techniques out there for this (and for estimating a bunch of other things), so now we need to start looking at which is the best.  If there’s one place for a follow up, it would be making sure that the errors aren’t systematically distributed (does the formula over/under-shoot in a consistent manner for certain types of players), but one thing at a time I suppose.

While you’re at THT: Chris Jaffe has a list of 50 “closer” entry songs waiting to happen.  I suppose I know what mine would be.

Baseball game Hall of Fame?: Over at Beyond the Boxscore, there’s a delightful discussion of what games belong in the Hall of Fame.  I vote for this one… because I was there…

Some rockin’ links

From the “stuff we’re reading” file: Derek Carty takes a look at predicting BABIP for batters.  He finds that a regression-based formula put together by Peter Bendix and Chris Dutton rates the best of them all.  This is a good next step.  There are multiple techniques out there for this (and for estimating a bunch of other things), so now we need to start looking at which is the best.  If there’s one place for a follow up, it would be making sure that the errors aren’t systematically distributed (does the formula over/under-shoot in a consistent manner for certain types of players), but one thing at a time I suppose.

While you’re at THT: Chris Jaffe has a list of 50 “closer” entry songs waiting to happen.  I suppose I know what mine would be.

Baseball game Hall of Fame?: Over at Beyond the Boxscore, there’s a delightful discussion of what games belong in the Hall of Fame.  I vote for this one… because I was there…

So how long does it take for BABIP to become reliable?

Seems a simple question.  We know that BABIP (batting average on balls in play) for pitchers has a low correlation from year to year.  As a result, a Sabermetric standard has been that one year in a pitcher’s life tells you little about his actual ability to prevent hits on balls in play, which is true.  In statistical terms: one year is not a sufficient sample to get a good estimate of the parameter, primarily because a pitcher only faces a few hundred balls in play each year.

Suppose though that a pitcher’s season lasted billions of plate appearances.  Eventually, we’d know exactly how good a pitcher was.  If we let him face another billion hitters, he’d come up with the same number again.  That sort of sampling frame produces reliable statistics, but it’s a fantasy.  We have to deal in reality.

But after looking at year-to-year stats, with the low correlation between BABIP at year 1 and BABIP at year 2 (which has held any which way you try to break it), it’s been assumed that pitchers have no control at all over their BABIP, ever.  That’s a big jump, one that I think people make without fully stopping to realize that they’ve made.  (I’ve probably made it myself.)  There’s a difference between a parameter being entirely random and it being unobservable given our limited data and the amount of noise present. 

The assumption goes that everyone is a .300 pitcher once the ball is in play and doesn’t leave the stadium.  After all, if there’s no stability, it must all be random noise.  Right?  It’s just that no one has ever been really comfortable with that thought.  Pitchers don’t differ in their BABIP ability at all?  Pedro Martinez in his heyday was the equivalent of Mike Bacsik in his heyday?  It just doesn’t make sense.  Then there is the curious case of Troy Percival (my personal favorite piece of anecdotal evidence.)  His BABIPs have been consistently below the magic .300 line throughout his entire career, and it’s been a long one.  Could it happen by chance?  Sure, but perhaps something else is afoot.

Maybe the problem is that we need to widen the sampling frame.  Maybe one year doesn’t tell us much about a pitcher’s true talent on BABIP, but what if several years do? 

I took 30 years worth of Retrosheet data (1979-2008) and dumped it into a giant file.  I selected all balls in play (not a strikeout, not a walk, not a home run, not HBP, not one of those weird catcher interference thingies.)  As I have been wont to do lately, I started running some split-half reliability analyses.  I split each pitcher’s batters faced into even and odd numbered appearances (so, I’m drawing the first PA into the odd group, then the second into the even group… it balances out the two halves of a player’s performance so that I’m drawing some from year one, some from year two, etc.)

For each pitcher, I started by taking a sample of 500 balls in play and splitting them into two 250 BIP halves (those that had 500 to give).  I ran a correlation between those two halves for all 1461 pitchers in the sample who fit the criteria.  The correlation was .174.  So, at 250 BIP, BABIP has a split half reliability of .174.  It’s numbers like that which led to the creation of DIPS theory to begin with.

But let’s expand.  Let’s take two samples of 500 BIP.  That bumps things up to .253.  Hmmmm, getting a bit more reliable.  The question becomes when does it hit that “good enough” point.  I’ve argued previously for the use of .70 as the cutoff for reliability. It’s an arbitrary point (I guess in an ideal world, we’d want a reliability of 1.0), but .707 has an R-squared of .50, which means anything north of that accounts for more than 50% of the variance.  Can we get to .70?

Turns out that the answer is… yes.

At a sample of 3750 balls in play, (a 7500 BIP sample, chopped in half… there were 48 pitchers in the last 30 years who had that many BIP to look at… not outstanding, but enough to not discount), the split-half reliability was .696.  At 4000, it reached .742 (in 34 pitchers).  So, it only takes about 3800 BIP before we get a reliable read on a pitcher’s BABIP abilities.  That’s a lot, but it’s not an obscene amount.  In 2008, the average pitcher saw roughly 3 balls in play per inning pitched.  At that rate, a starter who throws 180 innings would see about 540 BIP in a year (rough estimates here.)  So, it would take about seven years, at that same 180 IP per year rate, to get to the required number of BIP.  Not easy, but not out of the realm of possibilities.

Now, about those guys who had two matching 4000 BIP samples, there was still some variability in the sample.  Andy Pettite had BABIPs in his twin samples of .318 and .312.  Charlie Hough had the other extreme at .248 and .266.  So, it looks like there is such a thing as the “ability” to exert some control over what happens to a ball in play.  It just takes a while (but not forever) to reveal itself.

This isn’t a very functionally useful finding for evaluating players or predicting what they will do.  A pitcher is not the same man he was at the beginning and end of seven years (either as a pitcher or a human being).  The ability to prevent hits on BIPs may deteriorate over the years and at that point, we’re using data that are 6 and 7 years old to predict what will happen tomorrow.  In a single season, which is really the sampling frame that most fans are concerned about, there will still be a lot of noise around the signal, but the signal is definitely there.  Now if we can just get a better radio to pick it up.

Those Left Behind

I want to take a few minutes here to look at some of the remaining members of the 2009 free agent class. We’ve been reading for weeks about the depressed economy having an effect on some of the potential buyers this winter, and how some players aren’t happy about it. Bobby Abreu is still unsigned as of this writing, as are Adam Dunn and Manny Ramirez. Those three combined for 97 home runs last season, though they admittedly are not the three best fielders in the world. Abreu, Dunn and Manny are three of the most consistent players in baseball, so it’s easy to figure out what they will each add to a team. Obviously, teams so far have figured out that value and don’t like the asking price so far.

So what about some of the more volatile members of the current free agent class? I’m not talking about the Milton Bradley throwing a fit kind of volatility here. I mean the guys who are more difficult to value because of their potential to be productive, albeit less so than in the past. This is by no means an all-encompassing list, it’s more of just a survey of the field. For right now, I’m going to stick with hitters. Let’s get to it…

Ivan Rodriguez

Depending on who’s doing the talking, an Ivan Rodriguez signing could be described as either a bargain or a rip-off. Despite having an OBP below .300 in two of the past four years, replacement level is so low for catchers that Pudge has remained a very productive member of society. Let’s say he signs for a one year, $5 million deal. If he reverts back to his 2007 form, or even his pre-Yankees 2008 form, he’ll be a steal. But if the real Ivan Rodriguez right now is the one who got benched in favor of Francisco Cervelli in September, then his new team just threw that money down the drain. My money is on him being a good buy.

Nomar Garciaparra

Nomar is an interesting case, in that even when he’s healthy you’re not really sure if he’s healthy. He won comeback player of the year in 2006 despite missing 40 games due to injury. Just one season later, in essentially the same amount of playing time, he saw his home run totals drop by 65%. His slugging went from Burrell-esque levels to below the Melky-line. He’s still probably a decent hitter, but not for a first baseman. If some team is in need of a part-time utility infielder and is willing to put up with his limited range, then I can justify acquiring him. Other than that, I’d say he’s not worth keeping on the roster.

Jim Edmonds

I suspect that when he’s up for the Hall of Fame, there will be a lot written about his career in particular, and whether he deserves to be in. Just a wild guess–he gets in the third time around. Side note: Why is it that fielding is rarely considered for awards, but comes up so often in HOF debates? Anyways, Edmonds surprised everybody last season with his 20 home runs in limited playing time (and his .353 wOBA, but I suspect the amount of people shocked by that was far fewer). He’s probably no longer a viable center fielder, but the positional adjustment might just cancel out the value added by moving him to a corner spot. He’d have to be just about average at the corners to come out ahead. Offensively, Edmonds should continue to provide good value since he still has a great walk rate and that power bat (yay Moneyball!). If he’s used in the same way he was last season, he’ll be a good addition to some contending team.

Frank Thomas

The Big Hurt caps off this list of seemingly forgotten free agents. He held up pretty well in ’06 and ’07 but went down with a series of quad injuries that limited his 2008 season to just 71 games. I don’t see any team signing him to be the DH, so it looks like Fragile Frank (just made that up, I think) may have to hang it up. He’s not much different than guys like Mike Piazza and Sammy Sosa, who did not play last season after having varying degrees of success in 2007.

So that wraps up this post on some of the potential bargains of the remaining free agent class. Next week, I might look at some pitchers, although that’s less fun so I reserve the right to scrap the idea entirely. 

A philosophical question

Let’s say that there was a team (we’ll call them the Mapleland Bees) that was faced with a choice between two free agents.  (And no, this isn’t one of those things where I reveal that these are two real players at the end and say “surprise!”)

Free Agent A is good at the types of things that people will pay to watch. 

  • He hits homeruns
  • He’s “basestealing threat”, which means he’s good for 15-20 SB per year
  • He has a reputation for “coming through in the clutch,” dating superstars and models, and being a pretty “face of the franchise.”  The reporters love him because he’s great copy.
  • He hits for a high (.300+) batting average and because of the guys in front of him had 120 RBI last year.

The problem is that he’s not so good at some of the “hidden” things in baseball that few people know to even look for and fewer would probably pay for.  He’s awful as a defender, and makes the routine plays look spectacular.  He doesn’t draw walks (and so has a low OBP), strikes out a lot, and while he steals bases, it’s primarily because he just runs a lot and gets lucky sometimes.  In other words, he’s something of a slightly altered version of Derek Jeter.  (I know, I know, there’s nothing wrong with Jeter’s OBP).

Free Agent B is not a homerun hitter, nor much with the batting average, and due to hitting behind an OBP nightmare, he had a mere 65 RBI last year.  He does put up his share of doubles, but isn’t a fun player to build a marketing campaign around (read: ugly) and is happily married to his college sweetheart, has two kids, and doesn’t say much after games.  However, he’s a fantastic defender who makes the tough plays look routine.  He also draws a lot of walks, so while his average looks low, he’s actually getting on base quite a bit.  Plus he doesn’t really strike out so much.

So, he’s what would happen if Mark Ellis and Brian McCann had a baby together.

Here’s the thing.  The Bees are an enlightened team and they see the two players for just just what they are.  And after crunching some numbers, they realize that Ellis McCann would actually stand the better chance of making the team better, once everything is considered (hitting, fielding, running, the fact that being pretty has no bearing on on-the-field results).

But wait, not all is that simple.  Jerek Deter, despite being the lesser player of the two, will sell a lot of t-shirts, tickets, and is worth a few million more in the TV contract and a lot more in getting his name mentioned on E!.  Ellis McCann is boring and will only be appreciated by a handful of nerds. After crunching some more numbers, the owner realizes that he stands to make more money by signing Jerek Deter than Ellis McCann, even after you factor in what is likely to be a big difference in salary… in Deter’s favor.  It even works out to favor more profit from Deter after you figure in the thought that a team that wins more (and Ellis McCann is worth more wins) makes more because everyone loves a winner.

So, whom should the owner of the Mapleland Bees instruct his GM to sign?  He’s a business man who’s running an entertainment venture, and if there’s nothing else that we learned from the unfortunate ten years that were the peak of Britney Spears’s career, it’s that you have to give the people what they want, no matter how stupid that is.

But… winning…

Baseball Economics and Drive for Innovation

Aside: Well hello there, I’m a new writer here at StatSpeak.  I’m a Jays fan with an interest in economics, so I started a little blog about it.  Now I’m here to do economic style discourse about baseball.  In spite of my soon to be completed advanced degree in mathematics I can’t pretend to know nearly as much about stats as my fellows here, so don’t expect much of that in my posts.

So I’ve been a pretty huge baseball nerd for a while now.  It got so bad that I avoided taking evening lectures at the risk of missing any games my hometown Blue Jays were playing, and when I did attend I would be refreshing the score pretty often.  So when I took Econ 101 I was at the same time following the game, so perhaps it was natural that I got the two pretty mixed up.  When my prof would explain a concept I thought of an example of how it applied to the game.  Surprisingly this lead to a good mark in the class and here we are.

One of the aspects of the game that has always fascinated me the most is how it was in essence the microcosm of an economy.  You have thirty competing companies in the same industry who bid for the top employees and assets, who worry about P.R., and are constantly driving toward creating that monopoly – a dynasty in baseball parlance.  Every team wishes for a string of successful years and there seems to be no shortage of strategies toward achieving this.

There are the Yankees who hope to buy up free agency and drive the cost of player contracts into the stratosphere (more on this eventually), the Rays who are a recent example of success through strong drafting, and the sabermetrically beloved As whose success rode the back of a better understanding of statistics.  A successful team must have a strategy if they hope for continued success, to have a competitive advantage over their opponents.  Money is a part of this but teams like the As and the Twins demonstrate that it is not always the most important part.

In economics there exists the idea of Creative Destruction, popularized by Schumpeter who argued that monopolies are created through powerful innovations.  Innovations give the creator the ability to grab a significant market share because they simply do it better than their competitors.  Look at Google who created a monopoly on Internet search engines by having the cheapest cost per search.  But that is only the first part of the idea.  The second is that while monopolies are created by innovations, they are also destroyed by superceding innovations which create new monopolies.  So monopolies are created and destroyed by innovations, so the ability to create innovations are the greatest competitive advantage there is.

The interesting part of this is that the pace of this cycle of creation and destruction of monopolies is accelerating.

Right, back to baseball.  If baseball is a microcosm for an economy then the same principle of creative destruction should apply.  I think this is easily demonstrated.  The Yankees of the late nineties and early 2000s were undoubtably the creme de la creme of the baseball world and are really what most people think of first when they think of baseball dynasties.  They created a monopoly through innovation and their monopoly was destroyed by the superceding innovations of the Red Sox.  As a guess, the Yankee dynasty lasted about 5 or 6 years.  If we can believe that the Rays are as good as they demonstrated last year (which, as a Jays fan, I sure hope not) they could have superceded the previous Red Sox monopoly.

So if we can create an innovation in baseball as in business we can create a monopoly.  The discussion of what constitutes a baseball innovation is more than I’m willing to write right now; first we will have to dissect the successes of the past to have a better appreciation of the innovations needed in the future.  The easiest innovation to understand from the sabermetric point of view is, of course, the innovation of the Oakland Athletics, but more on that next time.  Unless I start complaining about the idea of a salary cap first.

Until next time,

Shawn Freaking Estes is not about to regress to your damned mean

I don’t mean to pick on Sean Estes, I really don’t. I am eternally grateful to him for shutting out the Reds late in 2003, and that’s as far as that goes.

But he came up in an old thread on Tango’s blog about the worst pitchers in baseball, and he certainly is in the running. And I was looking for a face to put on the concept of the fact that some guys, no matter how much we wait around, are never going to manifest improvement of any sort. We’re not talking about learning a new pitch, or gaining mental toughness. We’re talking about the numbers catching up to the fact that real MLB teams are still willing to give you a job somewhere. I could have easily made this post about a guy like Daniel Cabrera – and in fact, let’s do that too. Daniel Cabrera laughs at your measurements of central tendency.

I’ve taken the time recently – here, here and here – to muse about how talent is distributed in MLB. And I think I’ve finally come up with something approaching a resolution, or a least a direction to take the conversation in.

I’d like to note for the record that I hate, hate, hate arbitrary playing-time cutoffs for studies. They are at times a necessary evil, but they’re never ideal. But without them, too much weight was being put on fringe players and not enough on regular players when I went to study the issue.

So here’s what I did. I took three years of data (2006-2008), broken down into single-season pitching lines. I took each pitcher-season’s RA and weighted it by the number of batters faced – so a pitcher with 1000 batters faced counted for 100 times more than a pitcher with 10 BPF.

Then, for the sake of being able to actually do graphs, I subsampled out 20,000 pitcher-seasons from the result. So, for instance, there are 105 pitching lines of Jeff Francis’s in the result set, 104 from Johan Santana – but only 22 from guys like Bob Wickman or 21 from Reynel Pinto. (I did this several times, to make sure I wasn’t ending up with a particularly biased subset of the data.) Then I graphed it:


I cut off the graph at 20 RA so that there was enough meaningful detail for us to see anything. The shape of the graph seems somewhat normal to the left of the 5 RA mark, but seems to taper off much more slowly to the left than we would anticipate if pitching was truly normally distributed. This – and grant that this is the interpretation of a layman, nothing more – looks like a very modest application of the “fat tail,” which is the reason that you can’t buy or sell a house for money these days.

Compare to this graph of, oh, the logarithm of RA (all values of RA included):


This graph seems more normal, doesn’t it?

The biggest difference isn’t in skew – skew was never the major issue in the distribution of pitching, unlike what several of us (including myself) speculated in the comments. The problem is kurtosis – our distribution is too tall in the middle to be truly normal, and the tails are out of whack as a result. (There is, in fact, too much kurtosis for us to even be truly log-normal, although the log-normal distribution seems to describe pitching better than the normal distribution.)

Is there a practical application to this? I think so – although I have no more fancy graphs or evidence to present, so consider what proceeds from here to be nothing more than informed speculation.

There is the assumption that comes with the normal distribution that events a certain number of standard deviations, or “sigma,” away from the mean are practically impossible. (This is where the term “Six Sigma” comes from, if you were curious.)

We can pretty readily disprove this assumption when it comes to major league pitching. You simply need:

  • A baseball.
  • A bat.
  • A major-league hitter.
  • An idea of where the fence would be at in an MLB park.

Gather those things, and then you try throwing the ball to the hitter while he holds the bat. I think it’ll be pretty quickly demonstrated that it’s possible to be much, much worse at pitching than six sigma below the league average.

Any regression-based projection of the worst pitchers in baseball is likely to be too rosy, for the simple fact that it’s possible to have a worse true talent level for pitching than the normal distribution is fully able to comprehend and accept.

Is baseball talent normally distributed?

A lot of the basic assumptions of sabermetrics are based around the notion that baseball talent is normally distributed, or at least approximates a normal distribution.

Briefly, the normal distribution describes the bell curve – what you tend to see with a normal distribution is a peak in the middle, at the average, with two tails that spread out evenly from there, in a very symmetric fashion.

The normal distribution is important to sabermetrics – its the basis in which regression to the mean works. Regression to the mean simply states that extreme observations tend to become less extreme in the long run. This is the basis of how sabermetricians estimate a player’s true talent level, and is how most projection systems work.

But how true is it?

What I’m using right now is a tool called quantile-quantile plots. I should warn that I’m not a professional statistician, nor do I really play one on the Internet. I’m a hobbyist, so take the following with a grain of salt; this is the best I can discern from the documentation for the GNU R statistics package and some light reading on the topic.

But the Q-Q plot has a straight line on it that’s supposed to represent how the data would look if the distribution was normal, as well as a graph of how the data is actually distributed. This is the Q-Q plot for RA, from pitchers with 20+ IP from 1998-2008:


I should probably take this moment to note that those cutoffs are wholly arbitrary.

But what we see, especially on the high end of the tail, is a high amount of non-linearity in our graph – in other words, pitching does not seem to wholly fit to the normal distribution. If we look at the center part of the band we can see that for many – maybe even most – pitchers, the normal distribution is a “good-enough” fit – there’s a good fit there in the middle.

The biggest issues seem to be with pitchers who have an RA of 6+, which to be honest we probably aren’t too concerned with. There is a more subtle – and yet more important – shift away from the line at pitchers with an RA below 2.

Now, the same, but for wOBA of hitters with 20+ PA from 1998-2008, pitchers excluded:


Again, in the middle the normal distribution seems to fit well, but at the extremes the assumptions of normality seem to be suspect. The issues above a .400 wOBA seem to be with performances that are very unsustainable to begin with, so I don’t know if they’re important. The sub-.300 wOBA issues, again – I don’t know if they come into play often enough to be a concern.

I’m sorta musing aloud in this post, and would love to get your thoughs on the issue. (This goes double for all of you who have a better stats background than I do!) 


Get every new post delivered to your Inbox.