# Are Position Players Easier to Project than Pitchers?

Hi everyone. My name is Dan Jerison, and I’m excited to be joining this excellent blog. I am a grad student in math and an avid Red Sox fan. My posts will probably be on the technical side, but I’ll try not to get too caught up in the details.

We all know that position players are easier to project than pitchers. Right? For example:

Roy Halladay Year ERA+ 2005 184 2006 143 2007 120 2008 154 David Wright Year OPS+ 2005 139 2006 133 2007 150 2008 141

Okay, but why pick ERA+ and OPS+? Why not other stats, like…

Roy Halladay Year GB% 2005 61% 2006 57% 2007 53% 2008 54% David Wright Year WPA/LI 2005 4.2 2006 2.7 2007 5.0 2008 5.2

Now who’s easier to project?

I hope I’ve demonstrated that the question doesn’t make sense unless we
compare the pitcher and the position player along the same metric. It
would be best if the metric captured the player’s total value to his
team rather than just a portion. For instance, neither OPS+ nor WPA/LI
measures fielding.

Before we jump straight to the WARP-style uberstats, though, I want to
focus on a narrower issue. There are a lot of stats out there that
measure batting+baserunning wins above average. I think the easiest way
to understand the differences between them is to look at a particular
example.

On September 23, 2008,
the Mets trailed the Cubs 2-0 at Shea Stadium in the bottom of the 5th.
David Wright came to the plate with 2 outs and the bases loaded. He
singled to left, scoring 2 runs and sending the other runner to second
base.

How many wins above average do we give Wright for his 2 RBI single?
Here are four different stats: Basic Linear Weights, Run Expectancy Wins, Win Probability Added, and WPA/LI.

BLW: A single is worth +0.49 runs. There are 10 runs per win. (Numbers are approximate.) So, Wright gets 0.049 wins above average.

REW: The run expectancy before Wright’s single was 0.77. After
his single, it was 0.44. Two runs scored on the play, so Wright added 2
+ 0.44 – 0.77 = 1.67 runs. At 10 runs per win, that is 0.167 wins above
average.

WPA: Wright raised the Mets’ win expectancy from 32.3% to 55.7%. He gets credit for the difference: 0.234 wins above average.

WPA/LI: The Leverage Index of Wright’s at-bat was 4.16. He gets 0.234 / 4.16 = 0.056 wins above average.

Now let’s look at Wright’s totals in each of these stats for 2005-08. (For BLW, I am using Chone Smith’s implementation. The other three are on Fangraphs.)

David Wright
Year BLW REW WPA WPA/LI
2005 3.4 4.9 2.6 4.2
2006 3.4 3.7 4.7 2.7
2007 5.5 4.5 4.2 5.0
2008 4.4 4.1 4.2 5.2

That’s a lot of disagreement! Just for starters, which was
Wright’s best season? Four stats, four different answers. But my main
point has to do with year-to-year consistency. By REW, Wright averaged
+4.3 wins per year, with a high of +4.9 and a low of +3.7. By WPA/LI,
he averaged +4.2 wins per year, with a high of +5.2 and a low of +2.7.
The latter stat shows much wider variation than the former.

We saw earlier that Roy Halladay looked very consistent using GB% and
less consistent using ERA+. It’s well known that the pattern holds in
general: pitchers’ ground-ball rates vary less than their ERAs. The
David Wright example is somewhat more puzzling. Here are four stats,
all of which purport to measure the value a hitter provides for his
team by batting and baserunning. Yet they disagree on the amount of
year-to-year variation in that value.

So, how much does a hitter’s value vary from year to year? One
response is that it depends on which stat you use. I consider this to be a cop-out. In upcoming posts, I’ll lay out a way to determine
whether a particular stat has too little year-to-year variation, too
much, or the right amount. At least, that’s my goal. We will see how it
goes.

### 7 Responses to Are Position Players Easier to Project than Pitchers?

1. Pizza Cutter says:

When trying to get a handle on what a player’s value really is, is there such a thing as too much variation year to year?

2. Pizza Cutter says:

Sorry… changing diapers in the middle of the night does fun things to your concentration…
Should read: Is there such a thing as too little variation?

3. Millsy says:

I would imagine too little variation would make it difficult to extrapolate your projections very much. It wouldn’t give you a very nice distribution to pull from, and, while it may not be as likely that you miss, when you do miss it could be by a LOT since you have no precedent in finding that sort of anomoly. Also, too little varation could indicate that the metric isn’t really picking up everything it should be, since we know that player value DOES vary from year to year.

4. Colin Wyers says:

The correct answer for the amount of variance in a metric y-t-y is the amount of variance in player skill y-t-y, isn’t it? In the case of position player offense, that’s really not a great mystery – we know what works and what doesn’t work when it comes to evaluating hitting.
Trying to decide between LWTS and WPA/LI based upon a four-year section of David Wright’s career and the amount of variance year to year… why do it that way? We know exactly what those things measure and we have the data to analyze them on a pretty granular level.

5. Pat_Andriola says:

This would be interesting to see regarding UZR, which seems to sporadically fluctuate (take a look at Carlos Gomez this year).
Are you going to do RMSE scores for BLW, REW, WPA, and WPA/LI?

6. Millsy says:

Ah. Misunderstood the basis of the question. Going for a simple answer which wasn’t really useful.

7. Matt S. says:

You briefly brought up the issue of fielding and that really gets me excited. I feel that the general understanding of fielding ability regards it as a fixed skill that varies little if at all. That is is clearly untrue and advanced metrics (like UZR) show the variations. Still there is little mention of fielding in most projections and you would need to change that significantly if you were to develop a system that predicts overall value for a position player.
This may be even more problematic than it seems since the best fielding analysis systems are only available for the most recent years and it would be hard to create an age-based regression standard with such limited data. Still, predicting defensive performance would be very helpful on many fronts. I would love to see more on the fielding side of this question, as I think it is the most interesting part.