A new framework for offensive evaluation: Total Production
October 31, 2008 26 Comments
If you’ll apologize with me being brief with you, I have some (you guessed it) linear weights for your edification and entertainment.
This is an update to the “RE from zero” method I introduced a while back. There are two key differences:
 Instead of doing the entire Retroera, I simply used what I am calling the modern offensive era, 19942007. (I don’t have 2008 Retrosheet data yet, because nobody does.)
 I have made some corrections to the way I was figuring out the negative run value of the out.
Here is the run expectancy table, 19942007:
OUTS

RUN1

RUN2

RUN3

0

0.395

0.613

0.842

1

0.265

0.412

0.658

2

0.131

0.229

0.287

3

0.000

0.000

0.000

And now, the linear weights, broken down by component:
SHORTNAME

LONGNAME

NUM

RUN

RBI

OUT

LWTS

Out

Generic Out

1223130

0.005

0.028

0.132

0.099

K

Strikeout

419719

0.001

0.003

0.117

0.114

SB

Stolen Base

35402

0.170

0.000

0.000

0.170

CS

Caught Stealing

19585

0.036

0.000

0.271

0.236

NIBB

Walk

203512

0.256

0.065

0.000

0.320

IBB

Intentional Walk

17489

0.196

0.005

0.000

0.201

HBP

Hit By Pitch

22391

0.265

0.082

0.000

0.347

1B

Single

395520

0.272

0.206

0.004

0.474

2B

Double

117508

0.426

0.337

0.002

0.761

3B

Triple

12505

0.606

0.429

0.000

1.036

HR

Home Run

69809

1.000

1.401

0.000

1.401

And now I have some ‘splaining to do. These are “absolute” linear weights, as compared to linear weights above average (the usual presentation). These are not derived from a standard run expectancy table, and so no additional effort was made to reconcile them with absolute runs scored and as such I have no method employed. (Since converting LWTS from aboveaverage to absolute runs is a topic of some confusion and no little controversy – or at least that’s how it feels to me – I really need to do a longer writeup of the issue at some point.)
I’ve arranged the values so that they match up with the three basic elements of scoring runs: providing a baserunner, advancing a baserunner and making outs. You’ll note that I am doublecounting the home run; this is because doing so provides us with linear weights values that match up well with a player’s runs scored and runs batted in, at least for the population as a whole.
The benefit of this approach – at least to my thinking – is that you can present linear weights as contextneutral runs scored and runs batted in, with a third component – a player’s negative contribution by making outs.
I’m calling it Total Production, because to be quite frank I can’t come up with anything better and I should be in bed already. I am throwing myself on the mercy of you, the reader, to give me a better name. Please give me a better name.
And… here is your Total Production leaderboard from 19942007, minimum 300 PAs. (For no other reason than the size limit of a table I can publish via EditGrid.)
From here, you could do a lot of things with the numbers – add in park adjustments, provide versions for different baselines like above average or replacement level, combine them with some defensive metrics so that the “total” part of the name isn’t an absolute lie.
I should note that I’m not claiming any benefit from using these weights over any similarly capable set of linear weights appropriate to the time period involved. The only potential benefit I’m claiming here is in the presentation of the data – it occurred to me that some people might have an easier time approaching linear weights if they were presented in the guise of the traditional counting stats. Let me know if you think so, too.
You want a better name for what you have done? How about Total Crap? Seriously this is the worst implementation of coverting Runs Above Average linear weights to Absolute Runs that I have seen. Your values for non out events stay the same as Runs Above Average linear weights while you drastically reduce the negative value of making an out. Have you given any thought to the theoretical implications of that? You are basically distributing all the average runs by giving an equal value to each out event and NO VALUE TO EACH NON OUT EVENT. Players who make more outs than average get MORE VALUE. That is crazy.
In Tango’s method of giving .12 runs for each batting event the average runs are distributed among all events and every batter gains in proportion to the amount of PAs that he has. This makes sense if you believe that the out value in runs above average correctly compensates for the loss of additional PAs to other batters. The method I proposed of distributing .36 runs to every non out event would be correct if you believe that some additional value for the additional PAs of other batters is not correctly compensated in linear weights above average and therefore additional value should go to the batters who make fewer outs than average. There is no rational that I can conceive that would favor your method of giving additional value to batters that make more outs than average.
As always, Peter, your comments are appreciated.
I would like to see you name a published implementation of absolute LWTS that doesn’t look pretty much like the weights I’ve published here. All of the ones I’m aware of – Estimated Runs Produced, Extrapolated Runs – do it this way. That doesn’t make it right, but since this is “the worst implementation of co[n]verting Runs Above Average linear weights to Absolute Runs that [you] have seen” I’m curious as to how it’s worse than either of those.
I didn’t start from the assumption that this was the correct way to do it – I actually spent quite a bit of time simply trying to find out how to start from the “Zero Point run expectancy” and come up with weights that properly reconciled. I could have simply adhoced everything to make it look different, but that defeats the point of figuring the weights empirically, doesn’t it?
A while back I took batting runs and stepped through the different approaches to converting Runs Above Average, using Batting Runs as my example. Here they are, for everyone’s reference.
Above Average:
(.47*H)+(.38*D)+(.55*T)+(.93*HR)+(.33*(BB+HBP))+(.22*SB)+(.38*CS)+(.26*(ABH))
Traditional:
(.47*H)+(.38*D)+(.55*T)+(.93*HR)+(.33*(BB+HBP))+(.22*SB)+(.38*CS)+(.09*(ABH))
Tango:
(.55*H)+(.50*D)+(.67*T)+(1.05*HR)+(.45*(BB+HBP))+(.22*SB)+(.38*CS)+(.14*(ABH))
Peter:
(.60*H)+(.73*D)+(.90*T)+(1.28*HR)+(.68*(BB+HBP))+(.22*SB)+(.38*CS)+(.26*(ABH))
Unlike the values I normally present, these use hits instead of singles, which is probably the more common presentation of the data.
Here’s an exercise to try right now. Apply the different values for these linear weights to pitchers batting. I think that’ll show why Peter’s method is problematic.
Here’s a head start. Houston Astros pitchers this year hit .159/.194/.193, which is a bit above average for pitchers hitting. How many additional runs would the Astros have scored if they had a DH, presuming that DH was a league average hitter? First, Runs Above Average:
(.47 * 47) + (.38 * 4) + (.55 * 0) + (.93 * 2) + (.33 * (12 + 1)) + ((.26) * (296 – 47)) = 34.98
Now, let’s use the “traditional method”, only changing the out value:
(.47 * 47) + (.38 * 4) + (.55 * 0) + (.93 * 2) + (.33 * (12 + 1)) + ((.09) * (296 – 47)) = 7.35
Now Tango’s method, or the “reconcile” method:
(.55 * 47) + (.50 * 4) + (.67 * 0) + (1.05 * 2) + (.45 * (12 + 1)) + ((.14) * (296 – 47)) = 0.94
Now Peter’s method:
(.60 * 47) + (.73 * 4) + (.90 * 0) + (1.28 * 2) + (.68 * (12 + 1)) + ((.26) * (296 – 47)) = 22.22
Now, if an average hitter produces .12 runs per PA (and he does, varying a bit based upon run environment) that gives us:
.12 * 334 = 40.08
So now, runs above average, based upon:
Traditional: 7.35 – 40.08 = 32.73
Tango: 0.94 – 40.08 = 39.14
Peter: (22.22) – 40.08 = 62.3
The traditional and reconcile methods both line up pretty well with what we get using Runs Above Average. Peter’s method is far out of line with all three of the other methods.
I really don’t understand Peter’s criticisms. All that Colin is doing here is breaking down the Linear Weights similar to the way that Tangotiger did in his “How are Runs Really Created” series. The run values in the LWTS column, which are the sum of the Runs + RBI + Outs column, are similar to the results you would get if you calculated absolute Linear Weights from a traditional Run Expectancy Chart.
Does the CS category include pickoffs and pickoff errors? If it does, the .236 runvalue looks correct.
What are the component values for “Reached on Errors”?
Contextneutral Runs and RBI. What a great idea! I never thought that this would have been possible.
Terpsfan101–
Baseball Prospectus has some kind of context neutral RBI stat, although it’s configured completely differently than this. I think they take the rate at which you drive in runners on base and then scale the RBI count to what you’d do in an average lineup. Or something like that. So ARod’s totals get knocked down a little bit because he has 3 great hitters in front of him always on base, and someone like Pujols might have his total increased because the guys in front of him suck.
Dan,
Yes, I was aware of BP’s RBI stats. I should of said that I never thought that it would have been possible to convert Linear Weights into Runs scored and RBI’s. As far as I know, Colin is the first person to do this.
terpsfan, it should be noted that Tango has since advanced another position when it comes to reconciling linear weights with absolute runs:
http://www.tangotiger.net/reconcile.html
This is because of the difference between a runs per out framework (which is essentially what I have here) and a runs per plate appearance framework.
The correct way to turn these values into Runs Above Average is as follows:
Total Production – (.192 * (ABH)) = RunsAboveAverage
(And on that note, I’ve updated the spreadsheet with Production above average and above replacement, where replacement is defined as 75% of average, not adjusted for position.)
I don’t want to resort to jerkish technicalities, but the values presented here are not wrong, so long as you recognize that these are runs per out and you treat them as such. Since we normally measure playing time for hitters with plate appearances (since this is of course how playing time is allotted) for a variety of reasons it makes more sense to handle the issue Tango’s way.
But I don’t think it’s right to say that Tango is correct here and I am wrong. Each answers two different questions. You could argue (probably pretty well) that for hitters the question that Tango is answering is more relevant and meaningful, and you won’t get an arguement from me.
But so long as we’re careful and know what we’re dealing with, we can pretty readily convert between the two without worries. What you need to know to compare two players (with ANY counting stat, regardless of baseline) is their production level and playing time.
If I am providing runs per out and Tango is providing runs per plate appearance, then I have no idea what Peter is providing. Up until this point Peter has provided no evidence that his approach is correct, and has only really provided blunt assertion. Tango and I have both shown him points where his model breaks down at the extremes (I think that 20 runs for a pretty typical pitching staff’s hitting is an absurdity in an absolute run estimator.)
As for a reach on error, it has a Run value of .303, an RBI value of .213, and an out value of 0.002. CS does include pickoffs.
Colin #2 – I admit that Estimated Runs Produced and Extrapolated Runs are no better absolute run estimators than yours, so my claim that yours was the worst I had ever seen was a tad exaggerated. But Estimated Runs Produced was invented over 23 years ago and Runs Produced about 10 years ago.
As far as I can tell your “New” Framework has added nothing innovative and has repeated their earlier mistakes of adjusting only the out value to convert runs above average to absolute runs.
You must have a theoretical basis for the decisions you make. In my earlier post I presented the theory behind Tango’s method and my alternative hypothesis. If you want to defend your method by presenting the theoretical reasoning behind giving added value in absolute runs to players who make above average number of outs I will be glad to listen, but that is where the discussion needs to start.
Colin # 3 and 4 – If you had actually done the math correctly on your examples you would have found that the absolute run values for the Astros pitchers by the different methods were: yours – 7.35, Tango’s – 2.1, mine (13.38). Over the several months since the discussion about conversion methods began I have been giving a lot of thought about what the best method might be. I honestly don’t know whether Tango’s method or mine is correct. Lately, I have been leaning more in Tango’s direction but I have failed to come up with a solid proof of his underlying assumption that runs above average linear weights correctly compensates players for the added PAs their non outs produce for other players. The practical differences between the different conversion methods (yours, mine, and Tango’s) for most players are within only a few runs. Probably also within the margin of error of linear weights itself. But still it would be good to have a method that could be proven to be theoretically correct so that any error would be in the correct direction.
Colin,
Yes, I have been aware that you are using a different method here. You are using a RE chart that starts from zero instead of the average runs/inning.
Perhaps you could clarify one more thing for me. Why do the values for absolute Linear Weight values that you present here:
http://statspeak.net/2008/10/topoffensiveperformers2008.html
differ from the set of absoulte Linear Weights that you present in the “Total Production” article?
Colin,
Please explain this as well,
“This is because of the difference between a runs per out framework (which is essentially what I have here) and a runs per plate appearance framework.”
Isn’t it the other way around? Isn’t your framework R/PA, and the old framework R/O.
If you could point out where the error in calculation occured, that would be appreciated.
As far as:
“You must have a theoretical basis for the decisions you make. In my earlier post I presented the theory behind Tango’s method and my alternative hypothesis. If you want to defend your method by presenting the theoretical reasoning behind giving added value in absolute runs to players who make above average number of outs I will be glad to listen, but that is where the discussion needs to start.”
That is incorrect. In order to know a player’s value, you need two pieces of information: his production and his playing time. For the values I presented, the proper denomenator is outs.
The problem with your method is that it distorts the run values of the underlying events. I went ahead and tested everything on team run scoring, 19932007, figuring a set of linear weights above average based off a standard RE chart and adjusting from there. Absolute value linear weights based upon runs per out have an average error of +/ 17 runs on the team/seasonal level, compared to +/ 20 runs for runs per PA and +/ 38 runs for runs per safe (your method).
This type of value (the .1 out value, for lack of a better term) is the same that you will get from using a dynamic model of run scoring (BsR, Markov model, simulation, etc.) and measuring the marginal value of each event. They have a theoretical basis–the run value of the event to a team.
As Colin points out, since they are teambased values, they have to be viewed in a runs/out framework when they are turned into rates. And I also agree wholeheartedly with Colin when he says “But I don’t think it’s right to say that Tango is correct here and I am wrong. Each answers two different questions.”
So the only difference between the Absolute Linear Weights presented in this article and traditional Aboslute Linear Weights, is that the Absolute LW’s in this article are calculated from an Absolute RE table, while traditional Absolute LW’s are calculated from an average/runs inning RE table, with the additional step of adding the R/O for out plays. Is this the only difference?
Now, what is the difference between the Absolute LW in the “Total Production” article and the Absolute LW in the “Offensive Performance” article?
http://statspeak.net/2008/10/topoffensiveperformers2008.html
In the “Offensive Performers” article you are adding the average runvalue of a PA to each PA event. Why are you not doing this here? Finally, are the values in the LWTS column of the “Offensive Performance” article generated from an absolute RE table or are they generated from an average runs/inning table?
Right. These should be functionally identical to absolute runs reconciled using runs per out. The values in that past article were figured from a standard RE table and reconciled using runs per plate appearance.
I’m not doing this here because I didn’t ever get to runs above average. I’ll admit that I’m still not entirely decided myself on which approach I favor; I can see upsides and downsides to both. And I’m trying to work out which I think is preferable to me. Writing these articles is a real learning experience for me.
“Writing these articles is a real learning experience for me.”
Reading these articles is a learning experience.
After reading your last response, I am less confused. This confusion could possibly be removed if I knew the advantages/disadvantages of your new methods (R/PA Standard RE Table and Absolute RE Table) compared to the oldway of doing things.
(BTW, I’m an idiot. I didn’t realize that today was daylight savings time until a few minutes ago.)
Love the RUN and RBI linear weights, Colin. You can be sure I’ll use and abuse these in the near future.
Tom Tango said he tried to submit the following, but it wouldn’t go through:
“I need to make the correction to this:
(.55*H)+(.50*D)+(.67*T)+(1.05*HR)+(.45*(BB+HBP))+(.22*SB)+(.38*CS)+(.14*(ABH))
When you add .12 runs per PA, and if PA = H+BB+(ABH), then you do NOT want to add .12 to D, T, HR. You ONLY add it to H, BB, and (ABH).
— Tangotiger”
These are the new values I’ve been using (along with the test results by team and by pitcher):
http://www.editgrid.com/user/cwyers/Reconcile
Colin,
I’m not so sure that you should be assigning a runvalue to foulerrors. These are neutral plays from a Run Expectancy perspective, because they do not change the baseout situation.
OK, I notice that you only assigned a LW value to foulerrors in the LWTS_PA column. I’m just curious, how are you defining Plate Appearances for LWTS_PA?
Because of the inclusion of nonPA events into the weights, I went ahead and reconciled everything at the “run per event” instead of strictly runs per PA. I’m not certain this is correct, but it should work out fine for this because that’s how I figured the correction factor from the dataset, so it should reconcile out. (Everything does not in fact reconcile precisely, because partial innings were excluded when the RE table was generated.)
When I get home tonight I’ll look at excluding the foul errors – you’re right that they aren’t really events for the purposes of run expectancy. I don’t think it’ll change the overall thrust of the numbers presented, however.
I’m thinking that the foul errors may be the reason my LW do not reconcile to exactly 0. From 19542007, the LW sum to 227. Obviously, we can’t exclude foul errors when we initially generate the RE tables since PA_START is the definition we use for Plate Appearances in the RE table. I’ll see if subtracting (EVENT_RUNS_CT + FATE_RUNS_CT) for foul error events from the initial RE tables corrects the problems of the LW not reconciling.
If you are excluding partial innings when you generate your RE tables, your LWTS won’t reconcile when applied to data that includes those partial innings.
Excluding foul errors is an interesting idea, though. Something else to try when I get home.
(For those of you on RetroSQL – having partitioned my events and games table makes it a lot easier to play around with these sort of things and test different ideas. I highly recommend it.
For those who aren’t a member of RetroSQL, and who do work like this, by all means join!)
Let me see if I can’t correct a common misconception about Linear Weights. Even if you exclude partial innings and homehalf of the ninth and later innings, they do not reconcile exactly to zero (they come very close to reconciling). There are no games with partial innings in my LW database. If the LW are applied to only those events in my LW database, they do not sum to exactly zero. I believe this has something to do with the RE chart overstating Runs to the End of the Inning (REOI) for the basesempty, zeroout state.
I could be wrong though (See post #25). Maybe Colin could verify whether his LW reconcile or don’t reconcile when excluding partial innings and homehalf of the ninth and later innings.