The Visual WPA Project
April 10, 2008 19 Comments
Across the statistical spectrum a major debate has raged for quite some time: the statistical analysts vs. the scouts. Both thinks one another is wrong and bases decisions off of faulty methods. Though small portions of each side embraces the other the large majority does not. For this very reason WPA—Win Probability Added—has gotten some heat from those against baseball statistical analysis.
Essentially, WPA tracks the contributions of an individual to the win or loss of his team. It adds the accumulative differences in Win Expectancy percentages to determine who helped or hindered what specific percent of their team’s efforts.
To find the Win Expectancy of any game state, look in the Toolshed section of The Book or visit Christopher Shea’s Win Expectancy Finder online. For a great article on WPA, read Studes’ “The One About Win Probability.”
When I described WPA to a friend of mine—one on neither side of the analysis war—he responded with: “Yeah, but they’re human. Certain numbers like that cannot track true effort.” While I disagree that a number cannot accurately track effort I do feel the current WPA could potentially improve to track even more effort; or properly divvy up the effort to take into account these more human qualities. His comment made me wonder what would happen were we to combine our intuitive scouting as fans with a statistic like WPA; as in, would the results be so different than what we currently have?
If those in opposition to numbers really feel that human aspects of the game make such a drastic difference that an anarchic overthrow of WPA would be necessary then it seems to be a good idea to test that theory out. In conducting a study like this we would basically be measuring certain game aspects previously determined to be immeasurable with a stringent set of criterion.
TangoTiger helped me harness this idea when it was discovered I was preparing to write an article he had previously written. He informed me that his thoughts echoed those of my friend—the numbers might be improved upon combining percentages with intuitive scouting. The visual WPA would work much like the current statistic only there would be certain plays or situations with which we could apply our opinion of effort or contribution.
In May 2007, the Phillies were playing the Marlins and Rod Barajas made an absolutely boneheaded play. Hanley Ramirez was rounding third base and Pat Burrell’s throw creamed Ramirez in a race; by the time Barajas had the ball Ramirez was still one-fourth of the way from home plate. Despite this, Barajas, for whatever reason, did not block the plate or attempt a tag until Ramirez slid. Ramirez ended up being safe. In terms of WPA pitcher Brett Myers was debited the full amount but Barajas clearly deserves some of the blame. This is an example of a situation that only watching the game would be able to determine those deserving of credit or debit.
Though the Barajas situation falls into the category of separating fielders from pitchers in the contribution department, there are also the ever so frequent non-error errors. We’ve all seen these plays wherein a fielder should be able to get to a ball but it gets through infield or drops in the outfield. Errors are not charged in these specific plays however we know they should have been made. Why should the pitcher be fully debited for allowing a single when we intuitively understand that the play should have been made?
Other examples where a Visual WPA would benefit us are:
- Runner on first legging it out to third on a single, or scoring on a double when we intuitively feel he has no shot.
- Pitcher with a slow wind-up should be debited on an SB, not a catcher, whereas someone like Roy Oswalt (fast windup) would have more of an effect on a runner stealing.
- Separating a bad judgment or decision by a 3B Coach from the runner thrown out at a base he perhaps had no legit shot at reaching.
- We’ve all seen examples of Harry Kalas’s famous line: “Right down the middle for a ball.” If a pitcher strikes a better out but the ump fails to call the strike we should debit the umpire a bit because he incorrectly lengthened the inning.
And these are just a few of the examples of situations that would benefit from intuitive scouting.
Separating the contributions between batter/runner and fielder/pitcher has been studied before but I am proposing a use of our own intuition as fans in order to make these separations instead of a concrete set of numbers or measurable criteria. For instance, a runner legging it out to third on a single when we normally think he would have to stop at second base would be left to our intuition. We would be using our knowledge of the runner, the position of the outfielder, the throwing arm of the outfielder, and the importance of the situation in order to make our judgment.
If David Ortiz legs it out to third base in the first inning of a 0-0 game we may be inclined to split the WPA between he and the hitter by giving Ortiz 1/3 of the increase and the hitter 2/3. If the same situation occurs in the bottom of the 9th in a game in which the Red Sox trail 2-1 we may be much more inclined to give Ortiz 2/3 and the hitter 1/3. This allows us to separate contributions based on how we feel and our own scouting abilities. Essentially it lets us determine if scouting and the more human/gutsy/Eckstein-esque plays really effect or make much of a difference on what the statistics tell us.
Opinions vs. Measures
One of the big gripes here is that an opinion of mine with regards to a runner legging out an extra base may be completely different than someone else. This, however, is the beauty of baseball and how scouting works; scouts will differ in opinions on the same player. Though statistics are generally immobile scouting and intuition can shift. It is scary when presenting a potential concept like this—one that includes a combination of fact and opinion—but the idea is to see if we will truly produce different results; or at least results different enough to show that the immeasurable human aspects of the game really do make certain numbers less useful.
Conclusion: A Call to Arms
I am going to test this out with a three-game series in order to compare the differences and if anybody would like to help, by conducting their own three-game test, please e-mail me. If we can get a bunch of series logged, and if we see there are potentially big differences, there may be good reason to try this out over an extended period of time. Ultimately, if the results are deemed significantly different than we can say that these game aspects only evident by watching a game truly make a difference. At the very least we will be garnering a more accurate version of an already accurate statistic. I will post my initial results of the three-game series next week.