Vindicating Derek Jeter’s fielding at short (sorta)

Introducing OPA! OPA! is my new (still in the works) fielding system for use with Retrosheet, one that I’ve been meaning to create for a while now. Last week, I teased the beginnings of OPA!, at least the ground ball part. This week, a more full exploration of ways in which we can rate infield play without the benefit of knowing where the ball went.
First, the framework. You may be wondering what OPA! stands for. Other than my goal of making it the most festively-named fielding system out there (next time you go to a Greek wedding, they won’t be shouting UZR! or FRAA!), OPA! is short for OPAAA, or out probability added above average. Consider a ground ball. Any ground ball will do. The infielder’s job is to turn it into an out. He can either succeed or fail at this job, but several things must happen in order for him to succeed. He must have:

  • Good range: he has to get himself and his glove in the neighborhood of the ball
  • Good hands: he has to actually get the ball into his glove
  • Good arm: he has to then throw the ball to first (or second?) and put it somewhere in the neighborhood of the first baseman’s glove
  • The first baseman has to catch the ball

All of these things must happen in order for a ground ball to become a ground out. One of the major problems that I see with some of the major fielding systems is that they treat all of these as one giant package. Either the play was made or it was not. Sure, the point of the game is to make the play, but let’s think about the following situations. A ground ball to short where the SS gets to the ball, fields it cleanly, makes a throw right to the first baseman… who drops the ball. Sure, the 1B will pick up an error for his efforts, but the play not being completed, the SS gets no credit when he did everything right!
One of the things that spawned the new generation of fielding stats was an understanding that fielding percentage, indeed, the entire concept of an “error” was flawed. An error means that the fielder did something right, namely that he got to the ball. Yes, he booted it, but we don’t have a debit for those guys who are too slow to even get to the ball to begin with. So, an error actually penalizes one of the skills that you hope a player has. But, the type of error given (fielding, throwing) does tell us where things went wrong. It’s time to develop that line of logic more fully.
The average ground ball to somewhere on thethird base side of the infield has an X% chance on average of being turned into an out. We can play with the parameters around pitcher handedness and batter handedness and if I had more detailed data, hit location, but there will be some number that emerges. The very act of the fielder ranging to the ball and at least stopping it from going to the outfield adds someadditional percentage chance that the ball will become an out. Letting the ball through destroys whatchance there was to make an out. (I’m sure most of you have figured out by this point, but if anyone’s still lagging, I’m basing this model on the idea of WPA.) If the third baseman makes the play, we ought to credit him with the out probability he adds based on his range. If the ball goes through to left field, we should assign the 3B some blame, along with the shortstop. How to chop up that blame was neatly explored last week.
But, now let’s take a look at what happens if the third baseman gets to the ball (range), but boots it (hands). He’ll be charged with a fielding error, and the out probability that he built up by getting to the ball is now gone. To more accurately reflect what happened though, we can put his range OPA in the “range” basket and debit his “hands” basket. (And if the first baseman drops the ball, we can debit his “hands” basket, while leaving the third baseman’s contributions alone.) Now, we have a much more fine-grained idea of where a player’s strengths and weaknesses are.
That’s the theory. For the numerical spaghetti and some 2007 results (including a few things about Jeter), keep reading.

Here’s the really short version of what it took me two weeks to write the proper code to do:

  1. Take 2007 and isolate all ground balls
  2. Figure out the rates of expected outs by play state (after it leaves the bat, fielder got there, clean pick, good throw, 1B catches) controlling for who fielded it, and batter and pitcher hand.
  3. Create a separate look at double play grounders, in which we isolate the two plays that will hopefully happen, and account for the fact that it’s harder to turn the second leg of a double play.
  4. On each play, code for whether the play was completed with no problems or where the play broke down (ball went through to the OF, it broke down at the “range” stage; fielder was charged with a fielding error, “hands” stage; no error, but the batter reached base OR fielder gets a throwing error, “arm” stage; 1B is charged with an error on the catch, “catch” stage) and if it broke down, who was at fault.
  5. Aggregate it all together, including a total “outs added above average” column.

I have to admit, the code is still in the debugging stage, but I’m getting results that make enough sense to publish. Let’s take a look at some early results for shortstops. To control for the number of chances each player received, I gave him credit for a ball in his area if a) he fielded it or b) if he bore more than a 20% blame on the ball getting through, using the division of responsibility chart from last week. I limited the following to gentlemen who had at least 100 balls hit in their general direction.
In 2007, the best OPA! per ball among shortstops in the “range” category:

  1. Adam Everett: .0281 outs added per grounder above average
  2. John McDonald: .0243
  3. Eric Bruntlett: .0197
  4. Troy Tulowitzki: .0168
  5. Bobby Crosby: .0138

The bottom five in range:

  1. Mark Loretta: -.0207
  2. Nick Punto: -.0231 (wow, he couldn’t do anything right last year)
  3. Jeff Keppinger: -.0275
  4. Aaron Miles: -.0303
  5. Derek Jeter: -.0319

I promise that I come not to bury Jeter, but to praise him (a little). However, he had the worst range of any Major League shortstop in 2007. In fact, in raw numbers, Jeter had an OPA! on range of -20.55, which was more than double the second worst gentleman, Hanley Ramirez (-9.64). The problem of course is that getting there is half the fun… and a good chunk of the play (so a lot of the out probability). Ashortstop with no range isn’t a very good shortstop.
But let’s take a look at some of the other skills that a shortstop can possess, such as picking the ball cleanly once he’s gotten to the ball (“hands”). The top 5 and bottom 5 from 2007:

  1. Troy Tulowitzki: .0209 outs added per grounder above average
  2. Marco Scutaro: .0183
  3. Felipe Lopez: .0181
  4. Bobby Crosby: .0177
  5. Jeff Keppinger: .0166

Bottom 5:

  1. Brendan Harris: -.0161
  2. Jack Wilson: -.0181
  3. Aaron Miles: -.0212
  4. Adam Everett: -.0258 (yes, this is the bottom 5 list)
  5. Ben Zorbist: -.0265

Adam Everett, near the bottom? One concern that I have about this fielding system is the matter of a “range penalty.” We saw above that Everett had the best range in baseball at short last year (before he got hurt). He probably got to balls that no other shortstop would have. But some of those probably tested the very limits of his range and he had to kinda stab at the ball to even have a chance to pick it. In doing so, he might have looked like he “booted” a few ground balls. Again, those errors are the product of his spectacular range. Ideally, I’d like to factor this “range penalty” into my system, and correct for it. Not quite there yet.
Moving on to arms. Let’s look at which shortstops excelled on fielding a ground ball and then making their first throw. In some cases, that would be to second (on a double play ball) or to first (on just about everything else… yeah, there’s the occasional throw home to nab the guy there and those are factored in.)
Top 5:

  1. Ryan Theriot: .0355
  2. Tony Pena: .0311
  3. Omar Vizquel: .0290
  4. Adam Everett: .0258
  5. John McDonald: .0234

Bottom 5:

  1. Aaron Miles: -.0181
  2. Eric Bruntlett: -.0190
  3. Felipe Lopez: -.0265
  4. Ben Zorbist: -.0366
  5. Brendan Harris: -.0477

Bruntlett might be another range penalty guy in that here, a player would be debited if he gets to the ball, fields it cleanly, and either “puts it in his pocket” or throws but not in time. Sure, that could be a sign of a bad arm, but suppose that a shortstop has a habit of ranging deep into the hole and stopping balls from going into left field. I couldn’t make that play either.
What about turning the double play? That second throw on a double play, particularly coming across the bag for a shortstop, is a recipe for a collision with a runner who just might be bent on taking him out. In fact, I’ve shown it’s the one “gritty” and “hard nosed” play that players seem to have any skill for. It’s also one with a lower out expectancy, but once the ball is in the hand of the shortstop at second base and he makes the phantom tag of the base, there’s a certain number of plays at firstwe would expect him tocomplete on average. Prorated for number of potential DPs to complete (so, there’s already been at least a something-6 forceout, even if it’s 6 unassisted), the number of
The top 5 of 2007:

  1. Tony Pena: .1767 outs above average per double play ball
  2. Yuniel Escobar: .1634
  3. Ryan Theriot: .1555
  4. Ben Zorbist: .1536
  5. Cesar Izturis: .1340

The bottom 5 of 2007:

  1. Felipe Lopez: -.0559
  2. Julio Lugo: -.0559
  3. Mark Loretta: -.0559 (yeah, all three the same)
  4. David Eckstein: -.0726 (but he’s so “hard nosed!”)
  5. Miguel Tejada: -.0908

The fact that Zorbist wasamong the worst inmaking the first throw and then among the best in making the second throw was weird enough that I looked into whether there was any correlation between the two skills. The correlationwas .514, which is decently strong.Maybe just a fluke.
Finally, what about catching the ball on a throw? The shortstop doesn’t take as many throws as the first baseman (obviously), but he does take a few, particularly on the 4-6-3 and 1-6-3 double plays. Catching the ball might not seem like much, because it’s rare that a fielder actually fails to catch a ball thrown at him, but one dropped ball could destroya lot of built up out probability.
The shortstops who do the best catching a throw:

  1. Eric Bruntlett: .0104 OPAAA per catch opportunity
  2. Royce Clayton: .0097
  3. Jeff Keppinger: .0092
  4. Jack Wilson: .0089
  5. Michael Young: .0088

The bottom five:

  1. Felipe Lopez: -.0136
  2. Ryan Theriot: -.0158
  3. Jason Bartlett: -.0198
  4. Yuniel Escobar: -.0368
  5. Cristian Guzman: -.0507 (yikes!)

So, what of Jeter? You might notice that he avoided all of the bottom 5 lists after range. In fact, had I extended some of the top 5 lists to ten entrants, Jeter would have shown up. In fact, he ranks ninth overall in arm, ninth in turning the double play, and eighth in receiving throws. So, there are some places where he’s above average, particularly around throwing and catching. It’s fielding that’s his problem. But all told, was he the worst fielding shortstop in baseball in 2007? Here’s the top 5 and bottom 5 of cumulative OPA! divided by grounders for which the player had some semblance of responsibility. Starting at the top:

  1. John McDonald: .0671
  2. Troy Tulowitzki: .0470
  3. Tony Pena: .0467
  4. Omar Vizquel: .0427
  5. Adam Everett: .0332

And the bottom:

  1. Hanley Ramirez: -.0463 (poetic)
  2. Jack Wilson: -.0524
  3. Brendan Harris: -.0707
  4. Aaron Miles: -.0714
  5. Ben Zorbist: -.0758

Derek Jeter ranked #31 last year out of 43 qualifiers. Certainly he’s not a gold glove defender (according to the numbers), but he’s also not the worst of the worst. Finally, just to make sure, we need to remember that while Zorbist, Miles, and Wilsonwere awful ground ball fielders at shortstop last year, they were only part-timers with none of them logging more than 250 grounders. To some extent McDonald (can’thit)and Everett (injured, can’t really hit either) had a similar story on the other side. Ramirez was a full-time shortstop, and so he did a lot more to damage his team by being so bad and yet so immovable from that spot between second and third. Who helped his team the most/did the most damage to his team over the course of the year?
Top 5:

  1. Troy Tulowitzki: 36.18 OPA! (also got more GB his way than anyone else in baseball… think the Rockies have a strategy?)
  2. Tony Pena: 29.04
  3. Omar Vizquel: 26.38
  4. John McDonald: 25.44
  5. Jose Reyes: 19.93

Bottom 5:

  1. David Eckstein: -11.76
  2. Michael Young: -15.50
  3. Carlos Guillen: -19.26
  4. Brendan Harris: -25.33
  5. Hanley Ramirez: -29.65

Jeter, for the curious came in at -8.37, good for 35th place out of 43qualifying shortstops. Derek Jeter cost the Yankees eight ground ball outs last year compared to an average shortstop. So, all this talk about Jeter being the absolute worst in baseball doesn’t hold in my system. He’s 9th from the bottom. Consider him vindicated.
Now, I haven’t (yet) looked at liners or pop ups, and I suppose that could tilt the balance in another direction, but I have one more question for those who have been paying close attention. Last year, the two shortstop Gold Gloves went to Jimmy Rollins and Orlando Cabrera. You’ll notice that they haven’t been mentioned at all in this article until the last sentence.
Rollins ranked 18th in range, 16th in hands, 11th in arm, 13th in turning DPs, 29th in catching throws, 12th in OPA! per ball and 8th in cumulative OPA! Not bad numbers by any means, and he was usually above average, but he was middle of the pack. (What else Troy Tulowitzki had to do, I don’t know…) Cabrera was 27th in range, 23rd in hands, 24th in arm, 18th in turning DPs, 17th in catching throws, 20th in OPA! per ball, and 22nd in cumulative OPA! Same story. Tony Pena, and perhaps John McDonald, even though he was a part-timer, deserved better.
As with anything I do in Sabermetrics or life, it is not perfect and it is a work in progress. I would, of course, appreciate feedback using the lovely comment button below.

About these ads

23 Responses to Vindicating Derek Jeter’s fielding at short (sorta)

  1. jinaz says:

    Well, I see you talked about this stuff a fair bit in your previous column. Dang real life got in the way of my statspeak reading, and I’m just now catching up. :) Still, some of those questions might still be relevant… -j

  2. Eric Seidman says:

    It seems like you tackle the range penalty issue quite a bit here. Have you thought about weighting the range part a bit higher to take this into account? Or determining the order of importance of all of these instead of just range and then weighting them along those lines? That might get rid of your problem.
    I’d rather have a guy that can get to a ton of balls as opposed to one who would have a higher rating/percentage of sorts but with much less range.
    Stabbing at the ball or slightly booting it can be the difference between runs scoring as well. With a runner on second, a grounder to leftfield could score him, whereas excellent range could keep the ball in the infield, even if the SS doesn’t end up coming up to throw (or even ends up slightly booting it).

  3. Pizza Cutter says:

    I suppose the weights can come from getting some sort of run value on each part of the process.
    The problem with the range penalty is that all my system sees is that the ball was handled by the SS. It doesn’t take into account that a ball hit right to the SS has a different out expectancy than a ball in the hole.

  4. ekogan says:

    I’m surprised that the hands and arm scores are in the same ballpark as the range score. I thought that catching and throwing errors were much rarer than an infielder getting to/failing to get to a ground ball

  5. Eric Seidman says:

    Yeah, I just feel like there needs to be some type of inverse relationship, perhaps, between balls not gotten to by shortstops (and therefore not bobbled) and balls gotten to by others but bobbled/not made.
    Have these outs/run expectancies been tackled elsewhere or incorporated into other systems? It would seem that would be mecca-important in terms of fixing the range penalty. For instance, we could compare the expected outcomes of all balls in the edges of certain player’s ranges against actual results.
    Perhaps at the outer edges of all players with similar ranges, Everett had a better percentage of conversion, though measuring him relative to everyone outside of his peer range would make him look worse in certain aspects.

  6. Pizza Cutter says:

    ekogan, remember that these are all “above expectation”. It is rare for a fielder to throw a ball away, but it’s fairly devastating to the out probability when he does. e.g., when a ball is on its way to the 1B’s glove, the out expectancy is something like 99.7%. If the 1B catches it, he gets .3% worth of credit. If he drops it though, he gives back all of that 99.7%. Thankfully for most 1B’s, they catch most of the balls thrown their way.

  7. jinaz says:

    You’ll have to forgive me because I haven’t completely digested your system yet. But how does this system compare to other retrosheet-based systems like that of Sean Smith’s TotalZone (first discussed on this site when he wrote here) or Dan Fox (SFR)?
    Is the primary difference that you’re breaking down fielding events with a bit more granularity into range, hands, and arm (which I love–the Jeff Keppinger stuff is dead-on in line with scouting reports on him), and then combining them again to produce a composite picture of the player? How would you expect it to differ in overall results from those two systems, if at all (which are pretty similar to one another)? Given that you’re using more event types (and thus smaller sample sizes), do you think that it’s likely to be less stable from year to year than TotalZone or SFR?
    Sorry, I’m sort of a fielding stat junky and I like to think about strengths and merits of different systems. I find these retrosheet systems to be really fascinating because they have the potential to allow simple folks like me to do fielding research without relying on BIS or STATS inc for hit location data.
    -j

  8. Have you considered taking a look at the 3rd basemen these shortstops play alongside to see if the 3rd baseman’s range impacts balls a shortstop would get to? I’m guessing it wouldn’t matter a whole lot, but in the case of a player playing alongside a very gifted 3rd baseman, perhaps it takes away enough plays for a SS’s range to be affected.
    Also, what is the correlation between Adam Everett stabbing at balls at the edges of his range window vs. Jeff Keppinger stabbing at balls at the edges of his range window. The greater your window, the less likely you are to field the balls in the outer reaches of that window, however, is there a normal value that allows us to know that Keppinger actually might have better hands than Everett, especially when attempting to play a ball outside of comfortable for either player?

  9. Pizza Cutter says:

    Justin, I’m taking a lot of my cue from Dan Fox’s SFR and Sean’s TotalZone. My guess is that the composite results won’t be all that different, and in that respect, I’m re-inventing the wheel. You’re right in that my hope is that the improvement is breaking things down a little more fine-grained-ly at least into component skills. One study I’d like to do is to see which skills translate from position to position, and maybe identify some players who would benefit from a position change. And I will be doing stability studies soon enough. This is a big huge project that I will be looking at for the next few weeks.
    Spitting, that’s the problem with a lot of fielding measures. De-tangling the interactive nature of fielding is going to be hard, no matter what you do. TangoTiger’s “with or without you” framework might be a good starting point to answer that question though.

  10. Yes, I’ve read TangoTiger, and I understand the nature of things. I do, however, think there should be a measurable in your formula that helps you determine the difference between “good hands” and “good hands at the edges of range.” There’s a huge difference between Adam Everett gobbling up what comes to him vs. what he can’t quite handle and another SS who gobbles up most of what is in his area even if he can’t get to balls Everett gets. Everett would help more by getting to balls in the hole when there are runners at first and second, but he’s not doing much good when there’s a runner at third or at simply getting the batter out. If you fine tune you’ll get more detailed results as to when which shortstops matter more.

  11. Pizza Cutter says:

    I actually looked a bit at that topic in the 1993-1998 Retrosheet files that have the hit location data to do it. What I found was that (not shockingly) most balls right at the SS were picked up, but that the reliability from year to year on this percentage was low. This says that any variations from year to year and player to player are largely the result of chance. However, the reliability went up when looking at the zone to the SS’s left and right. Makes sense.
    The part that I struggle with (and the part that dooms all Retrosheet-based fielding systems) is that we don’t have good hit location on where the ball went in the recent RS data. It also seems like you’re arguing for possibly controlling for the game state (at least for where the runners are) probably in some sort of run-expectancy way. That’s an interesting idea…

  12. jinaz says:

    @PC, If you do look into position changes, I’d be interested to see how it relates to what the fan’s scouting report data would suggest on the same issue.
    Looking forward to seeing this thing develop! -j

  13. Yes, game state or runner positioning is what I’m talking about. It’s great that Adam Everett gets to balls in the hole, and it probably gives him a chance to create more outs overall, but where runners are positioned makes a difference as to the possibility that he makes more outs and more importantly prevents more runs than other shortstops. Like you said, there’s likely not enough data on where the ball went, but I think this is critical in coming up with who is the better defensive player.
    Good luck…

  14. Carlos Rubi says:

    It’s amazing how leaders in OPA! and bottom-feeders, as well, turned out to be the guys we’d expect to see in such a list.
    Great work.

  15. joe arthur says:

    One thing which may distort your ability to isolate components is positioning.
    As a thought experiment, strong armed shortstops “should” play deeper than average to take advantage of their arms [ in theory; consider that if they played shallower than average, the ball would get to them quicker; and assuming they got to the ball, with a shorter throw and more time to make it, they would lose the advantage of their arm strength.
    As I understand your explanation of OPA!, a shortstop with average range and a strong arm in reality, by playing deep would “reach” more balls, preventing more outfield singles, but not necessarily throwing out more batters. You would measure his above average skill as “range” rather than arm.
    In more specialized situations (infield in; double play depth?) there might be less variation in positioning, so that you could isolate the skill better, but of course much smaller samples.
    Is there anything interesting to say about grounders which are turned into outs without a throw?

  16. Pizza Cutter says:

    OPA! does look at double play-possible-GBs a little differently, although, of course, I don’t know if the fielders were positioned differently.
    Grounders without a throw (3 unassisted?) are counted as the first baseman fielding the ball and oddly enough, throwing to himself. It’s a bit of a flaw in the system that I’m not sure how to correct.

  17. joe arthur says:

    Since the article is about shortstops, I was thinking about shortstops fielding a grounder and stepping on 2nd. Same thing as your example with first basemen, where sometimes they have an option between making the toss to the pitcher covering and taking it to the bag themselves.
    Interesting point there – wouldn’t you want to separate the throw on a 3-1 play as a different skill than throws to other bases? It’s often underhanded, and while accuracy still matters, timing rather than velocity is the key skill.
    Just to restate my initial point, players should position themselves in some kind of equilibrium which takes advantage of their relative strengths and weaknesses (for arm and range at least) so that you cannot infer from retrosheet data what the underlying skills are.
    But maybe I am wrong and you usually will be able to isolate the underlying skills. It’s a practical question, and it’s worth examining.

  18. joe arthur says:

    And if you are able to isolate the underlying skills, it may be from a counterintuitive consequence that true range aside, strong armed shortstops have a surplus of infield hits [get to more balls] , just because they play deeper.

  19. Pizza Cutter says:

    Joe, good points all around. Now to figure out how to correct for this…

  20. Some really interesting stuff here. People usually lose me with these kind of stats, but I have to say I understood every word of this and found it to be pretty cool.
    I think other metrics have supported your point that Jeter has below average range, especially going up the middle to his left.

  21. Xeifrank says:

    One thought I came up with after reading this article, is that the same weight is given to a thirdbaseman who fields an easy ground ball and makes the out at firstbase to a diving stop on a hard hit ball who also records the out at firstbase. I think there needs to be some distinction to the level of difficulty of the play. Does the OPA! system do this? If not, are there other defensive rating systems that take level of difficulty into account? I like that approach.
    vr, Xeifrank

  22. Pizza Cutter says:

    Xeifrank, if there’s one flaw in the system, that’s it, and if Retrosheet had more detailed data, I would so do it. I have to settle for figuring out the average value, whether it was a two-hopper right to the 3B or an amazing running/diving/leaping play. Not ideal, but… free.

  23. [...] Statistically Speaking | MVN – a statistical and sabermetric baseball blog Blog Archive Vindic… Another fielding stat. Renteria doesn’t show up on the top or bottom lists, leaving him somewhere in the middle. [...]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: