Some homemade stats from 2007

Over the past year, I’ve introduced a few stats looking at different aspects of the game, but was always hamstrung about presenting the 2007 data.  These stats require a Retrosheet database, and that doesn’t appear until after the season is done plus a few weeks (how do they turn it around that fast!?!?).  So, I’ve had to be patient and stick with 2006 data.  Now, the 2007 file is out and ready to go.
I’ve gone through the specific processes for calculating these stats in previous posts and I’ve provided links for the curious.  I’ve also included links to Excel files in Google docs that list everyone who qualifies listed by their Retrosheet ID.
Plate discipline
Original article
(also, the extended article [PDF file] in By the Numbers, “Is Walk the Opposite of Strikeout”)
Quick synopsis: These are stats based on signal detection theory around a player’s management of the strike zone.  I wrote this article just before the whole Pitch f/x thing was revealed to the world and people really started digging into that.  The two main stats are sensitivity, which is a measure of how good a player is at avoiding strikes (both by taking balls and putting pitches into play… what happens to those balls in play is another issue altogether), and response bias, which is a measure of how likely a player is to swing.  Walks were more correlated with response bias (fewer swings led to more walks) and strikeouts were more correlated with sensitivity.  I’ve also included swing percentage and contact percentage (when swinging) in that file, because they’re used in the calculations for the numbers.  I also included pitches/PA because it was 3 extra buttons to push.
Higher sensitivity means that a player avoids strikes.  Higher numbers are better.  Higher levels of response bias mean a player is more likely to swing at a given pitch.  The optimal level is 1.0, and the further away a player gets from 1.0 in either direction, the worse the outcome.  For example, a player with a 1.05 rating swings too much, and a player with a .95 rating should swing more.
Notable leaders: Once again, Vlad Guerrero leads the league in sensitivity, as well as has the highest response bias.  The man swings at everything, and most of the time, hits it.  The rest of the top five in sensitivity: Moises Alou, Scott Hatteberg, Mike Sweeney, and Brian Giles.  The hitter least likely to swing was Hatteberg.  The player with the worst sensitivity was Ryan Langerhans, followed closely by Jack Cust. 
The 2007 data file
Speed Scores
Original article
(also, extended article [again, PDF] in By the Numbers, “Do You Have Any Idea How Fast You Were Going?”)
Quick synopsis: I re-did the classic Bill James speed score, with a little more statistical rigor and using the PBP database more effectively.  Turns out that the James method works just fine.  Mine does have better scale properties, but the James method is much easier to calculate.  Still, I calculated my way.  I base my scores on six categories: SB success rate, percentage of times on first drawing a throw from the pitcher, percentage of times on first in which a steal attempt was made, triple to double ratio (3b / (2b + 3b)), percentage of infield grounders beaten out for hits, and percentage of double plays avoided. As always, the gory details are in the article.  (I did allow for two missing scores, rather than one, when taking my average, this time.)
Notable leaders: Tony Gwynn, Jr. wins the title as the Majors’ fastest man (minimum 100 PA), followed by Carlos Gomez, Michael Bourn, Dave Roberts, and Nyjer Morgan.  Last year’s champion, Ichiro Suzuki came in 6th.  Ryan Howard and Victor Martinez were the most leadfooted of all players.
The 2007 data file
The Best Closers
Original article
Quick synopsis: I took a look at who were the best gentleman at the art of coming in to start an inning with a lead of three or less, getting three outs, and not giving up that lead, no matter what inning it was.  Surprisingly, it wasn’t always the guys who were the closers.
Notable leaders: Five guys (minimum 10 relevant appearances) had perfect records on this task in 2007 (Danny Baez, Pedro Feliciano, Hideki Okajima, Tony Pena Jr., and and Chris Ray).  Okajima had the most chances (23), so I guess he gets the title as the best closer in baseball.  Despite the fact that he wasn’t the closer on his own team.  Joe Borowski, who led the AL in saves, was 52nd of 70 qualifiers at 83%.  The bottom of the barrel was Jorge Julio, Reynel Pinto, Octavio Dotel, LaTroy Hawkins, and Tom Gordon.  As I pointed out in the article, this isn’t a very consistent skill from year to year.
The 2007 data file
Pickoff moves
Original article (Part I, Part II, Part III, and Part IV)
Quick synopsis: It started as a look at the throw to first more generally, and by Part IV, it turned into a method for evaluating pickoff moves.
Notable leaders: Steve Trachsel wins for best pickoff move at least in terms of most runs saved (I used average delta-run-expectancy values), followed by Wandy Rodriguez, and Justin Germano.  Bringing up the rear were Adam Eaton and Livan Hernandez.
Note that in this file, negative is good.
The 2007 data file
Enjoy.  Feel free to copy, distribute, or otherwise use the data.  I would only ask (on the honor system) that you give me a proper mention.
And from all of us at StatSpeak may you have a happy, healthy, and gentle 2008.  Unless you’re a Yankee fan.


One Response to Some homemade stats from 2007

  1. Love the stats. Especially the speed. For the closers, I’ve actually been researching something along those lines for a while, involving the breakdown of percentage/effectiveness in 1-run, 2-run, and 3-run saves, separately, as well as mop-up duty.
    As in, looking at their statistics in all 4 areas, and then looking at the percentage that their team outscored other teams in these save situations.
    That way we can tell which closers were the most effective in which areas. Just an example – we could say that CLOSER X this past year posted tremendous numbers in 2-run and 3-run save situations, but average or bad numbers in 1-run saves…. however his team outscored opponents by 2 or 3 in 85% of the save situations, so he was extremely effective in what his team ended up needing.
    That would tell us even though he was not great as a 1-run closer, he was great for his team because they did not have many of those 1-run games and 2-3 run games were fantastic.
    It goes along with my article on pitching effectiveness, where a guy may be very effective for his team however it is hard to compare him to others in other situations.
    I smell one of my next posts..

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: