World Famous StatSpeak Roundtable: September 10

Wednesday means many things, but here at StatSpeak, we like to think that you associate Wednesday with our world famous roundtable. This week, our special guest is (Doctor!) Justin Inaz, of On Baseball and the Reds, a blog about the second-best team in the state of Ohio. Justin joins the usual four suspects for a discussion of the Astros, Cole Hamels, the best player you’ve never heard of, measuring defense, and the search for the “next Pitch F/X.”
Question #1: Last year, pitch f/x debuted, and the availability of those data spawned research that collectively represented the biggest advance in baseball research of the 2007 season, if not the previous 10 years. As the 2008 season winds down, what do you feel has been the biggest advance in baseball research this season?
Justin Inaz: At least from my perspective, this year has been about projections for the masses, by the masses. First, in March, Tom Tango posted two studies thoroughly investigating the accuracy of different projection systems. Surprisingly, he found that simple little Marcel performed just about as well every other system, including the much heralded PECOTA. Suddenly, we don’t need to rely on a horrendously complicated black box for projections–we can literally do it ourselves and come out just as well!
Then, mid-season, Sal Baxamusa followed up his series of articles on in-season Marcel projections with the release of a spreadsheet that allowed us to get an up-to-date Marcel projection for the rest of the season for any player in MLB. Suddenly, we went from trying to make an “eyeball” judgement on how much to believe 2008 stats vs. previous year stats to being able to get a good, objective projection based on a combination of this and previous season stats. And now, more recently, we have your own Brian Cartwright working on a more sophisticated version of Marcel that could potentially be updated daily throughout the 2009 season.
This sort of work is changing how the sabr-inclined, yet skill-lacking among us (i.e. people like me) evaluate players during the season. To me, that’s huge–arguments about what we can expect from a player based on in-season performance seem to dominate virtually every message board and blog I read.
Brian Cartwright: Not an advancement yet, but sitting there waiting… (thru it’s GameDay application) has been collecting play by play data for all minor league games since the beginning of 2005. I’m sure places like and use it to do daily updates of their minor league satistics, but there are researchers like us who don’t even know it exists yet. Except for pitch f/x, all the same play by play that we have become accustomed to for the major leagues is also available for the minors, and just waiting for sabermetric analysis…park factors, defense, baserunning, etc.
Colin Wyers: Retrosheet’s 1999 data has, at long last, been published, along with 1954 to 1956. That finally gives us an unbroken stretch of play-by-play accounts (minus a handfull of games) that runs for over 50 years. The difference between studying baseball with and without play by play data is really like the difference between biology before and after the invention of the microscope.
Eric Seidman: The biggest advance, from my perspective, is a greater understanding of how certain systems work. There are plenty of projection systems around, and plenty of defensive systems around, but with the discussions of new projection systems, in-season projection systems, and ideas for new defensive metrics, vast dialogues regarding the functionalities are sprouting up all over the place, and I have found so many new people who are now grasping the idea of a true talent level. I also feel the work Victor Wang has done on trade analysis at The Hardball Times has been fantastic in terms of advancing our criterion for evaluating trades.
Pizza Cutter: Well there hasn’t been a data-stream released that can match Pitch F/X. I have to wonder if pound-for-pound there’s anything that could match the impact of Pitch F/X. There really hasn’t been an earth-shattering “this changes everything” moment in Sabermetrics so far this year. However, I think the biggest untapped pool of data is the stuff hiding in the minor league gameday files at own Brian Cartwrighthas talked about how he hopes to marry Retrosheet files to GameDay files, which would be awesome. But, those minor league files arethe next game-changer. Jeff Sackmann is already starting to exploit this at Minor League Splits(and beautifully!), but there’s so much more to be done there.

Question #2: Even with the advanced play by play record keeping available for today’s games, what data that you’d really like tohave is still missing that would help us in developing the best in defensive metrics.
Justin Inaz: One that Tom Tango has been pounding on for years and would be really easy to do: stopwatch recordings for time in flight for all fly balls, and time through the infield (or to the fielder) for all ground balls. With those times, plus the hit location data that is already collected, we could finally move away from the terribly subjective “hard, medium, soft” designations and get some accurate measures of how hard balls were struck. And this, in turn, would give us a much better means of classifying how difficult a given ball would be to turn into an out.
Hittrackeronline is already doing it for home runs, and Greg Rybarczyk did it for all balls hit by Torii Hunter and Andruw Jones in last year’s Hardball Times Annual. There’s just no good reason for a company like BIS to not be doing this. Maybe some version of fieldf/x will be able to do this in the near future, though.
Fielder positioning prior to the ball being struck would be another nice item to have, especially when we try to evaluate center fielders. Positioning is an important part of a fielder’s ability to get to balls, and I wouldn’t want it removed from the equation. But it’d be neat to be able to evaluate how important positioning is compared to “true” rangeyness. Do older fielders make up for some lost mobility by better positioning and anticipation? It’d be fun to study that quantitatively.
Brian Cartwright: I’ll go into it more in tomorrows column, but it drives me nuts that when looking at play by play, the available public sources don’t specifically list who is the one fielder who had the best chance of fielding the ball for an out. Batted balls are coded by where they are hit, either on a vector or into a zone, but without specifying a responsible fielder (ground single to leftfield) we are left to guess who was closest. These are commonly referred to as “split zones”, those zones in which responsibility is split and thus must be estimated. All that guessing would disappear with something as simple as “ground single, passed short, into leftfield”.
Colin Wyers: Starting positioning of fielders. That would achieve two things simultaneously:

1. Reduce the sample needed to make our defensive metrics say something meaningful.
2. Open up a wide area of new research – what’s the best way to position fielders? How effective are shifts? Does double play depth or playing against the bunt pay off?
Eric Seidman: One of my biggest pet-peeves with the Pitch F/X data, if you can even call it that, is that we never know where the catcher sets up. Due to this, it’s near impossible to truly use the dataset to track accuracy and missed spots. Along similar lines, if we have no idea where a fielder starts at in the field, it’s much tougher to gauge whether or not he should or should not get to a certain ball. With that knowledge, the already great defensive metrics could be even fantastic-er.
Pizza Cutter: If Baseball Info Solutions would release the dataset that they used to make The Fielding Bible, that would be amazing. If we had stop-watch times on balls from off the bat to when they either hit a glove or the ground, that would be even cooler, since we already have decent enough data on where they went. Then we could really get into who has the best range. It would also help to have an idea of how quickly the batter got up the first base line.
Question #3: The Astros have been one of the hottest teams in baseball recently. But their third-order win percentage for this year is still only a measly .481, despite being 10 games over .500 right now. Would the Astros have been better off losing more games in the past week, in the long run?
Justin Inaz: Wait, you’re saying that they’d be better off losing instead of winning? That’s a new one. 🙂
I suppose the answer to this has to do with whether this streak causes the Astros to evaluate themselves differently. It does seem as though they’re in a win-now sort of strategy, though that traces back to this past offseason. The trades for Miguel Tejada and Jose Valverde all but emptied their already weak farm system, making rebuilding a rather difficult proposition. And then there’s the midseason acquisition of Randy Wolf in an effort to “save” their season this past July. This is already a team that seems to have gone all-in, and I don’t really see what choice they have but to make as much of it now as they can. And they are arguably in the hunt for a wild card, though you’re right that their pythagorean record would project that they’re unlikely to beat out their competition for that playoff berth.
If they don’t win this year, the best path for the organization would seem to be blowing it up Oakland-style, trade away Berkman, Oswalt, Wigginton, and Lee (if you can find a taker) and get the best set of mlb-ready talent that you can get. All of them will probably be at their peak value, so if you’re not likely to win in ’09 then this will be the time to deal. But from what I see and hear from Houston’s front office and ownership, they’re going to keep shooting for the title in the short term no matter what…
Brian Cartwright: The trap management has to avoid is thinking the team is better than it actually is. Then they might hesitate to make necessary changes. In the Astros case, they made roster moves for a win now approach in the second half, bringing in older players like Randy Wolf at the expense of younger talent. Whether they are winning or losing now, if it’s clear that the season is lost, I’d think it’s time to get a head start on spring training. Even at 77-67, BPro’s Playoff Odds Report gives the Astros a 2.8% of making the postseason. Although losing doesn’t inspire confidence in the fans, it does get the team a better draft position.
Colin Wyers: I think the Astros are going to be next year’s version of this year’s Mariners – the team that doubles down when they should be sitting out a few hands. They have that perfect combination of a team that’s worse than their record would indicate and a club filled with aging veterans on the wrong side of 30.
Eric Seidman: I flip-flop on the Astros. I’m not one to advocate trading the farm in order to just have a winning record in a given year, but I am not a hater of trading the farm to win now if you can legitimately win now. The Astros made these moves to essentially finish 1-2 games out of the playoffs, with a record of something like 86-76. Their run differential is atrocious, meaning they aren’t “real,” and some of these guys won’t be back next year, meaning they will have traded the future for an above .500, somewhat meaningless season, for veterans who won’t be back… meaning they will be without both. The only reason I don’t like the moves they made is because they didn’t realistically understand the assets they had and where they could go. The Brewers were on the cusp of serious playoff contention and acquiring Sabathia made them almost a lock. This is not the case for the Astros.
Pizza Cutter: I don’t know that any team is ever better off for losing, but it would help to keep some delusions of adequacy out of the Astros’ fans minds. I did hear some murmuring that the Astros “just might be this year’s Colorado.” While, I understand that baseball is a game of hope, I’m enough of a psychologist to realize that false hope is a very dangerous thing. So, maybe they should have lost a few games, if only for the long-term mental health of the Houston area.
Question #4: Who is the best player in baseball right now that the vast majority of fans either haven’t heard of or know nothing about?
Justin Inaz: Assuming fans don’t remember his days with the Pirates, you can make a pretty good argument for Brian Giles. While his power has fluctuated over the years, Giles has never posted a seasonal OBP below 0.361. His career OBP is an astounding 0.403, which means that his career OPS of 0.913 underestimates his value (OPS undervalues OBP). Add to that the fact that he’s been playing since mid-2003 in San Diego, home of the most extreme pitcher parks in baseball, and you have a pretty impressive career. But it’s not just his career numbers. This year, he’s got a 0.391 OBP in San Diego (PETCO!) as a 37-year old.
On top of his offensive prowess, Giles is a pretty good defender. In recent years, the Fans’ Scouting Report has ranked him as at least an average defender, which makes him a plus fielder in a corner outfield position. This year, using an average of run translations for ZR and RZR, I have him as a +17 run fielder in right field. And combining both his offensive and defensive value, along with park, league, and position adjustments, my estimates put him as the 12th-most valuable position player in the baseball this year. Granted, he’s not exactly a no-name. But how many fans would rank him as among the top-15 position players in baseball?
Brian Cartwright: Chase Utley. I don’t think he gets enough national press, but the guy is a stud. At 29, he’s finishing his fourth straight season of hitting for average, hitting for power, and playing plus defense. If the Pirates played the Phillies as much as they used to, I’d have to hate him, but honestly, except for the other-worldly Albert Pujols, he is the guy I’d most like to have.
Colin Wyers: I don’t know if I’m answering the question or not, but J.J. Hardy is one of those guys that is criminally underrated. On a very good Brewers team, he gets overshadowed by Braun, Fielder and Hart (and the promise, unfulfilled, that Rickie Weeks will someday not suck). He’s a good hitter, and depending on who you ask a good to great defensive shortstop. Shortstops in general are guys who have a tendency to get underrated.
Eric Seidman: Jayson Werth of the Phillies, and it’s a no-doubter for me. The dude doesn’t even qualify for leaderboards on Fangraphs, yet ranks 15th in the NL in the counting stat of WPA/LI at 2.50. Many people forget or never knew he was a first round pick once upon a time ago, and after some injury problems he really seems to have made a name for himself in Philadelphia… except most people outside of Philly don’t know him… or they know him as “that guy who used to be a Dodger.”
Pizza Cutter: When it comes to the gulf between talent and recognition, I still say that Chase Utley is the most under-appreciated player in baseball, but that’s not the question. Who is the best player about whom no one has heard? The two following players are both National League catchers. One about whom you’ve heard a lot, the other whom you couldn’t name. (Stats as of Monday night…)
Player A: .326/.365/.512, 6.98 RC/27
Player B: .290/.368/.509, 6.90 RC/27
Player B is Geovany Soto, who will fight to the death with teammate Kosuke Fukudome for the NL Rookie of the Year Award and about whom everyone talks because he plays in Chicago. No problem there… Soto has had a very very good year and deserves some press. But, why is it that everyone knows about him and not about Player A, a guy who would likely give you the same basic production. His name: Ryan Doumit.
Question #5: This past Sunday, Charlie Manuel moved Cole Hamels up to start against Johan Santana and the Mets, over Kyle Kendrick. (It didn’t work, the Mets won 6-3.) Was this a wise or a desperate move?
Justin Inaz:The answer to this seems to me to be something that you’d best answer by talking to Hamel’s trainers rather than some random stathead like myself. If I were Manuel, I’d want to hear from both Hamels as well as whoever works him out during his off-days to find out how he tends to feel on that 4th day. Wouldn’t hurt to also use a strength-testing device like the thing to the Rex Sox use on Jonathan Papelbon in order to assess arm strength. And maybe the Phillies did all of this and decided to go for it.
Here’s a stats-based argument, though: The Book taught us that the current 5-man rotation does seem to be the optimal rotation. In their study, pitchers on three days rest had a wOBA of 0.369, while those same pitchers on four days rest had a wOBA of 0.352. Even including a fairly bad #5 starter in the rotation wasn’t enough to make this tradeoff worthwile for the rotation as a whole.
The issue in this case, though, is whether using your ace on short rest for a single meaningful game is worth doing. If the difference between your ace and the pitcher who would otherwise start is more than roughly 0.017 points of wOBA, then it’s a defensible move. In this case, that other pitcher was Kyle Kendrick, right? Through Sunday’s game, Hamels’ career wOBA against is a superb 0.297, whereas Kendrick’s career wOBA against is 0.357. That’s 0.050 points of wOBA, which is more than twice the 0.017 threshold.
So yeah, assuming the trainers were on board, I’d be fine with this move. Even though it didn’t work out.
Brian Cartwright: The strategy would be to pass over your 5th starter in favor of your #1 or #2 getting an extra start instead. We would think that a pitcher can handle starting on three days rest, as most starters did it regularly up until 1976, but as that almost never happens these days, there is an uncertainty about whether a pitcher today is conditioned for it. In the 1960’s and 70’s, when bringing a top starter back on short rest (less than what they are accustomed to), the majority fo the time it rarely succeeded for more than one start. The second and third time around, the top starter, now fatigued, frequently pitched no better than the guy at the bottom of the rotation, which then ended up backfiring. The most famous example was Gene Mauch’s 1964 Phillies. Their top two starters, Chris Short and Jim Bunning, each made nine starts in the last month of the season, and each made three on short rest. Short handled it well, allowing 18 hits and 6 earned runs in 21.2 innings, but Bunning did not, as he allowed 26 hits and 18 earned runs in only 10.2 innings. Short made his first start of September on two days rest, and pitched a four hit complete game. However, in his next start, back on the normal three days rest, he left after 2.1 innings. Both pitchers then continued on their normal rotation until the Phils headed into a 10 game losing streak late in the month. Bunning went out on two days rest three times, getting shelled and losing each of them. Short went 7.1 and 5.1 innings in his two starts, pitching reasonably well, but losing both. Short remained effective, but Bunning collapsed, and that was enough for the Phiilies to lose the pennant.
Colin Wyers: There really isn’t the space in a roundtable to run the numbers on this, but what the question boils down to (for me) is: do youburn Hamelsto get a matchup that, quite frankly, you’re favored to lose no matter what you do? Or do you essentially conceed the game, let Kendrick take the hill against Santana, and save Hamels for the next day, when you can get him a matchup he’s favored to win?

My gut says the latter is a better idea. Using pitcher projections and the log5 method, it really shouldn’t be too tricky to prove or disprove my gut feeling.
Eric Seidman: When the Mets got Santana in the offseason, I don’t think I was alone in desiring a Hamels-Santana matchup in September, with the division on the line. That the game was on ESPN at 8 pm, Sunday Night Baseball, only added to the “drama.” Unfortunately, the game was one-sided, with Hamels ultimately disappointing. The decision to use Hamels, however, instead of Kendrick, made sense to me due to Charlie Manuel’s previous usage of Cole Hamels against the Mets. In two different series against the Metropolitans, Manuel had the option of starting Hamels ahead of time in that series or the series prior, and opted not to. This way, the Mets had not seen Cole in quite a bit. In that regard, I liked the decision. As a fan, though, I probably would have preferred to have Hamels pitch against the Marlins, as it’s tough to beat Johan no matter who you have starting.
Pizza Cutter: Figure that sending Kendrick out to face Santana is basically conceding the game to the Mets, and Hamels is a 50/50 shot against Santana. But, Hamels against some other NL pitcher is probably a pretty good bet for the Phillies, while Kendrick and his over 5.00 ERA and FIP don’t match up well against anyone at all…. let’s give him a 30% chance of winning that game. The Phillies can trade a pretty sure loss (Kendrick vs. Santana) for a pretty sure win (Hamels vs. whoever they play next), or they can have a 50/50 shot (Hamels vs. Santana) and a 30/70 shot (Kendrick vs. whoever’s next). If they hold Hamels back, they probably win one and lose one. If they pitch Hamels against Santana, the expected outcome is .8 wins. Now, these are just probabilities that I’m pulling out of the air, but the point is that it’s pretty easy to build a logical case that this was a stupid move based on nothing more than expected wins. It feels better to say “We’re going ace vs. ace”, but the goal isn’t to win on ESPN Sunday Night Baseball, but to win more games than your division rivals over 162.


9 Responses to World Famous StatSpeak Roundtable: September 10

  1. jinaz says:

    I asked John Dewan about this a while back when C. Trent Rosecrans interviewed him. Dewan seemed thoroughly uninterested in doing anything but soft, medium, hard for ball velocity. Here’s a link:

  2. jinaz says:

    a blog about the second-best team in the state of Ohio
    Nice. 🙂
    The sad thing is that I won’t argue in the slightest. And I don’t even have much optimism that this will change any time soon. ::sigh::

  3. Pizza Cutter says:

    I suppose if you count the Toledo Mud Hens, the Columbus Clippers, the Mahoning Valley Scrappers, the Lake County Captains, and the JV team at my high school alma mater… 😉

  4. Doug Gray says:

    Re: Question 2 and Justin’s response.
    Given that BIS does all of their stuff via video on a computer, the timing of the play should actually be quite simple for them to do. 30 (or 29.97 actually) frames per second, it should be fairly simple to time how fast a ball goes from bat to fielder. Stopwatches take actual time. Computers could be configured to do it for you in time.

  5. Xeifrank says:

    I am curious as to what the impacts of the pitch fx system really are. It seems like a wonderful tool for scouting and performance purposes. But really why should I care what the exact measurements of the movement of a pitch are? How do can I evaluate how good a pitcher is based on this? If I want to know how many wins or runs above average a pitcher is, how can pitch fx ever translate to this? I am not questioning it’s worth, I am just feel ignorant as to how to ever apply the pitch fx data. Thoughts?
    vr, Xei

  6. Eric Seidman says:

    You should read Mike Fast’s article at The Hardball Times, as it explains plenty of aspects of analysis that the system helps us with. Personally, as a Pitch F/X analyst myself, I’m not concerned with movement unless there is something going on with a pitcher that could use diagnosing. For instance, early in the year, Brandon Webb had wicked movement… then it dropped a lot, when he said he had a dead arm.
    Now, he has stunk his last three starts, but his movement has been the same, so I know that his velocity and movement are not problems… could it be his release point? Yes, it’s much lower now. What effect does that have? These are all areas of analysis that Pitch F/X allows us to explore.
    Should you draft a pitcher for your fantasy team because he has a 3.00 ERA/FIP or 6 inches of horizontal movement? Clearly the ERA/FIP, but the movement, release point, break, location, etc, all this data allows our regular analyses to go to the next level.

  7. Pizza Cutter says:

    The psychologist in me sees Pitch F/X and thinks “we could crack the code for what to throw when…” Pitch sequencing is the great unexplored chess game on the mound, and Pitch F/X would give us a look into how it works.
    Then there’s projecting out which young pitchers might have a chance. If we see that a certain arsenal makes for a good pitcher at the MLB level, it’s just a matter of finding that same arsenal in the minors, or college, or high school, or the Little League World Series.

  8. Brian Cartwright says:

    Components such as speed and break let us identify pitches to great detail, such as a 12/6 curve compared to a sweeping curve. This can then allow us to group players who throw similar pitches. One of the steps in projection is to find all the previous players who have the same profile, and see how those others performed.

  9. dan says:

    Josh Kalk has similarity scores, but I don’t find them to be incredibly meaningful. Some very very good pitchers come out as being very similar to a lot of bad ones. And then some are just weird (Like Joba and Darrel Rasner being similar)
    For example, Zack Greinke’s top comps are:

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: