## The foul ball, part three: What does it tell us about an at-bat?

In part one of this series on foul balls, I took a look at what they say about batters.  In part two, I looked at what foul balls say about pitchers.  Now, let’s take a look at what the foul ball tells us about an individual at-bat.  After all, baseball is a series of at-bats.  They are the game within the game, in which a batter and a pitcher square off in individual combat.  But, what does a foul ball tell us about the chances that a batter or pitcher will complete his mission during an at-bat (recording/not making an out)?  When you see a foul ball, should you be encouraged or discouraged?  Is a foul ball just another swinging strike?
I took my data base of all plate appearances from 2000-2007 (thanks Retrosheet!) and looked for the answer to that question.  I looked at how the ball-and-strike count progressed in each plate appearance, specifically whether the strike had been recorded by way of a foul ball or a swinging strike or (since I was in the neighborhood anyway) a called strike.  Of course, anything that produces a strike is bad news for the batter, but perhaps not all strikes are created equal.  A foul ball can only produce a strike if the count before the pitch had 0 or 1 strikes, so I looked only at those pitches (I’ll get to 2 strike fouls in a minute).  That left eight possible counts in which a foul ball could have produced a strike (0-0, 1-0, 2-0, 3-0, 0-1, 1-1, 2-1, 3-1).  I looked at all cases in which any strike had been produced whether by foul ball, swinging strike, or called strike and the resulting OBP of those plate appearances.  Before starting, I took a look at the expected OBP that would result from the batter/pitcher matchup in play (using the odds ratio method, since OBP is a probability number) for the purposes of making sure that my groups were roughly equal.  I used seasonal OBP’s as my baseline.
Let me show what I did by using an example.  I took a look at all plate appearances in which the first pitch (a 0-0 count)  ended up as a strike on the batter (so, now a 0-1 count.)  I tallied up how that strike managed to get there, so that it created three “baskets” of plate appearances (called, swinging, foul).  I should also note that I only used plate appearances in which a batter with 250+ PA in that season faced a pitcher with 250+ BF in that season.  First, to make sure that the baskets were roughly equal (batters/pitchers who swing at/induce more swinging strikes might have higher/lower OBP’s/OBP’s allowed than batters/pitchers who… ah you know what I’m getting at.)  The overall expected OBP for the three groups were called strikes: .332, foul balls: .329, and swinging strikes: .326.  This pattern actually played itself out pretty consistently.  The overall expected OBP for those who took a called first pitch strike was usually a little higher than those who fouled off the first pitch which was higher still than those who swung and missed.  However, the differences were never massively huge and at their greatest, there was a spread of about 7 or 8 points among the three groups.
(Methodological note: A plate appearance might be represented in two different “baskets” here.  For example, a batter who takes a called first strike pitch, then two balls, then fouls off strike two would be in the 0-0 called bin and the 2-1 foul bin.  Such is life.)
What then came of those plate appearances with the first strike?  The actual OBP for the three groups were:

• Called strike: .287
• Foul ball: .295
• Swinging strike: .263

The most important pitch for a pitcher is strike one, but how he does it is worth 32 points of OBP!  A ball would certainly be a better outcome for a batter (plate apperances with a 1-0 count have an OBP of .385), but if he’s going to have a strike against him, he’s much better off if he swings and fouls the ball off than if he swings and misses.  I went through and did the same analysis for all of the other eight counts in question.  The results:
count                     if called   if foul   if swinging
0-0                         .287        .295     .263
1-0                         .321        .329      .308
2-0                         .404        .407      .397
3-0                         .585        .596      .597
0-1                        .219         .233      .199
1-1                        .248         .256      .227
2-1                        .315         .322      .287
3-1                        .458        .486      .442
If the batter swings, the simple act of making contact and hitting it foul signals a much better outcome for him, often on the order of 20-40 points worth of OBP, even though the result of that swing (a strike on the scoreboard) is the same.  The only notable exception is a swinging strike on a 3-0 is a little better than a foul ball.  At 3-0, the batter is in a good position no matter what he does.  Then there’s the matter of called strikes.  A called strike is consistently better than a swinging strike, but worse than a foul ball, although usually closer to the foul ball.  A strike is not a strike is not a strike.  You ignored the poor foul ball all this time, but it’s been trying to send you a message.  It’s important to pay attention to not only what the count is, but how those strikes got there.
What of two-strike foul balls?  The rules, of course, change and a foul ball at this point doesn’t affect the count.  A swinging strike or a called strike on a 2-2 pitch will result in a .000 OBP.  Does fouling off a two-strike pitch increase the chances that a batter will get on base?  What about spoiling multiple two-strike pitches?  In part one of the series, we saw that two-strike foul balls (at a seasonal level) were generally associated with different types of hitting outcomes (more singles, fewer HR), but weren’t really connected to OBP.  Does that finding still hold?
Again, I isolated all plate appearances in which there was at some point a count of 0-2 or 1-2 or 2-2 or 3-2.  I then counted up some foul balls that happened after that point.  I struggled with exactly how to compare apples-to-apples in this case.  Foul balls hit during the count in question, (i.e., foul balls only when the count was 1-2) solves the confound that different counts have different expected OBPs.  However, it doesn’t account for the fact that the mindset might be not so much focused on the count, but on spoiling as many pitches as possible and waiting out the balls and/or waiting for a good pitch to hit.  In that case, the better way to look at it would be foul balls from that point onward after that count had been reached.  (So, from the point of having a 1-2 count, if a batter fouled one off, took a ball, then fouled two more off on 2-2, that would be three fouls.)  I coded things the latter way (split it into zero fouls, one, two, and three-plus).  For fun, I did it the other way (not shown here), and the base conclusions didn’t really change.  Again, I first checked for the expected OBP based on the batter/pitcher match-up, and the differences were negligible.
Count    0 fouls   1 foul   2 fouls   3+ fouls
0-2        .209       .264    .231       .253
1-2        .235       .266     .279       .282
2-2        .307       .313     .314       .312
3-2        .468       .467     .451       .482
If the batter is fouling off two-strike pitches after being behind in the count, it means that he’s more likely to get on base (even though that effect is not linear with more fouls predicting higher OBP).  But after the count evens, there’s no particular advantage to fouling off a lot of pitches.  Seems like that even if the batter is behind in the count, if he’s still at least making contact, it’s a good sign.  However, the effects don’t seem to grow by huge margins when the batter spoils multiple pitches.  Talk of the pitcher having to “show” the batter extra pitches and this being a net gain for the batter doesn’t seem to hold water, at least as far as this particular batter being able to get on base in this particular at bat.  A lot of foul balls do, however, extend the pitcher’s pitch count, which might be helpful later in the game.  But, too often, commentators say that the batter is having “a good at-bat” if he fouls off a lot of 2-2 and 3-2 pitches.  In fact, he’s not likely to be having a better or worse at bat in terms of his result than if he hadn’t fouled those pitches off.
So, what have we learned in our examination of the foul ball?  First off, they matter.  A foul ball may count as a strike, but that’s not totally fair.  If it were just another strike, there wouldn’t be such major discrepancies between foul balls and called and/or swinging strikes.  It’s odd because a case can be made that the foul ball is something that’s positive for both the pitcher (it counts as a strike, and strikes are good) and for the batter (it’s not as damaging as other strikes).  A batter is better off if he collects balls, or perhaps home runs, and a pitcher is better off if the batter can’t touch his stuff at all.  But, it speaks to the importance of getting beyond simply counting balls and strikes.  In order to really understand a batter, a pitcher, or a plate appearance, it’s important to know how those strikes got there.  And you thought it was just a souvenir.

## A Tale of Two Cain's

When Brian Sabean signed Aaron Rowand this offseason it became clear that the San Francisco Giants had no clear direction. Signing an average hitter coming off of a career year in a hitter’s park to spend five years in a pitcher’s park just did not make much sense. This lack of direction put even more pressure on the dynamic young arms in their starting rotation. With Barry Zito on pace to break the single-season losses record a rotation featuring Tim Lincecum, Matt Cain, and Jonathan Sanchez needed to step up.
Lincecum and Sanchez, by most accounts, have been better than expected. Cain, however, is somewhat of a different story. He looked brilliant in three of his first five starts and terrible in the other two. Now, most of the readers here know I am a big fan of his; the fandom began as pity for his extreme lack of luck and blossomed into just really enjoying watching him pitch. Since he has had a topsy-turvy season I decided to get my feet wet with Pitch F/X data to provide a scouting report on him.
Mechanics
Carlos Gomez, former writer for The Hardball Times and current scout for the Diamondbacks, wrote a tremendous article breaking down Cain’s mechanics just about a year ago today. Cain has a windup very similar to that of Daisuke Matsuzaka, one that derives its power from balance, torque, and hellish momentum. As Carlos points out, the lower body completely uncoils his upper body; this, along with the loading of his shoulder, helps increase his velocity.
One of the reasons Cain can occasionally struggle with control deals with a problem at his release point: He tends not to meet the glove with his body but rather brings the glove towards him. This happened to be something I noticed from watching some starts this year, prior to reading the mechanics column, and was very happy to read something supporting my eyes. Cain’s release point has seemingly been off this season which may have contributed to his 18 walks through five games.
Pitch Classification
Since there were some issues with classifying pitches within the PFX algorithm I decided to mix my methods. Basically, anything that went unidentified or seemed fishy was corrected through watching the games and making due corrections. The incorrectly classified pitches mainly tended to be changeups. The algorithm recorded some of these changeups as splitters; however, from watching Cain pitch as well as reports on his repertoire, it is evident a splitter is not part of his arsenal. His changeup is more of a circle-change, though, one that offers significant sink; probably the main reason it was classified as a splitter in some instances.
This brings up an interesting point that I want to explore more in the coming weeks but essentially, the algorithm classifies the pitch based on what pitches similarly classified will do; regardless of what we call it, the pitch looks like a splitter due to its sink, break, velocity, and spin. If I call it a chanegeup the pitch is not any easier to hit. Regardless, Cain throws a fastball, slider, and changeup, with the curveball used sparingly. He has significantly reduced the usage of a curveball since his minor league tenure and early major league upbringing. He has thrown so few curves, in fact, that I’ve actually excluded them from many of the charts to come later. He has always relied on his fastball and this year is no different. Here is his location breakdown to lefties and righties:

And here is a look at how he has distributed the three major pitches amongst these hitters:

He clearly favors the slider against RHH and the changeup against LHH. With a circle-change with as much as sink as his it makes sense to use it primarily against lefties; it tails away from them. He appears to throw his fastball a bit more often against lefties but the discrepancy is much closer in the early going than the graph would suggest.
Count Results
Last year, as documented by Josh Kalk’s card as well as Chris Quick’s article at Bay City Baseball, Cain threw his fastball at least 50% of the time in all counts except for 1-2. In the early going of 2008, Cain has thrown his fastball much more, with a minimum of 58% in the same 1-2 counts. Reverting to his slider more often in these counts he has also recorded the highest percentage of his strikeouts. Here is his current graph of pitches by count:

As mentioned, he has upped his fastball usage thus far, starting 80% of batters off with one. He has virtually cut the curveball out of his diet, failing to throw it on 0-2 counts and any count with two or more balls on the batter. Yes, that line sounds funny in retrospect, but no more innuendos.
Sequencing
The area of pitch data I am going to focus on the most involves sequencing, IE, what does a pitcher throw after a certain pitch? After a series of pitches? Does this differ between batter types? Between handedness? Once the 2008 data is corrected I would love to explore this more in-depth via location quadrants: up and in, down and in, etc. For now I will look solely at the pitches thrown. The following charts show the percentage of pitches that follow a given pitch, broken down by lefties and righties. To further explain, for anyone unfamiliar, in the LHH chart, in the ‘CH’ row, the number 4.5% refers to the fact that, against lefties, Cain has followed a chanegup with a slider 4.5% of the time so far:

## A Tale of Two Cain’s

When Brian Sabean signed Aaron Rowand this offseason it became clear that the San Francisco Giants had no clear direction. Signing an average hitter coming off of a career year in a hitter’s park to spend five years in a pitcher’s park just did not make much sense. This lack of direction put even more pressure on the dynamic young arms in their starting rotation. With Barry Zito on pace to break the single-season losses record a rotation featuring Tim Lincecum, Matt Cain, and Jonathan Sanchez needed to step up.
Lincecum and Sanchez, by most accounts, have been better than expected. Cain, however, is somewhat of a different story. He looked brilliant in three of his first five starts and terrible in the other two. Now, most of the readers here know I am a big fan of his; the fandom began as pity for his extreme lack of luck and blossomed into just really enjoying watching him pitch. Since he has had a topsy-turvy season I decided to get my feet wet with Pitch F/X data to provide a scouting report on him.
Mechanics
Carlos Gomez, former writer for The Hardball Times and current scout for the Diamondbacks, wrote a tremendous article breaking down Cain’s mechanics just about a year ago today. Cain has a windup very similar to that of Daisuke Matsuzaka, one that derives its power from balance, torque, and hellish momentum. As Carlos points out, the lower body completely uncoils his upper body; this, along with the loading of his shoulder, helps increase his velocity.
One of the reasons Cain can occasionally struggle with control deals with a problem at his release point: He tends not to meet the glove with his body but rather brings the glove towards him. This happened to be something I noticed from watching some starts this year, prior to reading the mechanics column, and was very happy to read something supporting my eyes. Cain’s release point has seemingly been off this season which may have contributed to his 18 walks through five games.
Pitch Classification
Since there were some issues with classifying pitches within the PFX algorithm I decided to mix my methods. Basically, anything that went unidentified or seemed fishy was corrected through watching the games and making due corrections. The incorrectly classified pitches mainly tended to be changeups. The algorithm recorded some of these changeups as splitters; however, from watching Cain pitch as well as reports on his repertoire, it is evident a splitter is not part of his arsenal. His changeup is more of a circle-change, though, one that offers significant sink; probably the main reason it was classified as a splitter in some instances.
This brings up an interesting point that I want to explore more in the coming weeks but essentially, the algorithm classifies the pitch based on what pitches similarly classified will do; regardless of what we call it, the pitch looks like a splitter due to its sink, break, velocity, and spin. If I call it a chanegeup the pitch is not any easier to hit. Regardless, Cain throws a fastball, slider, and changeup, with the curveball used sparingly. He has significantly reduced the usage of a curveball since his minor league tenure and early major league upbringing. He has thrown so few curves, in fact, that I’ve actually excluded them from many of the charts to come later. He has always relied on his fastball and this year is no different. Here is his location breakdown to lefties and righties:

And here is a look at how he has distributed the three major pitches amongst these hitters:

He clearly favors the slider against RHH and the changeup against LHH. With a circle-change with as much as sink as his it makes sense to use it primarily against lefties; it tails away from them. He appears to throw his fastball a bit more often against lefties but the discrepancy is much closer in the early going than the graph would suggest.
Count Results
Last year, as documented by Josh Kalk’s card as well as Chris Quick’s article at Bay City Baseball, Cain threw his fastball at least 50% of the time in all counts except for 1-2. In the early going of 2008, Cain has thrown his fastball much more, with a minimum of 58% in the same 1-2 counts. Reverting to his slider more often in these counts he has also recorded the highest percentage of his strikeouts. Here is his current graph of pitches by count:

As mentioned, he has upped his fastball usage thus far, starting 80% of batters off with one. He has virtually cut the curveball out of his diet, failing to throw it on 0-2 counts and any count with two or more balls on the batter. Yes, that line sounds funny in retrospect, but no more innuendos.
Sequencing
The area of pitch data I am going to focus on the most involves sequencing, IE, what does a pitcher throw after a certain pitch? After a series of pitches? Does this differ between batter types? Between handedness? Once the 2008 data is corrected I would love to explore this more in-depth via location quadrants: up and in, down and in, etc. For now I will look solely at the pitches thrown. The following charts show the percentage of pitches that follow a given pitch, broken down by lefties and righties. To further explain, for anyone unfamiliar, in the LHH chart, in the ‘CH’ row, the number 4.5% refers to the fact that, against lefties, Cain has followed a chanegup with a slider 4.5% of the time so far:

## StatSpeak World Famous Roundtable: April 28

What better cure for the Mondays can there be than a StatSpeak roundtable.  Today, StatSpeak is proud to welcome Geoff Young of Ducksnorts (where Geoff writes about the San Diego Padres) as well as Baseball Digest Daily, and The Hardball TimesGeoff joins Eric and Pizza in a discussion of what you should be seeing on the bottom of your screen, Trevor Hoffman, and which player who’s had a crazy good start to the season has the best chance of keeping things up.
Question #1: Trevor Hoffman — small sample size victim or toast.
Geoff Young: I’m going to cheat and say a little of each. On the one hand, I don’t feel comfortable making a firm judgment based on eight games to start the season. On the other, Hoffman is 40 years old and he hasn’t been dominant since the ’90s. Skills erode.  From a visual standpoint, what concerns me most is his decreased ability to locate the fastball. He used to be deadly accurate with that pitch, but not so much thus far in ’08. Whenever Hoffman has had this sort of problem in the past, he’s been able to correct it quickly. We’re still waiting for that to happen this time, and at his age, there’s no guarantee that it will.
From a statistical standpoint, the declining strikeouts are a huge yellow flag. His K/9 over the past five years in which he was healthy paints a troubling picture:
2002: 10.47
2004: 8.73
2005: 8.43
2006: 7.14
2007: 6.91
There isn’t a way to cast those numbers in a positive light. At the same time, largely because of smarts and great control, Hoffman has managed to remain effective despite decreased dominance. Here’s his ERA+ over that same period:
2002: 137
2004: 168
2005: 130
2006: 189
2007: 135
There’s no identifiable pattern, but from looking at the K/9, it’s clear that he’s more mirrors than smoke at this point. The agonizing part of the equation, from the standpoint of the Padres and their fans, is that by the time enough of a sample is gathered, it’s probably too late to make the right decision. This, of course, is why hindsight is 20-20. I’d say give him another month or so to right the ship. If Hoffman still hasn’t figured it out by the end of May, then slap together a new plan.
Eric Seidman: I would personally be inclined to think it’s a sample size issue right now but there seems to be luck-based factors at work, as well.  Hoffman’s K/BB has plummeted since 2004, dropping from 6.63 to 2.93 last year–currently at 2.00–however his current K/9 is actually higher than in 2006 and 2007.  As of this moment his percentage of line drives has decreased from 17% to 7% while his grounder frequency has jumped from 30% to 39%.  His BABIP is currently the second-highest it has ever been in his career, likely as a result of the grounder increase.  Labeling is a big factor in situations like this, as I’m sure Pizza can attest to, because once we say something about a person, every subsequent action is viewed in this light.  When Brad Lidge gave up the home run to Pujols in the playoffs, he was labeled a mentally bruised and battered pitcher.  Playing directly into a convenience factor, Lidge posted a 5.28 ERA the following year and lost his spot.  Most in the media wrote him off as having a fragile psyche because he followed a devastating home run surrendered with a seemingly subpar season.  His FIP in 2006 was 3.84.  His FIP last year, when he had a solid on-the-surface statistical season?  3.84.  He was unlucky in 2006 and a tad lucky last year but because he was given the label of being toast we viewed every blown save as more evidence of his demise.  With regards to Hoffman, it seems that he is currently a bit unlucky, as his FIP is 1.5 points lower than his ERA as well as the aforementioned factors.  I’d love to revisit this question in June or July to see where he stands.  He won’t be good forever but I think this is a bit overblown.  (Ed. note: When did Eric get a degree in psychology? – P.C.)
Pizza Cutter: Mmmmm, I like toast… Well, it takes about 150 PA’s to get even some basic pitching stats to stabilize enough that they can really be counted reliable.  As I write this, Trevor has faced 40.  His BABIP is high, as is his HR/FB, so luck has not been his friend.  On that evidence, I’d say he’s just the victim of a small sample size.  But, there are a few concerning signs.  Trevor hasn’t lost much velocity off his pitches (assuming that FanGraphs has good data on Hoffman’s pitch selection), but he’s been throwing more sliders than normal, at the expense of his changeup.  Why would a pitcher who’s had success in the past mess around that drastically with his pitch selection?  (An injury?  A lack of confidence in the changeup?  Maybe he has new confidence in the slider now.)  Plus, both last year and this year, he’s seen a decent sized jump in his fly ball rate.  Last year, he gave up a ridiculously low HR/FB so he got away with it.  This year, he might not be so fortunate.  Closers are usually brought into high leverage situations where a home run is catastrophic for his team.  Sure, Hoffman pitches at Petco, so it might not be as big a concern half the time, but still, it’s not like it’s a good idea to be giving up so many fly balls.  Then, there’s the issue of his strikeout rate creeping downward (although his current rate is creeping back up to 8 per 9 innings), and his walk rate creeping upward.  Since his velocity isn’t down, perhaps it’s his control that is fading?  He is 40 years old.  I have a hard time reading too much into stats this early in the season, but I do see some signs for concern, even dating back to last season.

## When Non-Pitchers Attack

Trailing 18-0 as the eighth inning came to a close, the Arizona Diamondbacks knew that their chance of coming back had passed its statute of limitations.  When the ninth inning rolled around, and the likes of Rick Helling, Mike Morgan, Eddie Oropesa, and Bret Prinz had already appeared, Bob Brenly decided to give one hopeful his pitching debut.  Mark Grace.  Brenly called upon Grace to pitch the final inning of this September 2nd, 2002 game in order to rest a weary bullpen in a situation that had become meaningless.  What happened next will be etched in the baseball part of my mind forever: Grace began impersonating pitchers on his team, namely Mike Fetters, illiciting much laughter out of the severely depleted fan base as well as his colleagues.
Grace induced flyouts off the bats of Jeff Reboulet and Wilson Ruan before surrendering a first-pitch home run to Dave Ross.  The dinger turned out to be the first of Ross’s career and you couldn’t help but smile at Grace’s mock yelling angrily at him as he circled the bases.  Tyler Houston then flew out to end the inning and the DBacks lost 19-1.
Non-pitchers taking the mound seems to be an event so rare in nature that it can help quell the disgust at our team for being blown out;  or, for the team experiencing the huge lead, it can become quite the comical moment as pressure ceases to exist when up by 15+ runs.  Despite this, it could potentially produce embarrassing results if a certain non-pitcher happens to strike batters out.  Perhaps not as embarrassing as it would have been if Pat Maholm gave up a single to Billy Crystal but, if I’m a major league hitter, I’m very likely to get razzed if Jeff Cirillo comes into pitch and strikes me out.
Well… if you replace “a major league hitter” with Craig Counsell the previous sentence takes on the form of a factual description of a ninth inning event on August 20th, 2007.  Ahead 9-0, Bob Melvin thought it appropriate to give his oft-used bullpen a break, and handed the ball to the veteran infielder.  Maybe Counsell had been trying to act like a leadoff batter, getting himself into a longer at-bat in order to show his teammates Cirillo’s repertoire, because an epic 7-pitch matchup followed.  After a called strike and a swinging strike, Jeff wasted two pitches, evening up the count at 2-2.  Counsell fought back, fouling the next two pitches off, but Cirillo came back and struck him out swinging on the next pitch.
Counsell is not the only one who has ever fallen victim to a strikeout at the hands of a non-pitcher and I decided to research more instances of this occurring.  Luckily, Sean Forman informed me of a page in the Frivolities section at Baseball-Reference that kept track of non-pitchers pitching, or else this would have taken quite some time.  Below are the more recent players that have been struck out by non-pitchers; since it will come in story and not list form the non-pitcher and strikeout victim will be bolded.
Tim Bogar had made a relief pitching appearance on June 10th, 2000, giving nothing up in his one inning, while throwing 12 pitches/9 strikes.  The Astros called on him again two weeks later, June 24th.  After surrendering a leadoff home run to J.T. Snow, Bogar struck out Felipe Crespo on three straight swing and misses.
Wade Boggs came into pitch in two different years: 1997 with the Yankees and 1999 with the Devil Rays.  No, that isn’t a messup, they were still the Devil Rays back then.  He struck out one batter in each of his appearances.  In 1997, victim #1 was Angels catcher Todd Greene, and in 1999, victim #2 was none other than the venerable Delino DeShields.
In 1991, Cubs outfielder Doug Dascenzo made three pitching appearances, striking out one batter in two of them.  Against the Cardinals he struck out pitcher Willie Fraser, which I guess is not as difficult as someone like DeShields, but is still a strikeout.  Against the Pirates, later on in the year, he struck out pinch-hitter Joe Redfield.  Also of note: he got Barry Bonds to flyout.
Mets legend Matt Franco struck two batters out in multiple 1999 appearances.  With two outs in the ninth inning, against the Braves, Matt came into spell John Franco.  After giving up a Gerald Williams home run and an Otis Nixon triple, he struck out Andruw Jones swinging.  A little over a month later, against the Dodgers, Franco struck out super-mega-pinch-hit star Dave Hansen.
In 1990, current Red Sox skipper Terry Francona struck out Stan Javier.
Gary Gaetti made three pitching appearances throughout his career, but none more memorable than July 3rd, 1999, when he struck out the immortal Kevin Sefcik.
In 1998, another super-mega-pinch-hit star, Lenny Harris, pitched a scoreless inning against the Reds; in the process he struck out Brent Mayne.  Mayne did make a pitching appearance of his own but sadly did not strike anyone out.
Though not necessarily recent, Dave Kingman struck out three Dodgers in a 1973 game in which he pitched two innings.  His victims: Steve Yeager, Joe Ferguson, and Bill Russell.  Earlier in the year he struck out Darrel Chaney of the Reds.
In 2001, Tim Laker struck out Jose Valentin.  I’m not counting this, though, because Laker was named in The Mitchell Report.  Clearly, magical performance elixirs are the only reason this K exists.
Also in 2001, Mark Loretta struck out reliever Chris Nichting and outfielder Ruben Rivera of the Reds in the same inning.
On June 19th, 1987, third year player Paul O’Neill had one wild appearance against the Braves: Lasting two innings, he gave up two hits, three runs, while walking four and striking out two.  His victims were Ken Griffey (the dad) and pitcher Jeff Dedmon.
Former Pirates backup catcher Keith Osik pitched in one game in 1999 and one in 2000, striking out one in each.  In 1999 he struck out fellow backup catcher Paul Bako.  Osik’s membership to the Below Average Backup Catchers Union was promptly revoked.  In 2000, John Rodriguez of the Cardinals fell victim.
In 2001, Desi Relaford struck out reliever Jose Antonio Nunez in an at-bat that I would bet neither Relaford nor Nunez even remembers.
With two outs in the eighth on May 2nd, 1993, Kevin Seitzer relieved Kelly Downs and struck out Carlos Martinez to end the inning.
On July 31st, 1998, Mark Whiten of the Indians pitched the eighth inning against the Athletics in which he struck out the side!  Mike Blowers, Miguel Tejada (he was two years older though so it’s different), and Mike Neill (who?) were no match for the powers of the “Light-hittin’” one.
And lastly, as a member of the Rockies in 2002, Todd Zeile struck out Wilson Ruan.
Wow.  I don’t know if you realized it or not but the first example of a non-pitcher pitching in this article involved Wilson Ruan and so did the last.  That was entirely unintentional and what we in the filmmaking community refer to as a “happy mistake.”  I never thought I would ever write an article bookended by Wilson Ruan.  My personal favorite non-pitcher pitching moment was Grace’s, but what are yours?

## Visual WPA Results

Two weeks ago I discussed a version of WPA in which intuitive scouting would aid in the in-depth division of contributions between batter/runner and pitcher/fielder.  Though not a new or revolutionary technique I felt it was worth bringing up to serve as a potential measure of the human aspects not currently found in the WPA statistic.  Essentially, if those against stats like WPA feel that its results are tainted due to a lack of division amongst the true efforts in each play–situations like a tremendous fielding or baserunning play being solely credited to the respective pitcher and hitter–then incorporating said division would theoretically produce different and more accurate results.
Making Judgments
In conducting this analysis I used the 4/21-4/22 series between the Phillies and Rockies.  Though I definitely agree with Pizza Cutter’s assertion that a group of different eyes determining the credit or debit division is a better idea, the judgments here were left up to my own eyes.  In no way did I attempt to cherrypick any data to prove a point; this was merely done to investigate an idea.  I watched every play with eyes that have seen a ton of baseball games, often watching plays a few times.
Comparing Results
The key in comparing results is understanding that we cannot jump from Point A to Point C.  The current WPA does not divide contributions; if we compared those figures to the in-depth play divisions common sense suggests drastically different results will be found.  Before going in-depth WPA needs to be adjusted to divide credit or debit between, at the very least, errors.  With this in mind I broke the analysis into two steps: First, just separating contribution amongst amazing/bad plays unfairly credited or debited fully to the wrong person; and secondly, dividing contribution on a deeper level, gauging things like outfielder distance, strength of some arms, would your average runner reach third base, etc.  For instance, Chase Utley’s web gems in this series would be included in Step 1 as well as Step 2 whereas something like properly determining the debit recipient on a Wily Taveras stolen base would apply solely to Step 2.  This way, we can compare the current WPA to Step 1 and then Step 1 to Step 2; this comparitive system will be more effective in determining just how different the results may be.
Results
Here are the links to the Fangraphs WPA for these two games, as well as a PDF showing the total WPA for the series:

Here are the links to the VisPA results:

Analysis
Of the 170 plays in this two game set, a total of fourteen were effected by the Step 1 and 26 were altered via Step 2.  Despite only 15% of the plays requiring some type of adjustment there were noticeable shifts in the WPA of certain players.  Chase Utley, for instance, came in at +.454 via standard WPA; both his VisPA1 and VisPA2 were +.589.  Because of his tremendous fielding plays he increased somewhat significantly.  Pat Burrell, on the other hand, had a standard WPA of +.674; his VisPA1 was +.512 while his VisPA2 was +.333.  Due to poor fielding and certain plays benefiting from heads up baserunning on the part of others, Burrell’s standard WPA significantly decreased from WPA to VisPA1 and dropped off even more from VisPA1 to VisPA2.  Here is a file showing all three types of WPA for everyone in this series:

Something I ended up doing, which I’m curious to hear thoughts on, was treat a certain play in a fashion similar to inherited runners.  It was an error made by Pat Burrell that put two men on prior to Yorvit Torrealba hitting a 3-r homer.  If Burrell makes that play, the inning ends; since he didn’t, and three runs scored on the home run, I charged him with 1/3 of the WPA debit on that play.  This is not necessarily something I fully advocate but something to consider and generate feedback on.  It was only one play in this short series so the end result isn’t too significant.  There were no instances of an umpire effecting the outcome of a play with a bad call and no signs of wrongdoing by the third base coaches either.  Some plays were divided based on the difficulty level of properly executing; others, as in Utley’s web gems, were awarded entirely to the fielder.
Based on standard WPA, the most valuable player of this series was Pat Burrell.  Using VisPA1, and in this case VisPA2, Chase Utley is the most valuable player.  If you asked anybody that watched the series who they would pick as the MVP it would likely be unanimous in Utley’s favor.
This just reinforces that much more can be gathered from mixed methods in sabermetrics.  For all we know, over the course of a season, everything could even itself out to the point that standard WPA is 90%+ accurate.  I’m still very intrigued by the idea of putting something together across the web to track this for an extended period of time.  Even if the results end up cancelling each other out in the aforementioned scenario I feel like we owe it to ourselves to try.  After all, by using a group of eyes to evaluate the proper debit and credit on specific plays meriting said division, we will incorporate human aspects of the game not found in a play-by-play file or game log, and in turn offer a more accurate measurement of what we are seeking to measure.

## The foul ball, part two: What does it tell us about a pitcher?

Last week, I took a look at what a foul ball tells us about a batter.  In general, we saw that what type of foul balls a batter hit (whether they were two-strike spoilers or they were basically really long strikes) may have provided a bit of a diagnostic to his mindset at the plate, whether he was a high risk/reward swinger or a low risk/reward swinger.  Now, we look at it from the pitcher’s perspective.
Again, I’ve calculated a few basic foul ball metrics, including foul balls per plate appearance, zero and one strike fouls per PA, two strike foul balls per PA, overall contact and swing percentages, and percentage of balls with which the batter made contact that went foul (foul contact).  And I ran a big correlation matrix to look at whether any of these metrics were correlated with a pitcher’s batter ball profile, the usual slash stats, and some basic outcome rates.
Batters were fairly consistent from year to year on these foul ball metrics.  What about pitchers?  Again, I looked at the years 2004-2007 with a minimum of 250 BF.  Foul balls per PA (intraclass correlation = .696), contact percentage (ICC = .805), and foul contact (ICC = .753) were all pretty stable.  So, there is some repeatable skill in inducing (or not inducing foul balls or getting the ball to go foul when it has been hit).
Splitting the foul balls by when they happened in the count didn’t make for very reliable stats though.  (Two strike fouls ICC = .585, 0-and-1 strike fouls = .454).  Those numbers are nice, but to be considered reliable, they should be north of .70.  Further, there was a moderate correlation (r = .359) between those two stats.  Sounds like pitchers don’t control when the foul ball happens, but they do have an overall skill in getting the ball to go foul.
Taking a quick look at simple foul balls per PA gives us some interesting information.  A pitcher who gives up a lot of foul balls is more likely to give up fly balls (r = .411) and less likely to give up ground balls (r = -.440).  He’s also more likely to strike batters out (r = .440), but not and more or less likely to walk batters (r = -.020).  So, it pays to have a pitcher who induces a lot of foul balls, although he might pay for it with more home runs coming off of those fly balls.
The real story though is in figuring out what happens to the ball after the batter makes contact.  A higher overall contact rate (again, from the pitcher’s perspective) is associated with higher numbers on all three slash stats (AVG/OBP/SLG, those correlations being .610/.381/.494).  It’s also weakly associated with fewer walks (r = -.245), but very associated with fewer strikeouts (r = -.844!!!) and more singles (r = .519).  Now, given that, we would never want a pitcher who pitches “to contact”, right?  Maybe we would.
The foul contact index has some rather interesting findings.  Here we see that the ratio of foul balls to the number of all balls hit (foul or in play… or over the fence), has a bunch of strong correlations in the other direction.  A lot of foul balls here is related to lower numbers on the three slash stats (r = -.535/-.352/-.387), as well as a higher strike out rate (r = .725), and fewer singles (r = -.491) and doubles and triples (r = -.310).  Might have something to do with the fact that foul balls are generally counted as strikes, and strikes are… um, good if you’re a pitcher.  Foul contact, strangely enough, does correlate moderately with more walks, however (r = .205).  Weird.  Moral of the story: you can get by as a pitcher who pitches “to contact”, as long as they’re hitting it foul most of the time.
Who were the league leaders in foul contact in 2007?  (Top 20, from highest to lowest, min 250 BF): Rafael Betancourt, Russ Springer, Al Reyes, Juan Cruz, Scott Kazmir, Chris Young, J.J. Putz, Rafael Soriano, Jonathon Broxton, Jose Valverde, Kevin Gregg, Joe Nathan, Alan Embree, Brandon Morrow, Frank Francisco, Matt Garza, Mariano Rivera, Bob Howry, Eric Bedard, and Jake Peavy.  Mostly relievers, and some guys with some pretty high-test stuff, but a few guys who aren’t considered “closer material” but still have had good seasons with less-than-classically-beautiful stuff.  Maybe there’s something to this.
One other issue worth looking at was brought up by StatSpeak alumnus Mike Fast in his comment on part one of this series.  Do a lot of two-strike foul balls mean that a pitcher lacks a “strikeout pitch?”  Surprisingly, the answer is no.  We’ve already seen that two-strike fouls are a rather un-reliable stat from the pitcher’s perspective, so there’s not a lot of repeatable skill in inducing two-strike fouls.  Still, do they correlate well with strikeouts?  The correlation is -.172, so there’s a weak relationship in which more two strike fouls lead to fewer walks, but .172 isn’t much of anything.  It looks like if the pitcher at first doesn’t succeed in striking the batter out on a two strike pitch, he can try try again.
So what have we learned?  Foul balls are a good thing for a pitcher!  (Primarily because they count as strikes.)  If a pitcher has a repetoire of “stuff” that no one can touch, that’s great!  It’s hard to hit a home run if you can’t get the bat on the ball.  However, if a pitcher doesn’t have world class gas, it’s OK if he has tricky stuff.  It might be one of those abilities that hide in the data that no one really pays attention to.  Consider, a foul ball means that the batter is thinking “hey, I can hit that!” and so he swings.  He aims his bat where he thinks the ball is going, but apparently, he’s a little off and he fouls it off.  The pitcher has tricked him!  Perhaps foul contact is a decent proxy for how tricky a pitcher’s “stuff” really is.  If a pitcher can trick a batter over and over, it means that he’s doing something right.
Next week, we’ll finish up our study of the foul ball (who knew they were so interesting!) by looking at what a foul ball tells us about an at-bat.

## StatSpeak World Famous Roundtable: April 21

This week, our roundtable features special guest Jessica Bader from MVN’s Take the 7 Train and Mets GeekRead on to see what Eric, Pizza, and Jessica have to say about Kosuke Fukudome, umpiring, and the role of luck in the game of baseball.
Question #1:  One thing that has been an important discovery of the sabermetric movement is the role that luck plays in baseball, both for individual players (BABIPs that are out of line with batted-ball data) and for teams (discrepancies between actual and Pythagorean winning percentage that can’t entirely be explained by bullpen performance). Yet this concept is one that many less sabermetrically-oriented followers of baseball are especially disdainful towards. Why do you think this is the case?

1. People want to believe that the “better” team always prevails. Many see sports as a diversion from the stresses of everyday life and don’t want to believe that baseball (or football, or basketball) is subject to the same vagaries of fortune as the rest of the daily grind.
2. There is a misconception as to the coexistence of luck and skill. While a high level of skill can minimize the impact of luck (a pitcher who strikes a lot of batters out has fewer opportunities to be victimized by bad results on balls in play), it will not completely eliminate it (that pitcher is still going to give up some hits on grounders that find holes). However, some people believe that when you attribute a player or team’s success or failure to luck, you are saying that skill has nothing to do with it, and those people are likely to take offense upon being told that Hitter X has benefited from good luck or that Pitcher Y has been unlucky.
3. Luck is a four-letter word ending in –uck, and people tend to have issues with those, don’t they?

Eric Seidman: I think a lot of the disdain towards luck-based indicators correlates well with the reason people would be opposed towards the ball/strike ump standing behind the pitcher or near the mound: the results take more of the human aspect out of the game.  I’ve asked a few people with cursory knowledge of pythagorean W-L, who are on the “scouts rule” part of the analysis war, and they seem to feel that luck is a human aspect of the game and attempting to measure human aspects with statistics misses the point.  Though I disagree I do feel there is some merit to this stance as a team with a better pythagorean record is not necessarily unlucky.  Look at the Braves this year: they lose one-run games and win blowouts.  Their pythagorean is going to be exagerrated due to these blowouts; the games are individual commodities, though, and so losing 2-1 and winning 8-0 should produce a 1-1 record.  Regardless of the runs scored/runs allowed, the team won a game and lost a game.  Winning by eight should have no effect on a loss by one and those opposed to these luck-based statistics likely understand this whether they realize it or not.
Another issue, dealing with BABIP is whether or not certain expectations follow suit for certain players.  In the aforementioned Upton example, a player with Upton’s speed is likely to leg out more grounders than, say, Pat Burrell; a high ground-ball rate and high BABIP for Burrell will regress whereas Upton could maintain a .360+ BABIP due to his speed.
Pizza Cutter: Luck means that the world around us is chaotic and that there is no control or plan or design or destiny.  When people say that baseball is part of the “American identity” or the “fabric of our lives”, they’re making the argument that baseball is part of people’s life narrative.  I’m guessing that many of you reading this would discuss at least some of the events of your life in relation to major baseball events that happened.  Who wants to believe that their life itself is chaotic, uncontrollable, and without purpose or destiny?  Whether life is (or not) is a philosophical question.  Whether baseball is (or not) is a matter of investigation.   It’s a much more comforting thought to believe that life is not random and that there is some plan behind it.  When a lot of “traditional” baseball fans hear Sabermetricians talk, they’re hearing someone critique not just their views on baseball.  Deep down, they’re hearing us say “Life is meaningless.”

## This Week in News and Sabermetrics, 4/13-4/19

With another full week of baseball in the books, let’s recap the daylights out of it… or at least in a similar fashion to last week’s TWINS article.
Interesting Bits of Tid
Well, the first of many Johan vs. Hamels matchups is in the book, with the edge going to Senor Santana.  Unfortunately, both pitching lines were effected by inherited runners scored as JC Romero and Aaron Heilman did not help their causes.  Santana completely dominated the Phillies hitters, making Ryan Howard look plain silly.  Down 5-1, Greg Dobbs hit a 3-run homer to bring the Phillies within one run but, in the end, lost by a count of 6-4.  After winning 10 straight against the Mets the Phillies have now lost three straight; however, some Mets fans have even told me they don’t consider themselves to have really beaten the Phillies until Jimmy Rollins is back in the lineup.
The Yankees and Red Sox had a 19-hr ESPN game in which Dave O’Brien and Joe Morgan joked that they ran out of things to say.  Though I didn’t watch the whole thing I can only imagine cliches such as Alex Rodriguez’s lack of clutch of performance, Derek Jeter’s gamer-ness, and something about David Eckstein came up.  Oh, and then there was this — Farnsworth “unintentionally” throwing at Manny — that caused quite a stir.  Good thing Manny didn’t respond, though, because if my memory serves me correctly Farnsworth once linebacker-tackled an angry batter as he charged the mound.  Why doesn’t anyone ever use Hog Ellis’s line in Major League 3: Back to the Minors when a batter charges at him?  I don’t care if baseball players don’t have screenwriters.  Next time someone throws at someone, contact me, I’m a screenwriter, I’ll write you up some material to verbally intimidate an angry batter.
Then, the Padres and Rockies had a 22-inning game that confused the Fangraphs WPA system and resulted in the longest WPA graph anyone has ever seen.
Lastly, Miguel Tejada admitted he was actually 33 going on 34, not 31 going on 32, after being schooled and made to look like an idiot by an E:60 reporter.  Whether or not it was “fair” to “ambush” Tejada like that one thing is clear – this story is not making headlines anymore.  Maybe it will when the E:60 show airs but, for now, people are just letting Tejada “..play beisbol, man… I just here to play beisbol..”
Cy Young Predictor
In The Neyer/James Guide to Pitchers Bill James presented a formula that could, with pretty good accuracy, predict the eventual Cy Young Award.  For a description, click here.  Here are the top in the NL:

1. Brandon Webb, 38.5
2. Ben Sheets, 35.6
3. Jake Peavy, 35.3
4. Dan Haren, 31.0
5. John Smoltz, 30.0

And in the AL:

1. Daisuke Matsuzaka, 38.3
2. Cliff Lee, 33.3
3. Zach Greinke, 32.5
4. Joe Saunders, 31.1
5. Carlos Silva, 29.0

Beane Count
The teams most fitting Billy Beane’s desired attributes, via Rob Neyer’s statistic, as of this week, are: Chicago White Sox, St. Louis Cardinals.
The Cardinals score a 13.5 (lower is better) while the DBacks are close behind them with a 13.6.  After that the distance grows as the Braves chime in with a 21.8.  In the AL it is not close by any stretch of the imagibeaneneation.  The White Sox have a 9.7 and the second place Devil Rays (I’ll get it one of these days) have a 22.3.
Game Scores of the Week
Here are the best games of this week, via Bill James’s single-game evaluative statistic:

• Cliff Lee, April 18th: 8 IP, 2 H, 0 R, 0 ER, 1 BB, 8 K, W – 85 GSC
• Cliff Lee, April 13th: 8 IP, 2 H, 1 R, 1 ER, 0 BB, 8 K, W – 82 GSC
• Jake Peavy, April 17th: 8 IP, 4 H, 0 R, 0 ER, 3 BB, 11 K, ND – 82 GSC

Cliff Lee was insanely consistent and consistently insane this week… using insane as an adjective to describe the high quality of his games, not his mental state.
Weekly Oddibe Award
The Oddibe Awards are given to the hitter with the slash stats (BA/OBP/SLG) closest to the league average and are named after Oddibe McDowell, whom RJ Anderson of Beyond the Box Score determined to have the career slash line closest to the league average from 1960-2006.  As of this week the league average slash line is .258/.329/.401.  Should the season end today, for whatever reason, the 2008 Oddibe Award would go to….. drumroll….. Orlando Hudson at .254/.314/.413.  This is the O-Dog’s second straight Oddibe Award.  He won the award in 2006.
If the Season Ended Today
Speaking of whether or not the season ended today I think it will be interesting to look at the playoff matchups each week if it did end.  This way we can see which teams were in it all year as opposed to burning out or surging in.

• White Sox (AL Central) vs. Orioles (AL WC)
• Red Sox (AL East) vs. Angels (AL West)
• DBacks (NL West) vs. winner of tiebreaker between Mil/CHC (NL WC)
• Cardinals (NL Central) vs. Mets (NL East)

For the second straight week the DBacks play either the Brewers or Cubs while the Cardinals have a new opponent in the Marlins.  The Red Sox and Angels make their first appearences as the Orioles shift to Wild Card and the Royals/Athletics find themselves out looking in.
In Case You Missed It
Here are some great articles/notes from the past week.

• David Appelman at Fangraphs brought both myself and USS Mariner’s Dave Cameron in to write daily entries at the fantastic site.  I’ll have about two articles per day over there, either recapping games, or looking at interesting stats, or, like this article, reminiscing about Jose Lima’s career.
• Paul Nyman looks at the pitching mechanics of left-handed pitchers through the years.
• Tom Tango suggests some interesting rule-changes the MLB should potentially implement but will more than likely ignore.
• R.J. Anderson looks at Billy Beane’s desire to lock-up Huston Street and why this is such an un-Beanelike move.

## The Batting Hall of Current

A topic that never ceases to cause debate in the baseball writing community is who does or does not belong in the hall of fame.  Most of the debate revolves around whether or not a player has “the numbers.”  The majority of those chiming in intuitively know what makes up a worthy player but, because no common denominator exists, we rever to statistical milestones in order to base judgments.  While this is not wrong, by any means, there are also those who possess the mindset that a healthy combination of solid stats and contributions to the game is a better way to gauge induction-worthiness.
I personally feel the hall of fame should work more along the lines of an historical document that will serve to inform future generations which players from the past are really worth knowing about.  The fact of the matter is that there are many different ideas and definitions about what the Cooperstown hall is or should be; this plethora of ideas is one of the key reasons we so fervently debate.
In my favorite baseball book (as of now) Whatever Happened to the Hall of Fame? Bill James attempts to uncover what makes a hall of fame player as well as why Player A got in and Player B did not.  While he did not necessarily find a common denominator he did notice that a large percentage of those enshrined reached certain statistical milestones.  With that in mind he created a few tests to determine the likelihood of a player getting inducted.
The test I like to examine the most is the Hall of Fame Monitor.  For a full explanation click the link of the title, but it essentially weights different milestones and awards points as players positively distance themselves from said achievements.  Anyone with a score of 100+ is considered to have a shot; anyone with 130+ is considered a virtual shoe-in.  For instance, Ken Griffey Jr. currently has a 225 and Alex Rodriguez has a 316; based on what others currently inducted have done, these two players would be no-doubters if they retired today or tomorrow.
There are currently 35 batters with 100+ not yet eligible for induction.  I thought it might be fun to show them and get your thoughts on whether or not they are worthy, as well as why or why not.  If we can get enough of a response we’ll have an official fan ballot.  In just taking a cursory scan of these 35 I have a strong sense we will find some players with 130+ that are not necessarily worthy of induction based on the standards of some.  Here are the seven above 200:

• Barry Bonds, 350
• Alex Rodriguez, 316
• Ivan Rodriguez, 228
• Ken Griffey Jr, 225
• Derek Jeter, 221
• Mike Piazza, 205
• Sammy Sosa, 201

The bookends of that list bring up the topic of steroids and magical performance elixirs (what I imagine Sesame Street would call PED’s) but I am only mentioning them due to a quota of PED mentions in articles in need of being reached.  Here are the players above 150:

• Frank Thomas, 194
• Roberto Alomar, 193
• Manny Ramirez, 187
• Rafael Palmeiro, 178
• Craig Biggio, 172
• Ichiro Suzuki, 170
• Albert Pujols, 166
• Todd Helton, 162

It’s very interesting to see Albert Pujols and Vlad in there so highly due to them still having a nice portion of their careers left.  Here are the players above 130:

• Jeff Bagwell, 149
• Larry Walker, 147
• Gary Sheffield, 146
• Chipper Jones, 141
• Jim Thome, 139
• Bernie Williams, 133
• Edgar Martinez, 131

And here are the players between 100 and 130:

• Jeff Kent, 121
• Nomar Garciaparra, 120
• Juan Gonzalez, 120
• Barry Larkin, 118