Ah, so we meet again…

A trivia question: Over the last 25ish years (since 1981), what batter/pitcher combo has faced each other the most?  As you might expect, these are two gentlemen who played more than 20 years each (and both premiered in the same year), both spent their entire careers in the same league, but were never teammates.  The names are at the end, but if this is any hint, they faced each other 154 times over their careers.

So after the 35th time, who really had the advantage?  Is it the pitcher who now “knows how to get the batter out?”  After all, he’s had the experience to see what a batter will swing at and what he won’t.  Then again, maybe the batter has the advantage.  He’s had the experience to see what the pitcher throws and can figure out his patterns.  Indeed, there’s always talk that when a player is traded to/signs in a new league, he will have a period of adjustment, owing to the fact that he likely hasn’t faced many of the batters/pitchers that he will now be facing.  I’ve actually heard it all four ways, that a batter will benefit from/suffer for his first foray into a new league (because the pitchers haven’t seen him/he hasn’t seen the pitchers before) and that a pitcher will benefit from/suffer for his foray into a new league (same logic).  What gives?

Well, let’s look at what really happens.  I took all the Retrosheet play by play files from 1980 to 2007 and put them into one big file.  (My computer currently hates me.)  I sorted them into chronological order and then numbered the different confrontations between batter and pitcher.  I dumped everyone who appeared in the 1980 season from the data set.  Johnny Bench faced Tom Seaver in 1980, but certainly, that wasn’t the first time that they’d seen each other (although my data set would have considered them to be just introduced).  In order to maintain the intergrity of the sample, they had to go.

Then I coded for whether the plate appearance ended in the batter being on base (even if that meant an ROE).  My first thought was to run a simple OBP broken down by the number of times that the two had faced each other.  But then in order to get to a point where a player had been around long enough to face a pitcher 20 times, he was probably a different class of hitter than the guy who only only got marginally introduced to a couple pitchers.  Same logic goes for pitchers who stick around.  So, I had to calculate what the expected OBP of the plate appearances in question might be.  I calculated both the player’s yearly OBP and the pitcher’s OBP given up (plus the league OBP for the year).  To make sure I wasn’t getting any .500 OBPs from someone going 1-for-2, the pitcher and batter had to have logged 250 PAs in the year in question.  This had the nice side effect of getting rid of pitchers hitting.

You can calculate what the expected OBP of a particular batter/pitcher matchup is by converting OBPs into odds ratios (OBP / 1-OBP), and then using the formula.

(batter OR / lg OR) * (pitcher OR / lg OR) = (expected OR / lg OR) 

Once you have the expectation, you can turn it back into an OBP rather easily (OR / (OR + 1)). 

Then, it was simply a matter of watching what happened when I compared what would have been expected to what actually happened.  I fumbled around with some binary logit models to see what happened, and they generally showed that as a pitcher and batter faced each other more often, the advantage slowly worked its way in the batter’s favor, but I think that the graph shows the effect a little better.  On this graph, numbers above zero mean that the pitcher has the edge.  Below, the batter has the edge.


pitcher batter learning.JPGIn the first meeting between batter and pitcher, the pitcher had a 7 point advantage in OBP.  By the time of the second meeting, that advantage was almost entirely gone (down to 1.5 points), and then by the third meeting, the outcome was most likely to be even-up to expectations.  Following that, you can see that the graph jumps around a little, but the general trend-line is downward until about 35 PA’s.  After that, the graph just gets really unstable.  My interpretation is that means that we have something of a real effect, although not a very coherent one, and the fluctuations may have to do with selective sampling and a decreasing number of pitcher-batter pairs that have met 30-something times.

There’s certainly a trend line to be had, and it certainly looks like it points toward the batter having the edge as he faces a pitcher more often, and by meeting #35, the magnitude is 13 points worth of OBP.  At first, the pitcher has the element of surprise, but the pitcher must strategize on how to remove the batter from the batter’s box with a new strategy each time, while the batter himself must simply react to what’s thrown at him.  At first, the batter has nothing to go on, but if he can learn the pattern (and it looks like he does) he can react better.

So for a short period of time, an exotic pitcher does have the advantage.  But not for long.  That advantage wears off pretty much the second time through the lineup.

Trivia answer: Greg Maddux has faced Barry Bonds* 154 times over their careers.  Second place on the list, incidentally, also belongs to Greg Maddux, this time paired with Craig Biggio (140).


11 Responses to Ah, so we meet again…

  1. dan says:

    I’ve been waiting a long time for someone to write something like this. In the past, Tom Tango has advocated (or maybe just given the “okay”) to using this shortcut for odds-ratios (using OBP):
    BatterOBP + PitcherOBP – Lg.OBP. How well do you think that works for individual matchups?

  2. Pizza Cutter says:

    That shortcut is probably better than nothing, but I don’t see any mathematic derivation for it. It’s possible that there is, but if the full way is available, I prefer that.

  3. Shane says:

    What are Bonds’ and Biggio’s record against Maddux?

  4. tangotiger says:

    My recommended method is what Pizza is doing (Odds Ratio). As Pizza said, in the “something better than nothing” category, you can use the differential method.
    Since we know that there is an in-game advantage for the pitcher the first time meeting, it is not a surprise that the first time they meet ever happens to also match the fist time meeting in any game.
    So, I’d like to see a control for the meeting within game, and for career. That is, if you look at the 23rd time meeting in a career, and it was the 1st, 2nd or 3rd time meeting in a particular game, how does the graph look? Is the function almost entirely an in-game effect?
    Finally, sample size almost certainly accounts for the up-and-down at the end of your graph. Perhaps you can show what the +/- 2 SD range is, based on the binomial, for the number of PAs you have at each level.

  5. Xeifrank says:

    Very interesting study! I wonder if any of the noise you are getting could be attributed to portions of the OBP that are out of the pitchers control (non FIP), namely singles, doubles and triples. If instead of OBP, you looked at a FIP stat (choose your own), or narrowed it down to HR rate, SO rate and less interesting BB rate what you would get as results.
    vr, Xei

  6. MGL says:

    I wrote this on the Inside the Book Blog:
    I agree with Tango. You really have to find some way to control for the in-game effects. As he says, most of the 1st time meetings are also the first time in a game, most of the second-time meetings are the second time in a game, etc.
    I guess you don’t HAVE to separate the two, but I would like to know how much any effect is in-game and how much is overall. And we already KNOW that there is a large in-game effect.
    One could do the thing by performance during an entire game versus how many prior meetings. One benefit of that is that it would mimic a “real life situation” and answer a “real life” question, e.g., “If a batter has never faced a pitcher before, does he have an disadvantage in the game?”
    I would also prefer to see wOBA used, as it is possible that an effect like this shows up in the slugging and not so much in the OBA (although it would likely show up in any hitting measure).
    I also like the idea of using an FIP or DIPS kind of stat or at least showing K and BB rates (preferably all the components, actually) in addition to any other stat.

  7. Pizza Cutter says:

    That’s probably a large portion of the noise. And there’s a lot of noise. Then again, there’s probably just some random noise in their too. I cut the graph off at 67 PA because that’s where it falls below 500 pitcher/batter pairs. I’ll monkey around with some of the other issues later on.

  8. Phil Birnbaum says:

    Could the trend line have to do with batters aging more gracefully than pitchers? Batters gain more walks as they age, if I remember correctly, but pitchers just get worse.
    I’m not sure if you used this-season-only stats for the pitcher and hitter — if you did, it would make my hypothesis wrong.

  9. Pizza Cutter says:

    Phil, I used “this season only.” So that’s not a factor. But then again, you might be on to something more generally. A 35 year old batter facing a pitcher for the first time might approach him in a different way than a 25 year old. The 35 year old may not have seen this particular guy before, but he has seen a few other pitchers in his day.

  10. Pingback: The Difficulty of Scouting Yu Darvish | Yu Darvish

  11. Pingback: Quora

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: