BABIP Projection, Batted Ball Types, and Interaction Terms
February 20, 2009 5 Comments
Normal
0
false
false
false
MicrosoftInternetExplorer4
st1\:*{behavior:url(#ieooui) }
/* Style Definitions */
table.MsoNormalTable
{msostylename:”Table Normal”;
msotstylerowbandsize:0;
msotstylecolbandsize:0;
msostylenoshow:yes;
msostyleparent:””;
msopaddingalt:0in 5.4pt 0in 5.4pt;
msoparamargin:0in;
msoparamarginbottom:.0001pt;
msopagination:widoworphan;
fontsize:10.0pt;
fontfamily:”Times New Roman”;
msoansilanguage:#0400;
msofareastlanguage:#0400;
msobidilanguage:#0400;}
This is my first post
at StatSpeak, and I am excited to join the excellent StatSpeak crew. I am an Economics Ph.D. student and a
Phillies fan, which affects my ability to analyze baseball objectively in a
positive and negative way, respectively.
Most of my baseball research is empirical, despite the fact that my
dissertation is actually theoretical, but I will occasionally post general
economic analysis of baseball decision making.
While this post will be a continuation of some of my older research, I
will summarize some of my previous results and hyperlink a few things as
needed.
Voros McCracken introduced the concept of Batting Average on
Balls in Play (BABIP) nearly a decade
ago, when he suggested that pitchers may not vary in their ability to
control it. There are clear yeartoyear
correlations with respect to a pitcher’s ability to control homeruns, walks,
and strikeouts, but the correlations were smaller or nonexistent for
BABIP. A slew of research followed, in
an attempt to determine exactly how much pitchers do control BABIP. What was agreed upon within the sabermetric
community was that you can learn the vast majority of what you need to know
about a pitcher by studying his ability to affect the Three True Outcomes (HR,
BB, K). The first thing I do when I
analyze a pitcher is to look at his Defense Independent Pitching Statistics
(DIPS). Hitters certainly do exhibit
stronger correlations with respect to these outcomes than they do with respect
to BABIP, but BABIP skill is clearly a real thing for hitters, and a large
portion of a hitter’s value derives from their ability to control BABIP. Last year, 70% of major league plate
appearances resulted in a ball in play.
Trying to determine how valuable a hitter will be requires some model of
predicting their BABIP, even if not explicit.
A
few years ago, Dave Studeman introduced a couple different ways to
approximate a hitter’s BABIP: Firstly, he suggested simply finding line drive
rate and adding .120. Later, he
suggested a regression using groundball rate, line drive, and strikeout
rate. This spawned a lot of research on
hitter’s BABIP, and this baseball offseason has seen a flurry of excellent
research on the topic.
Perhaps the most widely read articles have been an article
by Chris Dutton and Peter Bendix in which they introduced a regression formula
illustrating a number of strong correlates with BABIP, and an article
by Derek Carty which declared that an updated version that Dutton had done of
that formula slightly beat Tom Tango’s Marcel projections’ BABIP estimates and
outperformed the model Studeman introduced a few years ago.
I have written a couple of articles as well over at www.thegoodphight.com, where I have
been posting all of my research until this article. My first
significant article on the topic, written in January, suggested that the
way to analyze BABIP is to dissect BABIP by batted ball type, and I ran a few
regressions and tested correlations to determine GBBABIP, FBBABIP, and LDBABIP. The data set that I have been using is rather
small–just the 224 hitters who managed 100 PA each year from 20052008–and I am
working on acquiring a larger data set (if you have any way to get me this,
please email swartzm@econ.upenn.edu
and let me know), but I have been able to get a significant amount of
information out of this small dataset.
In the first article, I developed regression models for each batted ball
type, and found the following list of dependent variables for each regression:
Groundballs’ BABIP (GBBABIP):
–GBBABIP (positive)
–Infield hit rate (a more repeatable
skill within GBBABIP, positive)
–Contact rate (as defined as on
fangraphs.com, the percent of pitches that a hitter swings at which he makes
contact with, positive).
Flyballs’ BABIP (FBBABIP):
–Infield fly rate (negative)
Line drives’ BABIP (LDBABIP):
–Ln(HR/AB) (positive)
I also noted that GB% itself is positively correlated with
GBBABIP, and that FB% is negatively correlated with FBBABIP. This will be related to the subject of my
post today, as I will be introducing interaction terms into my regression.
In my second
article a few weeks ago, I developed a larger regression formula for BABIP
using this knowledge, and developed a prediction method for using one year of
data and another method for using three years of data to improve the existing
methods for predicting BABIP. Using the
121 hitters who were able to get 300 PA in each year from 20052008, I
developed a regression model that was able to achieve a .63 correlation with
actual BABIP using only a few regressors:
–GB%
(line drive rate was insignificant in this regression as the other statistics
proved to be more reliable)
–Natural
Log of HR/AB
–GBBABIP
–IFFB%
–Outfield
flyball BABIP
–Natural Log of Contact rate (again, as defined by
fangraphs.com)
I also developed a model for determining expected BABIP
using one year of data. Using the 148
hitters in my dataset who were able to get 300 PA in both 2007 and 2008, I was
able to generate an expected BABIP for 2008 from 2007 data that had a .53
correlation with actual 2008 BABIP. The
regressors that I used included:
–LD%
–GB%
–Natural log of HR/AB
–IFFB%
–Outfield flyball BABIP
–Natural log of Contact% (as defined by fangraphs.com)
–Spray (as defined by Dutton and Bendix, the absolute value
of LF%RF% for hit location)
–Dummy
variables for handedness
The
same model applied to 2005/2006 data was able to yield a correlation of .54,
but the model using 2006/2007 data was only able to yield a correlation of .38. I found this surprising and I am curious what
may be causing this–whether it is noise or something else. My personal belief is that some of this may
be defenses adjusting to the massive amount of new information that was
available from 20052006 and adjusting their defenses accordingly. As leaguewide BABIP was actually higher
(.303) in 2007 than 2005 (.295), 2006 (.301), or 2008 (.300), I am not sure if
this theory tells the whole story.
The
point that I am trying to make is that the very reason that BABIP was developed
in the first place was that it was not
defense independent. It was intended to
be segregated from the Three True Outcomes on the basis that defenses affected
it. It is true that some of this was a
way of saying, “There is some luck involved in whether you hit the ball at
people or between them, but not as much luck involved in whether you swing and
miss,” but some of it is that hitters hit the ball in certain places, at
certain trajectories, and baseball teams budget large sums of money to
determine where those places are and put fielders there. Then hitters train themselves to hit the ball
where the fielders are not. In fact,
this is the reason why Dutton and Bendix’s spray variable comes up as significant
in so many regressions–hitters who spray the ball across the field are able to
avoid fielders all clustering on one side of the field to defend against them. It also justifies the introduction of
interaction terms in the regressions, and you will see that these come out as
significant and improve BABIP prediction.
As
I was thinking about introducing interaction terms, I realized how appropriate
it was to include them. If I am going to
use batting average by batted ball type, I should acknowledge that each of
those terms has varying levels of usefulness depending on how frequently those
batted balls are hit.
Consider
this BABIP equation (ignoring bunts):
BABIP=
GB%*GBBABIP + FB%*FBBABIP + LD%*LDBABIP
One
could also say:
BABIP=
GB%*GBBABIP + FB%*(1IFFB%)*(OFFBBABIP) + LD%*LDBABIP
(where
OFFBBABIP is Outfield Flyball BABIP.)
And
therefore:
BABIP=GB%*GBBABIP
+ (Outfield flyball hits)/(Total Balls in Play) + LD%LDBABIP
As
line drive rate itself does not have that strong yeartoyear correlation, I
did not even use it in the regression where I used multiple years of data. So I developed a regression for hitters using
20052007 data to predict 2008 BABIP using the following regressors:
–GB%
–GBBABIP
–GB
HITS/TOTAL BALLS IN PLAY (GBHITP)
–IFFB%
–OF
HITS/TOTAL BALLS IN PLAY (OFHITP)
–NATURAL
LOG OF HR/AB
–NATURAL
LOG OF CONTACT RATE
This
regression had an Rsquared of .4324, meaning that the actual BABIP for 2008
and the expected BABIP for 2008 had a correlation of .66, beating my previous
correlation of .63, and with a higher adjusted Rsquared (to account for the
additional variables) as well.
Here
is the output for that regression:
Source 
SS 
df 
MS 

#Obs 
121 
Model 
0.049269 
7 
0.007038 

F(7,113) 
12.3 
Residual 
0.064663 
113 
0.000572 

Prob>F 
0 
Total 
0.113932 
120 
0.000949 

R^2 
0.4324 





Adj R^2 
0.3973 





RMSE 
0.02392 
tbabip08 
Coef. 
Std.Err. 
t 
P>t 
95% CI 
95% CI 
gbpavg 
0.933799 
0.332795 
2.81 
0.006 
0.274473 
1.593125 
gbbabipavg 
1.476632 
0.576233 
2.56 
0.012 
0.33501 
2.618254 
gbhitpavg 
2.88594 
1.319034 
2.19 
0.031 
5.49918 
0.27269 
iffbpavg 
0.36193 
0.072787 
4.97 
0 
0.50613 
0.21772 
ofhitpavg 
0.717482 
0.263895 
2.72 
0.008 
0.194658 
1.240306 
loghraavg 
0.013685 
0.004989 
2.74 
0.007 
0.003802 
0.023569 
logcontact~g 
0.160608 
0.044852 
3.58 
0.001 
0.071748 
0.249468 
_cons 
0.08084 
0.146061 
0.55 
0.581 
0.37021 
0.208531 
Normal
0
false
false
false
MicrosoftInternetExplorer4
/* Style Definitions */
table.MsoNormalTable
{msostylename:”Table Normal”;
msotstylerowbandsize:0;
msotstylecolbandsize:0;
msostylenoshow:yes;
msostyleparent:””;
msopaddingalt:0in 5.4pt 0in 5.4pt;
msoparamargin:0in;
msoparamarginbottom:.0001pt;
msopagination:widoworphan;
fontsize:10.0pt;
fontfamily:”Times New Roman”;
msoansilanguage:#0400;
msofareastlanguage:#0400;
msobidilanguage:#0400;}
Interestingly,
GBHITPAVG which is GB%*GBBABIP has a negative coefficient, meaning that those
hitters who historically had high groundball rates did not have as much of a
positive effect for historically high GBBABIPs as those hitters with low
groundball rates. Perhaps groundball hitters give defenses an opportunity
to see where to play, and historically success erodes. Alternatively,
there may be another variable here that is causing an effect, and I’m just not
thinking of it or don’t have access to it.
What was also interesting was that OFFBBABIP (outfield flyball BABIP) was no longer significant, and I even removed it from the regression as outfield hits per total balls in play seemed more relevant. I do not have a great hypothesis for why this is, but I did find it interesting and worth noting. If nothing else, I guess it means that getting hits via outfield flyballs is a persistent skill, but actually having those hits which get to the outfield land for hits is not a skill. That does make some intuitive sense to me. This has a lot to do with the general philosophy that many have with respect to BABIP for pitchers– that sometimes the ball is hit at people and sometimes it is hit between them– applies for hitters in some sense. It’s just a matter of getting it to the outfield, not about necessarily hitting the ball in the gaps or being able to dunk a flyball in front of an outfielder.
I
also developed a regression for 2008 using 2007 data using the following
regressors:
–LD%
–GB%
–OFFBBABIP
–OFHITS/TOTAL
BALLS IN PLAY
–NATURAL
LOG OF HOMERUN RATE
–NATURAL
LOG OF CONTACT RATE
–SPRAY
–SWITCH
HITTER DUMMY VARIABLE
This
had an Rsquared of .3090, meaning that actual and expected BABIP had a .56 correlation
instead of a .53 correlation as in my previous model, and also had an improved
adjusted Rsquared as well.
Here
is the output for that regression:
Source 
SS 
df 
MS 

#Obs 
149 
Model 
0.041241 
8 
0.005155 

F(8,140) 
7.83 
Residual 
0.092228 
140 
0.000659 

Prob>F 
0 
Total 
0.133469 
148 
0.000902 

R^2 
0.309 





Adj R^2 
0.2695 





RMSE 
0.02567 
tbabip08 
Coef. 
Std. 
t 
P>t 
95% CI 
95% CI 
ldp07 
0.551839 
0.12061 
4.58 
0 
0.313386 
0.790291 
gbp07 
0.436318 
0.10191 
4.28 
0 
0.234837 
0.6378 
loghra07 
0.009712 
0.004291 
2.26 
0.025 
0.001229 
0.018195 
offbbabip07 
0.64351 
0.263342 
2.44 
0.016 
1.16415 
0.12287 
ofhitp07 
2.222809 
0.716896 
3.1 
0.002 
0.805467 
3.64015 
logcontact07 
0.045724 
0.038554 
1.19 
0.238 
0.0305 
0.121947 
spray07 
0.06485 
0.031401 
2.07 
0.041 
0.12693 
0.00277 
shb 
0.009259 
0.005983 
1.55 
0.124 
0.00257 
0.021087 
_cons 
0.041088 
0.062641 
0.66 
0.513 
0.08276 
0.164934 
Normal
0
false
false
false
MicrosoftInternetExplorer4
/* Style Definitions */
table.MsoNormalTable
{msostylename:”Table Normal”;
msotstylerowbandsize:0;
msotstylecolbandsize:0;
msostylenoshow:yes;
msostyleparent:””;
msopaddingalt:0in 5.4pt 0in 5.4pt;
msoparamargin:0in;
msoparamarginbottom:.0001pt;
msopagination:widoworphan;
fontsize:10.0pt;
fontfamily:”Times New Roman”;
msoansilanguage:#0400;
msofareastlanguage:#0400;
msobidilanguage:#0400;}
Here,
outfield flyball BABIP came up as negative and outfield hit percentage came up
positive. This is surprising, given the
insignificance of the term for the regression using more data, but perhaps it
does indicate some of the same effect–being able to get the ball to the
outfield in the air is a skill, but those hitters who got it to fall in more
were just lucky.
So
who are the hitters we would expect to have the best BABIPs in 2009?
Normal
0
false
false
false
MicrosoftInternetExplorer4
st1\:*{behavior:url(#ieooui) }
/* Style Definitions */
table.MsoNormalTable
{msostylename:”Table Normal”;
msotstylerowbandsize:0;
msotstylecolbandsize:0;
msostylenoshow:yes;
msostyleparent:””;
msopaddingalt:0in 5.4pt 0in 5.4pt;
msoparamargin:0in;
msoparamarginbottom:.0001pt;
msopagination:widoworphan;
fontsize:10.0pt;
fontfamily:”Times New Roman”;
msoansilanguage:#0400;
msofareastlanguage:#0400;
msobidilanguage:#0400;}
1–Chipper
Jones: .356
2–Joe
Mauer: .356
3–Derek
Jeter: .342
4–Magglio
Ordonez: .341
5–Derrek
Lee: .335
6–Dmitri
Young: .335
7–Gary
Matthews Jr.: .334
8–Orlando Hudson:
.333
9–Jorge
Posada: .333
10–Brian
Roberts: .332
Also,
the hitters who the model expects to have the biggest change in BABIP from 2008
to 2009:
Name BABIP’08 E(BABIP’09)
Corey
Patterson: .215 .278
Gary
Matthews .292 .334
Gary
Sheffield .237 .282
Jason
Michaels .258 .300
Jose
Vidro .243 .312
Luis
Castillo .267 .326
Robinson
Cano .283 .329
And
due for a drop?
Name BABIP’08 e(BABIP’09)
Dioner
Navarro .318 .264
Manny
Ramirez .370 .319
Miguel
Olivo .310 .257
Milton
Bradley .388 .316
Nick
Punto .335 .290
Reed
Johnson .360 .308
Ryan
Doumit .333 .282
It’s
pretty clear that it helps to consider hitters’ propensities to hit different
types of batted balls more often when considering BABIP by batted ball
type. I believe that this article is a
movement in the right direction. I am
not done with hitters’ BABIP, and I am eager to hear criticisms and suggestions
for future research. I think that this
research is important and can help us improve BABIP projection and projection
for hitters in general. Please feel free
to leave questions comments or contact me at swartzm@econ.upenn.edu as well.
Just eyeballing your results, your regression predicts that a bunch of older players were extremely unlucky in 2008. Is it possible that as players age, their BABIP “skills” erode? Might age be a useful addition to your model?
Good point. Age comes up as highly significant when I throw it into the regression, but is probably due to the limits of regression analysis, rather than the limits of age’s effectiveness on lowering BABIP. Certainly, groundball BABIP skills do seem to erode with age, but the whole regression did not account for it.
I think that part of the problem is that the model will naturally project all hitters back towards the mean, hence the origin of the term ‘regression’ for ‘regression to the mean’. Guys like Sheffield, Castillo, and Vidro are probably losing their BABIP skills for real reasons but those aren’t showing up in a linear model.
Also, the model also is based on a sample of hitters who got 300 PA in 200508. Those hitters who did poorly enough in 2007 did not get 2008. So this regression is almost asking the question “if these players were healthy and MLB teams were to find them good enough to give 300 PA, what would their BABIP be this year?” If Gary Sheffield is actually healthy enough to get 300 PA this year, I suspect his BABIP will significantly improve.
Perhaps some kind of censored regression analysis might be better able– something that accounts for those hitters who did not get 300 PA in 2008 but did in 200507. Maybe something like a Heckman twostep or something would work, and might be better if the goal is to actually predict how these hitters will do from a MLB team’s perspective rather than conditional analysis like one might use in building a fantasy baseball team. I don’t remember my Heckman twostep model well enough, but there are a couple of assumptions (normality?) that might make that not the right model. I’d love suggestions.
Anyway, good observation and appreciated. Thank you.
My stats knowledge has all but disappeared into the depths of time, but I recall that Heckman did indeed assume normality.
Yeah, I used to be a math major, but that was 30 years ago…but speaking of age, my research and some others I read shows that speed starts declining at a very early age, 21 or maybe earlier (lack of sample size in ealier ages to be sure). This shows up in pct of extra base hits that go for triples tr/(do+tr), sotlen bases pct and stolen base attempts. I have not yet tested rate of infield hits, but I would think that would be a major factor in lowering babip as a player ages.
True. It’s certainly probable that age will cause BABIP to fall over time, but I would guess that it does not show up as significant in this regression because it’s effect may be linear.
If the aging process accelerates BABIP decline, then it would show up as significant. In fact, the average expected BABIP in my sample was a bit below .299, despite the average BABIP in my sample being .301 for 2008.