# The Deal With Derby Participants

July 23, 2008 11 Comments

In 2005, Bobby Abreu of the Phillies put on a showcase at the Home Run Derby, breaking the single-round record en route to a derby victory. All told, Abreu swatted 41 home runs into the Detroit stands that night. Since that fateful day, three and a half years ago, Abreu has hit just 47 home runs total. From 1999-2004, he averaged 24.3 HR/yr; combining the second half of 2005 with the first half of 2008, and adding it to 2006-07, Abreu has averaged just 16 HR/yr since. While several factors could account for this drop, such as age, a decline in bat speed, more grounders, to name a few, many fans and analysts alike attributed his second half dropoff to a “tired swing” due to the derby.

This story is not alone, either. Juan Rodriguez, in this Florida Sun-Sentinel article, discusses the highly popular idea that home run derby participants will experience a second-half dropoff. According to his research, half of those studied experienced drops in their home run rates while the other half stayed stagnant or increased their rate. Still, whether in jest or born out of complete concern, the idea of a derby-driven dropoff is a very popular one.

There have not been many studies on the subject either, meaning nothing has necessarily debunked the myth or proven it to be true. I have seen a couple of studies but they, just as I did in an initial look at this very subject, fell into the same trap. By straight up comparing first half to second half, our results are not truly expressing what we intend. The problem stems from a selection bias in that those named to the all-star team or home run derby are likely having big first-halves. Overachieving first-halves, that is, meaning they are naturally due for a second half regression whether they participate in the derby or not.

To properly conduct this study, by using the true talent level, we need to compare the actual second half production to the projected production based on the previous three years and the big first half of the year in question. This way, the real test will be whether or not players fall short of their projection in the second half. If so, then yes, the idea of a decline following the derby does carry some weight. If not, and/or the results are ambiguous, then it is nothing more than a theory as their would be nothing to suggest a decline.

Those supporting the idea could play the “tired swing” or “uppercut mentality” cards but I say it’s largely poppycock. I am, however, willing to be openly swayed by the numbers should they come to suggest such a result. My first step was compiling a list of all derby participants from 2000-2007, then entering their actual second half performance into a spreadsheet. Using the in-season Marcel projector, I then *manually* entered the pertinent numbers into the required fields, which took forever (Hardball Times, you need to go prior to 2004!) but eventually offered the projections for the second halves in each of those years for each of those players.

Next, I tested the strength of the numbers by running a simple correlation. As you will see below, everything other than batting average correlated quite strongly to each other between the halves:

- AB/HR: .49
- BA: .28
- OBP: .65
- SLG: .58
- OPS: .61

Testing the correlations or running a linear regression could help in this study but I decided to go with a paired samples t-test instead. A t-test compares the means between two sets of data and lets us know if the differences between the means are statistically significant or not. Keep in mind the sample size here is 64 players so these results may not be anything definitive, but I’m really just testing to see if the idea of a decline should be given any credence, whether or not it shows up in any way in the numbers.

Anyways, back to t-tests: In them, a p-value of .05 or less suggests that the means are, in fact, statistically different. Higher than that and the means are not really that different regardless of whether or not one appears higher or lower than another. Since we are testing for a decline here, the expectation is that the mean of the projected statistics will exceed the mean of actual statistics. After running the t-test I was surprised to find that all five measured stats (AB/HR, BA, OBP, SLG, and OPS) had a p-value below .05; in fact they were all below .03, with batting average being the least significant. Since the means are all significantly different from a statistics standpoint, here are the comparisons:

- Projected AB/HR: 17.8
- Actual AB/HR: 19.9

- Projected BA: .293
- Actual BA: .299

- Projected OBP: .382
- Actual OBP: .397

- Projected SLG: .546
- Actual SLG: .563

- Projected OPS: .928
- Actual OPS: .961

According to these results, the derby participants from 2000-2007 have actually outdone their projections in the slash line department as well as in OPS. This offers, at least amongst these numbers, that these players are not declining in overall production in the second half relative to what they were expected to do. In fact, it might even point in the opposite direction, that the derby was merely a stepping stone towards a great year for the players in question. By outdoing the second half projections and beating the expected regression, the slash line and OPS do not suggest decline in the least. We may be picking nits over whether it suggests improvement, but definitely not decline.

However, and it’s a big however, the AB/HR did get worse in the actual data. On average, the projected player would hit a home run once every 17.8 at-bats, while the actual players did so once every twenty or so at-bats. Essentially, the overall production of these players did not decline but their rate of home runs did. From a psychological standpoint, Pizza Cutter noted that perhaps pitchers will bear down more in the second half against these all stars and derby participants to avoid surrendering home runs from them, even though pitchers tend to give up more flyballs in the second half.

Overall, I would like to extend this into a larger study unless someone beats me to the punch, to see if the results hold up when we add say 12-15 years of derby data to the fold. Based on this study, however, it does appear that players will experience a drop in their home run rates while simultaneously beating their projections in BA, OBP, SLG, and OPS. The key to remember is that we are comparing projected second halves to actual, not a straight up comparison between both halves for each players; that would produce different results, and wouldn’t be a fair test for decline.

Taking this to the next level would involve using a larger sample of derby participants and conducting another t-test to compare the means in several areas. Additionally, we would want an equal sample of non-derby participants with similar numbers in perhaps the AB/HR area. We would conduct the same t-test for them and see if the derby actually has an effect; if it does, then we would see the same lower AB/HR rates for derby guys but different results for the non-derby guys. They would be our control group. For now though, it’s interesting to see that the derby participants only decline in that area. Essentially, we cannot automatically assume that the derby caused the ab/hr decline until we see them stacked alongside similar players not in the derby as a control.

What is the difference in projected AB/HR rate and actual AB/HR for the leagues as a whole during this time?

I think including the first half of the All-Star season is problematic, though I understand the inclusion of the three previous years is meant to help correct for that. I wonder if there was an Anit-HR derby and we examined 64 players in that mirrored the HR rate of these participants, say a smattering slightly overacheiving, a good deal on target, and a fair number underacheiving. would we find statistical significance for an increase in HR rate. Is that selection bias just too strong…

Harper… the first half of the All-Star/Derby season ISN’T included. The numbers you’re looking at are the Projections for the second half of the season and the Actual numbers for the second half.

I actually mentioned in the article how we fall into a selection bias trap by comparing the first and second halves straight up.

By comparing the projected second half to the actual second half performance we’re seeing if the decline is truly taking place; of course if we compare the players’ first and second halves straight up we’ll find more of a decline because they aren’t likely to sustain such big first halves.

Sorry for the confusion but I’m talking about using the 1st half in the basis for projections. Yo do say: “we need to compare the actual second half production to the projected production based on the previous three years and the big first half of the year in question.”

Harper, you have to use the first half of that season in conjunction with the previous three years to get an accurate projection, no way around it. The first half isn’t problematic in that sense, though.

If we were merely comparing the first half to the second half then yes, there would be a problem but using it in conjunction with the three prior years is the correct way to make the projections and the correct way to conduct this study, so all the numbers are good with a selection bias taken into account.

If you go back a week and read my article ‘Projecting the Landscape’ I give a primer of sorts on the true talent level of a player and why the projections are important/how it helps us evaluate a player.

We have to use the first half numbers for a study like this because, without them, we wouldn’t really have an accurate projection for the second half. The first half in conjunction with the three prior years gives us the 2nd half projection, with which we can compare the 2nd half actual.

Overall, though, the first half numbers are only used to help, when combined with the last three years, generate the second half projections. It’s not as if a really spectacular first half will completely erase the three prior years of average play, if that’s your concern.

The true talent level is a tricky concept but it’s a really, really important one to learn when evaluating players.

My concern isn’t that it will erase three years of average play just that it’ll nudge the projected vs actual 2nd half comparison into significance. Yes, you have to use that first half, projections without it would of course be more flawed than projections with it, but there may be no way around the fact that the selection bias inherent in the sample will lead to significance. Simply saying using the previous 3 years accounts for selection bias isn’t enough (unless there’s a study out somewhere that shows that Marcel doesn’t overpredict those with strong first halfs – which by the way – how was that weighted? All I could find for the Marcels was predicting the next season, not the remainder of the current season )

Now this is a ton more work, but I think what would remove that worry would be to take a random sample in those years of non-HR derby participants with similar AB/HR ratio changes from the previous 3 years to that first half and do the analysis with those. If the analysis shows significance then it it has to be seriously considered that the selection bias is driving the results.

(of course this is all moot if the projection itself is flawed in someway that it over projects AB/HR in general if we were to look league wide. On the flipside if it tends to underproject that makes your case that much stronger.)

The projection is the player’s true talent estimate.

I skimmed over the other comments so if that’s irrelevant just ignore it.

Harper, the point about the AB/HR is definitely a legit one, but as far as in-season Marcel projection, we’re looking at the true talent estimate, as Dan said. I understand your concerns but I don’t think they would necessarily effect this study too immensely.

By projection in-season we’re weighting the current season as 1.0, last year as 0.79, the year before as 0.62, and then it decreases more as we go back. My article on Sabathia and Lee from last delved into the true talent level idea and how, even after ridiculous starts by both of them (one good, one bad) their true talent level wasn’t changed too much; this was due to a lack of data in the sample.

Now that we’re 100 games in (and 89-96 games at the all star break for these derby seasons) the results are a bit more significant in that if someone with a projected .775 OPS entering the season is at .925 during the first half, he isn’t going to go .625 for the rest of the season to even it out, because his Marcel would understand that his sample of .925 may be large enough to carry the full weight of most recent data. He won’t be projected to go .925 the rest of the year, but his new true talent level would no longer be .775; it would be higher.

The projections do account for the selection bias because they measure the true talent level of the player and not the individual halves that may include crazy or overachieving performance. Marcel’s and other projection systems do not overpredict based on a strong first half; they re-calculate the true talent level every day by weighting each day more than the last albeit ever so slightly.

The control group in an extended study like this, though, as you said, would definitely be a group of similar players in terms of AB/HR who DIDN’T participate in the Derby, and see what they did. Perhaps we’ll find that all guys in the same range will lose some AB/HR regardless of the Derby.

The one point it seems I forgot to make here along those lines is that assuming the Derby has an effect wouldn’t necessarily be correct on its own; it could be a number of other things that cause the drop in AB/HR or increase in other areas.

Until the in-season Marcels are harnessed to make it easy to copy years prior to 2004, though, I don’t have as much time from here on out like I did this past week to manually enter 3.5 years of data for 64+ guys.

That would be the definitive test, though, comparing the rates of derby participants to those with similar numbers NOT in the derby, in a large enough sample.

MGL, thanks, glad you enjoyed it. It took FOREVER to enter the data to get those projections. I’m actually working right now to get a database up that automates the process which would allow me to much more easily find this control group and make the comparisons. I’ll likely revisit this after the season so I can include 2008 in the sample as well.

I agree that because Marcels are not nearly perfect and there may in fact be a bias in them (e.g., maybe they under-project home run rate for ALL players who have high HR rates in their most recent sample), using a control group of similar players who were not in the Derby is essential. I am glad that Eric picked this up. When I started reading the article, that is the first thing that came to mind. Nice job, Eric!

Pingback: Attacking The Home Run Derby Effect | Brotherly Glove