He always gets off to a hot start

That’s what she said.

Every year, I hear respectable radio broadcasters, fantasy analysts, and the rest of the baseball commentariat mumble things like “he’s got a track record as a fast starter” or “he’s never any good in April.”  As the season and the weather both heat up, we’re treated to declarations that the man at bat warms up with the summer heat.  And, of course, in September, we hear about the guys who always really turn it on to that extra… don’t say it… don’t say it… clu… clutch time of the year.

But is that true?  There’s no doubt that sometimes players have good months, sometimes better than might be expected from them, given their previous track record.  A month is a small sample size, and there’s always a lot of random variation in small sample sizes.  And there do seem to be players who are “always” (in the sportscaster sense of the word) hot or cold to start the season.  Proof by example!

It makes sense on an intuitive level to believe that some players would have better months than others.  Humans are a species who work on hormonal cycles, some of them daily, some of them monthly, some of them yearly.  There’s the phenomenon of seasonal affective disorder, in which some people see their mood particularly affected in the winter, usually for lack of sunlight.  Besides, some people are just better once they’ve gotten into the, ahem, swing of things.  Right?

Boys and girls, repeat after me.  Not everything that makes sense is true (and not everything that is true makes sense.)  That’s why we run experiments.  So, I decided to put this old saw to the test.  Are there really hot and cold starters in baseball?  Because starters pitch maybe 6 games in a month, and that starters are rather mercurial from outing to outing, I stuck to hitters.

I took the Retrosheet logs from 2004-2008 (5 years worth).  I took a player’s on-base percentage for each month of the year (anything that happened in March was counted as April, same with October becoming September) and then his OBP for the year.  In order to qualify, the player had to have at least 70 PA in the month in question and 400 for the year.

I took the difference between the batter’s OBP for the month and his OBP for the year.  The simple reason is that if Albert Pujols starts out in April with a .400 OBP, that’s not a hot streak.  That’s just business as usual.  (Methodological note: I realize that Pujols’s April PA’s will count in his yearly OBP totals.  To be a little more mathematically pure, I should compare April to the rest of the year, minus April, but I’m not in an exacting mood today.)

Now, over the five years, if a player really does have something about him that makes him hot in April/July/September, he should be consistently out-hitting his seasonal number over all the four years.  Or perhaps if he’s a cold guy, he might under-shoot it consistently.  But if guys really are hot and cold starters or finishers or just hot or cold July guys, then there should be some consistency across time in how much or how little they outperform their end-of-season stats during a specific month.

To check for that, we turn to our friend, AR(1) intraclass correlation.  (Short version: consider it like a year-to-year correlation.  But imagine a correlation that instead of having two data points could actually take on five data points at once.  That’s ICC.)  In the five years in the data set, I looked at the deviations… I can’t believe I’m blundering into my second bad unplanned pun of the article… across five Aprils.  Overall, was the consistency from year to year?

No.  In fact, the ICC for April deviations was .01 (read that like a regular correlation.)  In other words, non-existent.  Less than “clutch hitting” usually registers.  That means that the deviations across the years are almost completely random.  I ran the same analyses for the other 5 months of the regular season.  The highest any of them got to was .11 (May).  So, overall, there doesn’t seem to be any reliable skill in being a creature of one particular month.

So next time you hear that Larry is a “notoriously slow starter,” smack that person upside the head.  Sure, he might have a track record of starting slowly, but that has no predictive power over what he’ll do in the coming year.


4 Responses to He always gets off to a hot start

  1. dan says:

    I always find the monthly splits for Albert Pujols to be mildly amusing. Even look at how little variability there is with the batting averages. The guy’s a robot.

  2. Brandon H says:

    Now possibly I am off the mark here, but simply because most players do not adhere to monthly trends, is it fair to assert everyone does?
    Looking at every player proves that this theory does not hold true for every player, which one could simply guess to be true.
    This is similar to the issue with BABIP. Typically a pitcher is in a range of 290-300 and is considered unlucky if they are two SD’s above 300 and lucky if the opposite is true. However, we are seeing now that specific pitchers (left handers with big curves, or giants that pitch from a much higher arm slot) are not as vulnerable to this trend.
    That being said, the ‘streaky’ hitter debate is clearly not a rule of thumb, but I doubt a sports broadcaster or fantasy pundit would assert that every player has some sort of clear and obvious split.

  3. Pizza Cutter says:

    I suppose retro-actively, we might look back and say that Jones has always had “good Aprils” and can find a way to quantify that. That’s nice. The question that more interests me is “So what?” Does that mean this coming April will be good? I suppose I could identify guys who have a track record (say 3 straight of the same month with above/below expected performance) and see if they continue that trend. It would simply be a matter of chi-squared analysis.

  4. Brandon H says:

    I think ‘so what’ is not the correct question to ask. Rather, ‘why’?
    While it is simplistic, maybe it has something to do with the weather (media promotes this idea more then it is truly accurate). While other issues could be the added days off in April. Maybe pitchers are not at ‘full strength’ and we are more likely to see mediocre relievers.
    It then wouldn’t surprise me if teams/analysts looked further into these ‘trends’ to figure out if there is merit behind them. While this is probably far fetched, imagine a team having a September ‘fire man’ – as a player who is traditionally hot in September.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: