MLE’s, how useful are they?

Do MLE’s (Major league equivalencies of minor league stats) predict future major league performance as well as major league stats? Last winter I compared the results for several projection systems (CHONE, ZIPS, Marcel) for all hitters with at least 300 at bats in 2006.
I took this list, and determined which players had a substantial part of their projection coming from minor league stats. Generally if a player had less than one full season in the majors, I marked him as an MLE player. While the CHONE and ZIPS projections use MLE’s, Marcel does not, so a player with no MLB time, his projection will be the league average.
With all players who had at least 300 at bats in 2006, the correlation for projected OPS and actual OPS was between .61 and .62 for each system. Taking out the low experience players, the correlations jump to the .64 to .65 level. But using only the MLE players (41 of them) the correlations are .38 (ZIPS) and .34 (CHONE/Marcel). They aren’t useless, but do not predict as well as major league stats. Marcel’s correlation would be zero if we used players with no major league experience, using only the league average, but did pretty well using a half season or so of major league performance for this group of players.
Its possible that the sample size was low, and this was just a bad year for predicting rookies. Also, there are selective sampling problems, in that rookies who play well get to keep playing, and those who do not are more likely thanplayers with a strong MLB track record to get sent back to the minors. This doesn’t seem to be a huge problem though, as among this group the average actual OPS was only 3% higher than average projected OPS.
The players who CHONE missed the most on:
1. Ryan Howard. Everyone missed on him, the only question is how much. Howard’s MVP season was likely over his head.
2. Brian McCann. Same story. He was supposed to be good, especially for a catcher, but not THAT good.
3. Dan Uggla. MLE’s said he wasn’t very good at all. He is.
4. Andre Ethier. Another one the minor league stats didn’t think would be so good.
5. Ronny Cedeno. The first one on the list who played worse than his projection. If he wasn’t a shortstop, he would not have gotten so many at bats, and would not be part of the sample.
6. Ryan Zimmerman. Actually, ZIPS nailed this one, CHONE didn’t think he was quite ready. Marcel gave him a good projection simply based on 58 great at bats in September 2005.
7. Jeremy Hermida
8. Hanley Ramirez
The Marlins really screwed up the whole MLE process, didn’t they? Uggla in 2005 played in AA and hit .259/.302/.354 Ramirez in AA hit .271/.335/.385 They played well. Hermida hit (in as good of a pitcher’s league as the other 2) .293/.457/.518 at the same level. So of course Ramirez and Uggla are the ones who hit at the major league level.
Players that CHONE projected very well, within 10 points of actual OPS, were Chris Shelton, Jose Lopez, Shane Victorino, Conor Jackson, Matt Murton, Jose Bautista, Yuni Betencourt, and Ian Kinsler. I didn’t predict the hot start and later slump of Shelton, but at least got the final batting line down.
Minor league stats should not be ignored, but it appears they are not as predictive of major league stats. Perhaps we can improve projection systems by weighting them less than major league performances.


9 Responses to MLE’s, how useful are they?

  1. Guy says:

    Other than weighting most recent years more heavily, do you also have a regression component? And if so, are you regressing the MLE players to league mean, rookie/age mean, or something else? And position specific or generic?

  2. Guy says:

    I assume your 41 MLE players had fewer average PAs than the veteran sample, so that would explain a portion of the lower r (probably not a lot, though).
    Can you remind me how CHONE weights the MLE seasons? (or provide link)
    It may be impossible to say with such a small sample, but does there appear to be anything systemic about the way the projections miss? For example, you could divide the pool in two by: higher/lower projected OPS
    C/SS/2B/3B vs. OF/1B/DH
    Then see if one group noticeably overperforms projection while other underperforms. That may point the way toward improving the projections.

  3. Sean Smith says:

    I’d have to remind myself, I know I changed the weighting and added a 4th year of data for 2007.
    I think the weights for 2006 were 1/0.8/.06 or something close to that. MLE data was combined with MLB data, they were weighted equally.
    On BTF Dan S says that it was a bad year for MLE’s, the first time since 1995 that they didn’t do as well as major league data. I’ve only done projections from 2006-2007 so I’d have to go back (retroject?) to verify that.

  4. Rob McMillin says:

    “Not as predictive of minor league stats” as what?

  5. Rob McMillin says:

    Er, “not as predictive of major league stats” as what?

  6. Sean Smith says:

    Not as predictive as past major league stats.

  7. Sean Smith says:

    There is regression. Its based partly on speed, weight, and highest level played, so Hanley Ramirez would have part of his regression to MLB for his 1 at bat in 2005, while Uggla would be regressed partly to the mean of AA players.
    It is not position specific.

  8. Guy says:

    Sean, one thought on MLEs: If I understand the methodology, they are, in part, already a prediction. That is, the MLE for a 19-yr-old player in AA ball, in a specific park and league, doesn’t really tell us what that player would do in MLB if he were forced to play in the majors that season. Rather, it tells us what this player WILL do when he’s in the majors, usually several years later. Presumably, he isn’t already that good. On the other hand, a AAA MLE is probably much closer to being a true “equivalent.”
    I’m not sure exactly what this should mean for projection systems. But at least for lower levels and younger players, I wouldn’t think you could treat MLEs the same as actual ML data. Probably they need special regressing, based on player’s age at time of MLE and/or current age.

  9. Sean Smith says:

    The way I do MLE’s is look at the change in AAA to MLB, AA to AAA, A to AA, etc., and chain the results. Without any claims to the accuracy of the results, the attempt is to state what a A player would have hit in the majors that season. There is no attempt at prediction at this stage.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: