An archive of StatSpeak from its days on MVN
April 12, 2009
Filed under chone, Marcel, PECOTA, ZiPS
Are any of these differences between systems statistically significant?
Also, using RMSE as the only measurement of the accuracy of projections is problematic. It may be the single best measure, but using RMSE alone assumes the aim of all these systems is to get the best RMSE, and that may not be the case. Very cautious projection systems will get better RMSEs than systems which try harder to project shifts in performance level, but the latter may be considered better systems if they fulfill their aims.
“Very cautious projection systems will get better RMSEs than systems which try harder to project shifts in performance level”
While that may be true (I don’t know), it didn’t make a big difference here. Marcel didn’t go out and win every category despite being the most conservative of all the systems. Unless I’m misinterpreting your use of “cautious.”
Greg, that’s a good question about statistical significance. I’m somewhat embarrassed to say that I can’t recall how to do statistical significance for the difference between two different root mean square error tests? Does anybody know the formula I should even use for that? I’m having trouble even deriving what it would be.
As far as RMSE, using correlations did not change the answers significantly, so I used RMSE. Over on Tom Tango’s blog, there is a discussion in the thread about this article and about what method to use, and the general consensus is RMSE. Some people think average absolute error might be the way to go, but they seem to think it’s better than correlations.
As far as my personal thoughts on which to use, I see your general point about the goal being to find shifts in performance level and pretty much effectively determine who is underrated or overrated. I see that more as a goal for fantasy baseball– you don’t know want to know how good a guy is, you want to know if he’s better than other people think he is and should you draft him. For professional teams, the answer is a bit different. In that case, you’re trying to get a certain number of wins and approximate how many wins a player gets you, and that determines his worth. In that case, RMSE seems appropriate since it values correct valuations of players.
Fill in your details below or click an icon to log in:
You are commenting using your WordPress.com account. ( Log Out / Change )
You are commenting using your Twitter account. ( Log Out / Change )
You are commenting using your Facebook account. ( Log Out / Change )
You are commenting using your Google+ account. ( Log Out / Change )
Connecting to %s
Notify me of follow-up comments via email.
Blog at WordPress.com.
Get every new post delivered to your Inbox.