How do I become a Sabermetrician?
June 30, 2008 7 Comments
Occasionally, I get e-mail from someone who reads StatSpeak or some of the other writings that I sprinkle into the blogosphere, and my favorite always goes something like “I’ve read a bunch of stuff around and I’m interested in learning how to do my own Sabermetric research. Can you help me?” Yes, I can. I’m a therapist by training, and do you ever need help!
So you wanna be a Sabermetrician, eh? Well, first you should know that there’s no school for Sabermetrics (well, there is a class out there…) We’re all self-taught in one way or another, mostly in the form of guys using skills from their day jobs to study baseball. It’s part of the charm of the field. Most of us have respectable day jobs and we use this just to pass the time. Just about anyone can get themselves a free blog and start posting their work. That’s how I started out. So if you want to be a Sabermetrician, then by the power vested in me by no one in particular and the state of confusion, I now pronounce you an official Sabermetrician. The certificate’s in the mail.
Now of course, you don’t want to be just any Sabermetrician. You want to be one of those cool guys that actually gets hired by an MLB team someday. You want to publish a book. You want to be the next big thing. I suppose I’m not any of those things either, but I can give you a few tips on how to get started.
- I can’t stress this enough. There are far too many junk stats out there. A junk stat goes something like this. “I just came up with the formula HR x 15 + RBI x 7 + HBP x 4.5 + SLG x 90 based on how important I thought each one was” I’ve heard that particular reasoning far too many times. There are formulae that look like that, but they are developed using a very specific process. I’ve seen several cases of someone posting one of those, being ignored, and then disappearing never to be heard from again. I’m guessing that they were frustrated that no one saw their brilliance. Don’t start with a junk stat and be frustrated. There is good work to be done and you might be the one who can do it. Read on.
- Spend a few months reading Sabermetric work. There are plenty of good sites out there. We all link to each other. Read their stuff. Read the comments. Read Baseball Between the Numbers. (When you get advanced enough, read The Book: Playing the Percentages in Baseball) Go over to the Baseball Fever boards and read the discussions that go on over there. Participate.
- One of the things that can frustrate newcomers is the thought that their brilliant ideas that came to them in the middle of the night… have already been studied by someone else. We’ve all done studies on the illusion of clutch and why RBIs are a bad stat (and bad grammar). They’ve been studied to death… unless you can take a little more nuanced look at things. And to do that, you’ll need a good understanding of what research has come before you. Probably the biggest mistake that people make is to try to jump into Sabermetrics with both feet, not really knowing what they’re doing. Slowly, my friend. Slowly.
- You’ve probably already read Moneyball, which should give you a broader idea of what’s going on. We are not in the business of making baseball more “pure” or more enjoyable or more special or more cosmic or more whatever. (Do watch Field of Dreams, because it’s a good movie… but understand that’s not what we do here.) Sabermetrics is the scientific method applied to the goal of winning a baseball game/championship. I’ll type that again. Sabermetrics is the scientific method applied to the goal of winning a baseball game/championship. May I recommend that you have some background in the scientific method before you begin. I’m not saying that you need to be a Ph.D. level physicist, but simply that you need to understand how science works. Yes, we spend a lot of time debunking some sacred conventional wisdom. Be prepared to have some of your basic beliefs about baseball challenged.
- It’s good to be a fan. In fact, I recommend that you watch/listen to/go to as many baseball games as you can. It’s OK to have a favorite team and to occasionally be irrational in evaluating them, because you love them. Ask me about growing up with the Cleveland Indians some time. But, with that said, understand that science is a dispassionate process. We go into a situation not looking to confirm that so-and-so is the best player in baseball, but we come up with a reasonable definition of things and let the numbers fall where they may. Sometimes that means realizing that the numbers don’t bear out what you used to think as a kid (or as a fan now). That’s actually a lot harder to come to terms with than you might imagine. If you can get past that, you’ll make a fine Sabermetrician.
- Are you in college? (Surprise! A lot of the guys who travel in these circles are in/barely out of college themselves!) Sign up for a class in statistics. Trust me on this one. Even if you’re an English major, it’ll come in handy both in Sabermetrics and in the rest of life. Plus, it’ll teach you a little bit of how to use some of the computer programs that Sabermetricians like to use. And computers make life so much easier.
- Draw from your background. I’m a psychologist by training. Most of the questions that intrigue me center around “Why did he do that?” That’s what I’ve been trained to look for in life. You may think that your chosen field has nothing to do with baseball, but you’re wrong. Sure, there are a lot of guys who are physics/math majors who look at algorithims for figuring out what a player will do next year, and that’s fine. I’m personally waiting for a good Sabermetric sociologist to come along to figure out why it is that baseball teams and society in general are so poor in assigning value to baseball players.
- You do not need a doctorate in math. Sure, the more analytical techniques you know, the more complicated questions you can ask. And you do have to know some statistical/analytical techniques, but some of the biggest discoveries in Sabermetrics involve little more than knowing what a correlation is (e.g., DIPS) and are simple to the point of elegance. The math can be taught. The real work in Sabermetrics is perceptual and creative. It’s in seeing the game in a slightly new way and understanding how that insight can be measured and then tested. The rest is just an engineering problem.
- Keep a running idea list of things that you want to accomplish and ideas that you’ve had. Any time I have an idea pop into my head, I put it into my special file. When I need a project, I go back and pick one that sounds fun. Even if you don’t know exactly how you’d do it, if an interesting question or idea occurs to you, write it down.
- You’ll notice that I haven’t specifically pointed you to any how-to guides. The reason is that you’ll come across those in the process of reading through things. And you’ll also learn what other statistical tricks that others use by osmosis. Don’t focus so much on the actual technical details of how Pitch f/x works or what’s available from Retrosheet. If you really get restless, download some Retrosheet files and play around with them, but you’ll probably learn naturally just by doing some reading.