Gameday Quirks

I was “watching” the second and final game of the Yankees-Red Sox series on Tuesday night on Gameday, not paying much attention. I hate that I have to watch games on gameday, by the way. I don’t have a TV in my dorm because I just can’t justify spending $50 a month to watch it. Yes, it costs $50 a month, you read that right. I know of only about 2 or 3 people who have TVs in their rooms, and I envy them. But back to the point…

I went back to look at the pitch by pitch for some of Josh Beckett’s at bats because I wasn’t really paying much attention when they happened live, and noticed something odd. Look at this (click the link, I still can’t get the file upload to show after about ten tries), which is a partial screenshot of the gameday interface.

See anything weird? What about pitch 3 is different than pitch 4? I have no idea what “BRK” means, other than it stands for “break,”* but that number is the same on both, and so is the PFX column. The only difference is the speed of the pitch, which is slower on the 4-seam fastball (pitch 3) than it is on the cutter (pitch 4).

*Is it weird that I know all of the advanced, analytical stuff for pitch f/x but don’t know the more basic stuff that they show on gameday? Isn’t gameday supposed to make it easier to understand? I have NO IDEA what “break” means and it frustrates the hell out of me.**

**Also, I usually find it weird when writers use the Pozterisk, but it was the only way I could write that aside and still keep this a semi-coherent post.

So why is the faster pitch called the cutter? They have the same exact movement on them, so if anything, the slower one should be called the cutter. Am I the only one who is bothered by this? Hello?

Allow me to quote from a THT article by another StatSpeak alum, Mike Fast (whose articles I miss dearly):

Basically, he uses a neural net algorithm that takes the location,
velocity and acceleration data from PITCHf/x as inputs, weights the
various inputs with a hidden layer, and outputs the confidence that the
sampled pitch matches each pitch type. A 1.5x multiplier is applied to
the confidence for pitches that are known to be part of the pitcher’s
repertoire, and the pitch with the highest output confidence is
reported in Gameday.

The fact that he says “basically” and “neural net algorithm” in the same sentence amuses me. I don’t know, but this seems way more complicated than it has to be. And with the example above about cutter vs. fastball, I don’t know how much more accurate it would be than a decision tree. A decision tree, in the tiniest of nutshells, asks, “Does this pitch behave like a fastball? If not, does it behave like a slider? If not, does it…” and so on. Maybe it’s just me, but I think the mistakes on pitch identification would be reduced using this method. Yes, the difference between cutter and fastball is very small, but they are two distinct pitches (if you don’t believe me, click this link to see the difference). As the quote says, the gameday people pretty much know the pitcher’s repertoire in advance, and if they don’t they can either get the general idea from FanGraphs or a scouting report in the case of minor leaguers.

By now you may have figured out that this post won’t have much of a conclusion. I just wanted to point out the strange outputs that gameday occasionally gives, with some musings on the subject. Feel free to post any other gameday oddities you’ve noticed or qualms you may have in the comments section.

Advertisements

3 Responses to Gameday Quirks

  1. Mike Fast says:

    Dan…lots of legitimate points there. I have a few random thoughts/responses.
    1. I’m pretty sure both of those pitches by Beckett were cutters. Looking at the data at BrooksBaseball.net, Beckett threw 8 cutters, and the Gameday algorithm only picked up on 4 of them.
    2. Gameday ought to throw away that BRK number. It’s not particularly helpful. As for an explanation of what it is, try this one on for size:
    http://baseball-fever.com/showthread.php?t=77550&page=2
    3. I’m not a proponent of using a neural net to classify pitches; however, although classifying about 80% of pitches correctly is fairly easy, getting to 90% accuracy or better with real-time classification is a lot tougher.

  2. Dan Novick says:

    Ah, ok. So now that I know what break means, I think I find it even less useful.
    If you can explain, what would you do to classify pitches better in real-time?

  3. Mike Fast says:

    Classifying pitches in real time and getting good accuracy is not trivial. One can get to around 95% by applying some clever corrections for arm angle and fastball speed and using a Bayesian algorithm rather than a neural net, but I think higher than that depends upon giving the system some foreknowledge of each pitcher’s repertoire.
    The downside of doing that is that you may not do very well at detecting when a pitcher changes the pitches in his repertoire, which happens enough to be a problem. Plus, it’s a bit of work to get accurate repertoire information for ~400 pitchers in the first place, not to mention keeping up to date with rookies.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: