# On throwing to first, Part II

In the first part of this series, I presented a list of the runners that drew the most throws per time that they were on first.  I don’t think it’s much of a coincidence that the list contained a bunch of known speedsters and stolen-base leaders.  In fact, far from being a deterrent to running, throws to first seem to go to those who actually end up running.  But, throwing to first cut stolen base success rates by about 11%, so the pitcher must be doing something right.
This brings up a small problem: there’s a selection bias in who gets a throw.  In fact, the average number of throws seen by a runner per time on first and the percentage of situations in which they received a throw are both correlated with the percentage of opportunities in which they made any attempt at second (excluding 3-2 counts with 2 outs, but including those in which the pitch was fouled off), and their Bill James speed score.  The correlations were all around .75.  This isn’t surprising.  Even looking at the list from my first article (or watching a few games), the bias is pretty apparent.  Pitchers are worried about the fast runners, as they should be.  The fast runners are the ones who attempt to steal bases most often (again, a significant correlation of .75).  So, from here on out, when discussing outcomes, it is important to control for the runner’s speed.
So, let’s take a look at whether throwing to first predicts stolen base success rates when controlling for speed.  Again, using a binary logit regression predicting stolen base success (vs. being caught stealing), I entered speed scores and the dummy coded (that is, 0 for no, 1 for yes) of whether a throw was made or not.  The model had a Nagelkerke r-squared value of .076 (not overwhelming), although both factors were significant.  For those unfamiliar with binary logit, it’s not set up to tell by how much the probabilities were reduced by the throw over (binary functions don’t follow the normal distribution, and the function itself is an exponential one that predicts to an odds ratio), but it does give us a little bit of information.
There’s general agreement from the long history of run-expectancy research that a would-be base-stealer needs to be successful about 70% of the time to break even.  Given the results of this equation, we can calculate how fast a runner would have to be (at least in this model) to have a 70% success rate both when a throw is made and when it is not.  The results:  When a throw is not made, a runner with a speed score of 4.18 (Think Andruw Jones) is expected to make it 70% of the time.  When a throw is made, the score needed rises to 6.75 (Think Felipe Lopez/Orlando Cabrera).  Holding a runner really does slow him down quite a bit, most likely by keeping them a step or two closer to first base and a bit more hesitant in trying to get a jump.  (If you’d like to know all the gritty details of how I calculated these particular numbers, e-mail me.)
What about other “stolen” bases?  Does keeping a runner close to first mean that he’s less likely to take third on a single or home on a double?  I isolated all of the cases in this data set in which the batter singled.  Because a runner can go on contact with two outs, I further split the data to look at two out singles vs. one or zero out singles.  The dependent variable was whether the runner ended up on third (success, at least from the runner’s perspective) or whether he landed on second or was thrown out (failure).  Again, I used binary logit regression with speed scores and whether a throw to first had been made during the at-bat as my predictors.  Turns out that with less than two outs, a throw to first makes a runner more likely to take third, even controlling for his speed score.  With two outs, there is no effect.  Perhaps with less than two out, the runner is in a situation where he is more likely to take chances on going to third on a single, or perhaps by stealing a base.  The pitcher compensates by being more likely to throw over.
I repeated the same analyses for situations in which the batter doubled (to see whether the runner “stole” home), much as above.  With fewer than two outs, there is no effect of throwing over, but with two outs, there was an effect of throwing over.  Throwing to first made it less likely that the runner would be able to make it home.  Strange and contradictory findings, these.  Because of the inconsistent nature of these findings, I decided not to pursue these additional “stolen” base situations further.  As the primary purpose of the throw to first is to keep the runner from stealing, I focused exclusively on that.  (I suppose I could/should do the rest, but not today.)
Because of the nature of binary logit regression, to go any further, we’re going to have to standardize our runner.  That is, we’re going to pick a speed score for a hypothetical runner and use it for the rest of our calculations.  I looked at the speed scores for all batters in 2006 with at least 100 AB.  Because the fast runners are the ones drawing the most throws, it makes sense to pick a fast runner, so I looked for the score that would represent the 90th percentile (better than 90%) of all speed scores.  Among those players with 100 AB or more last year, that score was roughly 6.56, which is somewhere between Eric Byrnes and Derek Jeter
Using this new speed score (or putting Derek Jeter on first, if you will) and the regression equations generated above, I calculated the probabilities of a successful stolen base attempt with (69.31%) and without (82.10%) a throw to first.  Then, I multiplied each of those probabilities by the run expectancy of the possible outcomes (weighted for number of outs in the inning and whether there was a runner on third or not), using the run expectancy chart for 2006.  For each attempted steal (where the ball was not fouled off), run expectancy was reduced by a weighted average of 0.06 runs when there had been a throw made to first.  As a whole, runners from the 85th to the 95th percentile (the decile around Jeter) recorded a stolen base attempt (either SB or CS, as pitches fouled off will not change run expectancy) in 25.2% of the situations in which they were on first with second base open.  So, the weighted reduction in run expectancy by throwing over to first at least once (whether or not the runner goes) is 0.0166 runs per situation.
There is a modest correlation between speed score and the rate at which players find themselves on first in this situation (.451), so those with wheels really are more likely to be in this position.  (Generally, they hit near the top of the lineup and are singles hitters.)  So, how many runs does throwing to first save over the course of a season, in terms of quieting down the opposition’s base-stealing abilities?  In 2006, there were 34,942 cases of a runner on first with second base open, covering 43,995 plate appearances.  Let’s assume that they are equally distributed among each of the 30 teams, meaning that each team had 1164 situations apiece.  We also know that a throw was only made in 25.6% of all possible situations, or roughly 298 per team.   Those throws are most likely to go to fast runners, such as Jeter.  Assuming that the estimate of the effect for Jeter is good enough to serve as an estimate for everyone who might draw a throw (I know, it’s somewhat dubious, but it’s close enough for government work), then teams, in their current habits are shaving an average of 4.95 runs each off their runs allowed total for the year, just accounting for the effects on stolen base success rates.
Then there’s the issue of actually picking a runner off with the throw, but also of throwing the ball away down the right field line.  In 2006, there were 100 throws to first on which an error was recorded which allowed the runner(s) to advance.  Over the course of the season, they increased run expectancy by a league-wide grand total of 28.92 runs.  There were also 276 runners picked off in 2006, and these reduced run expectancy by an aggregate of 126.22 runs.  So, the net decrease in run expectancy from pickoffs (and errors leading to advances) is 97.3 runs, or about 3.24 runs per team over the year.
Overall, throwing to first to hold the runners saves a team a net total of eight runs (8.19 to be exact) over the course of a season by controlling the running game.  In Part III, I’ll explore whether throwing to first has any effect on the batter/pitcher matchup.