Different Factors For Different Folks Part I

Long time Pirates announcer Bob Prince used to say, especially after a Willie Stargell blast, “That would have been out of any park, including Yellowstone”.

I think it is a safe assumption that players who hit a higher percent of homers also, on average, hit their homers farther. (Perhaps in Part II I will correlate HR% with HR distance from hittrackeronline.com). Anecdotally, in 2007 Prince Fielder’s 51 homers had an average distance of 410 feet. None of Jimmy Rollins’ 30 HRs went that far. It shouldn’t matter as much to Fielder, as compared to Rollins, which park he hits his fly balls in, as most will go yard anyway. The question is, can different types of hitters have different homerun park factors?

More than 25 years ago, when Tom Tippett was publishing his “Pursue the Pennant” table-top game, there were two types of HRs that could be rolled. A straight HR didn’t depend on the ballpark, it was gone. The conditional HR required another dice roll and a lookup on the ballpark chart to see if it cleared the fences. If you hd someone like Mike Schmidt or Dale Murphy, playing in the Astrodome didn’t matter as much. I realized that proper proportioning of these would be able to recreate a player’s home/road homerun ratio.

It’s been on my to-do list for awhile to run park factors on selected subsets of players. I still haven’t done it yet, that will be Part II. This past wekend I did download the 2008 version of the Japanese league statistics. Of course, I wanted to go ahead and run projections on all the batters playing in the Land of the Rising Sun. My current method of calculating translation factors is to find which players played in the major leagues and also in the league to be evaluated. I had nine Japanese players who had come to the U.S., and a handful with enough major league experience who had headed the other way. Despite a relatively small sample size, the projections for the Japanese born players matched well to their stats when playing here. However, when calculating MLEs for those wo went to Japan, the results almost always well under their performance here. I needed to fix this before I proceeded with projecting everyone in the league.

The major league average of homeruns per balls hit fair (ab-so+sf) the past few years has been .039. The only Japanese born player to exceed this in his U.S. career is Hideki Matsui’s .043. Then we have Kenji Johjima and Tsuyoshi Shinjo at .028, Kosuke Fukudome .022, Kazuo Matsui .017, Akinori Iwamura, Ichiro and So Taguchi at .015. As a group, which is well below major league average in HR%, they had a Japan to MLB HR factor of 2.00. 30 homers in Japan would translate to 15 in MLB.

This time, instead uf using the limited amount of players who had enough MLB playing time, I decided to use MLEs calculated from the non-Japanese leagues and compare those to those players stats in Japan. Now I had a total of 106 players in the study, from the mid 1990’s to 2008. I seperated the batters into five grades of HR%, with A being greater than .065, B from .050 to .065, C from .030 to .050, D from .016 to .030, and E from .000 to .016. One bucket had unadjusted career statistics in the Central and Pacific leagues in Japan, the other U.S. MLEs, and these were weighted by the smaller of the plate appearances. Then the two buckets were summed and grouped by HR Grade, with the factors being the Japanese totals divided by the U.S. totals.

Grade

BHFw

SDTf

SIf

DOf

TRf

HRf

SHf

A

5536

0.98

1.08

0.83

0.46

1.14

0.25

B

16069

1.03

1.05

0.92

0.31

1.39

0.23

C

22237

1.05

1.02

0.97

0.43

1.66

0.37

D

18813

1.06

1.01

1.01

0.58

1.82

0.69

E

6920

1.02

0.98

1.19

0.56

2.27

1.13

ALL

69578

1.03

1.02

0.98

0.50

1.55

0.68

Batters with the highest HR% saw only a 14% increase in Japan, while those in the lowest group saw a 127% increase! Group A also had more singles and fewer doubles – I will guess that the outfielders played deeper, not allowing many balls over their heads, while allowing more singles in front of them. Power hitters bunt much less often when in Japan, but the non-HR hitters bunt more.

One other thing I was able to do was group the data by Japanese team, giving me some team park factors. There’s 106 players for the 16 teams, small enough per team to probably have some biases, but the numer of weighted plate appearances per team is fairly good. This is a first, indirect attempt at Japanese park factors, so don’t take these as 100% accurate. These values are set to 1.00 being the average for all Japanese parks. They are not scaled to U.S. parks.

Team

PAw

SDTf

SIf

DOf

TRf

HRf

Chiba, Pacific

9459

1.01

0.95

1.18

0.99

0.93

Chunichi, Central

7329

1.05

1.04

0.86

0.83

0.98

Fukuoka, Pacific

11662

1.01

0.99

0.99

1.44

0.90

Hanshin, Central

6133

0.99

0.99

1.03

1.03

0.91

Hiroshima, Central

6569

1.08

1.04

0.87

0.81

1.14

Kintetsu, Pacific

3298

1.04

0.99

1.01

1.00

1.21

Nippon, Pacific

9111

0.95

1.01

1.00

0.65

0.85

Orix, Pacific

15160

0.99

0.98

1.08

0.98

1.00

Seibu, Pacific

8805

0.96

1.00

0.98

1.51

0.97

Yakult, Central

8028

0.99

1.02

0.97

0.76

1.09

Yokohama, Central

3736

1.01

1.03

0.94

0.95

1.17

Yomiuri, Central

6696

0.97

1.01

0.95

1.41

1.23

Next I will do the park factors for MLB stadiums, from RetroSheet play by play data 1953-2008, filtered by HR% of the batter. With a deal more player seasons, I hope to be able to generate many more groups with smaller spreads, perhaps .005 for each group.

What I haven’t quite figured out yet is how to apply this new knowledge to the normalization formula (suggestions are welcome). In the study, I used the U.S. MLE HR% to generate the grades, but in fact that is our unknown value. The HR park factor for any hitter is going to be a function of the overall factor of that park as well as the HR% of the batter. For Japan, Grade A is 73% of the overall factor, while Grade E is 146%. Will this hold fairly constant in a larger study? Should we use the single season unadjusted HR%, or a Marcel weighting of that and past seasons, regressed to the league mean?

These preliminary results are right in line with my hypothesis that the higher a player’s HR%, the closer his personal park factor will be to 1.00. In these numbers for Japan, a Grade A HR hitter regress the overall park factor 75% back to 1.00. For any given pf, this could be expressed as pf – (pf – 1) * x

Still questions to be answered, but I’ve been quite excited by the results so far. The full set of data can be viewed in this spreadsheet at Google Docs.

Advertisements

3 Responses to Different Factors For Different Folks Part I

  1. Derek Carty says:

    “The question is, can different types of hitters have different homerun park factors?”
    Brian, I’d definitely recommend checking out Greg Rybarczyk’s article in the 2009 THT Annual. It deals with this exact question.

  2. Brian Cartwright says:

    I’ve had some conversations with Greg over the last year os so…I will have to get the book and check it out.

  3. KJOK says:

    Brian:
    Great stuff as always!
    “However, when calculating MLEs for those wo went to Japan, the results almost always well under their performance here. I needed to fix this before I proceeded with projecting everyone in the league.”
    I didn’t try to do ‘personal’ HR factors, but you might want to read my articles on Japanese MLE’s, at http://www.seamheads.com, as I addressed this issue.
    I would also recommend using more players in your sample. I have a list of hundreds of players who have gone between MLB and NPB, and vice versa, depending on how many years you want to go back.
    “What I haven’t quite figured out yet is how to apply this new knowledge to the normalization formula (suggestions are welcome). In the study, I used the U.S. MLE HR% to generate the grades, but in fact that is our unknown value. The HR park factor for any hitter is going to be a function of the overall factor of that park as well as the HR% of the batter.”
    If I understand what you’re trying to do here, I think you need either an iteration process, or some type of multi-variable matrix algebra, to solve this.
    THANKS,
    KJOK

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: