OneStat, Take Two
Through some spirited discussions with John from Washington Baseball Blog, I focused my attention away from the K/9*K/BB/(1+HR/9) formulation and towards a statistic based on expected runs saved vs. expected runs allowed based on a pitcher's pitcher-controlled stats (K/BB/HR).
In short, the formula is expressed:
Expected Runs Saved per K * K - Expected Runs Allowed per BB * BB - ExRA per HR*HR
That gets you the amount of saved runs to a team from a pitcher's defense-independent performance.
The only problem was to derive the expected run values of strikeouts, walks, and home runs. I started with the Baseball Prospectus 2004 expected run values by situation (I'd link it, but it is subscriber-only). Using that matrix, I created similar matrices for the expected runs added (or, in the case of strikeouts, subtracted) by the contribution of a marginal strikeout, walk, or homerun.
A strikeout situation is easy; you just take the current value of the situation and subtract out the value of the situation one out later. For example, if a team expected to score 0.8 runs with runner on first and none out, but 0.4 runs with runner on first and one out, I calculated the value of a K in that situation as 0.4 runs saved.
Walks are easy as well; you just take the difference between the current situation after a walk and the current situation without a walk. Thus, if a team expects to score .4 runs with a man on 1st and 1 out, but expects to score .8 runs with men on first and second and 1 out, the value of the walk in that situation is 0.4 expected runs. With the bases loaded, the value of a walk is 1 run.
Homers are a little counter-intuitive. With bases empty, the value of a home run is 1 run (obviously). With runners on, it's a little difference. For example, if there is a runner on 3rd and none out, the expected run value is 1.45 runs. If a batter homers, then the team gets 2 runs, but is left with a situation in which there are none on and none out - a situation with an expected run value of 0.54 runs. So the difference (1.45 - 0.54) must be subtracted out, leaving the value of that home run of 1.08 runs. The calculation is a little jarring at first, until you realize that your team is pretty much going to get that guy in anyway, so the real value you provide by hitting the homer is getting yourself around the bases.
Then I weighted the K, BB and HR matrices for the relative occurence of each cell in the real world. There were 188,539 plate appearances in MLB this year, and 103,387 (54.84%) came with no one on base. I got the plate appearances on a runner-situation basis from MLB.com, although it was not cross-referenced with out-situation. So I had to weight runner-situations by the relative occurence of out situations. Out situations are extremely evenly weighted - 34.5% came with none out, 33.2% with one out, and 32.32% came with two outs. If someone has the weightings of each cell in the 24-situation matrix, I could refine the data further.
The expected run values across MLB 2004 of a K, BB and HR are -.294422, +.327641, and +1.39299, respectively. I plugged these values into the formula, and created a runs saved per 9/IP figure by dividing by innings pitched and multiplying by nine. I calculated the values for all MLB pitchers in 2004 and plotted the runs saved per 9/IP against ERA. Here's the XY scatterplot I got for all pitchers with 20+ innings pitched in 2004:
It's an interesting calculation, and I'll have to noodle it further. For the time being, here are the top 10 pitchers (with 100 or more IP) in terms of runs saved per 9/IP through defense-independent pitching efforts: