Tuesday, December 07, 2004

Quiet Nats Blog

One of the reasons Nats Blog went dark yesterday is that DM's comment on my post about free agent pitchers got both me and DM thinking about a way to put a pitcher's strikeout, walk and home runs allowed totals into one simple statistic (which I have have been calling OneStat).

Both DM and I have been noodling the idea and we've come up with a rough version of it, which we need to back-test against data. But it looks something like this:

K/9 * K/BB
---------------
1 + HR/9

The resulting number shows that (no shocker here) that Ben Sheets and Randy Johnson were the best starters in the National League in 2004. But it also reveals some non-obvious comparisons. Odalis Perez had a 3.25 ERA and an ERA+ of 127 in 2004. At the same time, Kris Benson had a 4.31 ERA and a 97 ERA+.

But the value of Perez vs. Benson is not so easy. Here are some key stats from both of them last year:

NameIPHRBBKERAERA+OneStat
Perez196 1/326441283.251277.79
Benson200 1/315611344.31977.9


ERA and ERA+ wise, it looks like Perez was a much better pitcher. But dive into the peripherals, and you'll see that Perez gave up 11 more homers (pitching at Dodger Stadium, nonetheless), and struck out 6 fewer folks. Benson had 17 more walks, but that simply doesn't explain the 1 run difference in ERA and 30 point difference in ERA+ (one would think the 11 fewer home runs would more than compensate for the extra 17 walks).

Anyway, DM and I continue to noodle this, but it is something worth further thought, and we would appreciate any input you have.

11 Comments:

At 11:21 AM, Blogger John said...

What about what you're trying to do is different than DIPS (defense independent pitching stat)?

 
At 1:04 PM, Blogger SuperNoVa said...

John,

Can you get me a link to a site that has DIPS? I can't seem to find dERA or any other DIPS at baseballprospectus.com. I've got a subscription, so if you post a deep link, I can get to it.

 
At 1:11 PM, Blogger SuperNoVa said...

OK, John, I just found dERA for Benson and Perez:

Benson: 4.45 PIT / 4.67 NYM
Perez: 3.70

Based on their peripherals, wouldn't you expect their to be little or no difference in their dERA's? What is wrong with the picture here?

Doesn't the OneStat show that they are much closer in terms of performance independent of defense?

 
At 8:02 PM, Blogger DM said...

John, at bottom what we are doing is the same as DIPS, and based on the same theory. What I wanted was a simple stat that looked at only SO, BB and HR, that would not be hard to compute on the fly when looking at a pitcher's stats. As I understand McCracken's DIPS and even dERA, they use the principles to come up with a set of pitcher stats adjusted to be defense independent. I just want something easier to calculate at hand.

Ever since I read on DIPS, I've made a point to focus only a pitchers SO, BB and HR when reviewing a guy, just to get a sense of how useful DIPS is. If it really is the case that most of pitcher's effectiveness can be isolated in these stats, then that is all we should need to look at to get a sense of what kind o pitcher they are.

I ran the OneStat for the 2004 Phils (my home team and one I followed pretty closely all year thanks to DirecTV and MLB.tv). The rank it gave was pretty consistent with how I viewed the value of each pitcher on that roster. Sure, that's anecdotal, but all I want is something in the ballpark.

 
At 9:04 PM, Blogger John said...

Try to formulate in english what your stat is measuring. It's a proportion, but what is that proportion telling you? What does the scaling of K/9 by K/BB do? This is the same as:

K/IP * 9 * K/BB
-----------------
1 + HR / IP * 9

Which is the same as:

K^2
----------------------
IP*BB + HR*BB
-----
9

Why is this a useful proportion? What exactly does this measure? Why would I scale homers by the number of walks? There's no clear reason why this is any more reliable an indicator than removing the K^2, which would end up giving much different results on Perez vs. Benson for example if you don't square the strikeouts:

Perez: .061
Benson: .056

All the sudden, Benson has a smaller number than Perez. It seems to have no more or less meaning than your formulation.

The DIPS stats themselves don't try to factor in the small bit of impact being a fly-ball or ground-ball pitcher have, but what they measure is clear, and why they measure how they measure is clear. Note that dERA at Baseball Prospectus isn't DIPS ERA... I haven't ever seen the exact formulation, but it's supposedly better than DIPS ERA, I assume that it factors in the effect of flys vs. grounders or something like that.

Also, DIPS style stats are incorporating other defense-independent things you're not, including intentional walks and hit by pitch.

In short, even if you seem to be getting data correlated to how you view pitchers, there is no science that I can see in how you're combining the various components. The existing statistics do the same thing, and it's clear what they're doing, which is adjusting existing stats that already have meaning based on quantifible factors. dERA is the earned runs per inning, once you take into account the quantifiable impact of defense, which is complicated to do because of the impact of parks, etc., but it is clear what quantifiable effect each part of the equation has (i.e., what they are measuring each step of the way, and how it impacts the stat they're adjusting).

 
At 9:08 PM, Blogger John said...

Typo: When I said "by removing the K^2", I meant "by removing the square operation on K". That is, calculate

K
-----------------
(BB*IP)/9 + HR*BB

Or, in your formulation, this would be:

K/9 * 1/BB
----------
1 + HR/9

 
At 11:31 PM, Blogger SuperNoVa said...

In English, OneStat is measuring absolute strikeout rate (K/9) against relative strikeout to walk rate (K/BB) against absolute home run rate (HR/9). Since there is a possibilty of zero home runs, we add 1 to HR/9 to ameliorate the Div/0 problem. We also add 1 because very low home run rates (.5 HR/9 or less) have a distortive effective because dividing by a number less than 1 has a multiplicative effect.

The purpose is to measure all three stats at once. It is not to define the quality of a pitcher (although better pitchers would have better OneStats)...it's just to combine the stats. That's all.

 
At 3:38 AM, Blogger John said...

SuperNova: That makes no sense... the english doesn't even map to the formula.

Clearly you're scaling the strikeout rate by the Ks per walk rate. What good is that? Since you are SCALING here, you're basically implying that there is a relationship of scale here that is useful, but you don't indicate what it is. Plus, you made it a quadratic relationship, and you don't really give any reasoning why that would ever make sense. Then, why you would want to measure this as a rate that is arbitrarily defined is even less clear. Hint: even if you were measuring something in terms of how often it occurs relative to a home run, there's no reason not to let infinity be a possible value. It becomes less intuitive and useful to add arbitrary constants in a denominator. We don't formulate ERA as earned runs per (IP + 1), for instance.

Sure, the fact that you wrote it down indicates there is a relation... the equation defines the relation. Yet, you have a stat that:

1) Is decidedly non-linear for no obvious reason.
2) Maps to nothing that you can explain easily and intuitively, even if you were to make it linear.

I could pick apart the ad hoc nature of the construction further, but hopefully you will step back and see that I'm right with regard to your stat being poorly designed, instead of clinging to something that you like simply because you were involved in defining it.

If not, I'm not going to argue this any further. I intended to give constructive criticism, and have spent a lot of my time doing it, and have been pretty patient about it, even continuing to provide further detail, when I could have easily just gone on to other things, and let other people hand your ass to you when you try to publicize your stat in the real world.

Considering the time I've put into it, and particularly since this is a pretty poorly traveled corner of the net, if you think about it, I'd hope that it's reasonably clear that I'm not trying to make you feel personally assaulted because your stat wasn't top notch. I'm trying to help you learn and improve your thinking. DM seems to have responded well with his latest thinking.

Yet, you've been overly defensive about the issue, even saying that BP has a crap stat without really trying to dig into what the number may actually mean. You've not even really acknowledged that your formulation is basically an unstructured

 
At 10:17 AM, Blogger SuperNoVa said...

Jeez, John, now you're just getting insulting. You call it a crap stat without any testing of the data at all. It very well may be a crap stat! I agree! It may be the worst thing ever conceived. But I've played around with and my initial, non-scientific reaction is that, hey, it seems to fit reality OK. We are at the point of the hypothesis stage. Our hypothesis is that OneStat (we really need a new name) explains reality well and is a fair combination of three (or 4 if we include HBP) defense-independent statistics. The burden is on US to prove the hypothesis, which we haven't done, I agree. This idea is THREE DAYS OLD. Did Bill James put his Win Shares out to the public 3 days after he drew up his first formula for Win Shares?

Have you performed an analysis to check whether OneStat, for example is a better predictor of the next season's ERA than, for example, dERA? Nope, neither have I. But I don't think it should be dismissed (nor should it be embraced) without further analysis.

Moreover, are you not troubled by the 0.70 difference in dERA between Benson and Odalis Perez? I certainly am. It doesn't explain reality very well in that particular pairing. Of course, there are thousands of possible pairings from which assessments of reality could be drawn - the Perez/Benson problem is not the only one out there. There may be a reasonable explanation for dERA's superficial failing. And it's true, I don't have a deep understanding of dERA; there is no real explanation/equation for dERA at BP.

 
At 10:56 AM, Blogger John said...

You've made no attempt to explain why there is this quadratic relationship, you can't explain why you're doing scaling, and you've already provided data that demonstrates it doesn't fit on the ends, and I've provided data on Perez vs. Benson that suggests it's not getting the right answer in the middle either. Surprise, surprise, it's not worth my time to perform a regression analysis, because there is no valid hypothesis here with evidence to support it. You're showing me a formula with a not-very-concrete hypothesis and data that doesn't support it well.

Sorry if I got pissed off, but I feel like I've spent a lot of time on a volunteer basis, and you've been too defensive about it to make a serious attempt to understand it. And, I was tired. No excuse, but I still apologize.

If you understood my analysis of the .70 difference, you would see that it makes a ton of sense. The ERAs alone for the two guys are too far apart. But then you factor in parks and they get a bit closer, as expected. Then, there's a huge jump of almost half a run when you factor in the Dodgers' superior defense. How does that show dERA sucks? By what basis do you believe that, all other things equal, over the course of an entire year, the difference between the two guys would be less than, say, half a run per game?

 
At 10:58 AM, Blogger John said...

Actually, it's only a quarter a run, which is still a pretty big jump and says a lot for the Dodger defense. I actually get .82 for the difference. But the difference in ERA+ when you convert back to an ERA number is 1.04, and the difference in raw ERA is 1.06, if you're reading it here first, instead of the other thread.

 

Post a Comment

<< Home