Sunday, January 02, 2005

Sabermetric Scoring

I am pretty much a loser. For proof of that, here's what I did on Saturday night: I watched the Braves/Expos game from May 31 on MLB.tv. I wasn't in the mood to watch college football, and wanted to watch some baseball, so I figured $3.95 a month not too much to pay to watch our Nats, 2004 version. It helps that I haven't actually watched the Expos play since 1994, so I have almost no idea who won these games (except for the reasonable expectation that the Expos will lose), and it is almost like they are live.

But because I had to watch it on my laptop, I also opened up a spreadsheet with the Expected Run Value matrix just for fun (further loser evidence). A good example of an ERV matrix can be found in this article. It shows you how many runs on average are scored from each state of a baseball game (e.g. 1 out, runner on first; 0 out, runner on second, etc.) SuperNoVa describes it well in this post, where he explains how you can determine the precise value of a batter's action based on these values. It is very strong proof that bunting a man over to second makes you worse off (except maybe where you are almost certain that the batter will otherwise make an out, i.e. the pitcher), and that you need to steal with 70 percent success rate to make it worth while.

So, as I was watching the Expos and thinking about ERV, it struck me -- why don't we keep track of ERV while we score a game? For each batter, we can record the runs his action created by comparing the ERV for the state before he was up with the one after. For example, for the first batter of an inning, (none on, none out) the ERV is about 0.5 runs. If that first batter gets to first base, the ERV for that state is 0.9 runs, so that batter created 0.4 runs by getting to first, and you could record that in your scorecard. If the first batter gets out, it costs his team 0.2 runs (ERV goes from 0.5 to 0.3). You could print the ERV matrix on a scorecard for easy reference. In the end this really tells you which players contributed to the team's success or failure (I think it is also the basis for play-by-play win shares).

For a standard scorecard given out at most games, you could record the RV in the totals columns on the right-hand side, where I usually record pitch counts.

You can add further nuance by accounting for errors and even great fielding plays. With errors, you can record for the batter the runs that he would have lost had the play been made, but then record (maybe in a separate fielding section of the card) the runs created by the error (the difference between the actual state of the game and the state that would have occurred had the play been made). For example, if the first batter hits a routine grounder to short that is booted, he would get a minus 0.2 in his column, because it should have been an out (0.5 ERV down to 0.3). But you would record a plus 0.6 in the fielding column for the error (0.9 ERV that should have been a 0.3).

I did this for the May 31 Expos/Braves game and it was a lot of fun. Much more fun than counting pitches, which is what I used to do, and much more useful. In fact, in a pinch I'd score a game just with batter's outcome and RV (no basepath action, pitches, etc.), and have a good accounting of the game. You could even do a Manager's Column for the sacrifice bunts, IBBs and other plays where they directly determine an outcome. I also did it for the June 1 game (which is a great one if you haven't seen it). I think I might also work on design of a scorecard to record this stuff.

14 Comments:

At 9:02 PM, Blogger John said...

Actually, you can do better than calculating expected runs for each play, you can calculate the effects each action has on the probability of the team winning the game. This measures the absolute value of actions towards their goal. If a pitcher is pitching in a blow-out and focuses on just getting it over the plate, because he's got a 12-run lead, he's not going to be dinged hard for giving up a homer. But if his team were tied and he gave up the same homer to the same hitter, he'd get a pretty big ding, because it shifts the probabilities more dramatically.

This has been done before (called Player Game Probability). Read An Economic Evaluation of the Moneyball Hypothesis for a recent example. http://papers.ssrn.com/sol3/papers.cfm?abstract_id=618401

I've developed a neat way to apply this stuff that may or may not turn out to be really interesting. I'm currently playing around with the last two years of play-by-play data from Gameday, and will get more serious about it as the season draws near.

 
At 9:07 PM, Blogger John said...

And MLB.TV is definitely cool, not a loser thing, though I'm not so happy with their move towards Windows Media.

 
At 9:13 PM, Blogger DM said...

Yes, I am familiar with PGP, and it does focus more on the ultimate goal of winning the game. But I don't think you can calculate that by hand on paper with a dog in one hand and a beer in another. (If so, I'd love to hear it). The ERV stuff you can, which makes it a good candidate for scoring at a game, and gives you good info that you're not going to get from the stadium or TV broadcast.

 
At 9:30 PM, Blogger John said...

You would need a small booklet of tables, but you can definitely do it. I'm probably going to have a computer do it for me using Gameday data, and email my phone updates after every play...

 
At 9:41 AM, Blogger dexys_midnight said...

There has been a lot of talk over the past couple of decades about how the idea that someone is a "clutch" hitter is a myth. I wonder if this ERV scoring (or the scoring method John brings up) analysis has ever been studied as a way to show or disprove that clutch hitting actually exists. (I remember in his new Baseball Abstract, which I actually used to keep in my office, Bill James had a section on Joe Carter and a clutch hitting study--I don't remember if he said what he used for it as I sit here). It seems to me that if you could map a correlation between, let's say OPS and ERV, then anyone more than a couple of standard deviations off that correlation on a consistent basis would have to be considered a "clutch" or "choke" hitter.

 
At 10:11 AM, Blogger tmk67 said...

The best part about printing the ERV table in the program would be all the boos from the stands once a player breaks for second base on an attempted steal!

It is actually a nifty idea to score based on ERV. You would end up with a number that operates similar to plus/minus for hockey. I like being able to record and credit the throw from right field that prevents the runner from advancing to third, or the cost of missing the cut-off man.

And John, where do I sign up for your SMS in-game probability service? Possibly the best sabermetric hack I've heard of.

Happy New Year, all...Signed Backward K (used to be TMK67, now I am signing to shamelessly plug my Strikeout Blog).

 
At 10:20 AM, Blogger DM said...

Excellent analogy to plus/minus, Backward K, as this is a lot like that, since what we are really recording is marginal runs created or lost.

 
At 10:31 AM, Blogger DM said...

Also, I scored the June 2 game last night using ERV, which really showed that the apparent offensive juggernaut of the Expos (17 hits) was just about as valuable to them as the 4 errors made by the Braves.

http://sports.yahoo.com/mlb/boxscore?gid=240602115

 
At 11:05 AM, Blogger John said...

IIRC, the PGP people looked at clutch hitting, and determined that, with two exceptions, there was no strong evidence that, over a period of a few years, anyone was more likely to do better than his normal stats in the clutch than not.

Backward K, where's your strikeout blog? And I'll keep you posted on the SMS service on National Pastime, when the time comes.

 
At 12:56 PM, Blogger tmk67 said...

John,

Thanks for your interest. It is

http://backwardk.blogspot.com

Still rudimentary (I am using it partially as a way to train myself in HTML). I am currently working on a piece examining dERA and the Davenport "Stuff" ratings (a Three True Values calculation) as a predictor of future effectiveness. That piece is tentatively titled, "Should the Nats Sign Schoenweiss?"

 
At 1:31 PM, Blogger tmk67 said...

IMHO, play-by-play analysis like PGP and your ERV idea is likely to be the most interesting aspect of sabermetrics for the next five years, as we are only now getting enough data to draw firm conclusions. A shame we do not have play-by-play historically, as it could show some interesting things...As in, did Jim Palmer EVER give up a home run when it mattered? And from our parochial interest, I bet it will show Livan Hernandez to be much better than we think (this is my hunch at least.)

 
At 2:56 PM, Blogger John said...

Retrosheet does have quite a bit of historic play-by-play. You can get it for 2002-2004, but in between, there's a pretty big gap.

 
At 7:26 PM, Blogger Yuda said...

I was thinking about this, and I want to clear up some minor details to make sure I'm reading it twice.

Let's say there's runners on second and third with nobody out. Batter hits a single and drives both runs in. You give him 2 (for the RBIs) plus 0.4 (for getting to first base), right?

How do we handle if, say, the runner on second base gets thrown out at the plate?

 
At 9:18 PM, Blogger DM said...

This is the counter-intuitive part of this scoring. The batter does not get credit for 2 full runs, because he didn't get get those guys on base. You start by strictly comparing the "before" and "after" states. In this case, before the single, the team had a ERV of 2.1 runs. After the single the situation is no out and a man on first, which has an ERV of 0.9. So the batter sort of put the team in a "worse" state, (-1.2). But he did actually knock in two runs, so you add them back in, for a final RV for this at bat of 0.8.

In essence, the batter gets credit for 0.8 of those 2 runs, because some credit has to go to the runners who got there in the first place.

As for what you do if the runner on second is thrown out, I think you have to make a judgment here. You can do it simply and say the ERV went from 2.1 to 0.5 with one run scored, so it is actually a minus 0.6 (1 run minus 1.6) for the batter. I don't think this is fair to the batter, so I give the batter credit for scoring that guy (0.8). But now we have to account for the difference between him scoring (0.8) and him being thrown out (-0.6) or 1.4 RV. In essence, throwing him out saved 1.4 runs, because it stopped one run and added an out, which reduces the chance for more runs in the inning.

At this point I haven't quite figured out what I would do. I think it makes sense to apportion the 1.4 between the outfielder, cut-off man catcher and runner according to their responsibility. If trying to score was a bonehead play that any fielder would have stopped, I think it all should be deducted from the baserunner. But if it was like the Derek Jeter play in 2001 ALDS, then the fielders should get most of those runs saved. There may be a way to calculate this more precisely, but I haven't thought about it enough.

 

Post a Comment

<< Home