Sunday, January 02, 2005

Sabermetric Scoring

I am pretty much a loser. For proof of that, here's what I did on Saturday night: I watched the Braves/Expos game from May 31 on MLB.tv. I wasn't in the mood to watch college football, and wanted to watch some baseball, so I figured $3.95 a month not too much to pay to watch our Nats, 2004 version. It helps that I haven't actually watched the Expos play since 1994, so I have almost no idea who won these games (except for the reasonable expectation that the Expos will lose), and it is almost like they are live.

But because I had to watch it on my laptop, I also opened up a spreadsheet with the Expected Run Value matrix just for fun (further loser evidence). A good example of an ERV matrix can be found in this article. It shows you how many runs on average are scored from each state of a baseball game (e.g. 1 out, runner on first; 0 out, runner on second, etc.) SuperNoVa describes it well in this post, where he explains how you can determine the precise value of a batter's action based on these values. It is very strong proof that bunting a man over to second makes you worse off (except maybe where you are almost certain that the batter will otherwise make an out, i.e. the pitcher), and that you need to steal with 70 percent success rate to make it worth while.

So, as I was watching the Expos and thinking about ERV, it struck me -- why don't we keep track of ERV while we score a game? For each batter, we can record the runs his action created by comparing the ERV for the state before he was up with the one after. For example, for the first batter of an inning, (none on, none out) the ERV is about 0.5 runs. If that first batter gets to first base, the ERV for that state is 0.9 runs, so that batter created 0.4 runs by getting to first, and you could record that in your scorecard. If the first batter gets out, it costs his team 0.2 runs (ERV goes from 0.5 to 0.3). You could print the ERV matrix on a scorecard for easy reference. In the end this really tells you which players contributed to the team's success or failure (I think it is also the basis for play-by-play win shares).

For a standard scorecard given out at most games, you could record the RV in the totals columns on the right-hand side, where I usually record pitch counts.

You can add further nuance by accounting for errors and even great fielding plays. With errors, you can record for the batter the runs that he would have lost had the play been made, but then record (maybe in a separate fielding section of the card) the runs created by the error (the difference between the actual state of the game and the state that would have occurred had the play been made). For example, if the first batter hits a routine grounder to short that is booted, he would get a minus 0.2 in his column, because it should have been an out (0.5 ERV down to 0.3). But you would record a plus 0.6 in the fielding column for the error (0.9 ERV that should have been a 0.3).

I did this for the May 31 Expos/Braves game and it was a lot of fun. Much more fun than counting pitches, which is what I used to do, and much more useful. In fact, in a pinch I'd score a game just with batter's outcome and RV (no basepath action, pitches, etc.), and have a good accounting of the game. You could even do a Manager's Column for the sacrifice bunts, IBBs and other plays where they directly determine an outcome. I also did it for the June 1 game (which is a great one if you haven't seen it). I think I might also work on design of a scorecard to record this stuff.

4 Comments:

At 9:13 PM, Blogger DM said...

Yes, I am familiar with PGP, and it does focus more on the ultimate goal of winning the game. But I don't think you can calculate that by hand on paper with a dog in one hand and a beer in another. (If so, I'd love to hear it). The ERV stuff you can, which makes it a good candidate for scoring at a game, and gives you good info that you're not going to get from the stadium or TV broadcast.

 
At 10:20 AM, Blogger DM said...

Excellent analogy to plus/minus, Backward K, as this is a lot like that, since what we are really recording is marginal runs created or lost.

 
At 10:31 AM, Blogger DM said...

Also, I scored the June 2 game last night using ERV, which really showed that the apparent offensive juggernaut of the Expos (17 hits) was just about as valuable to them as the 4 errors made by the Braves.

http://sports.yahoo.com/mlb/boxscore?gid=240602115

 
At 9:18 PM, Blogger DM said...

This is the counter-intuitive part of this scoring. The batter does not get credit for 2 full runs, because he didn't get get those guys on base. You start by strictly comparing the "before" and "after" states. In this case, before the single, the team had a ERV of 2.1 runs. After the single the situation is no out and a man on first, which has an ERV of 0.9. So the batter sort of put the team in a "worse" state, (-1.2). But he did actually knock in two runs, so you add them back in, for a final RV for this at bat of 0.8.

In essence, the batter gets credit for 0.8 of those 2 runs, because some credit has to go to the runners who got there in the first place.

As for what you do if the runner on second is thrown out, I think you have to make a judgment here. You can do it simply and say the ERV went from 2.1 to 0.5 with one run scored, so it is actually a minus 0.6 (1 run minus 1.6) for the batter. I don't think this is fair to the batter, so I give the batter credit for scoring that guy (0.8). But now we have to account for the difference between him scoring (0.8) and him being thrown out (-0.6) or 1.4 RV. In essence, throwing him out saved 1.4 runs, because it stopped one run and added an out, which reduces the chance for more runs in the inning.

At this point I haven't quite figured out what I would do. I think it makes sense to apportion the 1.4 between the outfielder, cut-off man catcher and runner according to their responsibility. If trying to score was a bonehead play that any fielder would have stopped, I think it all should be deducted from the baserunner. But if it was like the Derek Jeter play in 2001 ALDS, then the fielders should get most of those runs saved. There may be a way to calculate this more precisely, but I haven't thought about it enough.

 

Post a Comment

<< Home