Tuesday, January 04, 2005

More on ERV Scoring

I've scored a few Expos games from last year with the ERV scoring method I talked about in this post, and it's been a lot of fun. To make it more clear how it works, here's an sample inning from this game between the Expos and Reds, the top of the fourth, Expos batting. Here's the ERV Chart I'm using:



Outs ---1-- -2-12---3 1-3 -23123
059 12 1515 19 21 23
1 3 5710 101215 16
2 1235 4 5 68

Note that I have multiplied the Estimated Run Values by 10 and rounded to the whole number, which makes it easier to score this on paper (for me, anyway). So I record "tenths"of a run -- 1 run equals 10 RVs, a half a run equals 5 RVs. With bases loaded and no outs, the average team will score 2.3 runs, or 23 RVs.

So here's what happened in the Expos 4th inning:

Vidro leads off with a double. He gets 7 RVs. Before, 5 RV (O on, O out); After, 12 RV (2nd, 0 out) an increase of 7.

Batista flies out to right, pretty deep, but Willie Moe Pena makes a strong throw to third to hold Vidro at second. This is a good example of how to score fielding with this system. In my judgment, the ball was deep enough to get the runner to third, but for Pena's throw. So I give Batista the RVs he would have gotten if that happened, -2. (Before: 12 ERV, After (1 out, Runner on 3d): 10 ERV, difference of -2). But I give Pena the difference between what should have happened and what did happen, or -3 RV (Runner on 2d, 1 out: 7 RV, instead of runner on 3d, 1 out: 10 RV). So I record the -2 in Batista's box and a -3 in the margin for Pena.

Johnson hits a grounder to second that Jimenez muffs, Vidro to Third. The official scorer ruled this a hit, but I thought the average fielder would have made the play, so I give Jimenez an error. Another good example here. I give Johnson the RVs as if the play had been made, or -3 (Before, 2nd and 1 out: 7 ERV minus After: 3d and 2 out: 4 ERV). Then I give Jimenez the RV that represent the error, the difference between what happened (1st and 3d, 1 out: 12 ERV) and what should have happened (3d, 2 out: 4 ERV) or a +8 RV for the error. (When I compute Jimenez's total RV for the game, this turns into a -8 when combined with his offensive RV -- he had a bad game, since his Batting RV was -11 and his Fielding RV was -8, for a total of -19. He cost the Reds almost 2 runs, and they lost 4-2).

Cabrera hits a sac fly that scores Vidro, Johnson stays at first. Before: 12 ERV (1&3, 1 out); After: 2 ERV (1st, 2 out), plus 10 for the run scored equals a RV of 0 for Cabrera.

Sledge Walks. Before: 2 ERV (1st, 2 out); After: 5 ERV (1st & 2d, 2 outs), so Sledge gets 3 RV.

Schneider flies out to end the inning. Before: 5 ERV; After: 0 ERV (Expos can't score anymore), so Schneider gets a -5 RV.

So here's how my scoresheet looks:

Vidro: 7 RV, 2B, run scored
Batista: -2 RV, F8, Footnote A: Pena gets -3 RV for good throw.
Johnson: -2 RV, E4, Footnote B: Jimenez gets a +8 RV for Error.
Cabrera: 0 RV, Sac F8
Sledge: 2 RV, BB
Schneider: -5 RV, F7

Total Offensive RV: 0
Total Defensive RV: +5
Total RV: +5

Why is the total RV 5 (1/2 a run) if they actually scored 1 run? Because we are recording marginal runs above the average. Recall that the ERV for 0 on, 0 out -- the start of the inning -- is 5 or 0.5 runs. That means the average team scores approximately 0.5 runs each inning, or 4.5 per game. So when they actually score 1 run, they are scoring 0.5 above the average, or 5 RVs. Note that the ERV scoring shows that the run scored here was half due to the Expos batters and half due to the Reds fielders. Also note that Jimenez's error is the largest RV -- scoring this way shows how errors really kill, because the turn outs into bases and/or runs.

It seems that each inning should result in a Total RV that is a multiple of 5. A 1-2-3 inning is a -5 (-2, -2, -1). One run scored should be +5, two runs +15, three runs +25, etc. I didn't expect things to work out like that, i.e. it be a zero-sum thing, but I guess it makes sense, but I need to think more about that.

As for the pitcher, I usually assign them the negative of the Offensive RV number, here it was 0, so they don't suffer the errors of their fielders. This is usually a multiple of 5, except where there are errors. In this game, in the Ninth, Rocky Biddle got an 11 RV (runs saved in his case), because there was one error in the inning, yet the Reds did not score (the error gave the Reds 6 RVs). He essentially gets credit for getting four outs in one inning.

I pulled a Word document from this great site and modified it for ERV scoring, but I'm still working on it. If anyone is interested, use the e-mail link to the right and I can send it to you, especially if you are good at form design and can improve it.

7 Comments:

At 10:37 AM, Blogger DM said...

Well, I don't think you can judge fielding without being a bit subjective, just as we judge errors now. My standard is whether the play would have been made by the average fielder, and for good plays I have a pretty high threshold. The Pena throw in this inning drew oohs and ahhs from the crowd, so it met the standard. In the three games I've scored, it was the only good play I recognized.

You could score this way only on the official scorer for errors and not do good fielding plays, to be more consistent. But that doesn't strike me as much fun.

 
At 11:20 AM, Blogger SuperNoVa said...

I'm having a hard time with the fact that, offensively, the ERV scoring mechanism only credited the 'pos with about .5 runs. While I realize that the ERV matrix is based on average run scoring, I'm still a little uncomfortable with not giving the batters credit for a whole run (if they deserved it).

I'd be more inclined to say that of the .5 increase in expected runs, Vidro accounted for 70% etc., and then applied that to the actual runs scored. So Vidro would get .7 runs credited, rather than .35... But there's no question you are doing it accurately the way you're doing it.

I think that this DEFINITELY is the way that statistics could take into account defensive performance. I'm pretty sure that zR takes into account the various "zones" in the field to which balls are hit. For example, if you could say that, of 120 balls hit to zone 95 with a runner on second, 80% of the runners on second went to third, you might say that the fielder saved (ERV runner on 3rd, 1 out) - (ERV runner on 2nd, 1 out) runs * 80%.

It would take a MONUMENTAL amount of numbers crunching for someone to figure that one out, though. You'd literally have to figure out, for each of the 24 offensive situations, what happened when a ball was hit to each of the zones.

 
At 12:51 PM, Blogger SuperNoVa said...

If DM carried the decimal out another point...you'd see that the expected run value of the sac fly was actually a positive .02 runs (.0224 if you want go out a little further).

I do think that John has a point inasmuch as there may be a more significant difference in the probability of winning the game based on Cabrera's sac fly...according to BP's expected win matrix, the sac fly took the Expos from a .556 win probability (down two runs, 1st and 3rd with 1 out in top of the 4th) to a .367 win probability (down one run, man on 1st with two outs in the top of the 4th). So, according to BP, Cabrera's sac fly actually made the Expos expected win value decline from .556 to .367. Ouch.

As you can tell, John, there are problems with the expected win values of a particular act in a game...namely that the sample sizes are getting smaller and can be affected by a few games where someone hit a 3 run homer in that situation, etc. Using BP's matrix, a team would be better off two down in the top of the 4th with men on first and third (.556 expected winning %) than being TIED in the top of the 4th with men on first and third and NOBODY out (.538 expected winning %). And we know that just ain't true...it's a problem of small sample size for the year 2004. Give me an expected win matrix for the last 40 years, and then it may be more useful.

 
At 2:01 PM, Blogger SuperNoVa said...

You know what bugs me a bit about ERV? That a sac fly with men on 1st and 3rd and zero outs is actually worth -0.4 runs. That's a pretty harsh penalty for bringing a guy home with a sac fly. Given the possibility of a double play in that situation, I have no problem with the guy putting some lift under the ball and getting the bird in hand. At the end of the day, you do have to get that guy in from third. As Mr. Miagi might say, you can't expect success, you must do success.

 
At 4:17 PM, Blogger DM said...

SNV, it depends on what your definition of "success" is. Your post reminds me of different views on investing. You seem to take the conservative approach, and would consider one run for one out a "success" or fair price. The ERV data indicates that you may be too conservative, in that your paying too high a price for that certainty. I don't know which is true, but I like scoring this way because it forces you to think about whether your judgment is right in that circumstance. As Dexys points out, many managers still believe it's a good deal to bunt a guy to second. In investing terms, they are stuffing cash under their mattress and ignoring inflation.

 
At 5:39 PM, Blogger SuperNoVa said...

Oddly enough, you've described my investing strategy pretty well....Dexy's knows that I like cash in hand (dividends) more than anything. Of course, had I held onto FRO like Dexy's did, I would have had a much better year in 2004 than I did (but it was pretty good in its own right).

There's another point here that can't be lost. Not all runs are of equal value. If the average runs you allow in a game is 4.5, the first 5 runs you score are more valuable than runs 6-8, which are more valuable than runs 9-11 in their own right.

One run down in the 8th inning, I'm going to take the sure run every time over a 55% the potential for 2 runs, even though I'm losing 1/10th of a run in the exercise.

 
At 9:39 PM, Blogger DM said...

SNV, you're definitely right that runs have different values, which is why (I think) John is exploring PGP and a measure of that value, based on how the run relates to the ultimate outcome of the game.

It seems to me, though, that the situations where the runs have different values may be limited to late inning situations, where outs become more valuable. It seems you could ERV score a game, then review it for those events that would fit into this "more valuable" category (or less valuable, for the blowout games), and perhaps make adjustments if need be. So that the sac fly in the top of the Ninth to put a team ahead gets some added benefit. I bet the Expected Win Percentage data could help us derive these values. Maybe instead of adding in 10 RV for a run, you add more when the inning is late and the game is close. Or have an separate ERV chart for the late innings of a close game, reflecting the added value of the runs at that point.

 

Post a Comment

<< Home