Wednesday, July 18, 2012

Counterpoint

Let me be the first to challenge my previous post.

The probability of scoring a run in baseball is a complicated mix of the probabilities of walks and hits and steals (and hit batsmen and home runs and so forth). So a nice simple Poisonnian model isn't really appropriate for estimating what the expectation for the maximum inning's score should be. It is possible that these combinations conspire to make higher scores more likely. If that is the case, then the "second-highest" inning should have a distribution more like that of the "highest" than a run-of-the-mill inning.

Also, I was hinting that this might be due to pitcher failure. That should be more likely in later innings, so I should check that also.

So, I want to look at highest and second-highest distributions, and the monte-carlo estimates of what those distribution should look like based on the reduced-reduced score. I also want to see what the distribution of high score inning numbers, and second-highest score inning numbers, look like. The former should be more skewed to higher inning numbers than the latter if this is due to pitcher issues. And the second-highest score should look more like the prediction from the average than like the highest score, if the high-score inning is pitcher failure.

So let's see if my predictions agree with the data. (No, I haven't done these yet.)

No comments: