In pre-season I wrote: the league leader was putting up numbers between 70-78, while the third-best mark was in the 54-64 range. My best guess right now is that by the end of 2023 we will see numbers about like that.

At the end of the year we had Ronald Acuna leading the majors with 73 (between 70-78), with Esteury Ruiz at 67 and third-place Corbin Carroll at 54 (in the 54-64 range). That held up remarkably well.

Way back when, I used to routinely run stats comparing how different run estimators performed. Been several years since I did that, so time to blow the electron dust off the code and see how things went.

Let’s start with 2023, and say: it was a good year for Equivalent Runs. Here’s how it looks straight off the window (and keep in mind, I apparently wrote the initial version of the code around 1995):

DESK /home/clayd/baseball/alltime $ eqacomp2016
AMERICAN (1) OR NATIONAL (2) LEAGUE, or BOTH (3)?
3
CALCULATE ENTIRE FILE? (1=YES, 2=NO)
2
ENTER FIRST YEAR,LAST YEAR TO WORK (i.e, 1950,1994)
2023,2023
PRINT ALL OUTPUT? (1=YES,0=NO)
1


team year r eqa woba rc bsr
COL-N 2023 721 687 686 703 693 -34 -35 -18 -28
CIN-N 2023 783 792 783 780 788 9 0 -3 5
WAS-N 2023 700 697 692 706 699 -3 -8 6 -1
ATL-N 2023 947 957 966 992 948 10 19 45 1
MIL-N 2023 728 685 692 686 687 -43 -36 -42 -41
ARI-N 2023 746 751 746 741 745 5 0 -5 -1
MIA-N 2023 668 690 701 722 694 22 33 54 26
SF_-N 2023 674 662 662 660 671 -12 -12 -14 -3
NY_-N 2023 717 723 720 714 729 6 3 -3 12
LA_-N 2023 906 889 889 877 882 -17 -17 -29 -24
PHI-N 2023 796 815 811 820 812 19 15 24 16
CHI-N 2023 819 800 795 787 796 -19 -24 -32 -23
PIT-N 2023 692 698 685 680 699 6 -7 -12 7
STL-N 2023 719 761 766 764 764 42 47 45 45
SD_-N 2023 752 769 775 757 767 17 23 5 15
NY_-A 2023 673 667 661 660 677 -6 -12 -13 4
HOU-A 2023 827 820 831 826 816 -7 4 -1 -11
BOS-A 2023 772 776 782 788 773 4 10 16 1
OAK-A 2023 585 625 613 610 629 40 28 25 44
TB_-A 2023 860 839 843 841 832 -21 -17 -19 -28
LA_-A 2023 739 753 755 755 756 14 16 16 17
SEA-A 2023 758 765 764 753 765 7 6 -5 7
BAL-A 2023 807 764 765 765 759 -43 -42 -42 -48
CHI-A 2023 641 615 602 632 626 -26 -39 -9 -15
MIN-A 2023 778 790 801 786 791 12 23 8 13
DET-A 2023 661 655 649 655 662 -6 -12 -6 1
CLE-A 2023 662 681 668 684 677 19 6 22 15
TEX-A 2023 881 872 882 876 866 -9 1 -5 -15
KC_-A 2023 676 686 665 674 682 10 -11 -2 6
TOR-A 2023 746 771 785 782 770 25 39 36 24


EQA 13352. 21.10
WOBA 15830. 22.97
WOBA-RC 15291. 22.58
RC 17446. 24.11
BaseR 13929. 21.55
OPS 18459. 24.81
TA 14917. 22.30
BA 64418. 46.34
OBA 34009. 33.67
SLG 21019. 26.47
BP_BR 13761. 21.42
LW 14704. 22.14
SilverR 16017. 23.11
XR 14123. 21.70
XRR 16179. 23.22
OPS_WRNG 18026. 24.51
MORRIS 19891. 25.75
OPI 40320. 36.66
LGE 151776. 71.13
OTS 19562. 25.54
SCA 69453. 48.12
CA 20940. 26.42

?

?So the first block (if I print all) gives the actual R estimate for 4 of the stats I run, with the deltas. As you can see, everybody had problems with some teams – Milwaukee, St Louis, Baltimore. The second block is the statistics I have written into the program, along with the error sum of squares and the root mean square error. Let me re-show that last part in reverse order, worst to best

LGE71.13League average runs per PA.
SCA48.12Secondary Average
BA46.34Batting Average
OPI36.66I don’t remember what this is. A weird linear weight divided by outs.
OBA33.67Onbase average.
SLG26.47Slugging average
CA26.42Combined average (early version of EQA; 2/3 BA + 1/3 ScA)
MORRIS25.75Morris Exact RPG
OTS25.54Onbase Times Slugging
OPS24.81On base plus slugging (normalized seperately)
OPS_WRNG24.51Onbase plus slugging (normalized together)
RC24.11Runs Created
XRR23.22Furtado Extrapolated Runs, Reduced
SilverR23.11Silver Linear weights formula
WOBA22.97WOBA (book formula)
WOBA-RC22.58WOBA (website version with their constants)
TA22.30Total Average (Boswell)
LW22.14Linear Weights (Palmer)
XR21.70Extrapolated Runs (Furtado)
BaseR21.55BaseRuns (Smyth)
BP_BR21.42BaseRuns – alternate version I developed
EQA21.10Equivalent Runs

I should note a couple of things. Every statistic on this list has access to the same information – AB, H, DB, TP, HR, BB, SO, SB, CS, HBP, SH, SF. And all have access to the league total of runs scored. All statistics will get the league total runs exactly right – the challenge is how to split them amongst the various teams. The typical pattern goes like this:

BA=H/AB/(LGH/LGBA) #calculate normalized variable

BARUN=(1.9*BA-.9)*PA*LGRPPA #use a simple regression equation, which varies with each stat, to get the estimate for runs per plate appearance. A couple of these use Runs per Out, rather than PA.

If someone’s formula relies on errors, they don’t get it here. If they use a different weight for BB and HBP – no. I do force those to be the same.

Lest anyone think that is a fluke, here’s the chart for 2001-2023. EQA comes in third here:

BP_BR19.55
XR19.69
EQA19.72
BaseR19.74
SilverR20.30
XRR20.30
LW21.17
WOBA21.21
RC21.77
WOBA-RC22.36
TA22.37
MORRIS22.93
OPS_WRNG24.11
OPS24.59
OTS25.21
SLG26.82
CA26.97
OPI30.74
OBA34.01
BA43.49
SCA45.98
LGE58.87

And then again for all of professional league history, 1871-2023:

EQA22.87
BP_BR23.05
BaseR23.40
XR23.44
SilverR23.73
XRR23.96
TA24.30
WOBA-RC24.51
WOBA24.74
LW24.85
RC25.75
MORRIS27.37
OPS_WRNG28.12
OPS28.38
OTS29.14
CA29.37
SLG32.62
OBA36.14
OPI36.17
BA42.39
SCA47.87
LGE66.83

This is why I keep running things with EQA. I don’t see any evidence that any other stat (other than BaseRuns, either original or my modified version) consistently even as well, much less better. And why I continually sigh over seeing such wide use of WOBA.

 

Comments are closed.

Set your Twitter account name in your settings to use the TwitterBar Section.