Projections and Accuracy for hitters – a look back at 2016

By clayd On March 7, 2017 · Leave a Comment

When it comes to trying to determine who had the most accurate projections, there are a lot of things to consider. And, if you try to consider them all, you won’t have a post – you’ll have a book.

You can use straight statistics, or you can bias-adjust. You can use whole stats. Rate stats. One-on-one comparisons for individual players. All stats. Just fantasy-relevant stats.

What I’ll do, here, is a fairly straightforward comparison between the numbers projected and the results. I’ll do some comparisons with and without bias adjustments, although part of your forecast skill should include not having large biases in the first place. And, for today, we’ll limit ourselves to hitters.

I’m going to be looking at average forecast errors for multiple statistics. If you’re going to do that, you really need to make sure that you are dealing with a consistent set of players. Having someone included in stat A but not stat B’s list can distort things badly. I eventually came up with a list of 256 players to form my focus group, if you will. These 256 were people that I thought, in the 2016 pre-season, would get at least 200 PA in 2016. But not just me – they only made the list if I, and Steamer, and Pecota, and Rotowire, all agreed they would get 200 PA, and then really did. We all made forecasts for these players, and they played enough to make a reasonable grade.

For projections, I have my own, of course, two varieties – the pure computer printout (“Autoclay”), and what I get from plugging those projections into lineups and depth charts and making my own judgments of playing time (“Clay”). I took these from copies I had on my computer from April 3; the latter were the ones I used for my own fantasy league drafts, last April 2 and 3. The fellows at FantasyPros.com helped me out by sharing copies they had made for Zips and Steamer (FanGraphs) last spring. I had downloaded stats from Pecota (Baseball Prospectus) and Rotowire, and I could run a variety of stats myself.

Now, the simplest, most naive forecast you could make, would be to simply use each player’s 2015 major league performance as his 2016 projection. That yields this

Stat	PA	H	DB	TP	HR	BB	SO	R	RBI	SB	CS	BA	ISO	bbrate	sorate		Sum	rates	roto
2015 real	118.22	34.65	8.14	1.86	6.78	14.44	30.10	20.82	22.75	4.55	2.07	12.33	17.33	10.21	22.84		327.09	62.70	67.24

The numbers for PA through CS are simply the raw, average, absolute value of the differences, between forecast and reality, for all 256 players. The number for batting average shows the average difference, in hits, between projected and actual batting average, given real atbats. For example – suppose someone really went 100-for-400, a .250 batting average. The forecast was .293 – so the error is 17.2 hits (.293 minus .250, times 400). Similarly, the ISO number shows the difference in extra bases between actual and forecast isolated power. bbrate is the difference, in walks, between real and forecast BB/PA, for actual PA; so rate is the same thing for strikeouts. The “Sum” box is the simple sum of the 15 category boxes; “rates” sums just the four rate statistics; “roto” sums the R, HR, RBI, SB, and BA categories. So a 327 sum – that’s you’re naive baseline.

You can do a little better by taking 2013-15 average, instead of just 2015:

Stat	PA	H	DB	TP	HR	BB	SO	R	RBI	SB	CS	BA	ISO	bbrate	sorate		Sum	rates	roto
2013-15 real	119.13	34.55	7.78	1.49	6.41	15.01	24.65	21.37	22.49	4.26	1.90	11.54	15.40	9.77	23.50		319.27	60.21	66.08

Your next step up in sophistication is to start adding some minor league stats into the ratings. When I run a simple 2013-15 DT for each player, and use that for my forecast – this will have no park factors, no adjustments to league average, and the PA will generally be projected into a 650-PA total, so they won’t project part-time PA at all – well, that gets me this:

Stat	PA	H	DB	TP	HR	BB	SO	R	RBI	SB	CS	BA	ISO	bbrate	sorate		Sum	rates	roto
2013-15 DT	152.04	41.39	9.25	1.95	7.25	17.37	29.25	25.11	24.32	5.83	2.62	9.91	14.06	8.30	10.15		358.80	42.43	72.42

which isn’t really very good at all, thanks to the way PA are thrown off, and all the counting stats along with it. Projecting to a 650_AP season is definitely NOT the way to go. But the rates stats are dramatically better. Lets improve on the previous by taking a simple average of the three years 2013-15, with one little added wrinkle – minor league stats only count for 50% of major league stats.

Stat	PA	H	DB	TP	HR	BB	SO	R	RBI	SB	CS	BA	ISO	bbrate	sorate		Sum	rates	roto
13-15 DT+	104.98	30.73	6.74	1.40	5.95	12.99	17.96	17.64	19.78	4.08	1.88	9.85	14.13	8.29	9.90		266.28	42.17	57.29

That figure, by itself, is comparable to most of the other stats I’ll run. That value is pretty much the primary input to the autoclay – the computer printout, without any further help from me:

Stat	PA	H	DB	TP	HR	BB	SO	R	RBI	SB	CS	BA	ISO	bbrate	sorate		Sum	rates	roto
autoclay	98.70	27.64	6.41	1.52	5.94	11.37	22.47	16.04	16.38	3.81	1.50	9.37	14.43	7.34	12.80		255.72	43.94	51.53

Next was ZIPS, taking another step forward:

Stat	PA	H	DB	TP	HR	BB	SO	R	RBI	SB	CS	BA	ISO	bbrate	sorate		Sum	rates	roto
zips	94.80	26.82	6.39	1.66	6.22	11.56	22.96	16.08	16.72	3.97	1.67	8.98	15.07	8.16	11.92		252.99	44.14	51.98

Followed by Rotowire’s projections;

Stat	PA	H	DB	TP	HR	BB	SO	R	RBI	SB	CS	BA	ISO	bbrate	sorate		Sum	rates	roto
rotowire	91.30	27.20	7.04	1.65	6.08	11.68	22.34	15.84	16.83	4.37	1.83	9.51	15.19	8.11	12.73		251.69	45.54	52.63

Ahead of Rotowire is where I show up:

Stat	PA	H	DB	TP	HR	BB	SO	R	RBI	SB	CS	BA	ISO	bbrate	sorate		Sum	rates	roto
clay	91.77	26.32	6.30	1.50	6.11	11.35	22.35	15.85	16.60	3.72	1.51	9.32	14.42	7.74	12.47		247.33	43.95	51.59

I’d like to say that’s where it ends, but I have to honestly report that still leaves PECOTA

Stat	PA	H	DB	TP	HR	BB	SO	R	RBI	SB	CS	BA	ISO	bbrate	sorate		Sum	rates	roto
pecota	90.83	25.77	6.14	1.59	5.90	11.77	21.69	15.57	16.25	4.04	1.51	9.04	14.62	8.08	12.61		245.40	44.34	50.80

And, on top, Steamer:

Stat	PA	H	DB	TP	HR	BB	SO	R	RBI	SB	CS	BA	ISO	bbrate	sorate		Sum	rates	roto
steamer	88.62	25.57	6.18	1.51	6.20	10.91	21.64	15.85	16.48	3.93	1.94	8.88	14.91	7.44	12.01		242.08	43.24	51.35

The differences really aren’t that big between them. For instance, if we take the best score on a player by player basis for the 256 player group, then among the top three the results are fairly even – Clay was best on 80, Steamer on 89, Pecota on 87. Clay beats Steamer 129-127 head to head; Pecota beats Clay 133-123; and Steamer beats Pecota 132-124.

{Note – I had this virtually finished this by about Valentine’s Day, and then – right before hitting the publish key – made one more check and discovered that the “real” stats I was using for validation were not the actual, real stats. Wiping out a lot of work and several erroneous conclusions. Then things went crazy at work, and I focused on getting stats for other leagues in, and before I know it its March – not just barely, but a full week in. Time sure seems to move a lot faster than it used to. May also have to do with being more committed to my day job than I used to be – I could never work on this during the day, the way I used to.]