mark says:

Show your projections from last year.

On the Projections page, there are links to the 2012 and 2013. They are from the saved spreadsheets that I have from the dates given, and run through the same csv-to-webpage script I used to make the current pages.

tangotiger says:
To help people understand how the #1 team is forecasted to “average” 91 wins, can you also show the averages for #1 through #30? That is, take the highest win total for each of your simulations (regardless of team), and show us that average. Then do the same for the second highest and so on.

Andy says:
He has no team winning more than 91 games… very likely.. lol

Tom Sheffield says:
It’s still way too early for projections like this but I do find great fault with 91 wins being the best record in baseball this year. The AL East looks about right standings wise.

There’s an issue here that I find hard to explain.

It is almost certainly NOT the case that the best record in baseball will only amount to 91 wins. In fact, if you looked at the playoff chances page, you’ll see that the AL East says this

Average wins by position in AL East: 95.2 87.7 81.9 76.1 68.5

indicating that it will take 95 wins, on average, to win the division – even though no team in the division, on average, gets above 90. Every division, in fact, takes 94-95 wins to finish first. WTF? Teams don’t win _on average_. The winning team will be the one who combines a good projection AND beats their projection. If the past three years are any indication, the average team is going to be 5 games off these projections – and a couple of teams will miss by 20. In the odds page, I play the season out a million times. In the real world, it will only play once, and how you perform relative to your projection determines your final standing.

There is no doubt in my mind that the best teams will be better than their projection, and the worst teams will be worse. Last year, the six first place teams averaged 8.7 wins better than their projection. Only the Tigers were able to underperform their projection and still win their division.

The six second place teams were +6.5.
The third place teams averaged -0.2…basically zero. Just meeting your projection is a recipe for mediocrity.

The fourth place teams averaged -3.
The last place teams averaged -10.

Whether the projection error comes from mis-estimating the real quality, or just random luck, or a mid-season tradeoff of talent from the weak to the strong that exaggerates the difference…there will be errors, and they have as much to do with deciding the winners as real talent. I’m sorry if that sounds like a copout.

David Lowe says:
You might want to tweak your software. The Royals aren’t going to be 9 games worse than they were last year, bro.
arttieTHE1manparty says:
Insane! How does the computer project the Royals to get worse??? With that defense and relief corps? No way…

Any projection is going to upset fans of various teams, especially if the projection comes in lower than they think is deserved.

With the Royals, the big concern for me is the pitching. I expect Shields to come back about a half run in ERA, and I don’t see quality replacements for Santana and Chen, who surprisingly put up over 400 IP @ 3.50 ERA. Two things I will concede – there is some evidence, looking at the last two years of projections, that I under-count defense…or rather, that teams with good(bad) defense don’t get their runs allowed moved down(up) enough. The Royals and Orioles are two teams who might be suffering from that bias…if it is real. It didn’t show up in the 2011 data with nearly the same effect as in 2012-13.

Now, Guthrie at a 5.00-ish ERA. I’m perfectly comfortable with that projection. He was 20 runs above average in the DR component – my way of saying he gave up 20 runs less than expected, base don his other stats. He doesn’t have a history of putting up that kind of number, and even if he did, that component score heavily, heavily trends towards zero in future years. The issue I have with the projection, in retrospect, is that there’s no way he gets 30 starts with that levelof performance. Its not as though there’s a ton of depth there, though, so its not going to make a big difference, but future iterations are liable to come up a a couple of wins for them. It IS a process to run these stats, and this was just an opener.

JR says:
Sorry, but if you think the Reds will be under .500 your computer has a bad virus.

I predict that Cincinnati fans will become thoroughly sick of the phrase “you can’t steal first base” this season.


14 Responses to …and replies to the first draft

  1. Darren says:

    Thanks Clay, I always look forward to these. A couple of points. A) The SD of your projections is just 6.5, which seems lower than the other projection systems that run sims. Last year I tracked a number of projections and I think yours had the lowest SD of all of them. The rest usually fall in the 7.5 to 8.5 range. Is this a function of you being too conservative on the defence. B) the Wins add up to 2431 and the Loss add up to 2429. I assume this is rounding. C) I notice that very few players predict to negative WARP. However, from looking at previous history most teams lose about 5-6 wins giving PA and IP to below replacement players. Is this a function of you using a much lower level of replacement value than the other sites (ie: 1000 WAR on Fangraphs and Baseball Reference).


    • clayd says:

      I’d say A) yes, its a conservative system. B), it is definitely rounding; my internal check is right on 2430.0, but that does carry the decimal wins. c), I don’t think a team ever gives plate appearances to a player they know or even think is below replacement. Teams give PA to players who _turn out to be_ below replacement, but that is quite different from _expecting_ them to be below replacement…and they sometimes pile up because they keep expecting a player with a history of positive performance to turn things around. I would contend that if you are in a position where you can’t actually get ahold of a replacement level player, than you’re defined replacement too high.

  2. Darren says:

    A couple of other points: base don the WARP projections for Major League players, you are allocating 658 wins for Hitters and 629 for Pitchers for a total of 1287 (although I note Chris Carter is projected to have -13 WARP, which I think you meant to be his Fielding). This means you split Hitting and Pitching 51/49, which is much tighter than what I have seen from other sites and from what Tango accepts (~58/42). Is this your intent? Also, your WARP prediction for Trout (6 WARP) is the lowest I have seen out of any of the projections thus far (8-9). Is this due to much higher regression? or something you model sees that others are not.

    • clayd says:

      The Carter number was just a straight out bug; the total warp for hitters is 677, so a 52/48 split. I haven’t been paying particular attention to that value, but..checking…I gave 591 to players in 2013, and 554 to pitchers. That’s real, not proejcted, so at least I’m consistent. As for Trout, I think the model is too low on the batting average…yes, regression may be overdone. That was a conscious decision I made , but I can’t say I’d scoped the results on every player. In passing, the 6.0 may seem low…but no one scores higher.

  3. John C says:

    Actually, Jeremy Guthrie has had a number of seasons where he has out-performed his FIP and xFIP by wide margins. What he did in 2013 was not out of the ordinary for him. He beat his expectations by even bigger amounts in 2008 and 2010, and by nearly as much in 2007. He beat expectations by smaller margins in ’09 and ’11, and the only place he’s been worse than expected was in his time with the Rockies.

    The 2013 season for Guthrie wasn’t anything unusual. That’s just who he is. Assuming a 5.00 ERA for him because his xFIP says so is a poor assumption to make.

    • clayd says:

      Disagree. He has had a number of seasons where he has allowed substantially fewer hits than expected, which comes through in the ‘component batting average’ stat I use, the cBA column. For the majority of pitchers, that component is highly volatile and regresses strongly to zero. Guthrie is the rare pitcher who does seem to have some skill for this stat. But that isn’t what he did last year; he gave up essentially an average number of hits, but was 20 runs above average in the cDR category – that’s how many runs he gave up, compared to expectation. That is an even more volatile statistic than cBA, and regresses even more strongly towards zero.

  4. Wallee Wright says:


    How does Kris Karter of the Astros earn a -13+ WARP when the next lowest in all of baseball is just -3+? I know he strikes out often, but he also had respectable power and rbi numbers last year.

    • clayd says:

      He’s actually a +0.8. There was a cell in the spreadsheet that should have been blank, but wasn’t, and as a result he picked up -100 runs of defense at DH. I fixed that and checked other teams to make sure the same issue wasn’t happening for another team.

  5. Tim Blaker says:

    According to the list that was available from Baseball Prospectus last year, Yasiel Puig had this Davenport ID (PUIG19901207A) and Adeiny Hechavarria had this one (HECHAVARR19890415A). But in your projections this year, you appear to be using PUIGcubaY01 and HECHEVARRIAcubaA01, respectively.

    Just wondering whether the change is correct … as I map sets of player IDs against various sources (including MLBs ID) and I want to make sure I’m using the correct one.


    • clayd says:

      I guess “right” would depend on whether you’re getting the data from me or BP, since we’re not co-ordinating our lists with each other. In the cases you mention, I had used these IDs for the players while they were still in Cuba; that’s the standard form for Cuban players, since I can’t usually get birthdates. Even though the LASTNAMEdateA format is preferred, I don’t change ones that are already in place if I can help it. Its a real pain in the rear to track down all of the places I might have used one or the other when I screw up and list a guy under two different names.

  6. Rick says:

    Clay, I get that the Reds offense will take a big hit given the loss of Choo. But you also have them allowing more than 50 runs more than they did last year. Where is that coming from?

    Their team BABIP was low last year, but it’s been consistently low the last few years given the quality of their defense. And when you consider that Hamilton will replace Choo in CF and that the loss of Arroyo is offset by a healthy Cueto and more starts from Cingrani, I’m just struggling to see 6 wins worth of run prevention disappearing unless you regress BABIP as if defensive performance were random.

    • clayd says:

      The biggest differences I can see is:
      a) a substantial fall-off from Mike Leake, who was +13 last year in components that generally trend towards zero.
      b) several bullpen members, principally Simon and LeCure, also fall back. Still good, but a drop nonetheless.
      c) Cincinnati only used 7 starters last year; the expected 5 from the rotation at the start, plus Cingrani, plus Greg Reynolds. It is unlikely that they have such good health again, and so I have 100 innings going to scrubby others where, last year, they only had the 30 from Reynolds. There’s no reserve of Cingrani’s quality this time.
      d) the defense had a very high rating last year, and will probably decline, even with Hamilton.

  7. tangotiger says:

    I give a 57/43 split, similar to what the other reader posted. I think Fangraphs gives 57/43 and Baseball Reference gives 59/41. Bill James I think is at 62/38.


    Clay: I tried to explain the “best team” v the “top observed team” on my blog.


    I saw you on Clubhouse Confidential: congratulations! They at least seemed receptive, so that’s a step forward. I was worried they might be dismissive.

    • clayd says:

      Thanks. A couple of remarks on the Clubhouse Confidential appearance:

      1) I had no graphics in front of me at all. Not even what they had on screen. No references to any notes.
      2) Spent 20 minutes getting from the front of the building to the entrance of the parking garage behind it. I clearly don’t know Bethesda, and that did not help my mindset going in.
      3) They had me perched on top of a bar stool in front of the camera. With no back. My feet were hanging at least a foot off the ground. No anchor whatsoever. And they kept telling me through the mike in my ear, “Don’t swivel”.

Set your Twitter account name in your settings to use the TwitterBar Section.