I pulled this out from the comments because I thought the answers might be helpful to more than one person.. With thanks to Josh Clayton…

Hello Clay. Thank you for your work – I really enjoy the player comps. Why do some of a player’s comps change when the Organizational DT reports are generated every month? Michael Conforto had Lance Berkman as a comp in November but no longer does now.

The simple answer to the question is that it means I changed something. The offseason is a time for me to review, reanalyze, and test the assumptions that go into model that generates the comps; I try to avoid doing it while the season is underway, simply because I cannot reprocess everybody else and keep up with the real-time player updates. There were two main changes I worked in (before a third one, which I’ll get to in a moment). I wanted to raise the importance of the player’s level – enlarging the penalty between stats accumulated at AAA versus stats accumulated at AA, for instance. Secondly, and probably a larger effect, was changing the player’s baseline stats. The way I calculated the 3-year averages, for all players, moved into a weighted average, with different weights for different stats, and also with variable regression to the mean.

What I mean by that is – some statistics are more deterministic than others. Pitchers strikeouts are strong – the most recent season has a  high weight, and RTM is small, about .5/.2/.1. The most recent season is carrying 63% of the total, and the difference between the sum of the weights (.8) and 1.o is small. Pitchers hits allowed, by contrast, have weights like .15-.1-.1. They are essentially even over the last three years, with a large RTM.

And where do you get the player’s height and weight info? Dominic Smith’s page lists him at 185 pounds but Baseball Reference has him at 239.

Well, this is going to set off another change in the comps, and is just flat-out embarrassing. I have a program that is supposed to run to keep my master bio file up to date. It appears that said program broke a couple of years ago without me noticing. As a result – once a player entered my database, his info became static. When Dominic Smith entered my lists – as an 18-year-old in the Appalachian League – he was listed as 6’0″, 185. Now – aged 22 – he’s added 45 pounds and is listed as 239. He was still listed at 185 by MLB after 2014, and after 2015. Went to 250 during the 2016 season, and 239 this year. The numbers come from an MLB stat feed.

While weights are the most likely thing to change, the same program will also pick up on changes in height (anyone drafted 18 or younger may still have some growth upwards, not just outwards), names (Michaels become Mikes), and occasionally correct errors in handedness or birthdates – recognizing that some of those birthdate “errors” are deliberate. So that program did get patched up, and everyone was updated, and now everyone needs to be rerun. That is going to take a few days.

Also, the comps for some of the pitchers the Mets drafted in 2017 are for different players. David Peterson was the Mets top pick this past year and I was curious what his comps were but his page instead lists the comps for Tommy Peterson. The other pitchers whose pages display the comps of a different player are: Marcel Renteria, Aaron Ford, Tony Dibrell, Matt Cleveland, Kyle Wilson, Joshua Walker, Nate Peden, Noah Nunez, Liam McCall, Dariel Rivera, Bryce Hutchinson, Ronnie Taylor Jr., and Yadiel Flores.

That indicates the program broke while running his projection, and inserted the numbers for the player previously run – which indicates I am not clearing a file before starting. I can see that the re-runs did fix Matt Cleveland’s page – hasn’t gone to the web yet, but he’s alphabetically the earliest one there, and the program is only up to D players yet. I think you’ll still be out of luck with Peterson, though – having less than 10 innings pitched, total, will break the program.

 

5 Responses to A question from you

  1. Jack Everitt says:

    Wait, we can ask questions? I have one: The last player listed for the A’s is Logan Farrar. This is INCREDIBLY surprising, as well last year was his first in the pros, at short-season A-level (the Lake Monsters). I’m just not even slightly imagining him starting at Beloit (not Stockton), then Stockton, then Midland, then Nashville, and then 30 AB in September. THIS guy is going to play at FIVE levels in 2018? A 36th round draft choice?! Why. Him?!

  2. Josh Clayton says:

    Many thanks to you Clay for not only doing this work but taking the time to answer questions that us neophytes have. Thank you and I look forward to the next DT reports.

  3. Ruben Arutiunov says:

    THANK you so much for Your work !
    Please Tell me where you can find stats on grams since 1871 or as early as possible ? Statistics need this format :
    Nov 01, 02: 20 PM Los Angeles Dodgers – Houston Astros 5 : 1
    Just date-time (or no time)- home team-guest team-match score.
    It would be great to contact you by e-mail.

  4. Ruben Arutiunov says:

    THANK you so much for Your work !
    Please Tell me where you can find statistics on games since 1871 or as early as possible ? Statistics need this format :
    01.11.2017 14:20 Los Angeles Dodgers – Houston Astros 1:5
    Just date-time (or no time)- home team-guest team-match score.
    It would be great to contact you by e-mail.
    I do not speak English. I write with the help of an online translator. Sorry, if that the it is unclear…

  5. 3cardmonty says:

    I have a question…what do all the abbreviations on this site mean? How can there be dozens of different stats and no glossary?

    The ones I’m particularly curious about are RAR and RARP from the 6-year projection sections of the hitter and pitcher DT pages respectively. I’m sure RAR is runs above replacement. Is RARP just runs above replacement prevented? The reasons I ask are (a) there is no pitcher WARP listed, so maybe RARP is something different entirely, and (b) RAR and RARP seem to be on very different scales. Is this just because pitchers are subject to so much more attrition, so their projections are much more conservative?

Leave a Reply

Your email address will not be published.

Set your Twitter account name in your settings to use the TwitterBar Section.