when lo0king at spring training stats.
Florida games: .254 batting average, .331 onbase, .414 slugging, 9.4 runs per game
Arizona games: .275 (+21!) .351 (+20) .469 (+55!), 11.5 (+2.1)
Offense is 22% higher in Cactus games than Grapefruit games.
On top of that, teams have wildly different run environments, even within the two leagues. The run environment includes what might be called park factors – although, over just 15 or so games, it also includes windy days, umpires, and a whole lot variability. It also will lean strongly for teams whose pitching is a lot better than their hitting (low run envi), or vice versa (high runs). Which may be the main reason that the Baltimore Orioles (good offense, suspect pitching) are playing in the most extreme run environment of the spring; 11.3 runs per game is 21% than the Florida average. Their real-life neighbors, the Washington Nationals, are playing the most extreme low run environment this spring, with a 7.3 rpg (a lower run environment than the 1968 Washington Senators!), 23% below average.
all teams run environments, relative to their league average
1. Baltimore 1.211
2. San Francisco 1.197
3. Pittsburgh 1.137
4. Detroit 1.108
5. White Sox 1.106
6. Oakland 1.090
7. Houston 1.087
8. Toronto 1.075
9. Arizona 1.075
10. Cleveland 1.073
11. Mets 1.068
12. Kansas City 1.066
13. Yankees 1.045
14. Boston 1.035
15. San Diego 1.030
16. Dodgers 1.008
17. St Louis .997
18. Philadelphia .947
19. Texas .942
20. Cincinnati .932
21. Cubs .931
22. Tampa Bay .919
23. Angels .914
24. Colorado .890
25. Minnesota .883
26. Seattle .873
27. Milwaukee .867
28. Atlanta .855
29. Miami .840
30. Washington .776
I pulled this out from the comments because I thought the answers might be helpful to more than one person.. With thanks to Josh Clayton…
Hello Clay. Thank you for your work – I really enjoy the player comps. Why do some of a player’s comps change when the Organizational DT reports are generated every month? Michael Conforto had Lance Berkman as a comp in November but no longer does now.
The simple answer to the question is that it means I changed something. The offseason is a time for me to review, reanalyze, and test the assumptions that go into model that generates the comps; I try to avoid doing it while the season is underway, simply because I cannot reprocess everybody else and keep up with the real-time player updates. There were two main changes I worked in (before a third one, which I’ll get to in a moment). I wanted to raise the importance of the player’s level – enlarging the penalty between stats accumulated at AAA versus stats accumulated at AA, for instance. Secondly, and probably a larger effect, was changing the player’s baseline stats. The way I calculated the 3-year averages, for all players, moved into a weighted average, with different weights for different stats, and also with variable regression to the mean.
What I mean by that is – some statistics are more deterministic than others. Pitchers strikeouts are strong – the most recent season has a high weight, and RTM is small, about .5/.2/.1. The most recent season is carrying 63% of the total, and the difference between the sum of the weights (.8) and 1.o is small. Pitchers hits allowed, by contrast, have weights like .15-.1-.1. They are essentially even over the last three years, with a large RTM.
And where do you get the player’s height and weight info? Dominic Smith’s page lists him at 185 pounds but Baseball Reference has him at 239.
Well, this is going to set off another change in the comps, and is just flat-out embarrassing. I have a program that is supposed to run to keep my master bio file up to date. It appears that said program broke a couple of years ago without me noticing. As a result – once a player entered my database, his info became static. When Dominic Smith entered my lists – as an 18-year-old in the Appalachian League – he was listed as 6’0″, 185. Now – aged 22 – he’s added 45 pounds and is listed as 239. He was still listed at 185 by MLB after 2014, and after 2015. Went to 250 during the 2016 season, and 239 this year. The numbers come from an MLB stat feed.
While weights are the most likely thing to change, the same program will also pick up on changes in height (anyone drafted 18 or younger may still have some growth upwards, not just outwards), names (Michaels become Mikes), and occasionally correct errors in handedness or birthdates – recognizing that some of those birthdate “errors” are deliberate. So that program did get patched up, and everyone was updated, and now everyone needs to be rerun. That is going to take a few days.
Also, the comps for some of the pitchers the Mets drafted in 2017 are for different players. David Peterson was the Mets top pick this past year and I was curious what his comps were but his page instead lists the comps for Tommy Peterson. The other pitchers whose pages display the comps of a different player are: Marcel Renteria, Aaron Ford, Tony Dibrell, Matt Cleveland, Kyle Wilson, Joshua Walker, Nate Peden, Noah Nunez, Liam McCall, Dariel Rivera, Bryce Hutchinson, Ronnie Taylor Jr., and Yadiel Flores.
That indicates the program broke while running his projection, and inserted the numbers for the player previously run – which indicates I am not clearing a file before starting. I can see that the re-runs did fix Matt Cleveland’s page – hasn’t gone to the web yet, but he’s alphabetically the earliest one there, and the program is only up to D players yet. I think you’ll still be out of luck with Peterson, though – having less than 10 innings pitched, total, will break the program.
Those of you with long memories may remember that I have my own Hall of Fame rating scale. It is a purely objective system, based only on the player’s season by season WARP ratings – no postseason, no All-Star games, no subjective flash, no drugs, no gambling. It isn’t meant to be perfect, and it isn’t meant to simulate who is actually in.
The system is based on what I called the player’s “MVP Career” score. It is just a weighted average of the player’s WARP3 scores, and I called it “MVP Career” because it used a similar weighting system as the MVP ballot. The player’s best season gets counted 14 times. His second-best season gets counted 9 times, his third-best 8 times, and so on. Everything from the 10th-best season through the end of his career gets counted once. So longevity will help build a score, but for the most part you have to have a strong peak to have a high enough score to get in. The best score in history belongs to Babe Ruth, at 785; 6 players have a 600+, 34 have 500+, 145 score 400, and 484 score 300.
One thing that does make my Hall system different from other systems – like Jaffe’s JAWS or James’s Monitor – is that it has no set value to decide whether a player is in or out. Like the real Hall, the selection is serial. I have a defined size for the Hall – and the top available players needed to fill it get in.
I defined the size of my Hall by looking at the size of the real Hall, and comparing it to the number of teams in major league history. It turns out that the real Hall of Fame has inducted about one member for every 12 team-seasons – that number may have drifted a bit since I first did the study however many years ago, but it provides a nice, workable number.
Like the real Hall, I started my calculations from 1936. At that time, there should already have been 70-odd players in the Hall, based on the number of teams in history then. But I set rules that more or less mirrored the behavior of the real Hall. I limited the entry class for any given year to 5 initially, then dropped it to 4 after a few years, and then to 3 – until the Hall had caught up to size. I imposed a five-year wait for all players to become eligible (even dead ones; sorry), and gave them a 15-year window of eligibility (the 15-year window only kicked in at 1960). My first class matched the real Hall on three of them – Honus Wagner, Ty Cobb, and Walter Johnson. I have Pete Alexander ranked higher than Christy Mathewson, and so reversed their real-life induction years. And I also moved Nap Lajoie up a year, because Babe Ruth wasn’t eligible for me until 1941.
So, with 30 teams in the majors, I have a Hall that grows by either 2 or 3 players a year. The best 2 (or 3) players available at that time are my inductees (which helps to limit the consequences of, say, any bias in the WARP rating system over time). There are players who do wait multiple years before breaking through. The score needed to get you in varies with time. In the 60s and 70s, some inductees had scores barely over 300; today, players with a 400 score cannot be certain of getting in.
This year, 2018, was a 3 year – the program says I have room for Hall members #227, #228, and #229. The three best players available to me, my own inductees for 2018, were
Jorge Posada, 437. Sadly, dropped from the writer’s ballot last year.
Scott Rolen, 426. Received only 10% of the vote this year.
Sammy Sosa, 422 . Received only 8%, pays a hefty PED penalty.
Looking over the rest of the writer’s voting:
Chipper Jones – Jones has a 409 career score, and for my purposes trails Scott Rolen. I have a huge defensive split between them, with Rolen at +115 runs for his career and Jones at -160, easily enough to explain the difference between their scores and their perceptions. One of the features, if you will, about my set-up is that I can tell you right now what the selections are for the next five years, based on players who did not play in 2017, assuming no one un-retires. Chipper ranks as my third runnerup in 2018, and will remain among the top 5 runners-up until finally making it in in 2022.
Vladimir Guerrero – 369 career MVP score. Does not make my ballot, and never makes the top 5 runners-up. He only has 1 truly outstanding season for me, a 9.0 WARP3 in 2004 which was third-best in the league. He only had one other season above 6, two more over 5. I have him ranked 24th among all right fielders – that doesn’t quite make it.
Jim Thome – 410. Thome is my second runnerup in 2018, one spot ahead of Chipper. He’s set to break through in 2021
Trevor Hoffman – 380. Hoffman’s score is the third best score for a relief pitcher, better even then Eckersley (377), Gossage (365), Fingers (342), or Wilhelm (310). Due to the vagaries of the system, Wilhelm and Eckersley made it into my Hall (in 1979 and 2004; Gossage was first runnerup 4 different times but never broke through). Hoffman never breaks into the top 5 also-rans.
Edgar Martinez – 466. I set a player’s position by where he recorded the most WARP3, not games played. For me, Ernie Banks is a shortstop, not a first baseman. And Edgar Martinez is a DH, and he ranks as the best DH ever, ahead of Paul Molitor (433, inducted 2004) and David Ortiz (423, scheduled to go in with A-Rod and Chipper Jones in 2022). Martinez went in for me on his first ballot in 2010.
Mike Mussina – 435. Mussina went in for me in 2016, his third ballot.
Roger Clemens – 609. Clemens has the second highest pitching score ever, behind only Walter Johnson’s 687. Without PEDs, there is no doubt that Clemens would be in – and I mean that in every sense of the word, because he did enough before ever touching PEDs to earn a spot. My system makes no allowance for moral failings, any more than it did for Joe Jackson (1941) or Pete Rose (1993). He was part of a two-man class to make my Hall in 2013.
Barry Bonds – 749. The other member of my 2013 class, and most of what I just said about Clemens fits here too. He has the second highest MVP score in history, behind only Ruth. Also noteworthy that his father‘s 355 score was good enough to make the Hall in 1987, giving me a father/son duo.
Curt Schilling – 392. Another example where outside controversy won’t follow a player into my system, but his score isn’t going to be enough to get him in. He was the third runnerup in 2013, dropped to fifth in 2014, and hasn’t made the top 5 since. He’ll be in the running for best score that doesn’t make it in.
Omar Vizquel – 254. A player who is only listed on the ballots because of his defense, but I only give him +56 for his career. I do believe that is too low. I have a couple of programs I have played around with over the years, one that compares fielders against other fielders behind the same pitcher, and another that controls for opposing hitters. With his pitchers, Vizquel was only about 33 plays above average, which would be about +25 runs; but controlling for opposing hitters, he’s +218 plays. That would be good for an extra 100 runs or so, and would be more in keeping with his popular image, but still wouldn’t compare to Ozzie Smith (+586) or Mark Belanger (+347) using the same scale. A 254 score only makes him the 63rd best shortstop in history, and doesn’t come close to Hall standards.
Larry Walker – 318, 44th best in right field.
Fred McGriff – 316, 36th best among first basemen. That comparison to opposing hitters I was talking about with Vizquel? McGriff has a -223 on that, the worst career total for any first baseman.
Manny Ramirez – 449. Ramirez managed to outslug his awful defensive numbers, and earn a score that got him into my Hall in 2017.
Jeff Kent – 402. Kent has been among the top 5 runners-up for five straight years, and will be there again in 2020-22. I think he’ll make it in the 2024-26 timeframe, but it depends on how some of the currently active players ahead of him (Pujols, Mauer, Bautista, Cano) spread out their retirements. The longer they hold on, the better for Kent.
Billy Wagner – 337. Ranks 8th among relievers but has no shot.
Gary Sheffield – 436. Made my Hall in 2016.
Scott Rolen – 426. Made my Hall this year.
Sammy Sosa – 422. Made my Hall this year, his sixth on the ballot.
Andruw Jones – 361. 17th ranked center fielder, better than a couple of center fielders (Cy Seymour and Duke Snider) who did get in.
Johan Santana – 417. Santana was my first runner-up this year. He’ll be first runner-up again next year, when two first-time players (Mariano Rivera and Roy Halladay) step ahead of him, and then he joins my Hall in 2020. He had a short, sharp career – I give him at least a 7.8 WARP for five straight seasons, which is more than I can say for, oh, Sandy Koufax or Dizzy Dean, both of whom are in my Hall and the real Hall. And not even 5% of the writers would give him their vote.
Jamie Moyer – 329. Has a score of 311 just from his seasons after age 33, which is 15th all-time. And special to me, since he was the last player older than I was playing in the major leagues.
Hideki Matsui – 223.
Kerry Wood – 231.
Chris Carpenter – 298.
Livan Hernandez – 243.
Carlos Lee – 227.
Carlos Zambrano – 284.
Brad Lidge – 208.
Kevin Millwood – 253.
Aubrey Huff – 262.
Orlando Hudson – 252.
Jason Isringhausen – 178.
in translated statistics.
BA .265, OBA .335, SLG .415, OPS .750
For 650 PA, it comes out to about 579 AB, 153 H, 29 DB, 3 TP, 17 HR, 55 BB, 108 SO, 10 SB, 5 CS, 80 R, 76 RBI, and 74 EQR.
For pitchers, 198 IP, 198 H, 22 HR, 66 BB, 132 SO, 110 R, 99 ER, 4.50 ERA
To be honest I first set up the 2018 projections a little after Thanksgiving, but never felt like writing up anything about them. At this stage – with pitchers and catchers still having four more weeks of liberty before reporting for duty, and most of the free agents still free – there’s an awful lot of guesswork to go around. How are the Marlins are going to fill their outfield? The Orioles their rotation? How long will Player X, expected to be ready by May, really be out?
These are my token stabs in the dark at some of those answers
Baltimore Orioles – Signed Audry Perez and Eddie Gamboa; neither is likely to play.
Chicago White Sox – Signed Miguel Gonzalez and Gonzalez Germen. Both Gonzalezes should claim a regular spot for the Sox this year, with Miguel in the rotation and Germen in the pen.
Detroit Tigers – signed Johnny Barbato, who could get a spot callup to fill the back end of the bullpen for a month or so.
Kansas City Royals – Added Tyler Collins, who is no better than the bad outfielders I already have slotted in the KC outfield.
Minnesota Twins – Added Addison Reed and Jermaine Curtis. Curtis won’t play. Reed, however, is probably the best reliever in the Twins bullpen, and is going to be a very popular bid in roto leagues waiting for Rodney to fail.
New York Yankees – Signed Wade LeBlanc to a minor league contract; I expect him to work into long relief for some few innings.
San Francisco Giants – Trading for McCutchen primarily means that the Parker/Slater combo, who were fighting for left field playing time, are now fighting for fourth outfielder time. Although they do have a chance of knocking off Hunter Pence, if he can’t recover from last year’s disaster.
Tampa Bay Rays – No changes, but I had only had Brendan McKay registered as a hitter. Now he’s listed as both a hitter and a pitcher.
Texas Rangers – picked up Curt Casali and Deolis Guerra. Casali is probably the fourth catcher, which puts him on the fringe of getting into the majors this season – survey says about a third of similar players were in the majors. Guerra projects better than several current members of the pen, and I think he’ll be get a fair amount of playing time.
Toronto Blue Jays – Traded Dominic Leone and Conner Greene for Randal Grichuk. Also signed Curtis Granderson and Al Alburquerque. Granderson figures to take over the big side of a LF platoon with Steve Pearce, while Grichuk more or less takes over in right. That should be a marked improvement over the Teoscar Hernandez/Anthony Alford/Ezequiel Carrera combos I had in there before. Replacing Leone with Alburquerque is a loss for what is already a pretty sorry-looking bullpen.
Sorry, folks, for the lack of updates over the last three weeks, but I was actually out of the country on vacation . I hope you’ll forgive me for not announcing in advance, on a public forum, “I’ll be out of my house for the next three weeks!”
I knew some reports would suffer without daily maintenance – mainly adding new people to the database. What I did not expect was that my computer would put itself in sleep mode the day I left, meaning it did not even make the regular data pulls, much less run anything during my absence. I’m trying to rebuild some of the files (mainly the playoff odds charts) that needed daily data.
One vacation pic – from 2000m up in the mountains on the Polish/Slovak border. Not much of an expression, but the background was gorgeous!
This is just too far out there not to pass on.
I had seven players in my database with the last name “Pantoja”: Fidel, Jhonny, Thormar,another Jhonny, Jorge, Alexis, and Yorvin, with birthdates ranging from 1950 to 1997.
This morning’s stat update from MLB had the professional debut, for the Athletics in the Dominican Summer League, of an eight Pantoja.
And his first name is Enrry.
I guess you could call these my power rankings. All major league teams, ranked by WARP1, compared to win percentages. By way of comparison, I looked up Fangraphs team WAR while I was working on it.
|Team||Wins||Losses||WARP (me)||WAR (Fangraphs)|
|Correlation with WinPct||0.857||0.834|
I guess I must still be doing something right.
The Ides of March are come – but not yet gone.
As a round of cuts hits the spring teams, I don’t have cuts – I have adds. I’ve been digging through a slew of other stats, and now have translations up for – get this – 17 leagues not previously listed for 2016.
Now, the fun part of this for me wasn’t so much where the players fell within their leagues, but where the leagues fell with respect to each other.
The Pecos League, a 12-team league playing a 72-game schedule in the Southwest, was the weakest of the leagues I reviewed. The common players I found in Organized Baseball came from low-level leagues – Arizona, Gulf Coast, Pioneer, and a few AB in the South Atlantic. The rating comes in at .245 – meaning that every run in the Pecos league was only worth a quarter of a run in the majors.
Next up was the Dutch league, which rated .269. Really hard to say much, though, since the rating depends almost entirely on Nick Urbanus and Kevin Moesquit, who played in the SAL and Midwest leagues.
The Pacific Association, a four-team group operating in northern California, checks in at .273.
For reference, we now hit a set of OB leagues. The GCL and Arizona leagues come in at .338 and .333; the Pioneer and Appalachian leagues register at .387 and .381. And, not far above, the short-season As – the Northwest league rated .387 in 2016, while the New York-Penn drew .406.
The weakest of the winter leagues was the one in Australia – which, ironically plays in the (Australian) summertime. They draw a variety of players, quite a few from AA. It comes in at a New York-Penn equivalent of .406.
The Frontier League comes in with a .433 rating. The majority of the common players came from the slightly-stronger mid-A leagues, the Midwest and South Atlantic.
I was surprised by the Italian League’s rating, a robust .504, splitting the difference between the South Atlantic (.476) and Midwest (.511) leagues. They even had a former major leaguer, in Ronny Cedeno.
The next set of OB leagues up are the high-A leagues. In 2016, the Carolina rated .556, the California was at .577, and the Florida State league led with a .595. Fitting neatly into them would be the new American Association, at .587. The 12 teams of the AA mainly lie along a north-south line running between Laredo, Texas, and Winnipeg, Manitoba; the team in Gary, Indiana is the only one significantly off the line.
Between the high-A and AA leagues are the highest independent leagues – the Canadian-American league at .617 and the Atlantic league at .644. Both are primarily in the northeast US region. The Can-Am common players are primarily from AA and high-A, which is a little surprising. Generally speaking, independent leagues made up of players from league X tend to plat a level somewhat below X – they were, after all, players who did not move up in Organized Baseball. The Atlantic fits better – it is highly regarded, with many players from AA, AAA, and even recognized major league players.
The Mexican league, at .652, also falls in here. Matching it is the Puerto Rican winter league.
Double-A leagues come in at .667 (Texas), .671 (Southern), and .697 (Eastern). The Arizona Fall League, at .667, steps right in at the lower end of AA.
Just on the high side of AA are three more winter leagues, surprisingly equal to one another. The Venezuelan rated .701, the Mexican Pacific was at .708, and the Dominican was on top with .733.
The Korean Baseball Organization rates .734.
The Triple-A leagues are at .759 (PCL) and .802 (International).
The Japanese Central league (.809) and Japanese Pacific (.840) are the highest leagues outside of the majors…
which is the only thing left, the National League at 1.000 and the AL (1.107).
Similar to the last one, for hitters. I believe I used a 50 IP requirement, with a caveat for anyone projected to get at least 5 saves. There were 229 pitchers in the totals.
Just going to drop everything in at once, starting with 2016’s best. Pecota really killed it on pitchers last year, particularly with BB and SO.
|Stat||W||L||SV||G||GS||IP||H||R||ER||HR||BB||SO||ERA (R)||WHIP (BR)||SO(R)||BB(R)||sum||rates||roto|
|2013-15 mj + mnr/2||3.04||2.87||2.93||9.72||5.80||33.49||33.53||0.00||16.25||5.79||11.24||32.55||11.22||17.10||11.72||7.77||205.05||47.81||66.85|
Note: The “R” category is zero for all because one stat I tried to use didn’t have it – so I just cut it from everybody’s.
While everything on this site is free, a donation through Paypal to help offset costs would be greatly appreciated. -Clay
- March 2018
- February 2018
- January 2018
- August 2017
- June 2017
- March 2017
- January 2017
- September 2016
- May 2016
- April 2016
- March 2016
- February 2016
- January 2016
- September 2015
- April 2015
- March 2015
- January 2015
- December 2014
- November 2014
- October 2014
- April 2014
- February 2014
- January 2014
- October 2013
- April 2013
- March 2013
- February 2013
- March 2012
- February 2012
- January 2012
- December 2011
- November 2011
- October 2011
- August 2011
- July 2011
- June 2011
- May 2011